Time |
Nickname |
Message |
00:12
🔗
|
|
ephemer0l has quit IRC (Ping timeout: 615 seconds) |
00:19
🔗
|
|
ephemer0l has joined #archiveteam-ot |
00:23
🔗
|
|
icedice has quit IRC (Quit: Leaving) |
01:07
🔗
|
|
ATrescue has quit IRC (Ping timeout: 260 seconds) |
02:16
🔗
|
|
killsushi has quit IRC (Quit: Leaving) |
02:33
🔗
|
|
Despatche has quit IRC (Quit: Read error: Connection reset by deer) |
03:20
🔗
|
|
Stilettoo has joined #archiveteam-ot |
03:21
🔗
|
Flashfire |
WHO WANTS A LEGO BATMAN STEAM KEY? |
03:22
🔗
|
|
Stiletto has quit IRC (Ping timeout: 268 seconds) |
03:23
🔗
|
ayanami_ |
idk does it work on linux |
03:23
🔗
|
ayanami_ |
Flashfire |
03:24
🔗
|
|
odemg has quit IRC (Ping timeout: 615 seconds) |
03:24
🔗
|
ayanami_ |
wait which lego batman is it? Flashfire |
03:25
🔗
|
Flashfire |
The first one |
03:25
🔗
|
Flashfire |
It doesnt look to run natively on linux |
03:26
🔗
|
ayanami_ |
alright |
03:26
🔗
|
ayanami_ |
maybe I'll ask if a friend wants one |
03:26
🔗
|
ayanami_ |
can I do that? |
03:27
🔗
|
Flashfire |
I mean Id prefer if it went directly to people here as I have also asked in a few discord servers i am in ayanami_ ask again in maybe half an hour and if nobody wanted it or asked for it you can give it to your friend |
03:28
🔗
|
ayanami_ |
alright, fair to me |
03:30
🔗
|
|
odemg has joined #archiveteam-ot |
03:37
🔗
|
|
ayanami_ has quit IRC (Quit: Leaving) |
03:55
🔗
|
|
m007a83 has joined #archiveteam-ot |
05:38
🔗
|
|
Stilettoo is now known as Stiletto |
05:43
🔗
|
|
dhyan_nat has joined #archiveteam-ot |
06:17
🔗
|
|
Odd0002_ has joined #archiveteam-ot |
06:19
🔗
|
|
Odd0002 has quit IRC (Read error: Operation timed out) |
06:19
🔗
|
|
Odd0002_ is now known as Odd0002 |
07:43
🔗
|
|
Zerote has joined #archiveteam-ot |
08:42
🔗
|
|
chirlu` has quit IRC (Read error: Operation timed out) |
08:44
🔗
|
|
chirlu` has joined #archiveteam-ot |
09:07
🔗
|
|
VerifiedJ has joined #archiveteam-ot |
09:10
🔗
|
|
Verified_ has quit IRC (Ping timeout: 252 seconds) |
09:48
🔗
|
|
Verified_ has joined #archiveteam-ot |
10:55
🔗
|
|
dhyan_nat has quit IRC (Read error: Operation timed out) |
11:12
🔗
|
|
Despatche has joined #archiveteam-ot |
11:17
🔗
|
|
dhyan_nat has joined #archiveteam-ot |
12:05
🔗
|
|
astrid has quit IRC (Read error: Operation timed out) |
12:06
🔗
|
|
astrid has joined #archiveteam-ot |
12:17
🔗
|
|
astrid has quit IRC (Ping timeout: 360 seconds) |
12:27
🔗
|
|
icedice has joined #archiveteam-ot |
12:44
🔗
|
|
BlueMax has quit IRC (Quit: Leaving) |
12:49
🔗
|
|
astrid has joined #archiveteam-ot |
12:49
🔗
|
|
Fusl sets mode: +o astrid |
13:17
🔗
|
|
icedice has quit IRC (Read error: Operation timed out) |
14:22
🔗
|
|
Zerote has quit IRC (Read error: Operation timed out) |
14:38
🔗
|
|
morgan_ has quit IRC (Ping timeout: 263 seconds) |
15:10
🔗
|
|
vitzli has joined #archiveteam-ot |
15:11
🔗
|
|
vitzli has quit IRC (Client Quit) |
15:49
🔗
|
|
Oddly has joined #archiveteam-ot |
15:57
🔗
|
|
drcd has joined #archiveteam-ot |
17:54
🔗
|
|
astrid has quit IRC (Read error: Operation timed out) |
18:17
🔗
|
|
Oddly has quit IRC (Read error: Operation timed out) |
18:24
🔗
|
|
Zerote has joined #archiveteam-ot |
19:35
🔗
|
|
killsushi has joined #archiveteam-ot |
19:39
🔗
|
|
MrRadar has quit IRC (Quit: Rebooting) |
19:42
🔗
|
|
MrRadar has joined #archiveteam-ot |
20:41
🔗
|
|
dhyan_nat has quit IRC (Read error: Operation timed out) |
21:29
🔗
|
|
drcd has quit IRC (Read error: Connection reset by peer) |
22:04
🔗
|
|
apt-get has joined #archiveteam-ot |
22:04
🔗
|
apt-get |
Heya! |
22:04
🔗
|
apt-get |
Got a question for those used to how the wayback machine works |
22:05
🔗
|
JAA |
E: Invalid operation Heya! |
22:05
🔗
|
JAA |
;-) |
22:05
🔗
|
apt-get |
heh |
22:05
🔗
|
apt-get |
If a website has apparently been crawled two times, but neither attempt to access it on the calendar works, is there still a way to retrieve the page? |
22:06
🔗
|
apt-get |
(website in question: http://imoutogensou.blog100.fc2.com/ ) |
22:08
🔗
|
JAA |
I'd say "try again in a few days" as that fixed it for me before, but if it has been like that for a while, it's probably worth sending an email to IA. |
22:10
🔗
|
apt-get |
Alright, I'll try that. (attempting to access the site map gets me a 500 error as well...) |
22:11
🔗
|
ivan |
not in https://www.webarchive.org.uk/wayback/archive/*/http://imoutogensou.blog100.fc2.com/ or http://webarchive.loc.gov/all/19960101000000-20180421235959*/http://imoutogensou.blog100.fc2.com/ |
22:11
🔗
|
ivan |
doesn't appear to be inoreader's feed cache |
22:13
🔗
|
ivan |
http://web.archive.org/web/*/http://imoutogensou.blog100.fc2.com/* doesn't seem to have other pages |
22:14
🔗
|
ivan |
in summary you is hosed |
22:15
🔗
|
ivan |
apt-get: same site? http://web.archive.org/web/20130327143033/http://imoutogensou.info/ |
22:17
🔗
|
apt-get |
different blog, I think |
22:18
🔗
|
apt-get |
the fc2 one seemed to be up until roughly ~2017... according to links I've found around |
22:18
🔗
|
apt-get |
the .info site closed much earlier |
22:20
🔗
|
apt-get |
but yeah it's from the same person |
23:15
🔗
|
ivan |
apt-get: I think JAA is right, I tried wayback just now with a URL I know is present and it has the same broken behavior right now |
23:21
🔗
|
apt-get |
I'll try contacting them tomorrow if it's not fixed, then. |
23:23
🔗
|
JAA |
I've had cases before where most things worked fine, but a few particular URLs I had just saved through the WBM gave me the "not saved yet" error. A few days later, my screenshot was visible. |
23:23
🔗
|
ivan |
apt-get: when it's fixed this might also have content http://web.archive.org/web/20130629162858/https://www.google.com/reader/api/0/stream/contents/feed/http%3A%2F%2Fimoutogensou.blog100.fc2.com%2F%3Fxml?r=n&n=1000&hl=en&likes=true&comments=true&client=ArchiveTeam |
23:24
🔗
|
JAA |
I've also had working snapshots disappear from the WBM. I was writing with Mark about this a long time ago, and he told me they had a number of internal index problems. This was 3 years ago or something (before I even showed up here), so I'd assume it was fixed at some point, but well, I'm still seeing those issues from time to time. |
23:25
🔗
|
|
BlueMax has joined #archiveteam-ot |
23:25
🔗
|
apt-get |
Ohh, thanks for the additional url |
23:25
🔗
|
JAA |
But yeah, any issues right now might still be related to the recent power outage which broke various things. |
23:26
🔗
|
apt-get |
Ah, damn, did something happen to the IA's physical location? |
23:26
🔗
|
JAA |
No clue what the problem was. |
23:27
🔗
|
|
jeekl has quit IRC (Ping timeout: 615 seconds) |
23:28
🔗
|
JAA |
ivan: FYI, I'm *finally* working on testing that prioritisation code. |
23:28
🔗
|
JAA |
s/code/code's performance/ |
23:29
🔗
|
JAA |
Using a nice real-world DB with just over 50 million URLs. |
23:30
🔗
|
ivan |
JAA: took me a minute to even remember what you were talking about, but cool :-) |
23:30
🔗
|
JAA |
Haha, yeah, I delayed this for far too long. |
23:40
🔗
|
|
Stilettoo has joined #archiveteam-ot |
23:42
🔗
|
|
Stiletto has quit IRC (Ping timeout: 265 seconds) |
23:43
🔗
|
JAA |
ivan: Welp, performance is terrible. |
23:44
🔗
|
|
ayanami_ has joined #archiveteam-ot |
23:45
🔗
|
JAA |
The old code's one is bad enough already, taking 10 seconds to check out and back in a thousand entries roughly 30 % into the table (5M done + 10M skipped of 50M total). I killed the new code after 2.5 minutes. |
23:47
🔗
|
ivan |
womp |
23:47
🔗
|
|
Zerote has quit IRC (Read error: Operation timed out) |
23:47
🔗
|
JAA |
Oh wait. |
23:47
🔗
|
JAA |
Yeah, no surprise there. |
23:48
🔗
|
JAA |
The old DB doesn't have an index over priority, so it has to do a full table scan for every checkout. |
23:49
🔗
|
JAA |
It managed 4 checkouts in 2.5 minutes. lol |
23:50
🔗
|
JAA |
But hey, at least HTML parsing wouldn't be the main performance bottleneck anymore. :-) |
23:55
🔗
|
JAA |
I have to say that I really hate SQLAlchemy. I had nothing but issues with it when trying to get that distributed wpull version with a central high-ish latency PostgreSQL DB to run smoothly which all went away as soon as I just wrote everything directly with psycopg2. |
23:59
🔗
|
JAA |
I'm sure it's possible to write performant code with SQLAlchemy, but getting familiar with psycopg2 (had never worked with it before) and writing the queries manually was less effort for me. |