#archiveteam-ot 2019-05-01,Wed

↑back Search

Time Nickname Message
00:12 🔗 ephemer0l has quit IRC (Ping timeout: 615 seconds)
00:19 🔗 ephemer0l has joined #archiveteam-ot
00:23 🔗 icedice has quit IRC (Quit: Leaving)
01:07 🔗 ATrescue has quit IRC (Ping timeout: 260 seconds)
02:16 🔗 killsushi has quit IRC (Quit: Leaving)
02:33 🔗 Despatche has quit IRC (Quit: Read error: Connection reset by deer)
03:20 🔗 Stilettoo has joined #archiveteam-ot
03:21 🔗 Flashfire WHO WANTS A LEGO BATMAN STEAM KEY?
03:22 🔗 Stiletto has quit IRC (Ping timeout: 268 seconds)
03:23 🔗 ayanami_ idk does it work on linux
03:23 🔗 ayanami_ Flashfire
03:24 🔗 odemg has quit IRC (Ping timeout: 615 seconds)
03:24 🔗 ayanami_ wait which lego batman is it? Flashfire
03:25 🔗 Flashfire The first one
03:25 🔗 Flashfire It doesnt look to run natively on linux
03:26 🔗 ayanami_ alright
03:26 🔗 ayanami_ maybe I'll ask if a friend wants one
03:26 🔗 ayanami_ can I do that?
03:27 🔗 Flashfire I mean Id prefer if it went directly to people here as I have also asked in a few discord servers i am in ayanami_ ask again in maybe half an hour and if nobody wanted it or asked for it you can give it to your friend
03:28 🔗 ayanami_ alright, fair to me
03:30 🔗 odemg has joined #archiveteam-ot
03:37 🔗 ayanami_ has quit IRC (Quit: Leaving)
03:55 🔗 m007a83 has joined #archiveteam-ot
05:38 🔗 Stilettoo is now known as Stiletto
05:43 🔗 dhyan_nat has joined #archiveteam-ot
06:17 🔗 Odd0002_ has joined #archiveteam-ot
06:19 🔗 Odd0002 has quit IRC (Read error: Operation timed out)
06:19 🔗 Odd0002_ is now known as Odd0002
07:43 🔗 Zerote has joined #archiveteam-ot
08:42 🔗 chirlu` has quit IRC (Read error: Operation timed out)
08:44 🔗 chirlu` has joined #archiveteam-ot
09:07 🔗 VerifiedJ has joined #archiveteam-ot
09:10 🔗 Verified_ has quit IRC (Ping timeout: 252 seconds)
09:48 🔗 Verified_ has joined #archiveteam-ot
10:55 🔗 dhyan_nat has quit IRC (Read error: Operation timed out)
11:12 🔗 Despatche has joined #archiveteam-ot
11:17 🔗 dhyan_nat has joined #archiveteam-ot
12:05 🔗 astrid has quit IRC (Read error: Operation timed out)
12:06 🔗 astrid has joined #archiveteam-ot
12:17 🔗 astrid has quit IRC (Ping timeout: 360 seconds)
12:27 🔗 icedice has joined #archiveteam-ot
12:44 🔗 BlueMax has quit IRC (Quit: Leaving)
12:49 🔗 astrid has joined #archiveteam-ot
12:49 🔗 Fusl sets mode: +o astrid
13:17 🔗 icedice has quit IRC (Read error: Operation timed out)
14:22 🔗 Zerote has quit IRC (Read error: Operation timed out)
14:38 🔗 morgan_ has quit IRC (Ping timeout: 263 seconds)
15:10 🔗 vitzli has joined #archiveteam-ot
15:11 🔗 vitzli has quit IRC (Client Quit)
15:49 🔗 Oddly has joined #archiveteam-ot
15:57 🔗 drcd has joined #archiveteam-ot
17:54 🔗 astrid has quit IRC (Read error: Operation timed out)
18:17 🔗 Oddly has quit IRC (Read error: Operation timed out)
18:24 🔗 Zerote has joined #archiveteam-ot
19:35 🔗 killsushi has joined #archiveteam-ot
19:39 🔗 MrRadar has quit IRC (Quit: Rebooting)
19:42 🔗 MrRadar has joined #archiveteam-ot
20:41 🔗 dhyan_nat has quit IRC (Read error: Operation timed out)
21:29 🔗 drcd has quit IRC (Read error: Connection reset by peer)
22:04 🔗 apt-get has joined #archiveteam-ot
22:04 🔗 apt-get Heya!
22:04 🔗 apt-get Got a question for those used to how the wayback machine works
22:05 🔗 JAA E: Invalid operation Heya!
22:05 🔗 JAA ;-)
22:05 🔗 apt-get heh
22:05 🔗 apt-get If a website has apparently been crawled two times, but neither attempt to access it on the calendar works, is there still a way to retrieve the page?
22:06 🔗 apt-get (website in question: http://imoutogensou.blog100.fc2.com/ )
22:08 🔗 JAA I'd say "try again in a few days" as that fixed it for me before, but if it has been like that for a while, it's probably worth sending an email to IA.
22:10 🔗 apt-get Alright, I'll try that. (attempting to access the site map gets me a 500 error as well...)
22:11 🔗 ivan not in https://www.webarchive.org.uk/wayback/archive/*/http://imoutogensou.blog100.fc2.com/ or http://webarchive.loc.gov/all/19960101000000-20180421235959*/http://imoutogensou.blog100.fc2.com/
22:11 🔗 ivan doesn't appear to be inoreader's feed cache
22:13 🔗 ivan http://web.archive.org/web/*/http://imoutogensou.blog100.fc2.com/* doesn't seem to have other pages
22:14 🔗 ivan in summary you is hosed
22:15 🔗 ivan apt-get: same site? http://web.archive.org/web/20130327143033/http://imoutogensou.info/
22:17 🔗 apt-get different blog, I think
22:18 🔗 apt-get the fc2 one seemed to be up until roughly ~2017... according to links I've found around
22:18 🔗 apt-get the .info site closed much earlier
22:20 🔗 apt-get but yeah it's from the same person
23:15 🔗 ivan apt-get: I think JAA is right, I tried wayback just now with a URL I know is present and it has the same broken behavior right now
23:21 🔗 apt-get I'll try contacting them tomorrow if it's not fixed, then.
23:23 🔗 JAA I've had cases before where most things worked fine, but a few particular URLs I had just saved through the WBM gave me the "not saved yet" error. A few days later, my screenshot was visible.
23:23 🔗 ivan apt-get: when it's fixed this might also have content http://web.archive.org/web/20130629162858/https://www.google.com/reader/api/0/stream/contents/feed/http%3A%2F%2Fimoutogensou.blog100.fc2.com%2F%3Fxml?r=n&n=1000&hl=en&likes=true&comments=true&client=ArchiveTeam
23:24 🔗 JAA I've also had working snapshots disappear from the WBM. I was writing with Mark about this a long time ago, and he told me they had a number of internal index problems. This was 3 years ago or something (before I even showed up here), so I'd assume it was fixed at some point, but well, I'm still seeing those issues from time to time.
23:25 🔗 BlueMax has joined #archiveteam-ot
23:25 🔗 apt-get Ohh, thanks for the additional url
23:25 🔗 JAA But yeah, any issues right now might still be related to the recent power outage which broke various things.
23:26 🔗 apt-get Ah, damn, did something happen to the IA's physical location?
23:26 🔗 JAA No clue what the problem was.
23:27 🔗 jeekl has quit IRC (Ping timeout: 615 seconds)
23:28 🔗 JAA ivan: FYI, I'm *finally* working on testing that prioritisation code.
23:28 🔗 JAA s/code/code's performance/
23:29 🔗 JAA Using a nice real-world DB with just over 50 million URLs.
23:30 🔗 ivan JAA: took me a minute to even remember what you were talking about, but cool :-)
23:30 🔗 JAA Haha, yeah, I delayed this for far too long.
23:40 🔗 Stilettoo has joined #archiveteam-ot
23:42 🔗 Stiletto has quit IRC (Ping timeout: 265 seconds)
23:43 🔗 JAA ivan: Welp, performance is terrible.
23:44 🔗 ayanami_ has joined #archiveteam-ot
23:45 🔗 JAA The old code's one is bad enough already, taking 10 seconds to check out and back in a thousand entries roughly 30 % into the table (5M done + 10M skipped of 50M total). I killed the new code after 2.5 minutes.
23:47 🔗 ivan womp
23:47 🔗 Zerote has quit IRC (Read error: Operation timed out)
23:47 🔗 JAA Oh wait.
23:47 🔗 JAA Yeah, no surprise there.
23:48 🔗 JAA The old DB doesn't have an index over priority, so it has to do a full table scan for every checkout.
23:49 🔗 JAA It managed 4 checkouts in 2.5 minutes. lol
23:50 🔗 JAA But hey, at least HTML parsing wouldn't be the main performance bottleneck anymore. :-)
23:55 🔗 JAA I have to say that I really hate SQLAlchemy. I had nothing but issues with it when trying to get that distributed wpull version with a central high-ish latency PostgreSQL DB to run smoothly which all went away as soon as I just wrote everything directly with psycopg2.
23:59 🔗 JAA I'm sure it's possible to write performant code with SQLAlchemy, but getting familiar with psycopg2 (had never worked with it before) and writing the queries manually was less effort for me.

irclogger-viewer