[00:12] *** ephemer0l has quit IRC (Ping timeout: 615 seconds) [00:19] *** ephemer0l has joined #archiveteam-ot [00:23] *** icedice has quit IRC (Quit: Leaving) [01:07] *** ATrescue has quit IRC (Ping timeout: 260 seconds) [02:16] *** killsushi has quit IRC (Quit: Leaving) [02:33] *** Despatche has quit IRC (Quit: Read error: Connection reset by deer) [03:20] *** Stilettoo has joined #archiveteam-ot [03:21] WHO WANTS A LEGO BATMAN STEAM KEY? [03:22] *** Stiletto has quit IRC (Ping timeout: 268 seconds) [03:23] idk does it work on linux [03:23] Flashfire [03:24] *** odemg has quit IRC (Ping timeout: 615 seconds) [03:24] wait which lego batman is it? Flashfire [03:25] The first one [03:25] It doesnt look to run natively on linux [03:26] alright [03:26] maybe I'll ask if a friend wants one [03:26] can I do that? [03:27] I mean Id prefer if it went directly to people here as I have also asked in a few discord servers i am in ayanami_ ask again in maybe half an hour and if nobody wanted it or asked for it you can give it to your friend [03:28] alright, fair to me [03:30] *** odemg has joined #archiveteam-ot [03:37] *** ayanami_ has quit IRC (Quit: Leaving) [03:55] *** m007a83 has joined #archiveteam-ot [05:38] *** Stilettoo is now known as Stiletto [05:43] *** dhyan_nat has joined #archiveteam-ot [06:17] *** Odd0002_ has joined #archiveteam-ot [06:19] *** Odd0002 has quit IRC (Read error: Operation timed out) [06:19] *** Odd0002_ is now known as Odd0002 [07:43] *** Zerote has joined #archiveteam-ot [08:42] *** chirlu` has quit IRC (Read error: Operation timed out) [08:44] *** chirlu` has joined #archiveteam-ot [09:07] *** VerifiedJ has joined #archiveteam-ot [09:10] *** Verified_ has quit IRC (Ping timeout: 252 seconds) [09:48] *** Verified_ has joined #archiveteam-ot [10:55] *** dhyan_nat has quit IRC (Read error: Operation timed out) [11:12] *** Despatche has joined #archiveteam-ot [11:17] *** dhyan_nat has joined #archiveteam-ot [12:05] *** astrid has quit IRC (Read error: Operation timed out) [12:06] *** astrid has joined #archiveteam-ot [12:17] *** astrid has quit IRC (Ping timeout: 360 seconds) [12:27] *** icedice has joined #archiveteam-ot [12:44] *** BlueMax has quit IRC (Quit: Leaving) [12:49] *** astrid has joined #archiveteam-ot [12:49] *** Fusl sets mode: +o astrid [13:17] *** icedice has quit IRC (Read error: Operation timed out) [14:22] *** Zerote has quit IRC (Read error: Operation timed out) [14:38] *** morgan_ has quit IRC (Ping timeout: 263 seconds) [15:10] *** vitzli has joined #archiveteam-ot [15:11] *** vitzli has quit IRC (Client Quit) [15:49] *** Oddly has joined #archiveteam-ot [15:57] *** drcd has joined #archiveteam-ot [17:54] *** astrid has quit IRC (Read error: Operation timed out) [18:17] *** Oddly has quit IRC (Read error: Operation timed out) [18:24] *** Zerote has joined #archiveteam-ot [19:35] *** killsushi has joined #archiveteam-ot [19:39] *** MrRadar has quit IRC (Quit: Rebooting) [19:42] *** MrRadar has joined #archiveteam-ot [20:41] *** dhyan_nat has quit IRC (Read error: Operation timed out) [21:29] *** drcd has quit IRC (Read error: Connection reset by peer) [22:04] *** apt-get has joined #archiveteam-ot [22:04] Heya! [22:04] Got a question for those used to how the wayback machine works [22:05] E: Invalid operation Heya! [22:05] ;-) [22:05] heh [22:05] If a website has apparently been crawled two times, but neither attempt to access it on the calendar works, is there still a way to retrieve the page? [22:06] (website in question: http://imoutogensou.blog100.fc2.com/ ) [22:08] I'd say "try again in a few days" as that fixed it for me before, but if it has been like that for a while, it's probably worth sending an email to IA. [22:10] Alright, I'll try that. (attempting to access the site map gets me a 500 error as well...) [22:11] not in https://www.webarchive.org.uk/wayback/archive/*/http://imoutogensou.blog100.fc2.com/ or http://webarchive.loc.gov/all/19960101000000-20180421235959*/http://imoutogensou.blog100.fc2.com/ [22:11] doesn't appear to be inoreader's feed cache [22:13] http://web.archive.org/web/*/http://imoutogensou.blog100.fc2.com/* doesn't seem to have other pages [22:14] in summary you is hosed [22:15] apt-get: same site? http://web.archive.org/web/20130327143033/http://imoutogensou.info/ [22:17] different blog, I think [22:18] the fc2 one seemed to be up until roughly ~2017... according to links I've found around [22:18] the .info site closed much earlier [22:20] but yeah it's from the same person [23:15] apt-get: I think JAA is right, I tried wayback just now with a URL I know is present and it has the same broken behavior right now [23:21] I'll try contacting them tomorrow if it's not fixed, then. [23:23] I've had cases before where most things worked fine, but a few particular URLs I had just saved through the WBM gave me the "not saved yet" error. A few days later, my screenshot was visible. [23:23] apt-get: when it's fixed this might also have content http://web.archive.org/web/20130629162858/https://www.google.com/reader/api/0/stream/contents/feed/http%3A%2F%2Fimoutogensou.blog100.fc2.com%2F%3Fxml?r=n&n=1000&hl=en&likes=true&comments=true&client=ArchiveTeam [23:24] I've also had working snapshots disappear from the WBM. I was writing with Mark about this a long time ago, and he told me they had a number of internal index problems. This was 3 years ago or something (before I even showed up here), so I'd assume it was fixed at some point, but well, I'm still seeing those issues from time to time. [23:25] *** BlueMax has joined #archiveteam-ot [23:25] Ohh, thanks for the additional url [23:25] But yeah, any issues right now might still be related to the recent power outage which broke various things. [23:26] Ah, damn, did something happen to the IA's physical location? [23:26] No clue what the problem was. [23:27] *** jeekl has quit IRC (Ping timeout: 615 seconds) [23:28] ivan: FYI, I'm *finally* working on testing that prioritisation code. [23:28] s/code/code's performance/ [23:29] Using a nice real-world DB with just over 50 million URLs. [23:30] JAA: took me a minute to even remember what you were talking about, but cool :-) [23:30] Haha, yeah, I delayed this for far too long. [23:40] *** Stilettoo has joined #archiveteam-ot [23:42] *** Stiletto has quit IRC (Ping timeout: 265 seconds) [23:43] ivan: Welp, performance is terrible. [23:44] *** ayanami_ has joined #archiveteam-ot [23:45] The old code's one is bad enough already, taking 10 seconds to check out and back in a thousand entries roughly 30 % into the table (5M done + 10M skipped of 50M total). I killed the new code after 2.5 minutes. [23:47] womp [23:47] *** Zerote has quit IRC (Read error: Operation timed out) [23:47] Oh wait. [23:47] Yeah, no surprise there. [23:48] The old DB doesn't have an index over priority, so it has to do a full table scan for every checkout. [23:49] It managed 4 checkouts in 2.5 minutes. lol [23:50] But hey, at least HTML parsing wouldn't be the main performance bottleneck anymore. :-) [23:55] I have to say that I really hate SQLAlchemy. I had nothing but issues with it when trying to get that distributed wpull version with a central high-ish latency PostgreSQL DB to run smoothly which all went away as soon as I just wrote everything directly with psycopg2. [23:59] I'm sure it's possible to write performant code with SQLAlchemy, but getting familiar with psycopg2 (had never worked with it before) and writing the queries manually was less effort for me.