#archiveteam-ot 2019-05-01,Wed

↑back Search

Time	Nickname	Message
00:12 ^🔗		ephemer0l has quit IRC (Ping timeout: 615 seconds)
00:19 ^🔗		ephemer0l has joined #archiveteam-ot
00:23 ^🔗		icedice has quit IRC (Quit: Leaving)
01:07 ^🔗		ATrescue has quit IRC (Ping timeout: 260 seconds)
02:16 ^🔗		killsushi has quit IRC (Quit: Leaving)
02:33 ^🔗		Despatche has quit IRC (Quit: Read error: Connection reset by deer)
03:20 ^🔗		Stilettoo has joined #archiveteam-ot
03:21 ^🔗	Flashfire	WHO WANTS A LEGO BATMAN STEAM KEY?
03:22 ^🔗		Stiletto has quit IRC (Ping timeout: 268 seconds)
03:23 ^🔗	ayanami_	idk does it work on linux
03:23 ^🔗	ayanami_	Flashfire
03:24 ^🔗		odemg has quit IRC (Ping timeout: 615 seconds)
03:24 ^🔗	ayanami_	wait which lego batman is it? Flashfire
03:25 ^🔗	Flashfire	The first one
03:25 ^🔗	Flashfire	It doesnt look to run natively on linux
03:26 ^🔗	ayanami_	alright
03:26 ^🔗	ayanami_	maybe I'll ask if a friend wants one
03:26 ^🔗	ayanami_	can I do that?
03:27 ^🔗	Flashfire	I mean Id prefer if it went directly to people here as I have also asked in a few discord servers i am in ayanami_ ask again in maybe half an hour and if nobody wanted it or asked for it you can give it to your friend
03:28 ^🔗	ayanami_	alright, fair to me
03:30 ^🔗		odemg has joined #archiveteam-ot
03:37 ^🔗		ayanami_ has quit IRC (Quit: Leaving)
03:55 ^🔗		m007a83 has joined #archiveteam-ot
05:38 ^🔗		Stilettoo is now known as Stiletto
05:43 ^🔗		dhyan_nat has joined #archiveteam-ot
06:17 ^🔗		Odd0002_ has joined #archiveteam-ot
06:19 ^🔗		Odd0002 has quit IRC (Read error: Operation timed out)
06:19 ^🔗		Odd0002_ is now known as Odd0002
07:43 ^🔗		Zerote has joined #archiveteam-ot
08:42 ^🔗		chirlu` has quit IRC (Read error: Operation timed out)
08:44 ^🔗		chirlu` has joined #archiveteam-ot
09:07 ^🔗		VerifiedJ has joined #archiveteam-ot
09:10 ^🔗		Verified_ has quit IRC (Ping timeout: 252 seconds)
09:48 ^🔗		Verified_ has joined #archiveteam-ot
10:55 ^🔗		dhyan_nat has quit IRC (Read error: Operation timed out)
11:12 ^🔗		Despatche has joined #archiveteam-ot
11:17 ^🔗		dhyan_nat has joined #archiveteam-ot
12:05 ^🔗		astrid has quit IRC (Read error: Operation timed out)
12:06 ^🔗		astrid has joined #archiveteam-ot
12:17 ^🔗		astrid has quit IRC (Ping timeout: 360 seconds)
12:27 ^🔗		icedice has joined #archiveteam-ot
12:44 ^🔗		BlueMax has quit IRC (Quit: Leaving)
12:49 ^🔗		astrid has joined #archiveteam-ot
12:49 ^🔗		Fusl sets mode: +o astrid
13:17 ^🔗		icedice has quit IRC (Read error: Operation timed out)
14:22 ^🔗		Zerote has quit IRC (Read error: Operation timed out)
14:38 ^🔗		morgan_ has quit IRC (Ping timeout: 263 seconds)
15:10 ^🔗		vitzli has joined #archiveteam-ot
15:11 ^🔗		vitzli has quit IRC (Client Quit)
15:49 ^🔗		Oddly has joined #archiveteam-ot
15:57 ^🔗		drcd has joined #archiveteam-ot
17:54 ^🔗		astrid has quit IRC (Read error: Operation timed out)
18:17 ^🔗		Oddly has quit IRC (Read error: Operation timed out)
18:24 ^🔗		Zerote has joined #archiveteam-ot
19:35 ^🔗		killsushi has joined #archiveteam-ot
19:39 ^🔗		MrRadar has quit IRC (Quit: Rebooting)
19:42 ^🔗		MrRadar has joined #archiveteam-ot
20:41 ^🔗		dhyan_nat has quit IRC (Read error: Operation timed out)
21:29 ^🔗		drcd has quit IRC (Read error: Connection reset by peer)
22:04 ^🔗		apt-get has joined #archiveteam-ot
22:04 ^🔗	apt-get	Heya!
22:04 ^🔗	apt-get	Got a question for those used to how the wayback machine works
22:05 ^🔗	JAA	E: Invalid operation Heya!
22:05 ^🔗	JAA	;-)
22:05 ^🔗	apt-get	heh
22:05 ^🔗	apt-get	If a website has apparently been crawled two times, but neither attempt to access it on the calendar works, is there still a way to retrieve the page?
22:06 ^🔗	apt-get	(website in question: http://imoutogensou.blog100.fc2.com/ )
22:08 ^🔗	JAA	I'd say "try again in a few days" as that fixed it for me before, but if it has been like that for a while, it's probably worth sending an email to IA.
22:10 ^🔗	apt-get	Alright, I'll try that. (attempting to access the site map gets me a 500 error as well...)
22:11 ^🔗	ivan	not in https://www.webarchive.org.uk/wayback/archive//http://imoutogensou.blog100.fc2.com/ or http://webarchive.loc.gov/all/19960101000000-20180421235959/http://imoutogensou.blog100.fc2.com/
22:11 ^🔗	ivan	doesn't appear to be inoreader's feed cache
22:13 ^🔗	ivan	http://web.archive.org/web//http://imoutogensou.blog100.fc2.com/ doesn't seem to have other pages
22:14 ^🔗	ivan	in summary you is hosed
22:15 ^🔗	ivan	apt-get: same site? http://web.archive.org/web/20130327143033/http://imoutogensou.info/
22:17 ^🔗	apt-get	different blog, I think
22:18 ^🔗	apt-get	the fc2 one seemed to be up until roughly ~2017... according to links I've found around
22:18 ^🔗	apt-get	the .info site closed much earlier
22:20 ^🔗	apt-get	but yeah it's from the same person
23:15 ^🔗	ivan	apt-get: I think JAA is right, I tried wayback just now with a URL I know is present and it has the same broken behavior right now
23:21 ^🔗	apt-get	I'll try contacting them tomorrow if it's not fixed, then.
23:23 ^🔗	JAA	I've had cases before where most things worked fine, but a few particular URLs I had just saved through the WBM gave me the "not saved yet" error. A few days later, my screenshot was visible.
23:23 ^🔗	ivan	apt-get: when it's fixed this might also have content http://web.archive.org/web/20130629162858/https://www.google.com/reader/api/0/stream/contents/feed/http%3A%2F%2Fimoutogensou.blog100.fc2.com%2F%3Fxml?r=n&n=1000&hl=en&likes=true&comments=true&client=ArchiveTeam
23:24 ^🔗	JAA	I've also had working snapshots disappear from the WBM. I was writing with Mark about this a long time ago, and he told me they had a number of internal index problems. This was 3 years ago or something (before I even showed up here), so I'd assume it was fixed at some point, but well, I'm still seeing those issues from time to time.
23:25 ^🔗		BlueMax has joined #archiveteam-ot
23:25 ^🔗	apt-get	Ohh, thanks for the additional url
23:25 ^🔗	JAA	But yeah, any issues right now might still be related to the recent power outage which broke various things.
23:26 ^🔗	apt-get	Ah, damn, did something happen to the IA's physical location?
23:26 ^🔗	JAA	No clue what the problem was.
23:27 ^🔗		jeekl has quit IRC (Ping timeout: 615 seconds)
23:28 ^🔗	JAA	ivan: FYI, I'm finally working on testing that prioritisation code.
23:28 ^🔗	JAA	s/code/code's performance/
23:29 ^🔗	JAA	Using a nice real-world DB with just over 50 million URLs.
23:30 ^🔗	ivan	JAA: took me a minute to even remember what you were talking about, but cool :-)
23:30 ^🔗	JAA	Haha, yeah, I delayed this for far too long.
23:40 ^🔗		Stilettoo has joined #archiveteam-ot
23:42 ^🔗		Stiletto has quit IRC (Ping timeout: 265 seconds)
23:43 ^🔗	JAA	ivan: Welp, performance is terrible.
23:44 ^🔗		ayanami_ has joined #archiveteam-ot
23:45 ^🔗	JAA	The old code's one is bad enough already, taking 10 seconds to check out and back in a thousand entries roughly 30 % into the table (5M done + 10M skipped of 50M total). I killed the new code after 2.5 minutes.
23:47 ^🔗	ivan	womp
23:47 ^🔗		Zerote has quit IRC (Read error: Operation timed out)
23:47 ^🔗	JAA	Oh wait.
23:47 ^🔗	JAA	Yeah, no surprise there.
23:48 ^🔗	JAA	The old DB doesn't have an index over priority, so it has to do a full table scan for every checkout.
23:49 ^🔗	JAA	It managed 4 checkouts in 2.5 minutes. lol
23:50 ^🔗	JAA	But hey, at least HTML parsing wouldn't be the main performance bottleneck anymore. :-)
23:55 ^🔗	JAA	I have to say that I really hate SQLAlchemy. I had nothing but issues with it when trying to get that distributed wpull version with a central high-ish latency PostgreSQL DB to run smoothly which all went away as soon as I just wrote everything directly with psycopg2.
23:59 ^🔗	JAA	I'm sure it's possible to write performant code with SQLAlchemy, but getting familiar with psycopg2 (had never worked with it before) and writing the queries manually was less effort for me.

irclogger-viewer