#archiveteam-bs 2017-06-11,Sun

↑back Search

Time	Nickname	Message
00:00 ^🔗	JAA	Well, that reduced the number of pending URLs from 390k to 80k. Nice. :D
00:21 ^🔗		tfgbd_znc has quit IRC (Ping timeout: 600 seconds)
00:54 ^🔗		BlueMaxim has joined #archiveteam-bs
01:04 ^🔗		timmc has left
01:08 ^🔗		wp494 has joined #archiveteam-bs
01:10 ^🔗	JensRex	JAA: I kind of forgot about it.
01:16 ^🔗	jrwr	arkiver: sweet baby jesus thats a ton of data arkiver
01:16 ^🔗	arkiver	yep
01:16 ^🔗	arkiver	so the future of the project is a little unclear
01:17 ^🔗	jrwr	Ya, more then 200TB you need to start getting IA Tech team on it
01:17 ^🔗	arkiver	well, we can just pump the data in
01:17 ^🔗	arkiver	but they'll notice
01:17 ^🔗	arkiver	especially with this size
01:17 ^🔗	jrwr	It looks like a amazing datastore
01:17 ^🔗	arkiver	and might not be happy with this much data
01:17 ^🔗	arkiver	yeah
01:18 ^🔗	arkiver	we coordinated with IA
01:18 ^🔗	arkiver	but was paused after the estimate turned out to be 800 TB
01:18 ^🔗	arkiver	anyway
01:18 ^🔗	jrwr	It would be best if they rsynced it right into their own search tool for it, so it had the right metadata
01:19 ^🔗	arkiver	not sure
01:19 ^🔗	arkiver	but I'm off
01:19 ^🔗	arkiver	good day
01:19 ^🔗	*	jrwr grabs a stick to hold the fort
01:19 ^🔗	arkiver	:P
01:26 ^🔗		geronimo has joined #archiveteam-bs
01:29 ^🔗		username1 has joined #archiveteam-bs
01:32 ^🔗		schbirid2 has quit IRC (Read error: Operation timed out)
01:35 ^🔗		geronimo has left
01:41 ^🔗		j08nY has quit IRC (Quit: Leaving)
01:43 ^🔗		fie has joined #archiveteam-bs
02:27 ^🔗		antonizoo has quit IRC ()
02:33 ^🔗		antonizoo has joined #archiveteam-bs
02:36 ^🔗		ZexaronS has quit IRC (Leaving)
02:49 ^🔗		antonizoo has quit IRC ()
02:53 ^🔗		antonizoo has joined #archiveteam-bs
03:47 ^🔗		Ravenloft has quit IRC (Ping timeout: 245 seconds)
03:59 ^🔗		pizzaiolo has quit IRC (Quit: pizzaiolo)
04:23 ^🔗	Gilfoyle	So just curious, anybody have any scripts for downloading say wikiProjects?
04:23 ^🔗	Gilfoyle	Trying to download one in particular and not the entirety of Wikipedia.
04:29 ^🔗	Gilfoyle	Otherwise, looks like I'll need to writer a spider and a basic implementation of PageRank to sort each topic into the correct subgrouping...
04:33 ^🔗	MrRadar	Gilfoyle: We have a set of scripts for scraping the API of Mediawiki installations: https://raw.githubusercontent.com/WikiTeam/wikiteam/master/dumpgenerator.py
04:34 ^🔗	MrRadar	A bit more info on the WikiTeam page: http://archiveteam.org/index.php?title=WikiTeam
04:36 ^🔗		divingkat has joined #archiveteam-bs
04:41 ^🔗	divingkat	Has anyone recorded things on Video 8 cassettes?
04:55 ^🔗		Sk1d has quit IRC (Ping timeout: 250 seconds)
05:02 ^🔗		Sk1d has joined #archiveteam-bs
05:31 ^🔗		klg has joined #archiveteam-bs
05:49 ^🔗		phuzion has quit IRC (Remote host closed the connection)
05:49 ^🔗		phuzion has joined #archiveteam-bs
06:04 ^🔗		phuzion has quit IRC (Read error: Connection reset by peer)
06:05 ^🔗		phuzion has joined #archiveteam-bs
06:33 ^🔗	divingkat	https://www.youtube.com/watch?v=8t5TYw2bkOk
07:00 ^🔗		tsp_ has joined #archiveteam-bs
07:08 ^🔗		Simpbrain has quit IRC (Read error: Operation timed out)
07:19 ^🔗		Simpbrain has joined #archiveteam-bs
08:20 ^🔗		vitzli has joined #archiveteam-bs
08:38 ^🔗		Honno has joined #archiveteam-bs
09:00 ^🔗		SHODAN_UI has joined #archiveteam-bs
09:29 ^🔗		underscor has joined #archiveteam-bs
09:29 ^🔗		swebb sets mode: +o underscor
09:48 ^🔗		gui7 has joined #archiveteam-bs
09:59 ^🔗		vitzli has quit IRC (Quit: Leaving)
10:00 ^🔗		divingkat has quit IRC (Quit: ChatZilla 0.9.93 [Firefox 53.0.3/20170518000419])
10:02 ^🔗	Nazca	okay, NOW pixiv is done, right?
10:02 ^🔗	Nazca	or have we still not done the +18 sweep?
10:06 ^🔗	JensRex	Doesn't look even slightly done. Still 200k items out.
10:14 ^🔗		SHODAN_UI has quit IRC (Remote host closed the connection)
10:14 ^🔗		j08nY has joined #archiveteam-bs
10:20 ^🔗	JAA	Yep, we still need to do the 18+ rooms.
10:33 ^🔗		icedice has joined #archiveteam-bs
11:36 ^🔗		pizzaiolo has joined #archiveteam-bs
12:08 ^🔗		SHODAN_UI has joined #archiveteam-bs
12:10 ^🔗		icedice has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		tsp_ has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		robinak has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		ItsYoda has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		BartoCH has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		DFJustin has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		jiphex has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		r3c0d3x has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Whopper has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Jon has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Kenshin has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Sanqui has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		LastNinja has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		kittymeow has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		zhongfu has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		HCross2 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		dan- has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		voltagex has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		davidar has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Meroje has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		wm_ has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		kevinr has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		tephra_ has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		ThisAsYou has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Ctrl-S___ has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		deathy has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		alembic has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		hook54321 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Famicoman has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		JSharp___ has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		tklk has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		floogulin has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		FalconK has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		t2t2 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Muad-Dib has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		raphidae has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		antonizoo has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		JensRex has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		alfie has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Rai-chan has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		purplebot has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Aoede has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		jtn2 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Fletcher has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Hecatz has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		bwn has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		medowar has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		voidsta has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Riviera has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		i0npulse has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		GLaDOS has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		nyany has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		PurpleSym has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		altlabel has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		yuitimoth has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		bsmith093 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		antomatic has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Xibalba has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Boppen has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		pizzaiolo has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Simpbrain has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		brayden has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		kurt has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		wacky_ has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		HUBI has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		SpaffGarg has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		balrog has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		kisspunch has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		sep332_ has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		kanzure has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		c4rc4s has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		ranma has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		whydomain has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		chazchaz_ has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		winr4r has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		closure has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		godane has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		robogoat has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		wabu has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		trs80 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		SadDM has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		jspiros has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		j08nY has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Odd0002 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		SilSte has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		zerkalo has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		RedType has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Nazca has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		pikhq has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Xamayon has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		mgrytbak has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		acridAxid has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		joepie91 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		cf has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		chfoo has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		eprillios has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		tapedrive has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		underscor has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		espes__ has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		kvieta has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		klg has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Honno has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		fie has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		username1 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		BlueMaxim has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		TheLovina has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		dashcloud has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		odemg has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		zenguy has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		superkuh has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		xmc has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		slyphic has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		w0rp has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		zino has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		ranma_ has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Selavi has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Frogging has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		htw has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		dxrt has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		twigfoot has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		MrRadar has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		atlogbot has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		chazchaz has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		atomicthu has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		TC01 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		swebb has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Darkstar has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		dcmorton has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Somebody2 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Cameron_D has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		arkiver has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Coderjo has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Jonimoose has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		midas has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		yipdw has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Baljem has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		lainu has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		gui7 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		phuzion has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		wp494 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		decay has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		jrwr has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		C4K3 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Stilett0 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		REiN^ has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		ndiddy has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		ivan has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Smiley has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Mayonaise has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		pipt has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		dboard has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		rocode has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		K4k has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		will has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Petri152 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		JAA has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		mhazinsk has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Lord_Nigh has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		luckcolor has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		aschmitz has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		mundus20- has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		PotcFdk has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		johnny5 has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		SHODAN_UI has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		RichardG has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		Gilfoyle has quit IRC (efnet.portlane.se se.hub)
12:10 ^🔗		greenie has quit IRC (efnet.portlane.se se.hub)
12:14 ^🔗		Xibalba has joined #archiveteam-bs
12:14 ^🔗		antomatic has joined #archiveteam-bs
12:14 ^🔗		bsmith093 has joined #archiveteam-bs
12:14 ^🔗		yuitimoth has joined #archiveteam-bs
12:14 ^🔗		altlabel has joined #archiveteam-bs
12:14 ^🔗		PurpleSym has joined #archiveteam-bs
12:14 ^🔗		nyany has joined #archiveteam-bs
12:14 ^🔗		GLaDOS has joined #archiveteam-bs
12:14 ^🔗		Boppen has joined #archiveteam-bs
12:14 ^🔗		i0npulse has joined #archiveteam-bs
12:14 ^🔗		Riviera has joined #archiveteam-bs
12:14 ^🔗		medowar has joined #archiveteam-bs
12:14 ^🔗		voidsta has joined #archiveteam-bs
12:14 ^🔗		bwn has joined #archiveteam-bs
12:14 ^🔗		Hecatz has joined #archiveteam-bs
12:14 ^🔗		Fletcher has joined #archiveteam-bs
12:14 ^🔗		jtn2 has joined #archiveteam-bs
12:14 ^🔗		Aoede has joined #archiveteam-bs
12:14 ^🔗		purplebot has joined #archiveteam-bs
12:14 ^🔗		Rai-chan has joined #archiveteam-bs
12:14 ^🔗		alfie has joined #archiveteam-bs
12:14 ^🔗		JensRex has joined #archiveteam-bs
12:14 ^🔗		antonizoo has joined #archiveteam-bs
12:14 ^🔗		ranma has joined #archiveteam-bs
12:14 ^🔗		whydomain has joined #archiveteam-bs
12:14 ^🔗		Baljem has joined #archiveteam-bs
12:14 ^🔗		lainu has joined #archiveteam-bs
12:14 ^🔗		chazchaz_ has joined #archiveteam-bs
12:14 ^🔗		winr4r has joined #archiveteam-bs
12:14 ^🔗		closure has joined #archiveteam-bs
12:14 ^🔗		c4rc4s has joined #archiveteam-bs
12:14 ^🔗		kanzure has joined #archiveteam-bs
12:14 ^🔗		yipdw has joined #archiveteam-bs
12:14 ^🔗		se.hub sets mode: +oooo antomatic PurpleSym closure yipdw
12:14 ^🔗		midas has joined #archiveteam-bs
12:14 ^🔗		Jonimoose has joined #archiveteam-bs
12:14 ^🔗		Coderjo has joined #archiveteam-bs
12:14 ^🔗		arkiver has joined #archiveteam-bs
12:14 ^🔗		Cameron_D has joined #archiveteam-bs
12:14 ^🔗		Somebody2 has joined #archiveteam-bs
12:14 ^🔗		dcmorton has joined #archiveteam-bs
12:14 ^🔗		sep332_ has joined #archiveteam-bs
12:14 ^🔗		Darkstar has joined #archiveteam-bs
12:14 ^🔗		kisspunch has joined #archiveteam-bs
12:14 ^🔗		johnny5 has joined #archiveteam-bs
12:14 ^🔗		swebb has joined #archiveteam-bs
12:14 ^🔗		se.hub sets mode: +ooo Jonimoose arkiver swebb
12:14 ^🔗		TC01 has joined #archiveteam-bs
12:14 ^🔗		atomicthu has joined #archiveteam-bs
12:14 ^🔗		PotcFdk has joined #archiveteam-bs
12:14 ^🔗		mundus20- has joined #archiveteam-bs
12:14 ^🔗		balrog has joined #archiveteam-bs
12:14 ^🔗		chazchaz has joined #archiveteam-bs
12:14 ^🔗		atlogbot has joined #archiveteam-bs
12:14 ^🔗		MrRadar has joined #archiveteam-bs
12:14 ^🔗		kvieta has joined #archiveteam-bs
12:14 ^🔗		espes__ has joined #archiveteam-bs
12:14 ^🔗		tapedrive has joined #archiveteam-bs
12:14 ^🔗		eprillios has joined #archiveteam-bs
12:14 ^🔗		chfoo has joined #archiveteam-bs
12:14 ^🔗		cf has joined #archiveteam-bs
12:14 ^🔗		joepie91 has joined #archiveteam-bs
12:14 ^🔗		dxrt has joined #archiveteam-bs
12:14 ^🔗		acridAxid has joined #archiveteam-bs
12:14 ^🔗		SpaffGarg has joined #archiveteam-bs
12:14 ^🔗		HUBI has joined #archiveteam-bs
12:14 ^🔗		wacky_ has joined #archiveteam-bs
12:14 ^🔗		aschmitz has joined #archiveteam-bs
12:14 ^🔗		luckcolor has joined #archiveteam-bs
12:14 ^🔗		Lord_Nigh has joined #archiveteam-bs
12:14 ^🔗		Xamayon has joined #archiveteam-bs
12:14 ^🔗		htw has joined #archiveteam-bs
12:14 ^🔗		mhazinsk has joined #archiveteam-bs
12:14 ^🔗		JAA has joined #archiveteam-bs
12:14 ^🔗		Petri152 has joined #archiveteam-bs
12:14 ^🔗		will has joined #archiveteam-bs
12:14 ^🔗		mgrytbak has joined #archiveteam-bs
12:14 ^🔗		kurt has joined #archiveteam-bs
12:14 ^🔗		K4k has joined #archiveteam-bs
12:14 ^🔗		Frogging has joined #archiveteam-bs
12:14 ^🔗		Selavi has joined #archiveteam-bs
12:14 ^🔗		rocode has joined #archiveteam-bs
12:14 ^🔗		ranma_ has joined #archiveteam-bs
12:14 ^🔗		dboard has joined #archiveteam-bs
12:14 ^🔗		greenie has joined #archiveteam-bs
12:14 ^🔗		trs80 has joined #archiveteam-bs
12:14 ^🔗		SadDM has joined #archiveteam-bs
12:14 ^🔗		jspiros has joined #archiveteam-bs
12:14 ^🔗		brayden has joined #archiveteam-bs
12:14 ^🔗		pipt has joined #archiveteam-bs
12:14 ^🔗		zino has joined #archiveteam-bs
12:14 ^🔗		Mayonaise has joined #archiveteam-bs
12:14 ^🔗		pikhq has joined #archiveteam-bs
12:14 ^🔗		se.hub sets mode: +ooo balrog SadDM brayden
12:14 ^🔗		w0rp has joined #archiveteam-bs
12:14 ^🔗		slyphic has joined #archiveteam-bs
12:14 ^🔗		Nazca has joined #archiveteam-bs
12:14 ^🔗		xmc has joined #archiveteam-bs
12:14 ^🔗		twigfoot has joined #archiveteam-bs
12:14 ^🔗		wabu has joined #archiveteam-bs
12:14 ^🔗		robogoat has joined #archiveteam-bs
12:14 ^🔗		RedType has joined #archiveteam-bs
12:14 ^🔗		superkuh has joined #archiveteam-bs
12:14 ^🔗		zerkalo has joined #archiveteam-bs
12:14 ^🔗		zenguy has joined #archiveteam-bs
12:14 ^🔗		Smiley has joined #archiveteam-bs
12:14 ^🔗		ivan has joined #archiveteam-bs
12:14 ^🔗		odemg has joined #archiveteam-bs
12:14 ^🔗		Gilfoyle has joined #archiveteam-bs
12:14 ^🔗		ndiddy has joined #archiveteam-bs
12:14 ^🔗		REiN^ has joined #archiveteam-bs
12:14 ^🔗		Stilett0 has joined #archiveteam-bs
12:14 ^🔗		C4K3 has joined #archiveteam-bs
12:14 ^🔗		dashcloud has joined #archiveteam-bs
12:14 ^🔗		TheLovina has joined #archiveteam-bs
12:14 ^🔗		RichardG has joined #archiveteam-bs
12:14 ^🔗		godane has joined #archiveteam-bs
12:14 ^🔗		jrwr has joined #archiveteam-bs
12:14 ^🔗		SilSte has joined #archiveteam-bs
12:14 ^🔗		Odd0002 has joined #archiveteam-bs
12:14 ^🔗		decay has joined #archiveteam-bs
12:14 ^🔗		BlueMaxim has joined #archiveteam-bs
12:14 ^🔗		wp494 has joined #archiveteam-bs
12:14 ^🔗		username1 has joined #archiveteam-bs
12:14 ^🔗		fie has joined #archiveteam-bs
12:14 ^🔗		klg has joined #archiveteam-bs
12:14 ^🔗		phuzion has joined #archiveteam-bs
12:14 ^🔗		Simpbrain has joined #archiveteam-bs
12:14 ^🔗		Honno has joined #archiveteam-bs
12:14 ^🔗		underscor has joined #archiveteam-bs
12:14 ^🔗		gui7 has joined #archiveteam-bs
12:14 ^🔗		j08nY has joined #archiveteam-bs
12:14 ^🔗		pizzaiolo has joined #archiveteam-bs
12:14 ^🔗		SHODAN_UI has joined #archiveteam-bs
12:14 ^🔗		se.hub sets mode: +oo xmc underscor
12:14 ^🔗		swebb sets mode: +o SketchCow
12:14 ^🔗		icedice has joined #archiveteam-bs
12:14 ^🔗		tsp_ has joined #archiveteam-bs
12:14 ^🔗		robinak has joined #archiveteam-bs
12:14 ^🔗		ItsYoda has joined #archiveteam-bs
12:14 ^🔗		BartoCH has joined #archiveteam-bs
12:14 ^🔗		DFJustin has joined #archiveteam-bs
12:14 ^🔗		jiphex has joined #archiveteam-bs
12:14 ^🔗		r3c0d3x has joined #archiveteam-bs
12:14 ^🔗		Whopper has joined #archiveteam-bs
12:14 ^🔗		Jon has joined #archiveteam-bs
12:14 ^🔗		Kenshin has joined #archiveteam-bs
12:14 ^🔗		Sanqui has joined #archiveteam-bs
12:14 ^🔗		LastNinja has joined #archiveteam-bs
12:14 ^🔗		kittymeow has joined #archiveteam-bs
12:14 ^🔗		zhongfu has joined #archiveteam-bs
12:14 ^🔗		HCross2 has joined #archiveteam-bs
12:14 ^🔗		dan- has joined #archiveteam-bs
12:14 ^🔗		voltagex has joined #archiveteam-bs
12:14 ^🔗		davidar has joined #archiveteam-bs
12:14 ^🔗		Meroje has joined #archiveteam-bs
12:14 ^🔗		wm_ has joined #archiveteam-bs
12:14 ^🔗		kevinr has joined #archiveteam-bs
12:14 ^🔗		tephra_ has joined #archiveteam-bs
12:14 ^🔗		Ctrl-S___ has joined #archiveteam-bs
12:14 ^🔗		ThisAsYou has joined #archiveteam-bs
12:14 ^🔗		alembic has joined #archiveteam-bs
12:14 ^🔗		deathy has joined #archiveteam-bs
12:14 ^🔗		hook54321 has joined #archiveteam-bs
12:14 ^🔗		Famicoman has joined #archiveteam-bs
12:14 ^🔗		JSharp___ has joined #archiveteam-bs
12:14 ^🔗		tklk has joined #archiveteam-bs
12:14 ^🔗		floogulin has joined #archiveteam-bs
12:14 ^🔗		FalconK has joined #archiveteam-bs
12:14 ^🔗		t2t2 has joined #archiveteam-bs
12:14 ^🔗		raphidae has joined #archiveteam-bs
12:14 ^🔗		Muad-Dib has joined #archiveteam-bs
12:14 ^🔗		efnet.port80.se sets mode: +o DFJustin
12:14 ^🔗		swebb sets mode: +o DFJustin
12:28 ^🔗		BlueMaxim has quit IRC (Quit: Leaving)
12:28 ^🔗	JAA	Tanobb grab complete. That was way quicker than I expected. I grabbed all 13 languages and the mobile versions as well. I skipped some redundant pages though, e.g. those listing all posts by one user in a thread.
14:05 ^🔗		ZexaronS has joined #archiveteam-bs
14:07 ^🔗		icedice has quit IRC (Quit: Leaving)
14:23 ^🔗		pizzaiolo has quit IRC (Read error: Operation timed out)
14:32 ^🔗		pizzaiolo has joined #archiveteam-bs
14:32 ^🔗		pizzaiolo has quit IRC (Read error: Connection reset by peer)
14:33 ^🔗		pizzaiolo has joined #archiveteam-bs
14:41 ^🔗		MrRadar_ has joined #archiveteam-bs
14:49 ^🔗		MrRadar has quit IRC (Read error: Operation timed out)
14:49 ^🔗		MrRadar_ is now known as MrRadar
15:56 ^🔗		odemg has quit IRC (Read error: Operation timed out)
16:00 ^🔗		odemg has joined #archiveteam-bs
19:10 ^🔗	tapedrive	arkiver: No, I'm using the same format bsmith093 used (https://github.com/JimmXinu/FanFicFare) which extracts the story text into markdown.
19:11 ^🔗	tapedrive	Any other way and my limited disk space would be completly used up.
19:11 ^🔗	arkiver	I see
19:11 ^🔗	arkiver	What are you currently using?
19:11 ^🔗	arkiver	maybe the WARCs could be uploaded to IA for the wayback machine
19:11 ^🔗	arkiver	if there's any copyright issues they'll probably block the pages or website in the wayback machine
19:13 ^🔗	tapedrive	arkiver: Example format: http://storage.savefanfiction.tk/Prince%20Consort-ffnet_8902231.txt (re-archived today)
19:14 ^🔗	tapedrive	All fanfiction.net URLs are robots.txt blocked anyway, actually.
19:14 ^🔗	MrRadar	Yet another good reason to grab them all as WARCs... so when the site goes down in the future and someone removes the robots.txt block there will be content to show
19:15 ^🔗	MrRadar	Also, I wonder how well git would handle if you put all these files into it...
19:15 ^🔗		SmileyG has joined #archiveteam-bs
19:17 ^🔗	username1	you could point FFF at warcproxy maybe?
19:20 ^🔗	tapedrive	Total of 7,382,393 plaintext files, all in the same folder? Few hundred gigabytes? No idea if git would cope with that.
19:20 ^🔗		SmileyG has quit IRC (Read error: Connection reset by peer)
19:20 ^🔗		SmileyG has joined #archiveteam-bs
19:21 ^🔗	jrwr	you could move based off the first few letters of its sha1 of the name of the file
19:21 ^🔗	jrwr	into their own folders, or even the first few letters of the title of the file
19:21 ^🔗	tapedrive	I've thought about doing WARCs and stuff like that, but I'm running everything on a couple of Raspberry Pis and a 2tb HD, so...
19:22 ^🔗		Smiley has quit IRC (Read error: Operation timed out)
19:22 ^🔗	arkiver	thing is that once it's not in WARCs, it will probably never go into WARCs and therefor will never get into an archive like the wayback machine
19:23 ^🔗	username1	tapedrive: you will probably get better performance in any case if you split that into subdirectories :D
19:23 ^🔗	tapedrive	username1: Yeah, I can't realy do much in that directory any more
19:23 ^🔗	tapedrive	But it makes merging newly updates stories much easier.
19:24 ^🔗	tapedrive	arkiver: I'll think about WARCs for the future. To get them into wayback, do I just upload them to IA with type web?
19:26 ^🔗	arkiver	and let us know about them
19:26 ^🔗	arkiver	we'll first have to make sure the WARCs are valid of course
19:26 ^🔗	arkiver	and not edited
19:26 ^🔗	tapedrive	And if I upload more WARCs into that item, will they be auto-added to wayback?
19:26 ^🔗	arkiver	I think so
19:26 ^🔗	arkiver	but multiple items might be better in that case
19:27 ^🔗	arkiver	also again depending on the number of WARCs
19:27 ^🔗	arkiver	you could do like 1 or 10 GB per item
19:29 ^🔗	tapedrive	Okay, I'll see if I can add that in.
19:31 ^🔗		medowar has quit IRC (Read error: Connection reset by peer)
19:32 ^🔗	arkiver	nice :)
19:54 ^🔗		tfgbd_znc has joined #archiveteam-bs
19:55 ^🔗		Ravenloft has joined #archiveteam-bs
20:03 ^🔗	ndiddy	quick question: do you guys know how to download from veoh.com?
20:03 ^🔗	ndiddy	there's this rare video that i can only find there
20:03 ^🔗	arkiver	link?
20:03 ^🔗	ndiddy	http://www.veoh.com/watch/v1313996wddKMNqf?h1=Super+Spacefortress+Macross
20:04 ^🔗	ndiddy	it's the only copy i can find of the uncut, hilariously bad dub of the first macross movie
20:05 ^🔗	JAA	Ugh, Flash.
20:06 ^🔗	username1	ndiddy: uh
20:06 ^🔗	username1	ndiddy: youtube-dl "http://www.veoh.com/watch/v1313996wddKMNqf?h1=Super+Spacefortress+Macross"
20:06 ^🔗	username1	done
20:06 ^🔗	ndiddy	i tried jdownloader but it only downloads the first 1/4 of it
20:07 ^🔗	arkiver	go to source
20:07 ^🔗	arkiver	search for fullPreviewHashLowPath
20:07 ^🔗	arkiver	and download that URL
20:08 ^🔗	JAA	Low vs. High quality/resolution?
20:11 ^🔗	arkiver	ah sorry
20:12 ^🔗	username1	[download] 10.0% of 881.42MiB at 321.50KiB/s ETA 42:07
20:12 ^🔗	username1	i can up it elsewhere afterwards if needed
20:12 ^🔗	ndiddy	tell me if it downloads all the way
20:12 ^🔗	username1	:)
20:12 ^🔗	ndiddy	also, youtube-dl gives me an "unsupported url" error
20:12 ^🔗	username1	update
20:13 ^🔗	arkiver	search for fullPreviewHashHighPath
20:13 ^🔗	arkiver	that's the same resolution as when HQ is selected in the player
20:13 ^🔗	arkiver	uh
20:13 ^🔗	ndiddy	i assumed so
20:13 ^🔗	arkiver	same size
20:14 ^🔗	JAA	What does "Preview" mean in there though?
20:14 ^🔗	arkiver	no idea
20:14 ^🔗	ndiddy	it looks to be the same size though
20:14 ^🔗	arkiver	but it's the version that's loaded in browse
20:14 ^🔗	arkiver	yeah
20:14 ^🔗	ndiddy	the way they have the site set up is kinda strange
20:15 ^🔗	ndiddy	it downloads the first couple megs at full speed than throttles you to 300 kbps
20:15 ^🔗	JAA	Yeah, the whole page is really cancerous.
20:15 ^🔗	arkiver	many streaming sites do that
20:15 ^🔗	JAA	The amount of third-party JavaScript code being pulled there is ridiculous.
20:16 ^🔗	MrRadar	That kind of throttling is pretty common because most people don't watch most of the video anyways
20:16 ^🔗	MrRadar	So they give you just enough to ensure there's a decent buffer, then feed you the rest at approx. the playback rate
20:16 ^🔗	MrRadar	So they don't waste too much bandwidth if you close the tab after a minute
20:16 ^🔗	username1	have you tried watching the network tab while watching via flash?
20:17 ^🔗	ndiddy	no, why?
20:17 ^🔗	arkiver	yeah, it's the same
20:17 ^🔗	username1	to figure out how it works?
20:17 ^🔗	JAA	Could it be that the Preview is only the first 25% (what JDownloader grabs)?
20:17 ^🔗	ndiddy	looks like jdownloader was using a different url
20:18 ^🔗	ndiddy	http://fcache.veoh.com/file/f/h1313996.mp4?e=1497209817&ri=6000&rs=300&h=fa032ed6a31a75daab4b4503f60e35f9
20:18 ^🔗	ndiddy	vs page editing, which gives you http://content.veoh.com/flash/p/2/v1313996wddKMNqf/h1313996.mp4?ct=bdd4eff4404837d31915158fe9f6a3fe3c9aaf63da409283
20:18 ^🔗	JAA	I see.
20:19 ^🔗	ndiddy	damn it, same thing
20:20 ^🔗	ndiddy	arkiver: can you download the whole video?
20:20 ^🔗	arkiver	well, yeah, why?
20:20 ^🔗	ndiddy	i just got 200 mb again from that link
20:21 ^🔗	arkiver	in source search for fullPreviewHashHighPath
20:21 ^🔗	arkiver	and download that
20:21 ^🔗	ndiddy	that's what i did
20:22 ^🔗	ndiddy	i'll see if watching the video in a muted tab while downloading fixes anything
20:22 ^🔗	ndiddy	i'm assuming that that url is the one the player buffers from
20:22 ^🔗	ndiddy	and it won't buffer more than 25% into the video
20:23 ^🔗	arkiver	before you can see fullPreviewHashHighPath you need to have 18+ cookie of course
20:23 ^🔗	arkiver	sure you didn't use fullPreviewHashLowPath?
20:23 ^🔗	arkiver	that gives you a 200 MB one
20:23 ^🔗	arkiver	fullPreviewHashHighPath gives 881 or so
20:23 ^🔗	arkiver	881 MB*
20:24 ^🔗	username1	[download] 100% of 881.42MiB in 12:40
20:24 ^🔗	username1	ERROR: content too short (expected 924240150 bytes and served 239785474)
20:24 ^🔗	ndiddy	see what i mean
20:26 ^🔗	username1	well, can you actually skip to later via flash?
20:26 ^🔗	ndiddy	yep
20:26 ^🔗	ndiddy	it's not like nnd or something
20:26 ^🔗	JAA	Does anything happen in the Network tab?
20:26 ^🔗	ndiddy	you mean, like the windows one?
20:26 ^🔗	username1	no, the browser one
20:26 ^🔗	JAA	The browser's.
20:26 ^🔗	username1	:O
20:26 ^🔗	ndiddy	is that a thing
20:26 ^🔗	username1	zomg :P
20:26 ^🔗	username1	you have a lot to learn
20:27 ^🔗	MrRadar	Press F12 in the tab
20:27 ^🔗	JAA	Essential tool #1 for web devs. :-P
20:27 ^🔗	username1	and people trying to download special things
20:28 ^🔗	ndiddy	looks like there's a ?start parameter
20:31 ^🔗	ndiddy	sorry, &start
20:31 ^🔗	ndiddy	ex: http://content.veoh.com/flash/f/2/v1313996wddKMNqf/l1313996.mp4?ct=466b8df98e1762caf22b228f86934f1623d189f7d4c429bb&start=4659.76
20:33 ^🔗	arkiver	it will only send you the video starting from that time
20:34 ^🔗	ndiddy	i guess what i have to do is download the video from the first link, count how many seconds are in it, then redownload and splice all the clips together
20:40 ^🔗		SmileyG has quit IRC (Remote host closed the connection)
20:42 ^🔗		BlueMaxim has joined #archiveteam-bs
20:56 ^🔗		username1 has quit IRC (Quit: Leaving)
21:01 ^🔗		SHODAN_UI has quit IRC (Remote host closed the connection)
21:37 ^🔗		BartoCH has quit IRC (Ping timeout: 260 seconds)
21:48 ^🔗		dashcloud has quit IRC (Remote host closed the connection)
21:55 ^🔗		dashcloud has joined #archiveteam-bs
22:43 ^🔗	Somebody2	tapedrive: thank you for grabbing the text of that fanfiction, in any case. It's useful, even though WARCs would be very nice too.
22:45 ^🔗	tapedrive	I'm seeing how easy it would be to add in WARC to my system now.
22:45 ^🔗	tapedrive	Although I think it would mean an entire recrawl, which has taken several (about 10) months due to their rate limiting.
22:52 ^🔗	ndiddy	https://archive.org/details/SuperSpacefortressMacross
22:52 ^🔗		wp494 has quit IRC (Read error: Connection reset by peer)
23:04 ^🔗		Ravenloft has quit IRC (Ping timeout: 268 seconds)
23:21 ^🔗		antomati_ has joined #archiveteam-bs
23:21 ^🔗		swebb sets mode: +o antomati_
23:22 ^🔗		antomatic has quit IRC (Read error: Operation timed out)
23:33 ^🔗		BartoCH has joined #archiveteam-bs
23:52 ^🔗	jrwr	SketchCow: Fix it please - Could not store file "/tmp/phphIFGty" at "mwstore://local-backend/local-public/d/da/Chatpixivicon.gif".
23:52 ^🔗	jrwr	its still broken :(

irclogger-viewer