#archiveteam-bs 2016-11-25,Fri

↑back Search

Time	Nickname	Message
00:24 ^🔗		antomati_ has joined #archiveteam-bs
00:24 ^🔗		swebb sets mode: +o antomati_
00:25 ^🔗		antomatic has quit IRC (Read error: Operation timed out)
01:53 ^🔗		ndiddy has quit IRC (Read error: Connection reset by peer)
02:04 ^🔗		wp494 has quit IRC (Quit: LOUD UNNECESSARY QUIT MESSAGES)
02:08 ^🔗		wp494 has joined #archiveteam-bs
02:48 ^🔗	godane	so i added some better code for grabbing Arirang streams
02:49 ^🔗	godane	i had to change it cause index_3_av.m3u8 was not always the 850x480 stream
02:49 ^🔗	godane	as i just curl -s "$masterurl" \| grep -A1 850x480 \| grep m3u8
02:58 ^🔗		wp494_ has joined #archiveteam-bs
03:03 ^🔗		wp494 has quit IRC (Read error: Operation timed out)
03:03 ^🔗		wp494_ is now known as wp494
03:12 ^🔗		ravetcofx has joined #archiveteam-bs
03:47 ^🔗		Mayeau has joined #archiveteam-bs
03:56 ^🔗		Mayonaise has quit IRC (Ping timeout: 864 seconds)
03:56 ^🔗		Mayeau is now known as Mayonaise
04:08 ^🔗		ravetcofx has quit IRC (Ping timeout: 506 seconds)
05:09 ^🔗		Stilett0 has joined #archiveteam-bs
05:12 ^🔗		Stiletto has quit IRC (Read error: Operation timed out)
05:52 ^🔗		Sk1d has quit IRC (Ping timeout: 250 seconds)
05:59 ^🔗		Sk1d has joined #archiveteam-bs
06:49 ^🔗		Start has joined #archiveteam-bs
07:37 ^🔗		GE has joined #archiveteam-bs
10:47 ^🔗	Medowar	FYI: arkiver Sketchcow: bayimg is now done trackerside, I am currently syncing everything from my target over to fos.
11:09 ^🔗		GE has quit IRC (Remote host closed the connection)
11:19 ^🔗		BlueMaxim has quit IRC (Quit: Leaving)
11:48 ^🔗		ravetcofx has joined #archiveteam-bs
12:15 ^🔗		ravetcofx has quit IRC (Ping timeout: 260 seconds)
12:35 ^🔗		GE has joined #archiveteam-bs
12:54 ^🔗		vitzli has joined #archiveteam-bs
12:58 ^🔗	vitzli	Feel like shit, 4 years late to archive the wiki :[ wayback machine luckily has it, but still. (it's small udev rule, nothing super-important)
12:59 ^🔗	arkiver	what wiki?
12:59 ^🔗	vitzli	wiki.countercaster.com
13:00 ^🔗	vitzli	last update and online in 2012
13:01 ^🔗	arkiver	:(
13:01 ^🔗	arkiver	that sucks
13:06 ^🔗	vitzli	After I first met wikiteam tools - I began grabs of any small-ish or remotely interesting/useful wikis, it's awesome, but sometimes it's like to read one book, like it, read another, like it, I WANT MORE! HA! NO! fuck you! Author Existence Failure.
13:09 ^🔗	vitzli	and that is all for drunk Friday confessions, sorry about that
13:37 ^🔗	Whopper	Kaz: similar thing happened in Australia with metadata. It was originally to 'fight terrorists' and then we have https://www.crikey.com.au/2016/01/18/over-60-agencies-apply-to-snoop-into-your-metadata/ . The majority might have a legitimate use for the information but that's not the point. Race fixing, polluting, work health safety violations / fraud, mislabelling fruit? etc. ≠ terrorism
13:40 ^🔗	arkiver	Medowar: :D nice!
13:53 ^🔗		vitzli has quit IRC (Quit: Leaving)
15:28 ^🔗		superkuh has quit IRC (Remote host closed the connection)
15:29 ^🔗		Shakespea has joined #archiveteam-bs
15:30 ^🔗	Shakespea	FYI- http://www.panoramio.com/maps-faq
15:30 ^🔗		superkuh has joined #archiveteam-bs
15:30 ^🔗		Shakespea has left
15:44 ^🔗		Start has quit IRC (Quit: Disconnected.)
15:52 ^🔗		atrocity has quit IRC (Ping timeout: 260 seconds)
17:28 ^🔗		Stilett0 has quit IRC (Read error: Connection reset by peer)
18:03 ^🔗		Stiletto has joined #archiveteam-bs
18:06 ^🔗		ndiddy has joined #archiveteam-bs
19:33 ^🔗		RichardG_ has joined #archiveteam-bs
19:36 ^🔗		RichardG has quit IRC (Ping timeout: 250 seconds)
19:49 ^🔗		RichardG has joined #archiveteam-bs
19:50 ^🔗		RichardG_ has quit IRC (Ping timeout: 250 seconds)
19:51 ^🔗	tapedrive	I want to download a site (for a personal archive at the moment, so not with archivebot) but I'm not quite sure what options to us with wget. I want to download all pages under a specific domain, and all the resources (css, images, javascripts, etc) which will be under different domains. I also want to download all linked pages - so if the main site links to external.com/foo.html then I want that page and all its requisite downloaded
19:51 ^🔗	tapedrive	too. Is this possible with wget, or am I going to have to write something custom for this?
20:02 ^🔗		ndiddy has quit IRC (Ping timeout: 633 seconds)
20:04 ^🔗		ndiddy has joined #archiveteam-bs
20:09 ^🔗	Kaz	yes
20:10 ^🔗	Kaz	one sec
20:10 ^🔗	Kaz	tapedrive: http://archiveteam.org/index.php?title=Wget#Creating_WARC_with_wget
20:11 ^🔗	Kaz	or you can use https://github.com/ludios/grab-site
20:11 ^🔗	tapedrive	I've read that (the wiki one), but I'm confused as to what options I should use for downloading all page dependencies from other domains.
20:13 ^🔗	tapedrive	Ah, that grab-site one looks perfect. Thanks!
21:32 ^🔗		VADemon has joined #archiveteam-bs
21:35 ^🔗		sep332_ has quit IRC (Konversation terminated!)
21:35 ^🔗		jrwr has joined #archiveteam-bs
21:50 ^🔗		kristian_ has joined #archiveteam-bs
22:02 ^🔗	tapedrive	Okay, I'm using grab-site, but there's an issue. The site I'm archiving has imgur images lined, but imgur's doing some weird redirect thing.
22:02 ^🔗	tapedrive	From the logs: 302 Moved Temporarily http://i.imgur.com/LrowbFM.jpg
22:02 ^🔗	tapedrive	200 OK http://imgur.com/LrowbFM
22:03 ^🔗	tapedrive	So it's not archiving the image, rather a stupid imgur page.
22:03 ^🔗	tapedrive	Any ideas to get round this?
22:22 ^🔗		Stilett0 has joined #archiveteam-bs
22:28 ^🔗		Stilett0 has quit IRC (Read error: Operation timed out)
22:28 ^🔗		Stiletto has quit IRC (Read error: Operation timed out)
22:29 ^🔗	ae_g_i_s	yeah, imgur needs you to follow the redirect
22:29 ^🔗	ae_g_i_s	i don't know exactly what weird magic they're using atm
22:29 ^🔗	ae_g_i_s	but if you essentially do the request twice (not with the same url, with the one it redirects you to) you'll have a page that has the "real" source image
22:39 ^🔗	ae_g_i_s	second pitfall (since we're in -bs anyway and people might fall into that trap): if you upload an image nowadays, you can not copy the image link from the image it presents because they use a base64 blob (IIRC) in there
22:39 ^🔗	ae_g_i_s	you have to open the "sharing link" on the right...and _there_, the image source is as usual
22:40 ^🔗	tapedrive	So any way to add that rule into grab-site? Or will I just have to manually get them afterwards?
22:47 ^🔗	ae_g_i_s	< too noob to know
22:50 ^🔗		Start has joined #archiveteam-bs
22:54 ^🔗	tapedrive	It's not really a problem, as I can go through the log after it's complete, getting all the i.imgur.com images.
23:03 ^🔗		BlueMaxim has joined #archiveteam-bs
23:32 ^🔗		ravetcofx has joined #archiveteam-bs
23:34 ^🔗		ravetcofx has quit IRC (Remote host closed the connection)
23:54 ^🔗		Yoshimura has quit IRC (Remote host closed the connection)

irclogger-viewer