[00:24] *** antomati_ has joined #archiveteam-bs
[00:24] *** swebb sets mode: +o antomati_
[00:25] *** antomatic has quit IRC (Read error: Operation timed out)
[01:53] *** ndiddy has quit IRC (Read error: Connection reset by peer)
[02:04] *** wp494 has quit IRC (Quit: LOUD UNNECESSARY QUIT MESSAGES)
[02:08] *** wp494 has joined #archiveteam-bs
[02:48] <godane> so i added some better code for grabbing Arirang streams
[02:49] <godane> i had to change it cause index_3_av.m3u8 was not always the 850x480 stream
[02:49] <godane> as i just curl -s "$masterurl" | grep -A1 850x480 | grep m3u8
[02:58] *** wp494_ has joined #archiveteam-bs
[03:03] *** wp494 has quit IRC (Read error: Operation timed out)
[03:03] *** wp494_ is now known as wp494
[03:12] *** ravetcofx has joined #archiveteam-bs
[03:47] *** Mayeau has joined #archiveteam-bs
[03:56] *** Mayonaise has quit IRC (Ping timeout: 864 seconds)
[03:56] *** Mayeau is now known as Mayonaise
[04:08] *** ravetcofx has quit IRC (Ping timeout: 506 seconds)
[05:09] *** Stilett0 has joined #archiveteam-bs
[05:12] *** Stiletto has quit IRC (Read error: Operation timed out)
[05:52] *** Sk1d has quit IRC (Ping timeout: 250 seconds)
[05:59] *** Sk1d has joined #archiveteam-bs
[06:49] *** Start has joined #archiveteam-bs
[07:37] *** GE has joined #archiveteam-bs
[10:47] <Medowar> FYI: arkiver Sketchcow: bayimg is now done trackerside, I am currently syncing everything from my target over to fos.
[11:09] *** GE has quit IRC (Remote host closed the connection)
[11:19] *** BlueMaxim has quit IRC (Quit: Leaving)
[11:48] *** ravetcofx has joined #archiveteam-bs
[12:15] *** ravetcofx has quit IRC (Ping timeout: 260 seconds)
[12:35] *** GE has joined #archiveteam-bs
[12:54] *** vitzli has joined #archiveteam-bs
[12:58] <vitzli> Feel like shit, 4 years late to archive the wiki :[ wayback machine luckily has it, but still. (it's small udev rule, nothing super-important)
[12:59] <arkiver> what wiki?
[12:59] <vitzli> wiki.countercaster.com
[13:00] <vitzli> last update and online in 2012
[13:01] <arkiver> :(
[13:01] <arkiver> that sucks
[13:06] <vitzli> After I first met wikiteam tools - I began grabs of any small-ish or remotely interesting/useful wikis, it's awesome, but sometimes it's like to read one book, like it, read another, like it, I WANT MORE! HA! NO! fuck you! Author Existence Failure.
[13:09] <vitzli> and that is all for drunk Friday confessions, sorry about that
[13:37] <Whopper> Kaz: similar thing happened in Australia with metadata.  It was originally to 'fight terrorists' and then we have https://www.crikey.com.au/2016/01/18/over-60-agencies-apply-to-snoop-into-your-metadata/ .  The majority might have a legitimate use for the information but that's not the point.   Race fixing, polluting, work health safety violations / fraud, mislabelling fruit? etc. ≠ terrorism
[13:40] <arkiver> Medowar: :D nice!
[13:53] *** vitzli has quit IRC (Quit: Leaving)
[15:28] *** superkuh has quit IRC (Remote host closed the connection)
[15:29] *** Shakespea has joined #archiveteam-bs
[15:30] <Shakespea> FYI-  http://www.panoramio.com/maps-faq 
[15:30] *** superkuh has joined #archiveteam-bs
[15:30] *** Shakespea has left 
[15:44] *** Start has quit IRC (Quit: Disconnected.)
[15:52] *** atrocity has quit IRC (Ping timeout: 260 seconds)
[17:28] *** Stilett0 has quit IRC (Read error: Connection reset by peer)
[18:03] *** Stiletto has joined #archiveteam-bs
[18:06] *** ndiddy has joined #archiveteam-bs
[19:33] *** RichardG_ has joined #archiveteam-bs
[19:36] *** RichardG has quit IRC (Ping timeout: 250 seconds)
[19:49] *** RichardG has joined #archiveteam-bs
[19:50] *** RichardG_ has quit IRC (Ping timeout: 250 seconds)
[19:51] <tapedrive> I want to download a site (for a personal archive at the moment, so not with archivebot) but I'm not quite sure what options to us with wget. I want to download all pages under a specific domain, and all the resources (css, images, javascripts, etc) which will be under different domains. I also want to download all linked pages - so if the main site links to external.com/foo.html then I want that page and all its requisite downloaded 
[19:51] <tapedrive> too. Is this possible with wget, or am I going to have to write something custom for this?
[20:02] *** ndiddy has quit IRC (Ping timeout: 633 seconds)
[20:04] *** ndiddy has joined #archiveteam-bs
[20:09] <Kaz> yes
[20:10] <Kaz> one sec
[20:10] <Kaz> tapedrive: http://archiveteam.org/index.php?title=Wget#Creating_WARC_with_wget
[20:11] <Kaz> or you can use https://github.com/ludios/grab-site
[20:11] <tapedrive> I've read that (the wiki one), but I'm confused as to what options I should use for downloading all page dependencies from other domains.
[20:13] <tapedrive> Ah, that grab-site one looks perfect. Thanks!
[21:32] *** VADemon has joined #archiveteam-bs
[21:35] *** sep332_ has quit IRC (Konversation terminated!)
[21:35] *** jrwr has joined #archiveteam-bs
[21:50] *** kristian_ has joined #archiveteam-bs
[22:02] <tapedrive> Okay, I'm using grab-site, but there's an issue. The site I'm archiving has imgur images lined, but imgur's doing some weird redirect thing.
[22:02] <tapedrive> From the logs: 302 Moved Temporarily http://i.imgur.com/LrowbFM.jpg
[22:02] <tapedrive> 200 OK http://imgur.com/LrowbFM
[22:03] <tapedrive> So it's not archiving the image, rather a stupid imgur page.
[22:03] <tapedrive> Any ideas to get round this?
[22:22] *** Stilett0 has joined #archiveteam-bs
[22:28] *** Stilett0 has quit IRC (Read error: Operation timed out)
[22:28] *** Stiletto has quit IRC (Read error: Operation timed out)
[22:29] <ae_g_i_s> yeah, imgur needs you to follow the redirect
[22:29] <ae_g_i_s> i don't know exactly what weird magic they're using atm
[22:29] <ae_g_i_s> but if you essentially do the request twice (not with the same url, with the one it redirects you to) you'll have a page that has the "real" source image
[22:39] <ae_g_i_s> second pitfall (since we're in -bs anyway and people might fall into that trap): if you upload an image nowadays, you can not copy the image link from the image it presents because they use a base64 blob (IIRC) in there
[22:39] <ae_g_i_s> you have to open the "sharing link" on the right...and _there_, the image source is as usual
[22:40] <tapedrive> So any way to add that rule into grab-site? Or will I just have to manually get them afterwards?
[22:47] <ae_g_i_s> < too noob to know
[22:50] *** Start has joined #archiveteam-bs
[22:54] <tapedrive> It's not really a problem, as I can go through the log after it's complete, getting all the i.imgur.com images.
[23:03] *** BlueMaxim has joined #archiveteam-bs
[23:32] *** ravetcofx has joined #archiveteam-bs
[23:34] *** ravetcofx has quit IRC (Remote host closed the connection)
[23:54] *** Yoshimura has quit IRC (Remote host closed the connection)