[09:04] so, was that a small internet earthquake? [10:54] have some html injection https://startpage.com/do/search?query=Reginald+D.+Hunter&cat=web&pl=chrome&language=english [13:30] can someone point me in the right direction? i have a list of usernames for home.xmsnet.nl now i just need to figure out how i can get wget to work with this to warc it. [13:34] right, bad snort. [13:35] can someone point me in the right direction? i have a list of usernames for home.xmsnet.nl now i just need to figure out how i can get wget to work with this to warc it. [13:49] midas: is there any corelation between the usernames and urls? [13:51] home.xmsnet.nl/username IIRC [13:51] hmm, it looks like the urls are http://home.xmsnet.nl/username [13:51] right [13:52] Which version of wget was warc enabled? [13:52] so th easiest thing to do is to make a text file with a list of those urls and then do a wget -i [13:53] 1.14 maybe? [13:54] yup... >1.14 has warc baked in [13:56] midas: with what I've suggested you'll end up with a warc for every line of the text file... if that's not ultimately desireable, you can megawarc them all together after the fact. [13:58] There is only 156 sites. Not too bad. [14:27] SadDM: sorry, was out of irc for a sec [14:27] yeah, -i would work with --warc-file ? [14:28] it sure does [14:28] you just end up with one for every url in the file [14:28] ok, lets start a screen :p [14:28] thats no issue [14:32] wget wants a argument for warc-file [14:33] (i feel stupid right now :p) [14:35] oh... uh [14:35] hmm... maybe I got ahead of myself [14:35] I keep cocking up my media types, wish users could move items :| [14:35] my guess was just wget -i -m --warc-file but thats no balony [14:35] I know that I've done something like this [14:36] maybe I just built a bunch of wget commands in a file and then piped it through a shell [14:37] sorry midas, I think I've given you slightly bad advice [14:37] are you on a unix machine? [14:40] yeah [14:40] something like this should generate the wgets: cat usernames.txt|sed 's/\(.*\)/wget -m --warc-file=\1 http:\/\/home.xmsnet.nl/\1//' > wgets [14:41] note that I just wrote that cold and didn't test it or anything [14:41] then just "sh wgets" [14:41] lol. thanks! i think that would work :) [14:42] :-D it's close-ish at least [14:42] indeed, and thats more than i could do on short notice :p [14:48] so i fixed some of the uploads of sydney morning herald and Australian women's weekly [14:48] some of the uploads were incomplete downloads [14:50] midas: around bud? [14:50] so i'm mirroring more glenn beck episodes [14:51] Smiley: yeah [14:51] im always somewhere [14:51] midas: how are/were you testing cartoon hd on android, [14:51] you running a avd on your system? [14:52] didnt have time since last week, work said i had to do something :< [14:53] i think ill debug it tonight again [14:53] just have to get this grab started [14:54] yeah I'm just trying to setup an AVD so I can just grab stuff rather than grabbing on my phone and pushing accross the network [14:54] ah got it working now :D [14:55] * Smiley waits to see if it loads [14:57] ah bloody cool Smiley ! [14:58] midas: hmmmmm just sitting on "android" screen atm :/ [15:24] midas: just run android avd [15:24] and create a new avd using a nice h/w [15:26] or at least that's what I'm trying to do. [15:26] it's quite slow. [15:27] as I just told it to create a 5Gb sdcard D: [15:37] lol [15:37] * midas throws stones at midas [15:46] grrr [15:46] none will boot D: [15:51] yey got one booting \o/ [15:59] and using android monitor, it has a file explorer with push/pull [15:59] or i can use adb. sweeeet [16:45] lol Wheeler [16:45] "Simply put, when a consumer buys a specified bandwidth, it is commercially unreasonable and thus a violation of this proposal to deny them the full connectivity and the full benefits that connection enables." [16:45] evidently Wheeler has never used Comcast services, only lobbied for them [17:10] hahaha [18:02] lol wtf, startpage.com's "community" link goes tot heir facebook site [18:07] so this is about archive.org and a little bit about archiveteam also ;) http://www.nu.nl/weekend/3769630/bibliothecaris-van-internet-wil-websites-niet-verloren-laten-gaan.html [18:07] pritty big website in .nl [20:37] i'm grabbing Bomb Patrol Afghanistan [20:37] cause it aired on G4 [20:37] i'm getting the 720p copies [20:43] ./wg 25 [22:17] so i'm up to 7600 items now in my godaneinbox [22:53] so i'm trying to download mov file from way back machine and i can't get it the full file [22:53] i'm trying this one at the moment: https://web.archive.org/web/20060602014144/http://www.commandn.tv/cN/044/commandN-044-h264.mov [22:54] the way back machine url will just stop about 32.7mb into the file [22:54] even with wget [23:21] some good news [23:21] looks like i maybe able to get some mp4 files of commandN thur veoh.com [23:25] and also here is the sitemap of veoh.com: http://www.veoh.com/sitemap.xml [23:30] so based on one of the veho.com html files you can get the full path of videos from them [23:40] here is some example code for grabbing the video files from veoh.com: [23:40] curl http://www.veoh.com/watch/e20095 | grep fullPreviewHashHighPath | sed 's|.*fullPreviewHashHighPath":"||g' | sed 's|",".*||g' | sed 's|.*/content|content|g' [23:41] you may want to add a -O h$id.mp4 or some thing or you get file names like this below: [23:41] h20095.mp4?ct=ebfae74e540fcd7e297a588892beca41022dc1cd2d5c355d [23:53] looks like there is no page on archiveteam.org for veoh.com