[09:04] <midas> so, was that a small internet earthquake?
[10:54] <schbirid> have some html injection https://startpage.com/do/search?query=Reginald+D.+Hunter&cat=web&pl=chrome&language=english
[13:30] <midas> can someone point me in the right direction? i have a list of usernames for home.xmsnet.nl now i just need to figure out how i can get wget to work with this to warc it.
[13:34] <midas> right, bad snort.
[13:35] <midas> can someone point me in the right direction? i have a list of usernames for home.xmsnet.nl now i just need to figure out how i can get wget to work with this to warc it.
[13:49] <SadDM> midas: is there any corelation between the usernames and urls?
[13:51] <rocode_> home.xmsnet.nl/username IIRC
[13:51] <SadDM> hmm, it looks like the urls are  http://home.xmsnet.nl/username
[13:51] <SadDM> right
[13:52] <rocode> Which version of wget was warc enabled?
[13:52] <SadDM> so th easiest thing to do is to make a text file with a list of those urls and then do a wget -i <filename>
[13:53] <SadDM> 1.14 maybe?
[13:54] <SadDM> yup... >1.14 has warc baked in
[13:56] <SadDM> midas: with what I've suggested you'll end up with a warc for every line of the text file... if that's not ultimately desireable, you can megawarc them all together after the fact.
[13:58] <rocode> There is only 156 sites. Not too bad.
[14:27] <midas> SadDM: sorry, was out of irc for a sec
[14:27] <midas> yeah, -i would work with --warc-file ?
[14:28] <SadDM> it sure does
[14:28] <SadDM> you just end up with one for every url in the file
[14:28] <midas> ok, lets start a screen :p
[14:28] <midas> thats no issue
[14:32] <midas> wget wants a argument for warc-file
[14:33] <midas> (i feel stupid right now :p)
[14:35] <SadDM> oh... uh
[14:35] <SadDM> hmm... maybe I got ahead of myself
[14:35] <ohhdemgir> I keep cocking up my media types, wish users could move items :|
[14:35] <midas> my guess was just wget -i <file> -m --warc-file but thats no balony
[14:35] <SadDM> I know that I've done something like this
[14:36] <SadDM> maybe I just built a bunch of wget commands in a file and then piped it through a shell
[14:37] <SadDM> sorry midas, I think I've given you slightly bad advice
[14:37] <SadDM> are you on a unix machine?
[14:40] <midas> yeah
[14:40] <SadDM> something like this should generate the wgets: cat usernames.txt|sed 's/\(.*\)/wget -m --warc-file=\1 http:\/\/home.xmsnet.nl/\1//' > wgets
[14:41] <SadDM> note that I just wrote that cold and didn't test it or anything
[14:41] <SadDM> then just "sh wgets"
[14:41] <midas> lol. thanks! i think that would work :)
[14:42] <SadDM> :-D it's close-ish at least
[14:42] <midas> indeed, and thats more than i could do on short notice :p
[14:48] <godane> so i fixed some of the uploads of sydney morning herald and Australian women's weekly
[14:48] <godane> some of the uploads were incomplete downloads
[14:50] <Smiley> midas: around bud?
[14:50] <godane> so i'm mirroring more glenn beck episodes
[14:51] <midas> Smiley: yeah
[14:51] <midas> im always somewhere
[14:51] <Smiley> midas: how are/were you testing cartoon hd on android,
[14:51] <Smiley> you running a avd on your system?
[14:52] <midas> didnt have time since last week, work said i had to do something :<
[14:53] <midas> i think ill debug it tonight again
[14:53] <midas> just have to get this grab started
[14:54] <Smiley> yeah I'm just trying to setup an AVD so I can just grab stuff rather than grabbing on my phone and pushing accross the network
[14:54] <Smiley> ah got it working now :D
[14:55] * Smiley waits to see if it loads
[14:57] <midas> ah bloody cool Smiley !
[14:58] <Smiley> midas: hmmmmm just sitting on "android" screen atm :/
[15:24] <Smiley> midas: just run android avd
[15:24] <Smiley> and create a new avd using a nice h/w
[15:26] <Smiley> or at least that's what I'm trying to do.
[15:26] <Smiley> it's quite slow.
[15:27] <Smiley> as I just told it to create a 5Gb sdcard D:
[15:37] <midas> lol
[15:37] * midas throws stones at midas
[15:46] <Smiley> grrr
[15:46] <Smiley> none will boot D:
[15:51] <Smiley> yey got one booting \o/
[15:59] <Smiley> and using android monitor, it has a file explorer with push/pull
[15:59] <Smiley> or i can use adb. sweeeet
[16:45] <yipdw> lol Wheeler
[16:45] <yipdw> "Simply put, when a consumer buys a specified bandwidth, it is commercially unreasonable and thus a violation of this proposal to deny them the full connectivity and the full benefits that connection enables."
[16:45] <yipdw> evidently Wheeler has never used Comcast services, only lobbied for them
[17:10] <exmic> hahaha
[18:02] <schbirid> lol wtf, startpage.com's "community" link goes tot heir facebook site
[18:07] <midas> so this is about archive.org and a little bit about archiveteam also ;) http://www.nu.nl/weekend/3769630/bibliothecaris-van-internet-wil-websites-niet-verloren-laten-gaan.html
[18:07] <midas> pritty big website in .nl
[20:37] <godane> i'm grabbing Bomb Patrol Afghanistan
[20:37] <godane> cause it aired on G4
[20:37] <godane> i'm getting the 720p copies
[20:43] <exmic> ./wg 25
[22:17] <godane> so i'm up to 7600 items now in my godaneinbox
[22:53] <godane> so i'm trying to download mov file from way back machine and i can't get it the full file
[22:53] <godane> i'm trying this one at the moment: https://web.archive.org/web/20060602014144/http://www.commandn.tv/cN/044/commandN-044-h264.mov
[22:54] <godane> the way back machine url will just stop about 32.7mb into the file
[22:54] <godane> even with wget
[23:21] <godane> some good news
[23:21] <godane> looks like i maybe able to get some mp4 files of commandN thur veoh.com
[23:25] <godane> and also here is the sitemap of veoh.com: http://www.veoh.com/sitemap.xml
[23:30] <godane> so based on one of the veho.com html files you can get the full path of videos from them
[23:40] <godane> here is some example code for grabbing the video files from veoh.com:
[23:40] <godane> curl http://www.veoh.com/watch/e20095 | grep fullPreviewHashHighPath | sed 's|.*fullPreviewHashHighPath":"||g' | sed 's|",".*||g' | sed 's|.*/content|content|g'
[23:41] <godane> you may want to add a -O h$id.mp4 or some thing or you get file names like this below:
[23:41] <godane> h20095.mp4?ct=ebfae74e540fcd7e297a588892beca41022dc1cd2d5c355d
[23:53] <godane> looks like there is no page on archiveteam.org for veoh.com