Time |
Nickname |
Message |
09:04
🔗
|
midas |
so, was that a small internet earthquake? |
10:54
🔗
|
schbirid |
have some html injection https://startpage.com/do/search?query=Reginald+D.+Hunter&cat=web&pl=chrome&language=english |
13:30
🔗
|
midas |
can someone point me in the right direction? i have a list of usernames for home.xmsnet.nl now i just need to figure out how i can get wget to work with this to warc it. |
13:34
🔗
|
midas |
right, bad snort. |
13:35
🔗
|
midas |
can someone point me in the right direction? i have a list of usernames for home.xmsnet.nl now i just need to figure out how i can get wget to work with this to warc it. |
13:49
🔗
|
SadDM |
midas: is there any corelation between the usernames and urls? |
13:51
🔗
|
rocode_ |
home.xmsnet.nl/username IIRC |
13:51
🔗
|
SadDM |
hmm, it looks like the urls are http://home.xmsnet.nl/username |
13:51
🔗
|
SadDM |
right |
13:52
🔗
|
rocode |
Which version of wget was warc enabled? |
13:52
🔗
|
SadDM |
so th easiest thing to do is to make a text file with a list of those urls and then do a wget -i <filename> |
13:53
🔗
|
SadDM |
1.14 maybe? |
13:54
🔗
|
SadDM |
yup... >1.14 has warc baked in |
13:56
🔗
|
SadDM |
midas: with what I've suggested you'll end up with a warc for every line of the text file... if that's not ultimately desireable, you can megawarc them all together after the fact. |
13:58
🔗
|
rocode |
There is only 156 sites. Not too bad. |
14:27
🔗
|
midas |
SadDM: sorry, was out of irc for a sec |
14:27
🔗
|
midas |
yeah, -i would work with --warc-file ? |
14:28
🔗
|
SadDM |
it sure does |
14:28
🔗
|
SadDM |
you just end up with one for every url in the file |
14:28
🔗
|
midas |
ok, lets start a screen :p |
14:28
🔗
|
midas |
thats no issue |
14:32
🔗
|
midas |
wget wants a argument for warc-file |
14:33
🔗
|
midas |
(i feel stupid right now :p) |
14:35
🔗
|
SadDM |
oh... uh |
14:35
🔗
|
SadDM |
hmm... maybe I got ahead of myself |
14:35
🔗
|
ohhdemgir |
I keep cocking up my media types, wish users could move items :| |
14:35
🔗
|
midas |
my guess was just wget -i <file> -m --warc-file but thats no balony |
14:35
🔗
|
SadDM |
I know that I've done something like this |
14:36
🔗
|
SadDM |
maybe I just built a bunch of wget commands in a file and then piped it through a shell |
14:37
🔗
|
SadDM |
sorry midas, I think I've given you slightly bad advice |
14:37
🔗
|
SadDM |
are you on a unix machine? |
14:40
🔗
|
midas |
yeah |
14:40
🔗
|
SadDM |
something like this should generate the wgets: cat usernames.txt|sed 's/\(.*\)/wget -m --warc-file=\1 http:\/\/home.xmsnet.nl/\1//' > wgets |
14:41
🔗
|
SadDM |
note that I just wrote that cold and didn't test it or anything |
14:41
🔗
|
SadDM |
then just "sh wgets" |
14:41
🔗
|
midas |
lol. thanks! i think that would work :) |
14:42
🔗
|
SadDM |
:-D it's close-ish at least |
14:42
🔗
|
midas |
indeed, and thats more than i could do on short notice :p |
14:48
🔗
|
godane |
so i fixed some of the uploads of sydney morning herald and Australian women's weekly |
14:48
🔗
|
godane |
some of the uploads were incomplete downloads |
14:50
🔗
|
Smiley |
midas: around bud? |
14:50
🔗
|
godane |
so i'm mirroring more glenn beck episodes |
14:51
🔗
|
midas |
Smiley: yeah |
14:51
🔗
|
midas |
im always somewhere |
14:51
🔗
|
Smiley |
midas: how are/were you testing cartoon hd on android, |
14:51
🔗
|
Smiley |
you running a avd on your system? |
14:52
🔗
|
midas |
didnt have time since last week, work said i had to do something :< |
14:53
🔗
|
midas |
i think ill debug it tonight again |
14:53
🔗
|
midas |
just have to get this grab started |
14:54
🔗
|
Smiley |
yeah I'm just trying to setup an AVD so I can just grab stuff rather than grabbing on my phone and pushing accross the network |
14:54
🔗
|
Smiley |
ah got it working now :D |
14:55
🔗
|
* |
Smiley waits to see if it loads |
14:57
🔗
|
midas |
ah bloody cool Smiley ! |
14:58
🔗
|
Smiley |
midas: hmmmmm just sitting on "android" screen atm :/ |
15:24
🔗
|
Smiley |
midas: just run android avd |
15:24
🔗
|
Smiley |
and create a new avd using a nice h/w |
15:26
🔗
|
Smiley |
or at least that's what I'm trying to do. |
15:26
🔗
|
Smiley |
it's quite slow. |
15:27
🔗
|
Smiley |
as I just told it to create a 5Gb sdcard D: |
15:37
🔗
|
midas |
lol |
15:37
🔗
|
* |
midas throws stones at midas |
15:46
🔗
|
Smiley |
grrr |
15:46
🔗
|
Smiley |
none will boot D: |
15:51
🔗
|
Smiley |
yey got one booting \o/ |
15:59
🔗
|
Smiley |
and using android monitor, it has a file explorer with push/pull |
15:59
🔗
|
Smiley |
or i can use adb. sweeeet |
16:45
🔗
|
yipdw |
lol Wheeler |
16:45
🔗
|
yipdw |
"Simply put, when a consumer buys a specified bandwidth, it is commercially unreasonable and thus a violation of this proposal to deny them the full connectivity and the full benefits that connection enables." |
16:45
🔗
|
yipdw |
evidently Wheeler has never used Comcast services, only lobbied for them |
17:10
🔗
|
exmic |
hahaha |
18:02
🔗
|
schbirid |
lol wtf, startpage.com's "community" link goes tot heir facebook site |
18:07
🔗
|
midas |
so this is about archive.org and a little bit about archiveteam also ;) http://www.nu.nl/weekend/3769630/bibliothecaris-van-internet-wil-websites-niet-verloren-laten-gaan.html |
18:07
🔗
|
midas |
pritty big website in .nl |
20:37
🔗
|
godane |
i'm grabbing Bomb Patrol Afghanistan |
20:37
🔗
|
godane |
cause it aired on G4 |
20:37
🔗
|
godane |
i'm getting the 720p copies |
20:43
🔗
|
exmic |
./wg 25 |
22:17
🔗
|
godane |
so i'm up to 7600 items now in my godaneinbox |
22:53
🔗
|
godane |
so i'm trying to download mov file from way back machine and i can't get it the full file |
22:53
🔗
|
godane |
i'm trying this one at the moment: https://web.archive.org/web/20060602014144/http://www.commandn.tv/cN/044/commandN-044-h264.mov |
22:54
🔗
|
godane |
the way back machine url will just stop about 32.7mb into the file |
22:54
🔗
|
godane |
even with wget |
23:21
🔗
|
godane |
some good news |
23:21
🔗
|
godane |
looks like i maybe able to get some mp4 files of commandN thur veoh.com |
23:25
🔗
|
godane |
and also here is the sitemap of veoh.com: http://www.veoh.com/sitemap.xml |
23:30
🔗
|
godane |
so based on one of the veho.com html files you can get the full path of videos from them |
23:40
🔗
|
godane |
here is some example code for grabbing the video files from veoh.com: |
23:40
🔗
|
godane |
curl http://www.veoh.com/watch/e20095 | grep fullPreviewHashHighPath | sed 's|.*fullPreviewHashHighPath":"||g' | sed 's|",".*||g' | sed 's|.*/content|content|g' |
23:41
🔗
|
godane |
you may want to add a -O h$id.mp4 or some thing or you get file names like this below: |
23:41
🔗
|
godane |
h20095.mp4?ct=ebfae74e540fcd7e297a588892beca41022dc1cd2d5c355d |
23:53
🔗
|
godane |
looks like there is no page on archiveteam.org for veoh.com |