#archiveteam 2011-09-10,Sat

↑back Search

Time Nickname Message
03:22 πŸ”— dashcloud just saw the archiveteam defcon talk, and it was awesome- is that the best one to show to people who wonder what it is and what they do?
03:31 πŸ”— SketchCow It's a good one.
03:34 πŸ”— dashcloud so is the soy sauce available for order yet?
03:40 πŸ”— SketchCow It will take years.
03:53 πŸ”— SketchCow http://www.yagisawa-s.co.jp/ is the company.
04:08 πŸ”— Coderjoe SketchCow: this seems like a silly thing to ask, but are you interested in the bits of stage6.com I managed to download a few years back?
04:11 πŸ”— Coderjoe some stats at http://wegetsignal.org/stage6.php
04:14 πŸ”— SketchCow Yes
04:19 πŸ”— Coderjoe i only managed to get metadata for 4.24% of all of the videos that were up there, and the acutal videos for only 1.17% :(
04:24 πŸ”— SketchCow Understood.
04:24 πŸ”— SketchCow How big?
04:24 πŸ”— SketchCow I mean the final
04:25 πŸ”— Coderjoe 304G for the videos and less than 1G for the metadata (which currently lives in a mysql db)
04:25 πŸ”— SketchCow Oh, that's nothin'
04:33 πŸ”— Coderjoe I don't remember if the site had comments on the video pages, but if so, I don't have that. the metadata I have is id, original url, title, upload date, user, description, and tags. (which at the time was all I was concerned with)
04:36 πŸ”— godane did you guys backup starwars forums?
04:46 πŸ”— Coderjoe looks like the database dump is only 7.2M with extended inserts and 12M without extended inserts
04:51 πŸ”— godane it looks like stage6 was hacked a few months before the shutdown
04:52 πŸ”— godane At approximately 16:00 GMT on February 9, 2008, Stage6 was hacked. People that visited the front page of the website were redirected to multiple shock sites.
04:52 πŸ”— godane Several thousand user accounts, that were used to upload videos between December 7, 2007 and February 10, 2008, are thought to have been compromised by the attack.
04:53 πŸ”— godane it thought users account wore hacked on December 7, 2007
04:53 πŸ”— godane :P
04:54 πŸ”— godane there was also only 3 days after it was announced that would shut down that it was
04:59 πŸ”— godane vreel was another site that died too
04:59 πŸ”— godane vreel was set to replace stage6 it looks like
05:01 πŸ”— Coderjoe i was downloading videos for at least 5 days since the announcement
05:02 πŸ”— Coderjoe and I had to alter my script a few times because they changed some things on the back end to make it more difficult for you to download the videos
05:02 πŸ”— godane ok
05:03 πŸ”— godane so the videos could still be downloaded even thought the site may have been shut down
05:04 πŸ”— Coderjoe well, at the end, you had to get an access token from the video page in order to download the video. (before that, if you knew the video ID number you wanted to download, you could just get the video file directly)
05:05 πŸ”— godane i just hope youtube never goes down
05:06 πŸ”— Cameron_D it will, be it in 5 years or 50, it will.
05:07 πŸ”— blue_ do we start now?
05:07 πŸ”— godane maybe a good idea
05:07 πŸ”— godane since there is so much data
05:07 πŸ”— Cameron_D we could since the content that is already there isn't really going to change
05:08 πŸ”— godane start with the older videos and slowly go up
05:08 πŸ”— Cameron_D well youtube isn't incremental
05:08 πŸ”— Cameron_D so you'll need to brute-force the possible values
05:08 πŸ”— blue_ that's what i was thinking
05:08 πŸ”— Cameron_D and scrape them, if it 404's go back to it in a few years and try again
05:09 πŸ”— godane also if we find a way to not download 10 versions of the same video from 10 users that would also be good
05:09 πŸ”— Coderjoe i wonder if they reuse ids of videos that were removed, after some cooling-off period
05:09 πŸ”— Cameron_D and we don't really need to download every quality version
05:10 πŸ”— godane i know
05:10 πŸ”— godane but the highest version on youtube would be nice
05:11 πŸ”— godane so it can down encode in years to come
05:11 πŸ”— Coderjoe and when higher quality options become available again?
05:11 πŸ”— blue_ i had problems after going for the giant robot project in one day :/ not sure if it was my end or if they were blocking attempts from youtube-dl sequentially hitting the playlist
05:12 πŸ”— godane at least we are downloading youtube and not youporn
05:12 πŸ”— blue_ i'm sure someone out there has that covered
05:13 πŸ”— godane yes
05:57 πŸ”— blue_ glad to see that phantomjs was already in the tools section, if anyone can school me in that that would be neat
06:54 πŸ”— Coderjoe ok, last night tar ran out of memory again. I just added even more swap (16G of swap and 4G of ram) and we'll see how it goes
06:54 πŸ”— Coderjoe tar is currently up to 9490M virtual
06:56 πŸ”— Coderjoe and back down to 5444M
07:05 πŸ”— kin37ik 3 more days then my bandwidth refreshes
07:13 πŸ”— Coderjoe up to 6690M again, and finally outputting data
07:13 πŸ”— db48x2 heh
07:13 πŸ”— db48x2 Coderjoe: what are you tarring?
07:14 πŸ”— Coderjoe i was curious about how well all of the friendster data I have compresses when in one big tarball. so i untarred them all and am doing that now, with a filelist to pull in files in a certain order, which should compress the best
07:15 πŸ”— db48x2 heh
07:15 πŸ”— Coderjoe that filelist is 4735134182 bytes
07:15 πŸ”— db48x2 sheesh
07:16 πŸ”— Coderjoe the uncompressed tarball is expected to be 1452086937600 bytes
07:18 πŸ”— db48x2 mmm
07:18 πŸ”— db48x2 I have no idea how big mine is uncompressed
07:20 πŸ”— db48x2 hrm
07:20 πŸ”— db48x2 my center channel is very very quiet
07:20 πŸ”— Coderjoe it it hunting rabbit?
07:20 πŸ”— db48x2 not sure
07:20 πŸ”— db48x2 I can't hear what it's saying
07:21 πŸ”— db48x2 loose cable
07:44 πŸ”— godane SketchCow: found a video of you talking about textfiles.com back in 2000
07:44 πŸ”— godane at defcon 8
07:44 πŸ”— godane backing it up
08:19 πŸ”— ersi wget is now 2.5GB memory
08:59 πŸ”— Coderjoe hrm...
08:59 πŸ”— Coderjoe I think friendster.000023000-000240000.tar.gz is misnamed...
08:59 πŸ”— Coderjoe 000 023 000 - 000 240 000
09:03 πŸ”— Coderjoe btw, I'm running a program I wrote to run through the deflate stream and break in the debugger when it hits one of a few invalid conditions. Then I'll have the offset to the start of the block, and be able to see what the file looks like there
11:36 πŸ”— godane do you guys take old fan fics?
11:37 πŸ”— godane i only ask cause i got a old dragon ball z fan fic from 2002
11:51 πŸ”— chronomex godane: paper or electronic?
11:54 πŸ”— godane electronic
11:54 πŸ”— chronomex you should put it on archive.org
12:00 πŸ”— chronomex godane: archiveteam is not a donation bin. we're volunteer firefighters. archive.org is the donation bin
12:01 πŸ”— godane ok
12:01 πŸ”— godane i think all of it is on archive.org
13:54 πŸ”— ersi Hehe.. my instructables.com wget thread is up to 3GB memory use and it's downloaded 13.5GB of data
13:56 πŸ”— Cameron_D ersi, I started getting instructables, ended up with ~40gb I think, before I had a power outage and never resumed it
13:56 πŸ”— ersi Jebus
13:57 πŸ”— ersi I think my machine will kill itself before getting that much data.. seeing how it's slurping up 3GB memory right now
14:57 πŸ”— SpaceCore you know you didn't archive something right when during extraction, 8157 errors are returned
15:08 πŸ”— ersi errors?
15:41 πŸ”— SpaceCore ersi: "cannot create directory"
15:45 πŸ”— ersi >_>
15:45 πŸ”— ersi bet it's inode related or something?
16:30 πŸ”— SpaceCore ersi: windows related.
16:30 πŸ”— SpaceCore "HERP DERP IT R HEER NOT THAR"
16:31 πŸ”— SpaceCore You know how windows is like at times, right?
17:32 πŸ”— ersi lol
17:33 πŸ”— ersi well, enjoy your fail mr windows guy
18:27 πŸ”— josephwd1 I was scanning old photos, and found this cigar box, mostly old photos of family, at the bottom ? A picture of my uncles and grandpa in kkk uniforms, 8000 dollars, and 2 oz's of weed.
18:28 πŸ”— Schbirid you should make that a comic and become famous on reddit
18:29 πŸ”— josephwd1 This is very good idea
18:29 πŸ”— josephwd1 isn't there a site to make rage comics ?
18:30 πŸ”— SketchCow CETUSA.org
18:30 πŸ”— SketchCow Could someone please heretrix/wget that thing.
18:31 πŸ”— Schbirid Diversify Your Staff, Hire An Asian Women To Pose On Your Website
18:54 πŸ”— jch SketchCow: on it.
19:09 πŸ”— jch SketchCow: looks like it's down. all the paths for the css and stuff are given as absolute paths, do you mind that?
19:09 πŸ”— emijrp i dont remember who was archiving twit.tv, but here it is the wiki dump http://code.google.com/p/wikiteam/downloads/list
19:09 πŸ”— jch /foo.css
19:09 πŸ”— jch so it won't work if you host it under some subdir.
19:11 πŸ”— jch never mind...
19:11 πŸ”— Schbirid ace http://buttcoin.org/bitcointalk-forums-hacked-bill-cosby-pimping-new-cosbycoinsΓ―ΒΏΒ½-to-all-the-members-breaking
19:13 πŸ”— Aranje oh man, that's pretty funny
19:13 πŸ”— Aranje good old-school hack
19:16 πŸ”— emijrp lul
19:19 πŸ”— jch jesus its filenames are annoying.
19:24 πŸ”— jch Could somebody try and throw heretrix after cetusa.org? wget does a pretty bad job at making sense of their pretty bad rewriting rules
19:36 πŸ”— jch <- giving up on cetusa.org until it's been determined if heretrix could mirror it more painlessly
22:15 πŸ”— godane ShetchCow: it looks like defcon didn't get the videos and audio right until defcon 13
22:16 πŸ”— godane watch defcon 8 and 12 videos of you
22:17 πŸ”— chronomex that shit is haaaaard</wine>
22:17 πŸ”— chronomex *whine
22:25 πŸ”— ersi certainly
22:44 πŸ”— db48x2 didn't really get it right this year either
22:44 πŸ”— db48x2 the audio should have just been straight from the mic he was wearing, rather than mixed from both it and the mic at the podium
22:49 πŸ”— chronomex yeah, it was like that in the auditorium too
23:03 πŸ”— goekesmi That got on my nerves, and I was running that room half the time. Couldn't get the audio guys to choose one.
23:22 πŸ”— chronomex :(
23:22 πŸ”— chronomex goekesmi: you were in charge of audio in that room this year?
23:24 πŸ”— goekesmi Nope. I'm a stage manager for defcon. I can vaguely influence the A/V side of the operation.
23:26 πŸ”— chronomex ah neat

irclogger-viewer