#archiveteam-bs 2013-02-01,Fri

↑back Search

Time Nickname Message
00:49 🔗 godane SketchCow: looks some g4tv.com-videos are in g4video
00:49 🔗 godane there should be in g4video-web
00:59 🔗 chronomex alard: is there a reasonable, compatible way to pad a warc file? I want each response body to start at the beginning of a disk block, so zfs block-level dedup can work with it
01:00 🔗 chronomex I'm thinking a blank 'metadata' block would be the ticket
01:00 🔗 chronomex oh, nevermind, readers are supposed to skip unknown blocks
01:00 🔗 chronomex perfect
01:41 🔗 db48x chronomex: interesting idea
01:52 🔗 db48x chronomex: have you seen how much memory that takes though?
01:53 🔗 chronomex I have not
01:54 🔗 chronomex is it horrendous?
01:55 🔗 db48x yes
01:56 🔗 db48x 320 bytes per block allocated in your filesystem
02:01 🔗 db48x the block size is variable, but assume 64k as a half-way point between the extremes
02:01 🔗 db48x how big do you expect your dataset to be?
02:01 🔗 db48x actually, if it's one giant warc file, then the block size will be 128k
02:02 🔗 db48x http://constantin.glez.de/blog/2011/07/zfs-dedupe-or-not-dedupe is an article I hadn't seen before, but it summarizes the data very well
02:25 🔗 Brenry are you guys awake yet
02:32 🔗 db48x Brenry: nope. what's up?
02:33 🔗 Brenry did they ever scrape user data for geocities ?
02:34 🔗 Brenry or was it just those neighborhoods or commercial sites
02:34 🔗 db48x what do you mean by user data?
02:35 🔗 Brenry geocities.com/~username
02:35 🔗 Brenry so i can get my fkn .jpg pics
02:36 🔗 Brenry none of those were password protected.. user dirs.. just open files like a directory list unless it had an index.html file
02:37 🔗 db48x ah, hrm
02:40 🔗 db48x I don't really know
02:40 🔗 db48x the geocities project wasn't very organized, I'm afraid
02:41 🔗 db48x have you tried one of the mirrors?
02:42 🔗 Brenry one of those sites said the .jp geocities was still attached.. but that wouldn't have data from other regions eh ? tried that and said no longer exists
02:42 🔗 Brenry yeah spent alot of time like a month after that crap in 2009.. and a year later
02:42 🔗 Brenry and trying back
02:43 🔗 Brenry db48x: keep at it ok.. i'll be back in like 2 years.. and i would really like my pictures
02:43 🔗 db48x what was your username?
02:43 🔗 Brenry oplazzz
02:43 🔗 db48x let me see
02:43 🔗 Brenry it was geocities.com/oplazzz or geocities.com/~oplazzz
02:43 🔗 Brenry i tried wayback machine.. but doesnt seem to have users
02:45 🔗 db48x hmm. you're not in the username list on oocities.org
02:47 🔗 db48x nor is it available on reocities.com
02:47 🔗 db48x but they'll email you if they come across it
02:47 🔗 Brenry k.. l8r
02:51 🔗 db48x it occurs to me that I probably should have said that they _can_ email him, if he goes and puts in his email address
02:53 🔗 DFJustin welp
03:04 🔗 godane found another broken video
03:04 🔗 godane not cause of errors but cause it slowed down for some reason at the 11min mark
03:11 🔗 godane so the 9200, 9559, and 9717 have some sort of bad encoding
03:12 🔗 db48x videos can have variable frame rates
03:12 🔗 db48x although it's rarely used, so it more often a mistake
03:12 🔗 godane the frames was move like 1 every 5 seconds
03:12 🔗 db48x does the video run on past the audio?
03:13 🔗 godane there is no audio when this starts happen
03:20 🔗 dashcloud got a sample godane?
03:20 🔗 godane https://archive.org/details/g4tv.com-video9200
03:36 🔗 godane so i found some g4 underground clips
03:36 🔗 godane its better then the episodes that i have found
03:36 🔗 godane the episodes are all croped
03:37 🔗 godane so top and bottom are cut
03:46 🔗 godane i found a microsoft key note from tgs 2008
03:46 🔗 godane tgs = tokyo game show
03:54 🔗 DFJustin I uploaded this last night https://archive.org/details/osaka-game-show-2009
08:12 🔗 SketchCow godane - Stop uploading until I tell you to.
08:15 🔗 SketchCow We need to give you direct access to the g4 collections because you successfully killed out opensource_videos, which is frankly amazing.
08:36 🔗 godane SetchCow: your joking right?
08:38 🔗 godane you aways tell me to upload stuff then we will deal with it
08:38 🔗 SketchCow It won't last more than a day.
08:38 🔗 godane also your still puting them into g4video
08:38 🔗 godane not g4video-web
08:38 🔗 SketchCow But it's midnight at Internet Archive, I need to have the privs modified during the busy day.
08:38 🔗 SketchCow Dude, one day
08:38 🔗 godane ok
08:39 🔗 SketchCow The most recent set wasn't put in there by me.
08:39 🔗 godane oh
08:39 🔗 SketchCow It was put in my a desperate jeff trying to stop g4 related video from completely choking our RSS feed
08:39 🔗 SketchCow And other things
08:39 🔗 godane oh
08:39 🔗 db48x lol
08:39 🔗 SketchCow Normally, my scooping up your uploads every once in a while was fine.
08:40 🔗 SketchCow But you turned up the heat.
08:40 🔗 godane yes but this 35k+ videos
08:40 🔗 db48x godane: make them beg!
08:40 🔗 SketchCow So soon you will have the ability to declare g4video and g4video-web as the collection, and upload that way.
08:40 🔗 godane i hope i will get the twit collections access too
08:41 🔗 SketchCow But we need a day, it's all timeshifted now. It's 9:40 here and 12:40 in California.
08:41 🔗 SketchCow I'll get you ALL the collections you need.
08:41 🔗 godane ok
08:41 🔗 SketchCow I am surprised you're not aware you're one of the single largest non-institution uploaders
08:42 🔗 godane wow
08:43 🔗 SketchCow 35,000 videos is a lot of videos, sir.
08:43 🔗 SketchCow Anyway, like I said, one day, and we'll get this shored up.
08:43 🔗 godane thats ok
08:56 🔗 ersi godane: Bwahaha, you're TOO GOOD :)
08:56 🔗 ersi That's awesome
08:59 🔗 chronomex :)
09:04 🔗 godane also its about 255gb now
09:43 🔗 godane there looks to be alot of first 15 mins previews of games
09:43 🔗 godane :-D
10:28 🔗 ersi What? How hard is tar? :| tar -xf <file> for extraction, tar -cf file <targets> for creation and
10:28 🔗 ersi tar -tf <file> to look at it without extracting ;o
11:11 🔗 godane i'm back
11:12 🔗 godane my internet wifi when out
11:12 🔗 godane *went out
11:12 🔗 Schbirid SketchCow: dont miss going to http://www.computerspielemuseum.de/ !
11:13 🔗 Schbirid hm, their english site is incomplete
12:25 🔗 Cameron_D I prefer `tar -xvf <file>` so you can watch stuff scroll past
12:46 🔗 ersi I do that with tars from unknowns, if I made it myself - I just -xf it
15:59 🔗 godane uploaded: http://archive.org/details/BBV.Customer.Service.VHSCap-CG
23:16 🔗 Coderjoe wow. I just found the world's first website. In the footer: "There have been [counter] hits to this site since noon GMT, Jan 1st, 4713 BC."
23:20 🔗 chronomex lol
23:20 🔗 chronomex technically true on all levels
23:41 🔗 db48x hah
