#archiveteam-bs 2013-03-23,Sat

↑back Search

Time Nickname Message
00:52 🔗 godane Kevin Pereira's Imaginary Friend: https://archive.org/details/g4tv.com-video39529
01:11 🔗 godane my cpu temp is at 64.5 C
01:12 🔗 godane never seen that before compiling firefox
01:50 🔗 joepie91 godane: well hey, it's called *fire*fox
01:50 🔗 joepie91 :P
01:59 🔗 godane its made to kill cpus then
04:22 🔗 godane so i found a very old websites about laser discs and stuff
04:22 🔗 SketchCow Grab that shit
04:23 🔗 godane i'm mirroring could i want to see if i can past the 3585 files in wayback machine
04:24 🔗 godane this is the website: http://www.blam1.com/
04:24 🔗 godane best to have a stand alone archive of it
04:27 🔗 godane it has bumpers of DiscoVision
04:27 🔗 godane in real media format
04:28 🔗 godane i think this was a database of laser discs
04:28 🔗 godane with reviews
04:29 🔗 chronomex niiice
04:29 🔗 chronomex lddb.com is another I think
04:32 🔗 godane looks like lddb.com was japanld.free.fr
04:33 🔗 godane http://web.archive.org/web/20060114075257/http://japanld.free.fr/
04:33 🔗 godane that one doesn't existed anymore
04:34 🔗 godane must have been a old redirect since it had lddb.com in the page
04:34 🔗 godane the best part of old websites is that everything is on one domain
04:35 🔗 godane not freaking youtube redirect
04:35 🔗 godane no weird comments hosted on other sites
05:05 🔗 godane its over 70mb now
05:06 🔗 godane also i have past 3585 files in wayback machine
05:06 🔗 godane i'm at 4746 now
05:25 🔗 godane ok so its done
05:25 🔗 godane 5628 files in warc.gz
05:37 🔗 godane uploaded: https://archive.org/details/www.blam1.com-20130323
05:43 🔗 godane its called the Blem Entertainment Group
05:49 🔗 godane so i found that discovision.com website is still alive
05:49 🔗 godane grabing it
05:49 🔗 godane lets see if it bets the 301 total urls in wayback machine
05:49 🔗 DFJustin typo, should be Blam Entertainment Group
05:51 🔗 godane fixed
06:09 🔗 godane it only has 83 files
06:09 🔗 godane discovision.com that is
06:16 🔗 godane also know that blamld.com and blam1.com are the same
06:17 🔗 godane from what i could tell they bought blam1.com
06:17 🔗 godane maybe to stop a porn site or something
06:18 🔗 godane anyways even wayback doesn't have all the files under blamld.com host
06:29 🔗 godane uploaded: https://archive.org/details/www.discovision.com-20130323
06:34 🔗 godane i'm now grabing cedmagic.com
06:37 🔗 godane its about this: http://en.wikipedia.org/wiki/Capacitance_Electronic_Disc
06:38 🔗 chronomex ceds are cool
09:43 🔗 GLaDOS kennethre: you around?
09:44 🔗 GLaDOS Nevermind!
12:09 🔗 zenpho hi there!
12:09 🔗 soultcer Konnichiwa
12:12 🔗 zenpho i'd like to dip into some of the archived bt internet dialup (http://archive.org/details/archiveteam-btinternet) stuff
12:13 🔗 zenpho i've obtained hanzo warc-tools, grepped thru the CDX files for stuff i'd like to get, and now I think I have some byte offsets for specific spots in specific warc files with the files I'd like to dip into
12:14 🔗 zenpho i don't fancy downloading the entire eleventy-billion gigabytes of warc files see ;o)
12:14 🔗 soultcer I think the IA servers support range requests
12:16 🔗 zenpho I'm struggling to see how to download specific parts of warc files - on a semi-automated basis - so I can unpack the files I'd like to see to my disk
12:17 🔗 zenpho I'm very new to the warc format and tools for working with it - do you guys know if there's a part of warc-tools (or some other nifty warc-friendly tool) which will do what I want?
12:19 🔗 soultcer I don't know about warc-tools, but basically you need to make a http request (be it with python's urllib or with curl) that tells the server to only return a specific range of bytes
12:20 🔗 soultcer hcurl -L -r 2000-5000 http://archive.org/download/archiveteam-btinternet-u-z/btinternet-u-z.megawarc.warc.gz > extract.warc.gz will fetch only bytes 2000-5000 from the given file
12:20 🔗 zenpho I think I can use wget or curl to specify a specific byte range to download, but I have a hunch I'll end up with just some data with no context, certainly not a valid warc which I can parse and extract data from?
12:21 🔗 zenpho ah. whoops - I was typing whilst you were answering. ;o)
12:21 🔗 soultcer A warc.gz file is basically a succession of warc records each individually gzipped, and then concatenated
12:21 🔗 soultcer As long as you start at the correct offset, it should work
12:21 🔗 zenpho oho, awesome sauce!
12:22 🔗 zenpho i'll give this a go and report back - thanks soultcer!
13:45 🔗 Cameron_D Here, have some light (20k words) reading of tech support stories http://www.reddit.com/user/jon6/submitted/
13:45 🔗 Cameron_D There is great rage to be had
13:46 🔗 Cameron_D (despite the naming similarities it is different to BOFH)
13:51 🔗 nwh similarly r/talesfromtechsupport
13:52 🔗 nwh and r/cablefail
13:52 🔗 Cameron_D well, they are all submitted there, his user page is just a nice portal to list them all
13:57 🔗 godane hey everyone
13:57 🔗 godane i had to restart my cedmagic.com download
13:58 🔗 godane luckly i was only at 12mb and i just past that with out any long wait
13:58 🔗 godane my wifi droped in my sleep is the reason
13:59 🔗 nwh so any, any of you know how to set up an EC2 instance with a GPU?
14:01 🔗 Smiley nope
14:01 🔗 nwh they're not even on the damn lsits.
14:02 🔗 nwh is there anywhere that WOULD know?
14:05 🔗 godane i found 10mins of news coverage
14:05 🔗 godane its from good day oregon
14:06 🔗 * nwh twitches
14:11 🔗 godane the video was with the guy that owns cedmagic.com
14:37 🔗 godane i'm past the number of files on wayback machine for cedmagic.com
15:48 🔗 godane is there a way to stop multiable / urls from downloading
15:53 🔗 godane i will see if adding /// to reject-regex works
15:54 🔗 soultcer Ah, you mean URLs which have multiple "/" in them
15:55 🔗 godane yes
15:55 🔗 soultcer I know heritrix has a filter for that, but I don't know anything for wget
15:55 🔗 godane it has reject-regex
18:12 🔗 kennethre GLaDOS: what's up?
19:00 🔗 alard kennethre: I think GLaDOS wanted to ask you about the ArchiveTeam warrior buildpack. The Python buildpack failed because of this https://github.com/heroku/heroku-buildpack-python/issues/79
19:00 🔗 kennethre alard: ah well my response is the proper answer :)
19:00 🔗 alard But that's fixed now that the AT buildpack uses the latest Python-buildpack tag.
19:00 🔗 kennethre excellent
19:00 🔗 alard So I think GLaDOS is running one Yahoo Messages instance on Heroku now.
19:00 🔗 kennethre awesome
19:01 🔗 kennethre i was going to run some
19:01 🔗 kennethre soon
19:02 🔗 alard Cool. There's a strong competition this time.
21:22 🔗 ersi http://i.imgur.com/z0R4kXI.jpg
21:22 🔗 ersi lul wut
22:05 🔗 Smiley fuck knows
22:05 🔗 Smiley "i think i'm cool because i charged someone $24 for a dongle" ?
22:07 🔗 ersi I was just thinking of the PyCon debacle the whole time
22:08 🔗 Smiley ersi: that too
22:16 🔗 ersi this movie is kinda dope
22:16 🔗 ersi Will Ferrel, time travel and dinosaurs - do I need to say more?
22:29 🔗 ivan` https://www.youtube.com/user/ISO8 who likes trains? ;)
22:29 🔗 ivan` I'm running low on disk after 422GB of k-pop
22:29 🔗 ersi oooh, k-pop
22:30 🔗 ersi hey! I've been on that user and watched some videos before
22:30 🔗 ivan` that was https://www.youtube.com/user/godmd6 which I have 1 copy of
22:30 🔗 ivan` there are at least two great cab view videos in ISO8
22:31 🔗 ivan` https://www.youtube.com/watch?v=632rDJGrH1M https://www.youtube.com/watch?v=cW7IdpV49h0
22:34 🔗 ivan` more, actually
22:45 🔗 ersi huh, Jason Segal was in Slackers
22:53 🔗 joepie91 <ivan`>I'm running low on disk after 422GB of k-pop
22:53 🔗 joepie91 someone I know would virtually orgasm if he read this

irclogger-viewer