Time |
Nickname |
Message |
00:52
🔗
|
godane |
Kevin Pereira's Imaginary Friend: https://archive.org/details/g4tv.com-video39529 |
01:11
🔗
|
godane |
my cpu temp is at 64.5 C |
01:12
🔗
|
godane |
never seen that before compiling firefox |
01:50
🔗
|
joepie91 |
godane: well hey, it's called *fire*fox |
01:50
🔗
|
joepie91 |
:P |
01:59
🔗
|
godane |
its made to kill cpus then |
04:22
🔗
|
godane |
so i found a very old websites about laser discs and stuff |
04:22
🔗
|
SketchCow |
Grab that shit |
04:23
🔗
|
godane |
i'm mirroring could i want to see if i can past the 3585 files in wayback machine |
04:24
🔗
|
godane |
this is the website: http://www.blam1.com/ |
04:24
🔗
|
godane |
best to have a stand alone archive of it |
04:27
🔗
|
godane |
it has bumpers of DiscoVision |
04:27
🔗
|
godane |
in real media format |
04:28
🔗
|
godane |
i think this was a database of laser discs |
04:28
🔗
|
godane |
with reviews |
04:29
🔗
|
chronomex |
niiice |
04:29
🔗
|
chronomex |
lddb.com is another I think |
04:32
🔗
|
godane |
looks like lddb.com was japanld.free.fr |
04:33
🔗
|
godane |
http://web.archive.org/web/20060114075257/http://japanld.free.fr/ |
04:33
🔗
|
godane |
that one doesn't existed anymore |
04:34
🔗
|
godane |
must have been a old redirect since it had lddb.com in the page |
04:34
🔗
|
godane |
the best part of old websites is that everything is on one domain |
04:35
🔗
|
godane |
not freaking youtube redirect |
04:35
🔗
|
godane |
no weird comments hosted on other sites |
05:05
🔗
|
godane |
its over 70mb now |
05:06
🔗
|
godane |
also i have past 3585 files in wayback machine |
05:06
🔗
|
godane |
i'm at 4746 now |
05:25
🔗
|
godane |
ok so its done |
05:25
🔗
|
godane |
5628 files in warc.gz |
05:37
🔗
|
godane |
uploaded: https://archive.org/details/www.blam1.com-20130323 |
05:43
🔗
|
godane |
its called the Blem Entertainment Group |
05:49
🔗
|
godane |
so i found that discovision.com website is still alive |
05:49
🔗
|
godane |
grabing it |
05:49
🔗
|
godane |
lets see if it bets the 301 total urls in wayback machine |
05:49
🔗
|
DFJustin |
typo, should be Blam Entertainment Group |
05:51
🔗
|
godane |
fixed |
06:09
🔗
|
godane |
it only has 83 files |
06:09
🔗
|
godane |
discovision.com that is |
06:16
🔗
|
godane |
also know that blamld.com and blam1.com are the same |
06:17
🔗
|
godane |
from what i could tell they bought blam1.com |
06:17
🔗
|
godane |
maybe to stop a porn site or something |
06:18
🔗
|
godane |
anyways even wayback doesn't have all the files under blamld.com host |
06:29
🔗
|
godane |
uploaded: https://archive.org/details/www.discovision.com-20130323 |
06:34
🔗
|
godane |
i'm now grabing cedmagic.com |
06:37
🔗
|
godane |
its about this: http://en.wikipedia.org/wiki/Capacitance_Electronic_Disc |
06:38
🔗
|
chronomex |
ceds are cool |
09:43
🔗
|
GLaDOS |
kennethre: you around? |
09:44
🔗
|
GLaDOS |
Nevermind! |
12:09
🔗
|
zenpho |
hi there! |
12:09
🔗
|
soultcer |
Konnichiwa |
12:12
🔗
|
zenpho |
i'd like to dip into some of the archived bt internet dialup (http://archive.org/details/archiveteam-btinternet) stuff |
12:13
🔗
|
zenpho |
i've obtained hanzo warc-tools, grepped thru the CDX files for stuff i'd like to get, and now I think I have some byte offsets for specific spots in specific warc files with the files I'd like to dip into |
12:14
🔗
|
zenpho |
i don't fancy downloading the entire eleventy-billion gigabytes of warc files see ;o) |
12:14
🔗
|
soultcer |
I think the IA servers support range requests |
12:16
🔗
|
zenpho |
I'm struggling to see how to download specific parts of warc files - on a semi-automated basis - so I can unpack the files I'd like to see to my disk |
12:17
🔗
|
zenpho |
I'm very new to the warc format and tools for working with it - do you guys know if there's a part of warc-tools (or some other nifty warc-friendly tool) which will do what I want? |
12:19
🔗
|
soultcer |
I don't know about warc-tools, but basically you need to make a http request (be it with python's urllib or with curl) that tells the server to only return a specific range of bytes |
12:20
🔗
|
soultcer |
hcurl -L -r 2000-5000 http://archive.org/download/archiveteam-btinternet-u-z/btinternet-u-z.megawarc.warc.gz > extract.warc.gz will fetch only bytes 2000-5000 from the given file |
12:20
🔗
|
zenpho |
I think I can use wget or curl to specify a specific byte range to download, but I have a hunch I'll end up with just some data with no context, certainly not a valid warc which I can parse and extract data from? |
12:21
🔗
|
zenpho |
ah. whoops - I was typing whilst you were answering. ;o) |
12:21
🔗
|
soultcer |
A warc.gz file is basically a succession of warc records each individually gzipped, and then concatenated |
12:21
🔗
|
soultcer |
As long as you start at the correct offset, it should work |
12:21
🔗
|
zenpho |
oho, awesome sauce! |
12:22
🔗
|
zenpho |
i'll give this a go and report back - thanks soultcer! |
13:45
🔗
|
Cameron_D |
Here, have some light (20k words) reading of tech support stories http://www.reddit.com/user/jon6/submitted/ |
13:45
🔗
|
Cameron_D |
There is great rage to be had |
13:46
🔗
|
Cameron_D |
(despite the naming similarities it is different to BOFH) |
13:51
🔗
|
nwh |
similarly r/talesfromtechsupport |
13:52
🔗
|
nwh |
and r/cablefail |
13:52
🔗
|
Cameron_D |
well, they are all submitted there, his user page is just a nice portal to list them all |
13:57
🔗
|
godane |
hey everyone |
13:57
🔗
|
godane |
i had to restart my cedmagic.com download |
13:58
🔗
|
godane |
luckly i was only at 12mb and i just past that with out any long wait |
13:58
🔗
|
godane |
my wifi droped in my sleep is the reason |
13:59
🔗
|
nwh |
so any, any of you know how to set up an EC2 instance with a GPU? |
14:01
🔗
|
Smiley |
nope |
14:01
🔗
|
nwh |
they're not even on the damn lsits. |
14:02
🔗
|
nwh |
is there anywhere that WOULD know? |
14:05
🔗
|
godane |
i found 10mins of news coverage |
14:05
🔗
|
godane |
its from good day oregon |
14:06
🔗
|
* |
nwh twitches |
14:11
🔗
|
godane |
the video was with the guy that owns cedmagic.com |
14:37
🔗
|
godane |
i'm past the number of files on wayback machine for cedmagic.com |
15:48
🔗
|
godane |
is there a way to stop multiable / urls from downloading |
15:53
🔗
|
godane |
i will see if adding /// to reject-regex works |
15:54
🔗
|
soultcer |
Ah, you mean URLs which have multiple "/" in them |
15:55
🔗
|
godane |
yes |
15:55
🔗
|
soultcer |
I know heritrix has a filter for that, but I don't know anything for wget |
15:55
🔗
|
godane |
it has reject-regex |
18:12
🔗
|
kennethre |
GLaDOS: what's up? |
19:00
🔗
|
alard |
kennethre: I think GLaDOS wanted to ask you about the ArchiveTeam warrior buildpack. The Python buildpack failed because of this https://github.com/heroku/heroku-buildpack-python/issues/79 |
19:00
🔗
|
kennethre |
alard: ah well my response is the proper answer :) |
19:00
🔗
|
alard |
But that's fixed now that the AT buildpack uses the latest Python-buildpack tag. |
19:00
🔗
|
kennethre |
excellent |
19:00
🔗
|
alard |
So I think GLaDOS is running one Yahoo Messages instance on Heroku now. |
19:00
🔗
|
kennethre |
awesome |
19:01
🔗
|
kennethre |
i was going to run some |
19:01
🔗
|
kennethre |
soon |
19:02
🔗
|
alard |
Cool. There's a strong competition this time. |
21:22
🔗
|
ersi |
http://i.imgur.com/z0R4kXI.jpg |
21:22
🔗
|
ersi |
lul wut |
22:05
🔗
|
Smiley |
fuck knows |
22:05
🔗
|
Smiley |
"i think i'm cool because i charged someone $24 for a dongle" ? |
22:07
🔗
|
ersi |
I was just thinking of the PyCon debacle the whole time |
22:08
🔗
|
Smiley |
ersi: that too |
22:16
🔗
|
ersi |
this movie is kinda dope |
22:16
🔗
|
ersi |
Will Ferrel, time travel and dinosaurs - do I need to say more? |
22:29
🔗
|
ivan` |
https://www.youtube.com/user/ISO8 who likes trains? ;) |
22:29
🔗
|
ivan` |
I'm running low on disk after 422GB of k-pop |
22:29
🔗
|
ersi |
oooh, k-pop |
22:30
🔗
|
ersi |
hey! I've been on that user and watched some videos before |
22:30
🔗
|
ivan` |
that was https://www.youtube.com/user/godmd6 which I have 1 copy of |
22:30
🔗
|
ivan` |
there are at least two great cab view videos in ISO8 |
22:31
🔗
|
ivan` |
https://www.youtube.com/watch?v=632rDJGrH1M https://www.youtube.com/watch?v=cW7IdpV49h0 |
22:34
🔗
|
ivan` |
more, actually |
22:45
🔗
|
ersi |
huh, Jason Segal was in Slackers |
22:53
🔗
|
joepie91 |
<ivan`>I'm running low on disk after 422GB of k-pop |
22:53
🔗
|
joepie91 |
someone I know would virtually orgasm if he read this |