Time |
Nickname |
Message |
00:53
🔗
|
godane |
i'm finally finding old digital avenue files |
01:02
🔗
|
MrArgent |
:o |
01:05
🔗
|
godane |
there like long ads |
01:05
🔗
|
godane |
90 seconds to 3 mins |
01:11
🔗
|
MrArgent |
i'm attempting to re-acquire the textfiles dump |
01:11
🔗
|
MrArgent |
i have the extracted files on a spare drive but not the 7z |
01:12
🔗
|
MrArgent |
also, the extant copy is compromised as my antivirus got carried away with the source code/etc. to MS-DOS era viruses stored in the dump |
01:12
🔗
|
MrArgent |
(not to mention moving ~8gb of plaintext from one drive to another would NOT be fun) |
01:14
🔗
|
MrArgent |
http://archive.org/details/textfiles-dot-com-2011 This guy |
01:19
🔗
|
MrArgent |
on a side note, i love the wiki's page on the Library of Alexandria |
01:19
🔗
|
MrArgent |
thankfully, one of the larger-scale implications of the internet's existance is that information destruction on that scale is now substantially more difficulty |
01:19
🔗
|
MrArgent |
*difficult |
01:36
🔗
|
kennethre |
oh my god |
01:36
🔗
|
kennethre |
http://keygenjukebox.com/ |
01:39
🔗
|
joepie91 |
kennethre: http://keygenmusic.net/ ;) |
01:39
🔗
|
joepie91 |
four times as much |
01:39
🔗
|
joepie91 |
sorry, 3 times * |
01:40
🔗
|
kennethre |
no player though |
01:40
🔗
|
kennethre |
still quite cool |
01:40
🔗
|
joepie91 |
well, that's what xmplay is for :) |
01:40
🔗
|
joepie91 |
xmplay is magic |
01:40
🔗
|
joepie91 |
I'm actually a bit sad that there's no xmplay for Linux |
01:40
🔗
|
joepie91 |
I love it to bits |
01:40
🔗
|
joepie91 |
(no pun intended) |
01:41
🔗
|
BlueMax |
too bad there's no keygen radio |
01:41
🔗
|
BlueMax |
like rainwave.cc but for keygens |
01:43
🔗
|
* |
joepie91 bookmarked rainwave |
01:44
🔗
|
joepie91 |
hm |
01:44
🔗
|
joepie91 |
considering setting up a keygen radio |
01:44
🔗
|
joepie91 |
anyone here that would like to volunteer in encoding all of keygenmusic.net into mp3? :P) |
01:44
🔗
|
joepie91 |
:) * |
01:47
🔗
|
BlueMax |
if I could I would |
01:47
🔗
|
BlueMax |
but I don't have the upstream |
01:47
🔗
|
BlueMax |
to actually upload them afterwards |
01:55
🔗
|
godane |
www.2600.com is 13gb+ |
02:33
🔗
|
omf_ |
remember this old fossil http://www.htdig.org/ |
02:46
🔗
|
godane |
so i think 2600.com is to big for me |
02:46
🔗
|
godane |
its still growing at 14gb+ |
02:47
🔗
|
Lord_Nigh |
is there an automated way to mirror a pipermail archive w/metadata ? |
02:56
🔗
|
omf_ |
Lord_Nigh, possibly, gimme a link so I can check |
03:02
🔗
|
Lord_Nigh |
http://bluegrasspals.com/pipermail/dectalk/ |
03:04
🔗
|
omf_ |
I got a script that can grab all that but no warc data |
03:04
🔗
|
omf_ |
I used it for opensolaris |
03:21
🔗
|
godane |
so i think i'm going to stop this archive of 2600.com |
03:21
🔗
|
godane |
its too much data for me |
03:22
🔗
|
omf_ |
Yeah 2600 is huge, many, many years of data. I wouldn't be surprised if the site was over 100gb |
03:23
🔗
|
godane |
its mostly audio files that make it so big |
03:23
🔗
|
godane |
i stopped it and deleting it |
03:27
🔗
|
BlueMax |
D: |
03:28
🔗
|
godane |
its just too big for me to do a site dump of |
03:29
🔗
|
godane |
here is the code for my script: |
03:29
🔗
|
godane |
website="www.2600.com" |
03:30
🔗
|
godane |
wget $website --mirror --warc-file=$website-$(date +%Y%m%d) --warc-cdx -e robots=off --warc-header="operator: Archive Team" --warc-max-size=1G -E -o wget.log |
03:47
🔗
|
DFJustin |
I want an xmplay for android :( |
04:41
🔗
|
S[h]O[r]T |
http://www.huffingtonpost.com/2013/04/15/condom-challenge-snorting-condoms-videos_n_3085258.html |
04:41
🔗
|
S[h]O[r]T |
must archive all these videos |
04:44
🔗
|
BlueMax |
... |
12:20
🔗
|
joepie91 |
http://seclists.org/fulldisclosure/2013/Apr/28 |
12:20
🔗
|
ersi |
http://www.mymodernmet.com/profiles/blogs/dragon-ball-z-makankosappo-kamehameha/ |
13:19
🔗
|
BlueMax |
(if ersi says something something illegal re. #archiveteam I will murder him) |
13:45
🔗
|
ersi |
that would be illegal, so I wont. What if I get a DMCA |
13:46
🔗
|
BlueMax |
you're murdered |
15:17
🔗
|
balrog |
http://www.geocities.com/bswadener/humor/umac606.htm |
15:17
🔗
|
balrog |
how the hell is that still up |
15:18
🔗
|
balrog |
heh, site:geocities.com shows 193K results in google |
15:18
🔗
|
BlueMax |
what the shit |
15:18
🔗
|
BlueMax |
I guess Yahoo can't even DELETE stuff right |
15:19
🔗
|
BlueMax |
should probably grab what's still up there huh |
15:20
🔗
|
balrog |
probably... would need to compile a list though |
15:20
🔗
|
BlueMax |
Google's a good start at least |
15:22
🔗
|
BlueMax |
also calling the project name: geobitties |
15:25
🔗
|
BlueMax |
anyway I need sleep before insanity sets in |
15:29
🔗
|
DFJustin |
there are some sites still up, it seems to be accounts where they also had a domain name hosted through geocities, i.e. paying customers |
19:05
🔗
|
SketchCow |
yes |
19:50
🔗
|
omf_ |
balrog, Got any other cool tools like plowshare |
23:17
🔗
|
ersi |
balrog: There's a bunch of Geocities sites up still. They had paid hosting as well, if I'm not mistaken. |
23:17
🔗
|
ersi |
I don't think those were killed. But most were of course not paid for |
23:34
🔗
|
dashcloud |
here's a post about that: http://contemporary-home-computing.org/1tb/archives/3022 |
23:34
🔗
|
godane |
i'm looking for a way to grab urls from google |
23:35
🔗
|
godane |
but without autoboting |
23:35
🔗
|
godane |
*blocking |
23:44
🔗
|
godane |
i'm mirroring fucking google |
23:45
🔗
|
godane |
i'm already at the 500+ links |
23:56
🔗
|
omf_ |
Does anyone actually use NutchWAX? It doesn't have warc support |