Time |
Nickname |
Message |
01:23
🔗
|
godane |
i have more of marxists.org now then what archivebot got |
01:24
🔗
|
godane |
its around 87k urls now |
01:24
🔗
|
godane |
only 254 have 403 errors |
01:28
🔗
|
exmic |
neat |
01:36
🔗
|
ohhdemgir |
any OSX users around? |
01:56
🔗
|
godane |
92k now |
02:18
🔗
|
ohhdemgir |
http://www.reddit.com/r/DataHoarder/comments/245ij1/start_your_own_rgonewild_archive_automated_data/ |
04:36
🔗
|
godane |
99k urls now |
04:36
🔗
|
godane |
some of the pdfs will be uploaded so we can have collections of them |
04:36
🔗
|
godane |
like Peking Review |
04:40
🔗
|
godane |
i'm looking at peking review pdfs and i think they don't have them all links to html pages |
04:41
🔗
|
godane |
i will see about grabbing the missing ones |
07:16
🔗
|
godane |
SketchCow: you may get another copy of the computer chronices |
07:17
🔗
|
godane |
a guy on myspleen is taking the 1gb file and making 175mb mp4 |
07:22
🔗
|
godane |
the reason is that mp4 are better then what archive.org makes |
07:34
🔗
|
godane |
i notice one of the computer Chronicles breaks around the 1:47 mark to a home video with a baby in it |
07:35
🔗
|
godane |
its this one by the way: https://archive.org/details/CC601_macworld |
07:36
🔗
|
godane |
also another thing about the coomputer chronicles from myspleen |
07:36
🔗
|
godane |
we will have a collection that can be in season and episode order |
16:27
🔗
|
schbirid |
lol dreamspark |
16:27
🔗
|
schbirid |
Kennwort: Das Kennwort muss mindestens sechs Zeichen lang sein und kann keine dieser Zeichen enthalten <,>,',;,=,(,),|,[,],?,/,#. |
16:27
🔗
|
schbirid |
that looks like perl |
16:41
🔗
|
midas |
how much does archivebot like to grab github? |
16:48
🔗
|
DFJustin |
not much I would expect |
17:11
🔗
|
godane |
so i think marxists.org is redownloading pdfs that have been downloaded |
17:11
🔗
|
godane |
very odd |
17:17
🔗
|
godane |
i'm stop my mirror of marxists.org |
17:18
🔗
|
godane |
it was redownloading files that are downloaded so it was best to stop it |
17:18
🔗
|
balrog |
really....? |
17:19
🔗
|
SketchCow |
Today, I found out that Bill Murray wasn't in Charlie's Angels 2 and 3 because he pointed to Lucy Liu and said "I have no idea why you're here." |
17:19
🔗
|
godane |
thats the way it looks anyways |
17:20
🔗
|
godane |
if anything else you guys are getting 85gb of the website |
17:20
🔗
|
godane |
more then what archviebot got |
17:21
🔗
|
exmic |
fantastic |
17:22
🔗
|
godane |
and since i have the files i may make some collections out of the pdfs i got |
17:23
🔗
|
exmic |
cool |
17:23
🔗
|
exmic |
I don't know where you find the time or energy, man |
17:23
🔗
|
exmic |
you are a machine |
17:28
🔗
|
SketchCow |
He burns with the hate of a thousand suns |
17:28
🔗
|
exmic |
deletionists deserve all the hate they can get |
17:29
🔗
|
balrog |
godane: do we know which files are the ones that are getting removed? |
17:29
🔗
|
balrog |
it's all the ones published by that publishing company |
17:29
🔗
|
balrog |
aha: https://www.marxists.org/archive/marx/works/cw/ |
17:31
🔗
|
godane |
i think i have all that too in my dump |
17:31
🔗
|
balrog |
that's what's getting pulled |
17:34
🔗
|
godane |
i'm uploading the first 11gb of warc.gz right now |
17:47
🔗
|
SadDM |
"<@exmic> you are a machine" That might actually be it... godane's a robot! |
17:47
🔗
|
godane |
here is the item its being uploaded to: https://archive.org/details/www.marxists.org-20140426 |
17:48
🔗
|
exmic |
heh |
17:51
🔗
|
godane |
i'm also still mirroring nbc/cbs/abc news stuff |
18:09
🔗
|
godane |
i'm starting to upload some pdfs for collections: https://archive.org/details/v1n01-nov-15-1910-agitator |
18:37
🔗
|
CHRISTINA |
get in on the act Stratego MAGEGO JUMBASTIC http://ow.ly/vusaO |
18:45
🔗
|
godane |
i'm uploading American Appeal volume 7 and 8 |
18:57
🔗
|
SadDM |
Quite the up-tick of spam in the last few days. |
18:59
🔗
|
SketchCow |
You think this is an uptick? |
18:59
🔗
|
SketchCow |
Well, I mean, along the lines of 'saw second cow' |
19:00
🔗
|
SadDM |
Well, more than I've seen since I've been around. |
19:02
🔗
|
yipdw |
midas: honestly I wouldn't grab github with archivebot |
19:02
🔗
|
yipdw |
I'd use github-mirrorer |
19:03
🔗
|
yipdw |
er |
19:03
🔗
|
yipdw |
whatever closure's thing is called |
19:03
🔗
|
yipdw |
github-backup |
19:06
🔗
|
godane |
i'm getting a 2009 episode of 60 minutes that talks about movie pirates |
19:14
🔗
|
godane |
so i may have found a way to grab the original file names |
19:14
🔗
|
godane |
it was like what i thought it should be |
19:15
🔗
|
godane |
its something like imagename_646.flv |
19:16
🔗
|
godane |
i'm trying to get the original files from cbs news cause alot of the newer links got to media.cbsnews.com |
19:16
🔗
|
godane |
but they black bars on the the sides |
19:17
🔗
|
midas |
yipdw: will do |
19:27
🔗
|
balrog |
yipdw: can you backup winocm's repos? |
19:37
🔗
|
DFJustin |
http://gizmodo.com/inside-the-us-nuclear-silos-where-floppy-disk-are-still-1568609439? |
19:42
🔗
|
yipdw |
balrog: starting now; keep in mind that this only gets public data |
19:44
🔗
|
yipdw |
cabal install is toasting my laptop |
19:57
🔗
|
godane |
i'm now uploading American Socialist pdf collection |
20:08
🔗
|
yipdw |
gah, the github module doesn't work |
20:08
🔗
|
yipdw |
balrog: never mind, there's something wrong with my Haskell environment |
20:45
🔗
|
exmic |
why does it take haskell to run git clone a bunch? |
20:47
🔗
|
yipdw |
exmic: github-backup does more than that |
20:47
🔗
|
exmic |
sure, wikis and tickets and suchlike |
20:48
🔗
|
yipdw |
I could just clone them all I guess |
20:57
🔗
|
balrog |
yipdw: still having issues? |
20:59
🔗
|
SmileyG |
SketchCow: can u tell me how big the pdf collection is so far? |
20:59
🔗
|
SmileyG |
the ones im sending i mean |
21:00
🔗
|
yipdw |
balrog: haven't been able to get to it -- in the middle of app release procedure |
21:23
🔗
|
balrog |
ah ... ok |
21:23
🔗
|
balrog |
we probably have a few days |
21:25
🔗
|
godane |
i'm also starting to upload more buck sexton show: https://archive.org/details/the-buck-sexton-show-01-04-2014 |
21:26
🔗
|
SketchCow |
Which ones are yours, SmileyG |
21:30
🔗
|
SmileyG |
radioamerica i think the dir was called |
21:31
🔗
|
SketchCow |
22G american_radio/ |
21:31
🔗
|
SketchCow |
du -sh american_radio/ |
21:31
🔗
|
SmileyG |
urgh 1/4 |
22:07
🔗
|
godane |
I'M ON A SUGER RUSH FROM DONUTS |
23:12
🔗
|
SketchCow |
godane: Is there an issue with me uploading these Amazon manuals? |
23:13
🔗
|
SketchCow |
I just removed the dupes. |
23:14
🔗
|
godane |
no |
23:14
🔗
|
godane |
i removed alot of the dupes before uploading |
23:17
🔗
|
SketchCow |
I know. |
23:17
🔗
|
SketchCow |
And I got the rest. |
23:29
🔗
|
dashcloud |
just saw this today: http://rr-project.org/ rr records nondeterministic executions and debugs them deterministically |
23:42
🔗
|
exmic |
handy |
23:45
🔗
|
dashcloud |
it was designed for use with Firefox, but works with most programs |
23:46
🔗
|
SketchCow |
http://24.media.tumblr.com/ebe179bca4dc0d7c6bd0bd7d0cbbccd3/tumblr_n4md4pMruz1qa7q1no1_1280.jpg |
23:48
🔗
|
dashcloud |
MIDI sequencing in JS: http://mudcu.be/midi-js/ |