#archiveteam-bs 2014-04-28,Mon

↑back Search

Time Nickname Message
01:23 🔗 godane i have more of marxists.org now then what archivebot got
01:24 🔗 godane its around 87k urls now
01:24 🔗 godane only 254 have 403 errors
01:28 🔗 exmic neat
01:36 🔗 ohhdemgir any OSX users around?
01:56 🔗 godane 92k now
02:18 🔗 ohhdemgir http://www.reddit.com/r/DataHoarder/comments/245ij1/start_your_own_rgonewild_archive_automated_data/
04:36 🔗 godane 99k urls now
04:36 🔗 godane some of the pdfs will be uploaded so we can have collections of them
04:36 🔗 godane like Peking Review
04:40 🔗 godane i'm looking at peking review pdfs and i think they don't have them all links to html pages
04:41 🔗 godane i will see about grabbing the missing ones
07:16 🔗 godane SketchCow: you may get another copy of the computer chronices
07:17 🔗 godane a guy on myspleen is taking the 1gb file and making 175mb mp4
07:22 🔗 godane the reason is that mp4 are better then what archive.org makes
07:34 🔗 godane i notice one of the computer Chronicles breaks around the 1:47 mark to a home video with a baby in it
07:35 🔗 godane its this one by the way: https://archive.org/details/CC601_macworld
07:36 🔗 godane also another thing about the coomputer chronicles from myspleen
07:36 🔗 godane we will have a collection that can be in season and episode order
16:27 🔗 schbirid lol dreamspark
16:27 🔗 schbirid Kennwort: Das Kennwort muss mindestens sechs Zeichen lang sein und kann keine dieser Zeichen enthalten <,>,',;,=,(,),|,[,],?,/,#.
16:27 🔗 schbirid that looks like perl
16:41 🔗 midas how much does archivebot like to grab github?
16:48 🔗 DFJustin not much I would expect
17:11 🔗 godane so i think marxists.org is redownloading pdfs that have been downloaded
17:11 🔗 godane very odd
17:17 🔗 godane i'm stop my mirror of marxists.org
17:18 🔗 godane it was redownloading files that are downloaded so it was best to stop it
17:18 🔗 balrog really....?
17:19 🔗 SketchCow Today, I found out that Bill Murray wasn't in Charlie's Angels 2 and 3 because he pointed to Lucy Liu and said "I have no idea why you're here."
17:19 🔗 godane thats the way it looks anyways
17:20 🔗 godane if anything else you guys are getting 85gb of the website
17:20 🔗 godane more then what archviebot got
17:21 🔗 exmic fantastic
17:22 🔗 godane and since i have the files i may make some collections out of the pdfs i got
17:23 🔗 exmic cool
17:23 🔗 exmic I don't know where you find the time or energy, man
17:23 🔗 exmic you are a machine
17:28 🔗 SketchCow He burns with the hate of a thousand suns
17:28 🔗 exmic deletionists deserve all the hate they can get
17:29 🔗 balrog godane: do we know which files are the ones that are getting removed?
17:29 🔗 balrog it's all the ones published by that publishing company
17:29 🔗 balrog aha: https://www.marxists.org/archive/marx/works/cw/
17:31 🔗 godane i think i have all that too in my dump
17:31 🔗 balrog that's what's getting pulled
17:34 🔗 godane i'm uploading the first 11gb of warc.gz right now
17:47 🔗 SadDM "<@exmic> you are a machine" That might actually be it... godane's a robot!
17:47 🔗 godane here is the item its being uploaded to: https://archive.org/details/www.marxists.org-20140426
17:48 🔗 exmic heh
17:51 🔗 godane i'm also still mirroring nbc/cbs/abc news stuff
18:09 🔗 godane i'm starting to upload some pdfs for collections: https://archive.org/details/v1n01-nov-15-1910-agitator
18:37 🔗 CHRISTINA get in on the act Stratego MAGEGO JUMBASTIC http://ow.ly/vusaO
18:45 🔗 godane i'm uploading American Appeal volume 7 and 8
18:57 🔗 SadDM Quite the up-tick of spam in the last few days.
18:59 🔗 SketchCow You think this is an uptick?
18:59 🔗 SketchCow Well, I mean, along the lines of 'saw second cow'
19:00 🔗 SadDM Well, more than I've seen since I've been around.
19:02 🔗 yipdw midas: honestly I wouldn't grab github with archivebot
19:02 🔗 yipdw I'd use github-mirrorer
19:03 🔗 yipdw er
19:03 🔗 yipdw whatever closure's thing is called
19:03 🔗 yipdw github-backup
19:06 🔗 godane i'm getting a 2009 episode of 60 minutes that talks about movie pirates
19:14 🔗 godane so i may have found a way to grab the original file names
19:14 🔗 godane it was like what i thought it should be
19:15 🔗 godane its something like imagename_646.flv
19:16 🔗 godane i'm trying to get the original files from cbs news cause alot of the newer links got to media.cbsnews.com
19:16 🔗 godane but they black bars on the the sides
19:17 🔗 midas yipdw: will do
19:27 🔗 balrog yipdw: can you backup winocm's repos?
19:37 🔗 DFJustin http://gizmodo.com/inside-the-us-nuclear-silos-where-floppy-disk-are-still-1568609439?
19:42 🔗 yipdw balrog: starting now; keep in mind that this only gets public data
19:44 🔗 yipdw cabal install is toasting my laptop
19:57 🔗 godane i'm now uploading American Socialist pdf collection
20:08 🔗 yipdw gah, the github module doesn't work
20:08 🔗 yipdw balrog: never mind, there's something wrong with my Haskell environment
20:45 🔗 exmic why does it take haskell to run git clone a bunch?
20:47 🔗 yipdw exmic: github-backup does more than that
20:47 🔗 exmic sure, wikis and tickets and suchlike
20:48 🔗 yipdw I could just clone them all I guess
20:57 🔗 balrog yipdw: still having issues?
20:59 🔗 SmileyG SketchCow: can u tell me how big the pdf collection is so far?
20:59 🔗 SmileyG the ones im sending i mean
21:00 🔗 yipdw balrog: haven't been able to get to it -- in the middle of app release procedure
21:23 🔗 balrog ah ... ok
21:23 🔗 balrog we probably have a few days
21:25 🔗 godane i'm also starting to upload more buck sexton show: https://archive.org/details/the-buck-sexton-show-01-04-2014
21:26 🔗 SketchCow Which ones are yours, SmileyG
21:30 🔗 SmileyG radioamerica i think the dir was called
21:31 🔗 SketchCow 22G american_radio/
21:31 🔗 SketchCow du -sh american_radio/
21:31 🔗 SmileyG urgh 1/4
22:07 🔗 godane I'M ON A SUGER RUSH FROM DONUTS
23:12 🔗 SketchCow godane: Is there an issue with me uploading these Amazon manuals?
23:13 🔗 SketchCow I just removed the dupes.
23:14 🔗 godane no
23:14 🔗 godane i removed alot of the dupes before uploading
23:17 🔗 SketchCow I know.
23:17 🔗 SketchCow And I got the rest.
23:29 🔗 dashcloud just saw this today: http://rr-project.org/ rr records nondeterministic executions and debugs them deterministically
23:42 🔗 exmic handy
23:45 🔗 dashcloud it was designed for use with Firefox, but works with most programs
23:46 🔗 SketchCow http://24.media.tumblr.com/ebe179bca4dc0d7c6bd0bd7d0cbbccd3/tumblr_n4md4pMruz1qa7q1no1_1280.jpg
23:48 🔗 dashcloud MIDI sequencing in JS: http://mudcu.be/midi-js/

irclogger-viewer