#archiveteam 2011-12-23,Fri

↑back Search

Time Nickname Message
00:06 🔗 SketchCow Dude, SOPA is filled with so much bad.
00:06 🔗 PatC Yes it is :/
00:06 🔗 PatC Isn't it against the first amendment?
00:07 🔗 SketchCow The thing is, we had DMCA get passed, full of lots of amendment busting bullshit
00:07 🔗 SketchCow And what happened was:
00:07 🔗 SketchCow - Lots of chilling effect letters
00:07 🔗 SketchCow - Lots of fucked people, businesses
00:07 🔗 SketchCow - Lawsuits
00:07 🔗 SketchCow - Some parts struck down
00:07 🔗 PatC ah
00:08 🔗 SketchCow But using courts as some sort of lint trap for bad law is just a non-starter.
00:08 🔗 SketchCow We do it, but it's a very bad approach.
00:09 🔗 soultcer I wonder what kind of remifications it has for archive.org and other archive stuff
00:14 🔗 SketchCow Forget archive.org, man. Internet.
00:14 🔗 chronomex iiiiinternettt
00:14 🔗 SketchCow It makes it a years-penality crime to stream copyrighted material.
00:14 🔗 SketchCow Years. A felony.
00:14 🔗 soultcer <-- European
00:14 🔗 chronomex <-- American
00:14 🔗 Coderjoe completely fucking overboard
00:14 🔗 SketchCow Yeah, but see, it kills things like youtube, google showing things, etc.
00:15 🔗 Coderjoe it kills the internet as we know it
00:15 🔗 soultcer And the Internet Archive, which is the american institution I care most about
00:15 🔗 Coderjoe which is, of course, what big content wants: another broadcast/cable TV type platform
00:18 🔗 PatC SketchCow, has archive.org been here before DCMA?
00:22 🔗 chronomex yes
00:22 🔗 PatC you guys have any problems with the DCMA?
00:22 🔗 chronomex DMCA, for all its flaws, does allow hosting to actually continue
00:23 🔗 chronomex it seems that SOPA does not
00:23 🔗 PatC :
00:23 🔗 PatC :/ *
00:25 🔗 PatC http://www.youtube.com/watch?feature=player_embedded&v=1w6GtwOvnWM
01:27 🔗 Paradoks I'm reading "Free Ride", and it rants about how bad the DMCA is -- because it has the safe harbor clause. The author figures that all the pro-industry stuff in there didn't help anywhere near as much as they had hoped.
01:28 🔗 Paradoks Well, "help" in a sense that I hope we stop helping, as soon as possible. It's not the easiest read.
01:29 🔗 Paradoks I do wonder what the internet would look like if that safe harbor hadn't made it into the DMCA.
01:36 🔗 Ymgve just wait till SOPA is in full effect
01:46 🔗 Coderjoe Paradoks: no google video, no youtube, no imgur and the like
02:52 🔗 kennethre SketchCow: if namecheap doesn't work out well, i *highly* recommend dnsimple.com
02:56 🔗 kennethre hmm, http://httparchive.org/
03:04 🔗 SketchCow No, they look pretty good
03:10 🔗 PatC kennethre, what's wrong with name cheap?
03:10 🔗 PatC SketchCow, i've had no issues with namecheap
03:10 🔗 kennethre PatC: I haven't used them, I just know that dnsimple is incredible
03:10 🔗 SketchCow Hey, so, sharing a project.
03:10 🔗 SketchCow http://statusboard.archive.org/
03:10 🔗 SketchCow Don't pass around
03:10 🔗 SketchCow Just put on a machine full screen and absorb
03:11 🔗 kennethre woah
03:12 🔗 PatC nice
03:15 🔗 PatC SketchCow, that's cool
03:22 🔗 Paradoks Nifty!
03:25 🔗 SketchCow Little project alard and I did for the archive.
03:25 🔗 SketchCow Realtime, minus an hour.
03:25 🔗 SketchCow For generating niceness.
03:25 🔗 SketchCow Still needs a round of adjustments, etc.
03:29 🔗 kennethre SketchCow: love the wordle clouds
03:42 🔗 SketchCow I do too
03:43 🔗 SketchCow A metric ton of yearbooks are going by.
03:43 🔗 SketchCow What happens is that you can see trends, as the scanning centers attack a shelf of books.
04:36 🔗 PatC SketchCow, haha nice emulator
06:34 🔗 kennethre holy shit.
06:34 🔗 kennethre http://www.youtube.com/watch?v=eFu71XeM998&feature=youtu.be
06:36 🔗 chronomex packet radio is cool
06:37 🔗 kennethre this will be the internet post-SOPA
06:37 🔗 kennethre we'll be back to bbs
06:37 🔗 kennethre :)
06:45 🔗 yipdw|_ that is really slick
06:45 🔗 yipdw|_ that's something that I've been meaning to complete, actually: my Technician license
06:45 🔗 yipdw|_ doesn't seem that hard to do, I've just been doing too many other things
06:46 🔗 chronomex I went straight to General a few weeks ago, it wasn't very hard
06:46 🔗 yipdw|_ General's above Extra, right?
06:46 🔗 chronomex yeah
06:46 🔗 yipdw|_ ok
06:46 🔗 chronomex Extra, General, Tech
06:46 🔗 chronomex erm
06:46 🔗 chronomex Tech -> General -> Extra
06:46 🔗 chronomex extra is the highest easy one to get
06:47 🔗 chronomex above that I think you can go to Experimenter, but that also requires a substantial annual fee
06:47 🔗 yipdw|_ I developed a kind of weird fascination with amateur radio in college
07:11 🔗 Coderjoe yipdw: it is incredibly easy to get a license
07:11 🔗 yipdw|_ I knw
07:11 🔗 yipdw|_ ow
07:12 🔗 yipdw|_ it's a matter of reviewing the material and going in for the test
07:12 🔗 Coderjoe so easy you can cram the question pool and then take the test and pass
07:13 🔗 kennethre do people actually police that?
07:14 🔗 Coderjoe http://www.arrl.org/question-pools
07:14 🔗 Coderjoe and if you create a qrz.com account, they have practice exams
07:50 🔗 * SketchCow is back to adding magazines!
07:50 🔗 SketchCow While the Jamendo burns on
08:01 🔗 SketchCow http://www.archive.org/details/warren-1984-magazine
08:04 🔗 arrith very neat
08:19 🔗 SketchCow Yeah, now I'm worried someone owns it.
08:19 🔗 chronomex worrying about copyright is new for you
08:20 🔗 kennethre has archive.org received many dcmas?
08:22 🔗 SketchCow Well, in many cases I'm putting up stuff that is not sold.
08:22 🔗 SketchCow The takedown recently was because it turns out it IS sold
08:23 🔗 SketchCow Now while getting info on this obscure comic, I see Fantagraphics released a softcover compilation of all four comics.
08:23 🔗 SketchCow Which they still sell.
08:23 🔗 SketchCow I'd much rather be saving obscure items nobody cares about, than spending all day making not-distributable online copies of books still being sold right now.
08:23 🔗 chronomex oh hey, fantagraphics is local to me
08:23 🔗 SketchCow Yes
08:26 🔗 SketchCow I mean, make no mistake, I can make the transfer in as easy as pie.
08:26 🔗 yipdw|_ oh, that Technology Review piece has some comments on it
08:26 🔗 yipdw|_ "A small team can't possibly back up much of import, but if they end up influencing many thousands or millions of web users to back up their wanted stuff, much more could be done.  Just my $0.02."
08:27 🔗 yipdw|_ fuck you you are all in archive team, etc
08:27 🔗 chronomex I can't really extract much meaningful anything from that comment
08:27 🔗 SketchCow Oh, that guy.
08:27 🔗 SketchCow Yeah, really seriously, ignore that thing.
08:27 🔗 chronomex done
08:28 🔗 SketchCow It's cute, it'll bring me celebrity, but people are ALWAYS coming to this project going "But you can't save it all! Go home and nap."
08:28 🔗 chronomex PARTIAL SUCCESS (aka PARTIAL FAILURE) is NO BETTER THAN INACTION
08:29 🔗 arrith building up a big list of successes (and getting it all prettily formatted), like we're doing now, is something one can then just point to
08:29 🔗 arrith "turns out all of these show you're wrong.. yeah"
08:31 🔗 yipdw|_ eh, it's the Internet, I've learned to treat it for what it is
08:31 🔗 SketchCow Here's the thing (I'm done talking about that article for the night.)
08:31 🔗 SketchCow If stuff is added to the archive, it's there forever, stored in non-browsable archives if there's an issue.
08:31 🔗 SketchCow So I am quite glad to add these items.
08:32 🔗 SketchCow The key is, I am refining and refining and refining my helper scripts, so adding items is as absolutely simple as possible.
08:32 🔗 SketchCow So I can add, say, 500 magazines in an hour or two, while listening to presentations/podcasts people think I should be listening to.
08:32 🔗 SketchCow So that's my challenge.
08:32 🔗 SketchCow Right now, it's pretty good.
08:32 🔗 SketchCow I dump dotfiles in the directories that are instructions.
08:33 🔗 SketchCow "Put this in this collection, name them all this prefix", etc.
08:33 🔗 SketchCow So if there's a .collection file, it means "put these in this collection".
08:33 🔗 SketchCow Over time, it won't matter if it goes dark - it's trivial to have added.
08:34 🔗 SketchCow Now, granted, I WANT this stuff out there, so the Jamendos are really getting the work done.
08:38 🔗 SketchCow Also, I am rocking the REALLY, REALLY Low-Hanging fruit here - I have a whole bunch of digitization stations I'm setting up in my room over the holidays, and I will be adding SCADS of my own stuff that NOBODY cared enough about to the archives.
08:38 🔗 SketchCow I mean, massive piles. I have a bookscanner coming in what looks like early January, and then watch out.
08:39 🔗 SketchCow That's how 2012 is going to be spent, when I'm not doing the documentaries. Just adding piles of data.
08:41 🔗 DFJustin so I just found out youtube-dl can download a user's entire video list with one command line
08:41 🔗 DFJustin gonna have lots of fun with this
08:44 🔗 SketchCow http://www.archive.org/details/close-encounters-warren
08:49 🔗 SketchCow http://www.archive.org/details/blazingcombat-warren
09:06 🔗 SketchCow http://www.archive.org/details/warrenpublishing will keep filling.
09:06 🔗 bsmith093 whats the youtube dl command for all of a uesrs videos?
09:10 🔗 emijrp SketchCow: this newspaper is closing http://www.adn.es/
09:11 🔗 emijrp can you request a full scrape for wayback?
09:11 🔗 emijrp the currently wayback version looks like only the mainpage is being saved
09:13 🔗 DFJustin youtube-dl http://www.youtube.com/user/usernamegoeshere
09:14 🔗 DFJustin I also throw this on there to get nice filenames: -o "%(title)s.%(ext)s"
09:14 🔗 bsmith093 thanks DFJustin
09:15 🔗 emijrp DFJustin: youtube-dl works on usernames?
09:16 🔗 DFJustin yep, not sure when they added that
09:17 🔗 DFJustin playlists and searches also work apparently http://rg3.github.com/youtube-dl/documentation.html#d4
09:18 🔗 Nemo_bis I was trying to get this go on https://en.wikisource.org/wiki/Wikisource:WikiProject_Royal_Society_Journals/Uploading_progress#Volumes_to_be_uploaded_and.2For_downloaded_to_the_IA
09:18 🔗 Nemo_bis But the problem seems to be that nobody wants to scrape hathitrust.org even if they're public domain journals, even if we had the full list of journals missing on IA
09:19 🔗 Nemo_bis (and I don't know how to scrape it, btw)
09:22 🔗 yipdw|_ http://www.hathitrust.org/data_api
09:22 🔗 yipdw|_ that looks useful
09:22 🔗 yipdw|_ actually, all of http://www.hathitrust.org/data does
09:28 🔗 SketchCow http://www.archive.org/stream/teenagelovestories-03/teenagelovestories_warren_03#page/n41/mode/2up
09:29 🔗 kennethre hahaha
09:29 🔗 yipdw|_ wow
09:30 🔗 kennethre this makes me want to watch mad men
09:31 🔗 yipdw|_ the anachronicity in that reminds me of http://www.amazon.com/Everything-Always-Wanted-Know-About/dp/0312976569
09:42 🔗 SketchCow http://www.archive.org/stream/farmers-wife-v35-n10-1932-10/farmers_wife_v35n10_1932_10#page/n0/mode/2up
09:45 🔗 DFJustin that is some serious quality for 1932
09:45 🔗 SketchCow It's a nice one!
09:58 🔗 yipdw|_ looking through SSH auth logs yields some funny sites
09:59 🔗 yipdw|_ and I mean Web sites; for some reason hosts where break-in attempts are staged seem to have Web servers running
09:59 🔗 yipdw|_ one of them can be found at http://123.30.168.72/; looks like LNMP is some sort of My First Server package
10:07 🔗 SketchCow http://www.archive.org/details/whisper-magazine-v3-n7-1950-05
10:27 🔗 Nemo_bis yipdw|_, I doubt you can use the API to download content they don't want you to download
13:22 🔗 emijrp yes, downloading videos by users is cool
13:23 🔗 emijrp all conferences from wikimania 2011 http://www.youtube.com/user/WikimediaIL
13:24 🔗 emijrp is there a way to download the metadata from youtube? (description, uplodad date, etc)
13:25 🔗 emijrp using youtube-dl
17:04 🔗 Coderjoe http://i.imgur.com/04bUa.jpg
19:18 🔗 yipdw|_ ah, fuck
19:19 🔗 yipdw|_ I didn't archive GoDaddy's "we support SOPA" blog post
19:19 🔗 yipdw|_ anyone manage to get it?
19:21 🔗 Nemo_bis did they really delete it?
19:22 🔗 Nemo_bis not even the most stupid minister in Italian history did it with her press release about the CERN tunnel
19:22 🔗 yipdw|_ http://support.godaddy.com/godaddy/go-daddys-position-on-sopa/
19:22 🔗 yipdw|_ yes, it's gone
19:25 🔗 closure http://www.technologyreview.com/article/39317/#comment-239190 this comment is pure gold
19:32 🔗 Coderjoe godaddy's "we no longer support sopa" press release: http://www.godaddy.com/newscenter/release-view.aspx?news_item_id=378&isc=smtwsup
19:32 🔗 yipdw|_ yeah, but I can't find their original "we support SOPA" press release
19:33 🔗 Coderjoe wait... IA has a scanning center in Shenzehn, CN?
19:36 🔗 Coderjoe ah, the "memory hole" qualities of the internet
20:59 🔗 SketchCow Yes, they have a center in China
20:59 🔗 SketchCow It's cheaper for a range of books to send a shipping container to china and have them scan in the contents.
21:00 🔗 bsmith093 youtube dl only grabs the first 30 or so videos from a users page, how do i get them all?
21:00 🔗 bsmith093 [youtube] user Vihart: Collected 31 video ids (downloading 31 of them)
21:00 🔗 bsmith093 [youtube] user Vihart: Downloading video ids from 1 to 51
21:10 🔗 DFJustin as far as I can tell it gets all of them that exist, that output looks to me like there are only 31
21:10 🔗 DFJustin it just goes in chunks of 50 videos because that's how youtube breaks up the page for big users I guess
21:17 🔗 Coderjoe that's how it used to. youtube overhauled the UI a week or so ago
21:46 🔗 emijrp SketchCow: if you like counters, look this one http://toolserver.org/~emijrp/wikimediacounter/ ; )
21:47 🔗 emijrp a bit old
21:56 🔗 emijrp yipdw|_: this? http://nwlinux.com/godaddys-official-position-on-sopa/
21:58 🔗 DFJustin <emijrp> is there a way to download the metadata from youtube? (description, uplodad date, etc)
21:58 🔗 DFJustin --write-info-json
21:58 🔗 emijrp thanks
21:58 🔗 DFJustin --help may also be of interest
21:58 🔗 DFJustin for some reason it seems to eat all the line breaks in the saved descriptions though

irclogger-viewer