[00:04] Understood and known. [00:04] Hey, anyone got a server they can give me an account on to run firefox and an X server (small one) on? [00:35] SketchCow, still need that server? [00:45] I'd like to play around on one, yes. [00:46] ohhdemgir, are you there? [09:16] https://blog.box.com/2014/06/box-acquiring-streem-bringing-the-cloud-to-your-desktop/ [09:20] "Streem has been acquired by Box! We're creating an optional migration path so all of your data will be safe!" [09:39] "optional" and "all" don't get along well [09:43] looks like streem has disabled all public videos. [10:01] actually, the videos are still technically accessible [10:01] e.g. https://streem.s3.amazonaws.com/objects/9b38131adb0b0c6e36830d8fbeeb3fb4/LinuxCon_and_CloudOpen_North_America_2013_-_Linux_Kernel_Panel.mp4 [10:02] SN4T14, you can give SketchCo1 an account on arc01 if you want [11:30] * SketchCow jumps up and down at the counter [11:32] one million hard drives, sir? [11:45] Ryan Kearney is delivering his 1PB drive [14:14] SketchCow, hang on, let me get you an account. :p [14:19] Although, you will have to set up the DE and everything yourself (or you can set up a VM and have the installer do it for you) [15:28] http://www.theguardian.com/technology/2014/jun/17/youtube-indie-labels-music-subscription [15:33] Arkiver2: fun [15:34] Arkiver2: how will we identify the videos that are likely to be taken down? [15:34] db48x: I have no idea [15:34] is it public with which music labels google made an agreement? [15:34] Does google release those knd of contracts? [15:35] I don't suppose there's a list of independant artists and their youtube channels... [15:35] hah, no [15:38] found some [15:38] Adele and Arctic Monkeys are two examples of artists that are going to be blocked [15:38] so let's do those at least [15:39] https://www.youtube.com/playlist?list=PL55DF5F0E7C2C2DD3 [15:39] https://www.youtube.com/playlist?list=PLV7t4yekvqhv9x2_P8Jvue6IvtH4LCm20 [15:39] https://www.youtube.com/user/notsignedtv [15:40] prepare for blocked youtube content http://www.theguardian.com/technology/2014/jun/17/youtube-indie-labels-music-subscription [15:40] ivan`: we're discussing it already :P [15:40] I missed it :) [15:40] lol [15:40] we're searching for indepenent artists [15:41] the best way I know of to download a youtube channel is to use http://www.jwz.org/hacks/youtubefeed.pl [15:41] does that get you 1080p and 256kbit DASH? [15:41] db48x: but that doesn't create a warc with youtube vids right? [15:41] yea, it sorts by quality and grabs the best one [15:41] Arkiver2: nope [15:42] ivan`: of course the list of video types could easily be wrong or out of date, we'd want to double-check :) [15:42] db48x: have you observed it download a 1080p video after 2013-10? [15:42] yes [15:42] well [15:42] db48x: I'll start crawls with heritrix on the channels of some artists so their pages of the videos are saved, BUT NOT THE VIDEOS THEMSELVES are in the warcs [15:42] in 2013 yes, dunno about after november actually... [15:43] ivan`: we could modify it to just grab all the offered videos, that would do the trick [15:43] anyway this is how I use youtube-dl https://www.refheap.com/d97ee2660f3ebec52c8265f1e/raw [15:43] emijrp made me upload thousands videos with https://code.google.com/p/emijrp/source/browse/trunk/scrapers/youtube2internetarchive.py :) [15:43] already have to modify it not to skip videos older than 2 days [15:45] I think the article is probably a bit alarmist though [15:46] I can't see them banning every account where someone sat down with a guitar in front of a camera [15:46] they'll ban you if you don't use your Real Name [15:46] at least [15:46] on the other hand, they could just review all videos that seem to have music but that don't fall afoul of their stupid copyrighted-music detector [15:47] Rumour Has It that that's the case [15:47] I guess everyone will have to find Someone Like Youtube [15:48] still, let's engage our paranoia anyway [15:51] Arkiver2, db48x: not a whole lot of "you" left in "youtube" [15:51] heh [15:52] also [15:52] [17:46] I can't see them banning every account where someone sat down with a guitar in front of a camera [15:52] you'd be amazed [16:01] yea, I couldn't imagine them finding them all, but then I remembered their copyrighted-music detector [16:02] anything that _doesn't_ get flagged by that but that doesn't look like speech is basically going to be independant music [16:05] it's good to know that Youtube is no better than the labels [16:05] creative destruction of expectations [16:16] db48x, stop being silly and start using youtube-dl. :p [16:20] youtube-dl won't download a whole rss feed [16:20] Why do you need RSS feeds? [16:20] so that I can go around finding and downloading whole channels [16:21] rather than individual videos [16:21] I have a windows program here that downloads videos in the best quality [16:22] db48x, youtube-dl can download entire channels [16:22] youtube-dl does that just fine afaik [16:22] and playlists etc [16:22] No need to mess around with RSS feeds when there's simpler ways. ;) [16:23] it'd be nice if the documentation mentioned that [16:24] man youtube-dl :P [16:25] ^ [16:25] obviously that's no help if you haven't installed it because it looks like it won't do what you want :) [16:25] heh true [16:26] also https://www.google.is/search?q=youtube+dl+entire+channel [16:26] ;) [16:26] Yes, that is an Icelandic Google link because I'm lazy. :p [16:27] you're supposed to use lmgtfy.com :) [16:28] www.lmgtfy.com/?q=no [16:28] :p [16:28] heh [16:28] should we build a list of channels on an etherpad or the wiki or something? [16:29] piratepad! :D [16:29] :) [16:29] wiki [16:30] isnt it what its for? [16:30] wikis are great, but there's more round-trip time [16:31] Make a wiki page for it, and link to the piratepad list. ;) [16:31] db48x: or, y'know, the bottom of http://rg3.github.io/youtube-dl/supportedsites.html [16:31] :P [16:31] http://piratepad.net/C2ioWiy8fG [16:32] youtube-dl supports downloading archive.org, we should archive it! :D [16:32] lol [16:32] db48x, I liked your old name! :p [16:33] what was it? [16:35] Add name here [16:35] :p [16:36] ah [16:40] does freedb have information about labels in it? [16:46] db48x: not sure, but MusicBrainz does [16:46] db48x: and their data is freely available -> http://musicbrainz.org/doc/MusicBrainz_Database [16:48] nice [16:48] want to find all the artists with no label, then do some youtube searches? [16:49] at some point, but it's also artists on non-participating labels [16:49] e.g. for Adele you'd probably want to look up XL [16:49] XL Recordings that is [16:51] yea [16:53] the constitutents of the Worldwide Independent Network might be a good place to start for that list [16:54] good idea [16:55] not many actual artists in a youtube search for 'indenendant artist' [16:56] however I spell it [16:56] mostly interviews, promoters and consultants [16:59] this is good though: https://www.youtube.com/watch?v=Hbxy9xvpZ10&list=PLC32FEF51263DD92C [16:59] db48x, it's spelled "independent" [17:00] yes, I spelled it correctly when I did the search [18:37] http://instagram.com/p/pWpuz6MxuB/ (Server decomissioned) [18:39] That looks so fun. :D [18:48] i think i surpass my old record in godaneinbox [18:48] its at 18735 now [18:51] cripes [18:55] so, about this RAWPORTER, did I miss anything? [18:56] I think I punched them [18:56] Then we sat [18:56] But if we can pull ANYTHING out of them, do it. [18:57] Chances might be it's not possible. [18:57] Might be limited release, but scan them [18:57] someone did a scan already if im not mistaken [18:58] or was that steem [18:59] joepie91_: you did some rawporter work yesterday with the markers [19:01] i think the s3 wasnt secured [19:01] so we can grab all pictures and video's [19:01] http://rawporter.s3.amazonaws.com/ [19:03] Well, do it. [19:11] s3cmd du s3://rawporter [19:11] WARNING: Retrying failed request: /?marker=thumbs/l_f5fnivczoddwq7.jpg (timed out) [19:11] WARNING: Waiting 3 sec... [19:11] 78880173037 s3://rawporter/ [19:11] well peeps? [19:12] 78GB? That's pretty small... [19:12] thats what is on the s3 [19:14] Weird [19:18] :P [19:18] just grab all of S3 [19:19] meh, good point [19:20] grab first, assess later [19:21] that advice has also served me well on the North Side of Chicago [19:21] yipdw: ? [19:21] bad regional joke [19:21] (also, do we have a way of grabbing an entire S3 bucket with WARC?) [19:21] lol [19:22] * joepie91_ is not from that region [19:22] basically the north side and the rest of the North Shore area is unusually sexually active [19:22] So basically sex apartheid [19:23] nah [19:23] we have real racism in Chicag [19:23] o [19:23] "unusually sexually active"? [19:23] it's on the high end of the curve [19:23] anyway [19:24] I think I passed the -bs threshold on line 1 [19:27] lol [19:30] grabbing s3 now [19:32] midas, according to my calculations, you're going to cost them $4-$9.5 in S3 costs from grabbing all of that. :p [19:33] im going to grab it 20 times SN4T14 ;) [19:35] earbits.com mp3 tars are incoming, grab them while you can: https://archive.org/search.php?query=subject%3A%22earbits.com%22%20mp3 [19:35] while true; do curl https://rawporter.s3.amazonaws.com/AWOL/gallery.swf -o /dev/null; done [19:35] :p [19:35] Not sure if that's correct, I rarely use curl. :p [19:36] s3cmd get s3://rawporter --recursive /hurr/durr [19:36] while true; do s3cmd get s3://rawporter --recursive /dev/null &; done [19:36] :D [19:36] lol, open s3 is the new ID iteration [19:37] ID iteration? [19:37] open s3 is running around with your middlefingers in the air and screaming [19:38] schbirid: nah, open S3 is much more efficient [19:38] than ID iteration [19:38] :P [19:38] :> [19:38] SN4T14: wget http:///www.internet.com/file?id=123 [19:38] jesus wtf, 28KB/sec [19:38] how congested is IA [19:39] jeez joepie91_ [19:39] are you on dialup? [19:39] 20kb/sec now, cancelled it [19:39] midas: IA is, apparently [19:40] not from here [19:41] i [19:41] i've seen worse according to the weathermap [19:41] slowdown rate on s3 is also very low [19:41] are you downloading or uploading joepie91_ [19:41] ? [19:41] dfl [19:41] dl * [19:43] it's going over HE [19:43] now brb [19:49] http://arstechnica.com/business/2014/06/artists-who-dont-sign-with-youtubes-new-subscription-service-to-be-blocked/ [19:52] yeah that's fucked up [19:55] Muad-Dib: we need someone to build a list of independent artists [19:56] yipdw suggested looking at the members of the Worldwide Independent Network [20:40] congrats midas and ohhdemgir :) https://archive.org/metamgr.php?f=histogram&group=uploader&w_collection=ftpsites [20:40] "You must be logged in to access this service." >.> [20:40] :) [20:41] and why aren't you logged in on archive.org, aren't you going after spam and support requests in forums etc. etc. [20:42] aww, I'm not authorized [20:42] oh, look, there is only one 2331388015 KB item https://archive.org/metamgr.php?f=histogram&group=size&w_collection=ftpsites [20:42] The second is 893892396 KB from another wikisourceror, I swear I didn't suggest him [20:45] how can i see how much i uploaded? [20:48] schbirid: https://archive.org/metamgr.php?f=histogram&group=size&w_uploader=spirit@quaddicted.com but I'm not sure if there's a way to sum the first column [20:49] oi, dont leak my mail address to irc please [20:49] thanks [20:49] well, meatmgr [20:49] sorry, I had a doubt for a moment but then thought it's in all the xml files anyway :p [20:49] :P [20:50] I should have used mine as example [20:50] yeah but those are for admins only [20:50] while in this channel i maybe know 10% [20:50] no biggie [20:50] what, the xml files? [20:50] yeah [20:51] everyone can download them [20:51] hmm [20:51] that's what I thought, Nemo_bis [20:51] its kinda crazy how much archive.org shows stupid admins like me :D [20:51] oh? :( [20:51] * db48x spams schbirid [20:52] they're not even hidden behind the "HTTPS"/download link in items like https://archive.org/details/wiki-wikiurbandeadcom [20:54] joepie91, i've been downloading rawporter [20:54] nearly done [20:55] is anyone else downloading rawporter? [20:56] I think midas was as well. [20:58] just the s3 files [20:58] 6200 of 39K [20:59] probably done in the morning [21:00] schbirid: anybody can see uploader, yes [21:02] OTOH, http://blog.archive.org/2013/10/25/reader-privacy-at-the-internet-archive/ : almost nobody on the web is so good [21:07] I do wish there was an uploader privacy option for items though [21:07] other than registering a throwaway email [21:08] DFJustin: darken it directly after uploading? [21:08] altho, it wont be findable anymore [21:09] also won't be downloadable or anything [21:24] lol mistym [21:25] * midas [21:51] Nemo_bis, "User: ohhdemgirls is not authorized to access this service." [21:52] well, you're second with most items in ftpsites [21:53] i wanna see!! [21:57] I think midas was as well. [21:57] Whoops, this isn't Cygwin [21:57] lol [23:25] schbirid: Nemo_bis: yeah, they're totally open in the current system [23:25] Much of the system is architected on that being the case but eventually we want to move to a different user ID [23:25] as manpower allows