[00:26] *** mismatch has quit IRC (Ping timeout: 244 seconds) [00:27] *** dashcloud has quit IRC (Read error: Operation timed out) [00:28] *** mismatch_ has joined #archiveteam [00:31] *** dashcloud has joined #archiveteam [00:59] *** bwn has quit IRC (Read error: Operation timed out) [01:00] *** slpeeds has quit IRC (Remote host closed the connection) [01:00] *** fdo54ss has joined #archiveteam [01:14] *** bwn has joined #archiveteam [01:18] *** ariscop has quit IRC (Ping timeout: 506 seconds) [01:25] *** bsmith094 has quit IRC (http://www.kiwiirc.com/ - A hand crafted IRC client) [01:33] *** WinterFox has joined #archiveteam [01:39] there should be a seed vault for copyrighted material until they fall out of copyright [01:39] this is some dedication: http://pastebin.com/raw/AssdQjfC [01:39] RAtM's self titled album on what.cd [01:40] some steps may be placebo [01:41] ranma: there is, it's called 'darked on archive.org' [01:41] :o [01:41] * ranma looks it up [01:42] probably not documented [01:42] archive.org gets a takedown, they have a mechanism for making a thing appear to have been deleted [01:42] yeah, not finding it [01:44] gawd, that's a mindfuck knowing i'll probably be dead when this falls out of copyright D: [01:46] i volunteer at a museum that was founded around the time i was born, where we take care of things that are a hundred years old [01:46] the world is old [01:46] the world will become much older [01:49] related: https://archive.org/details/what_cd [01:50] interesting @ both [01:54] would be interesting hearing the opinions of people involved with museums [01:54] esp younger, less tenured ones [01:56] *** vegbrasil has quit IRC (*) [01:57] *** vegbrasil has joined #archiveteam [01:57] are a lot of "big sites" (cute overload, apkmirror, other ones you're archiving) on unlimited transfer connections? [01:59] *** vegbrasil has quit IRC (Client Quit) [02:00] coming from only having experience with "cheap hosting" ($10-50) with limits, i'm not familiar with where "unlimited" starts [02:00] actual sites that are hosted on hardware probably pay for bandwidth at the 95th percentile [02:01] *** vegbrasil has joined #archiveteam [02:05] so cuteoverload might be $200-500+/mo? [02:10] oh, they could be $150ish due to wordpress [02:13] now i'm less worried about what archivebot/ATW are doing to sites' bw/transfer count [02:14] *** vegbrasil has quit IRC (*) [02:18] *** vegbrasil has joined #archiveteam [02:19] is archive.org doing *.youtube? [02:20] or archiveteam doing channels? [02:20] *** vegbrasil has quit IRC (Client Quit) [02:23] *** vegbrasil has joined #archiveteam [02:23] xmc: it's documented now (in these channel logs) :-P [02:24] *** vegbrasil has quit IRC (Client Quit) [02:27] *** robink_ has quit IRC (Ping timeout: 633 seconds) [02:32] *** JesseW has quit IRC (Ping timeout: 370 seconds) [02:35] it's exhaustively documented in these logs [02:35] *** bwn has quit IRC (Quit: Quit) [02:36] *** dashcloud has quit IRC (Read error: Operation timed out) [02:37] *** bwn has joined #archiveteam [02:37] *** robink has joined #archiveteam [02:39] *** dashcloud has joined #archiveteam [02:50] *** VADemon has quit IRC (Quit: left4dead) [03:03] *** redlob has quit IRC (Read error: Operation timed out) [03:06] *** redlob has joined #archiveteam [03:22] *** ariscop has joined #archiveteam [03:49] *** JesseW has joined #archiveteam [03:53] *** ariscop has quit IRC (Quit: Leaving) [04:11] *** dashcloud has quit IRC (Read error: Operation timed out) [04:14] *** dashcloud has joined #archiveteam [04:16] *** bsmith093 has quit IRC (Ping timeout: 370 seconds) [04:19] *** bsmith093 has joined #archiveteam [04:39] *** ariscop has joined #archiveteam [04:42] *** ariscop has quit IRC (Leaving) [04:43] *** ariscop has joined #archiveteam [04:56] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [05:00] *** brayden has quit IRC (Quit: Leaving) [05:03] *** Sk1d has joined #archiveteam [05:04] ranma: bandwidth/traffic cost is a strange thing to calculate at times [05:04] *** brayden has joined #archiveteam [05:04] *** swebb sets mode: +o brayden [05:04] ranma: you generally pay fuck all for it, until you reach a certain point where you're outside of what the lower-end oversold-bandwidth hosts can offer you [05:05] at which point you indeed generally switch to 95th percentile billing or similar [05:05] downside of 95th percentile is that it can very suddenly jack up your costs a lot [05:07] this is probably more #archiveteam-bs material :P [05:41] *** mismatch_ has quit IRC (Remote host closed the connection) [05:41] *** mismatch_ has joined #archiveteam [05:50] Where's my hug [05:51] Whaddya need a hug for, and how close are you to Sunnyvale CA at the moment? [05:53] Not even a little close to Sunnyvale. [05:58] * JesseW mails a hug [05:58] It'll get there in a couple of days [05:59] I have a speech due for May 4th, first speaking engagement of the year, I think [06:13] * Frogging hugs SketchCow [06:17] I've also thrown out most of my clothing. [06:25] I'd hug you but you sound naked. [06:27] I sound like I'm not wearing clothes from my 20s [06:42] *** dashcloud has quit IRC (Read error: Operation timed out) [06:43] *** dashcloud has joined #archiveteam [07:21] *** JesseW has quit IRC (Ping timeout: 370 seconds) [07:42] hi, first thank you all alot for your work on this. Just found this project and love it [07:44] just set up my second warriror on my root server, but now i see both waiting because of a tracker rate limiting. i understand why this is active, but i think it would be better to add a second project to be worked on. is this possible? [07:47] *** Honno has joined #archiveteam [07:47] *** Honno_ has joined #archiveteam [07:49] xhdr: Seems like on weekends, there is enough people running, during week there seems to be more work, or at least that is how it looks like to me :) [07:50] Google code, yuku need most those days it seems. One simple way to run multiple is to run multiple warriors in docker. [07:51] That way you also do not have to pre-allocate memory and disk, just use a cgroup to limit. (Some recursive loops can run days and eat max memory allowed by the container (in my case 2GB) [07:54] *** Honno has quit IRC (Read error: Operation timed out) [07:55] google code also got rate limiting, yuku is auto-set for me by archive-team and on limiting. but fine, if you need the warriors on weekdays, i will see what happends the next days [07:58] would it be possible to monitor or mirror gamecopyworld.com ? it has no problems curently, but i think it is worth to be backed up [08:05] would archivebot backup a channel? [08:05] *a youtube channel [08:06] say a lockpicking channel: https://www.youtube.com/user/bosnianbill/videos [08:18] if not, it'd probably be pretty easy to integrate youtube-dl into it [08:25] *** godane has quit IRC (Quit: Leaving.) [08:38] *** Tomcat_ has joined #archiveteam [08:40] *** Honno__ has joined #archiveteam [08:40] *** hook54321 has quit IRC (Quit: Connection closed for inactivity) [08:49] ranma: arkiver was working on something related to that, iirc [08:53] *** Honno_ has quit IRC (Read error: Operation timed out) [08:54] *** dashcloud has quit IRC (Read error: Operation timed out) [08:58] *** godane has joined #archiveteam [08:58] *** dashcloud has joined #archiveteam [09:08] *** arkiver2 has joined #archiveteam [09:08] *** swebb sets mode: +o arkiver2 [09:10] *** robink has quit IRC (Ping timeout: 260 seconds) [09:12] *** bwn has quit IRC (Read error: Operation timed out) [09:12] *** robink has joined #archiveteam [09:20] http://gamecopyworld.eu/ seems definitely worth, although they have mirrors. I wonder how large the site is. But at least they use old style static web, so should take less space on that dimension. If you could, arkiver look at that, would be nice. [09:20] *** bwn has joined #archiveteam [09:36] *** Medowar has joined #archiveteam [09:41] *** atomotic has joined #archiveteam [09:47] *** Honno has joined #archiveteam [09:57] *** Honno__ has quit IRC (Read error: Operation timed out) [10:06] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [10:10] *** dashcloud has quit IRC (Read error: Operation timed out) [10:11] *** Emcy_ has joined #archiveteam [10:13] *** Emcy has quit IRC (Ping timeout: 250 seconds) [10:14] *** dashcloud has joined #archiveteam [10:26] *** arkiver3 has joined #archiveteam [10:26] *** swebb sets mode: +o arkiver3 [10:26] *** dashcloud has quit IRC (Read error: Operation timed out) [10:29] *** arkiver2 has quit IRC (Ping timeout: 244 seconds) [10:29] *** dashcloud has joined #archiveteam [10:30] *** arkiver3 has quit IRC (Client Quit) [10:49] ranma: if you just want all the content of a youtube channel, there's some download tools out there to do it (like jkdownloader2). Just pass it a channel name and it gets to work. Sadly, I'm pretty sure it runs on Java, and no idea if it takes command line stuff [10:56] https://rg3.github.io/youtube-dl/ works off the command line, only needs Python [10:57] *** Emcy has joined #archiveteam [11:09] *** Emcy_ has quit IRC (Read error: Operation timed out) [11:12] *** Honno_ has joined #archiveteam [11:17] *** BlueMaxim has quit IRC (Quit: Leaving) [11:35] *** Emcy_ has joined #archiveteam [11:40] *** Emcy has quit IRC (Ping timeout: 492 seconds) [13:02] *** VADemon has joined #archiveteam [13:58] *** WinterFox has quit IRC (Remote host closed the connection) [14:24] nice [14:31] dang, that script supports 800+ sites! [14:42] Plus many more which just work without any special coding [14:42] i hate being new here, but man, the stuff i learn about! [14:45] There's worse things to hate. [15:09] *** Honno has quit IRC (Quit: Leaving) [15:10] true! [15:12] *** Honno_ has quit IRC (Quit: Leaving) [15:21] atrocity: ranma: fwiw, if you intend to use youtube-dl locally... use this alias: [15:21] alias ytdl="youtube-dl --title --continue --retries 4 --write-info-json --write-description --write-thumbnail --write-annotations --all-subs --ignore-errors --merge-output-format mkv -f 'bestvideo+bestaudio/best' $1" [15:21] gets you the highest possible quality version plus all the metadata of everything [15:22] you will need ffmpeg or libav installed [15:23] (by default it will just grab `best` which isn't necessarily the highest quality - often there are purely-video and purely-audio streams available separately that are of higher quality, and the `bestvideo+bestaudio` option along with the merge options will download them separately and then combine them into an MKV) [15:23] (with a fallback to `best` if there are no separate streams available) [15:25] *** Honno has joined #archiveteam [15:28] *** Medowar has quit IRC (Quit: Connection closed for inactivity) [16:16] *** RichardG has quit IRC (Read error: Operation timed out) [16:33] joepie91: Newer versions have switched to bestvideo+bestaudio/best by default [16:33] But yes, those are all good options [16:35] *** Atom__ has joined #archiveteam [16:37] sweet, thanks! [16:47] *** pfallenop has quit IRC (Ping timeout: 260 seconds) [17:02] *** Medowar has joined #archiveteam [17:27] thanks for the command line strings. [17:27] I was just thinking it might be valuable info worth archiving [17:28] or slightly valuable :) [17:33] MrRadar: oh, have they? [17:33] have a link? :) [17:42] *** goekesmi has quit IRC (Read error: Connection reset by peer) [17:48] *** Stiletto has quit IRC (Read error: Operation timed out) [17:54] *** JesseW has joined #archiveteam [18:02] *** brayden_ has joined #archiveteam [18:02] *** swebb sets mode: +o brayden_ [18:02] For a dark archive of youtube, see ivan` (https://docs.google.com/forms/d/1_kkpBe6abFQ5sznrMfWHhP7ZhdktKejJEpvCCcqVues ) [18:08] *** brayden has quit IRC (Read error: Operation timed out) [18:09] everyone: please try the GameFront grab [18:09] GameFront has banned a lot of countries from access, so if your IP hass access, please run the grab. [18:09] * JesseW spins up my warrior again [18:10] *** goekesmi has joined #archiveteam [18:11] gamefront running [18:11] you're not banned? [18:11] concurrent = 3? [18:11] I haven't been running the warrior for a while, that's probably why [18:11] any concurrent is fine. [18:11] They banned a lot of countries, but I've not seem them ban an individual IP yet [18:12] what would I see if I was banned? [18:12] The grab has a check for that [18:12] If you're banned, you'll see it in the output [18:13] I saw three wget Failure's -- but nothing explicitly saying I was banned [18:13] starting 20 concurrent, will let you know if I'm banned [18:14] I may be banned -- I'm getting all wget failures [18:14] *** atomotic has joined #archiveteam [18:14] "The link to the download did not return status code 200. Are you banned? (This error might also be due to a problem on GameFront's side.)" [18:15] all I'm seeing is different locales of facebook being downloaded [18:15] Kazzy: yeah, that's fine [18:16] JesseW: But it did get something right? [18:16] I haven't seen any jobs make it past the WgetDownload step, no. [18:16] did you see it download an URL? [18:17] 16=200 http://track1.breakmedia.com/track.jpg?ref=http%3A%2F%2Fwww.gamefront.com%2Ffiles%2F17245152%2FTheLair2minFade_mp3&metaFileId=17245152&fileId=17245152. [18:17] 17=403 http://media1.gamefront.com/moddb/2009/12/05/TheLair2minFade.mp3?b17f4b620c6cf1393ffa644d1ceea1519471f50243241c9c351f544aefaeb617054856f45e07ae230795c14b30a53906a278cc670925e173f730b5fc39bd208db3b8eceddfd57da070f6effd84bb875ea8231bc73b452f7f18194cff069227ed82102d59138456617a72401b669fc54204dc. [18:17] The link to the download did not return status code 200. Are you banned? (This error might also be due to a problem on GameFront's side.) [18:17] Process WgetDownload returned exit code -6 for Item file:1724515 [18:17] Failed WgetDownload for Item file:1724515 [18:17] Waiting 10 seconds.. [18:17] that's fine [18:18] ok good [18:18] these items are requeued and problematic, so you'll get a lot of that [18:18] but new items will be up soon [18:18] as long as it's not known to be useless, I'm happy to run it [18:19] and the tracker shows me having uploaded file:2539515 [18:19] (with only 0.1 MB, but at least it completed) [18:23] the current project tab stays empty, the warrior is idle [18:25] Meroje: try restarting the virtual machine? [18:25] did that already [18:26] and this happens even when you select other projects? [18:26] it works for the team's choice [18:28] Hm. [18:29] ~/projetcs/gamefront seems good [18:33] trying to run the pipeline showed requests was not installed [18:33] sadly I'm banned [18:34] I'm not seeing my ip in the list shown though [18:34] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [18:37] yay, another one actually went through! [18:42] browsing the checked file gets me a 403, I guess that ip list got copy pasted by error [18:46] *** pfallenop has joined #archiveteam [18:51] *** hook54321 has joined #archiveteam [18:55] *** scyther has joined #archiveteam [18:55] *** bwn has quit IRC (Read error: Operation timed out) [19:05] *** JesseW has left [19:15] *** bwn has joined #archiveteam [19:16] *** JesseW has joined #archiveteam [19:37] joepie91: It's mentioned in the readme https://github.com/rg3/youtube-dl/blob/master/README.md [19:37] Search for 2015.04.26 [19:42] *** nwf has quit IRC (Read error: Operation timed out) [19:43] *** Yoshimura has quit IRC (http://www.kiwiirc.com/ - A hand crafted IRC client) [19:44] *** Yoshimura has joined #archiveteam [19:44] *** nwf has joined #archiveteam [19:51] *** Dark0ne_ has joined #archiveteam [20:01] *** JesseW has quit IRC (Ping timeout: 370 seconds) [20:07] *** anjacks0n has joined #archiveteam [20:12] i swapped to gamefront [20:12] set it to 3 [20:13] How is the Game Front back up going? [20:14] *** RichardG has joined #archiveteam [20:22] Very well [20:22] http://tracker.archiveteam.org/gamefront/ [20:22] 28.5 TB saved already [20:25] *** anjacks0n has quit IRC (anjacks0n) [20:53] *** ariscop has quit IRC (Ping timeout: 506 seconds) [20:55] *** RichardG has quit IRC (Ping timeout: 250 seconds) [20:58] *** scyther has quit IRC (Quit: Leaving) [20:59] *** Tomcat_ has quit IRC (Remote host closed the connection) [21:00] *** anjacks0n has joined #archiveteam [21:03] *** metalcamp has quit IRC (Ping timeout: 244 seconds) [21:04] *** Honno has quit IRC (Read error: Operation timed out) [21:13] *** anjacks0n has quit IRC (anjacks0n) [21:15] *** Coderjoe has quit IRC (Read error: Operation timed out) [21:18] *** RichardG has joined #archiveteam [21:24] *** kris33 has joined #archiveteam [21:36] *** Coderjoe has joined #archiveteam [21:38] *** JesseW has joined #archiveteam [21:44] *** Dark0ne_ has quit IRC (Ping timeout: 268 seconds) [21:45] *** WinterFox has joined #archiveteam [21:58] *** zino has quit IRC (Read error: Connection reset by peer) [22:07] *** JesseW has quit IRC (Ping timeout: 370 seconds) [22:12] *** DopefishJ has joined #archiveteam [22:12] *** swebb sets mode: +o DopefishJ [22:17] *** DFJustin has quit IRC (Read error: Operation timed out) [22:17] *** anjacks0n has joined #archiveteam [22:19] *** zino has joined #archiveteam [22:21] *** kris33_ has joined #archiveteam [22:21] *** Ymgve__ has joined #archiveteam [22:24] *** anjacks0n has quit IRC (anjacks0n) [22:28] *** kris33 has quit IRC (Ping timeout: 506 seconds) [22:28] *** Ymgve has quit IRC (Ping timeout: 506 seconds) [22:45] *** kris33_ has quit IRC (Textual IRC Client: www.textualapp.com) [22:58] *** JesseW has joined #archiveteam [23:03] *** BlueMaxim has joined #archiveteam [23:08] *** Medowar has quit IRC (Quit: Connection closed for inactivity) [23:24] *** mismatch_ has quit IRC (Ping timeout: 250 seconds) [23:24] *** mismatch has joined #archiveteam [23:40] *** hook54321 has quit IRC (Quit: Connection closed for inactivity)