[00:12] hey so gitorious.org is going away end of may. https://about.gitlab.com/2015/03/03/gitlab-acquires-gitorious/ [00:12] they have a migration path, but it's manual. [00:12] *** brayden has joined #archiveteam [00:12] (oddly, I have a email thread in progress with their CEO over a different matter) [00:12] anyway, this is probably worth a project.. [00:14] closure: look up at the conversation with rolfb, they work there [00:14] :) [00:26] iirc, gitorious also has bug tracking and merge requests, which is data not covered by a git repository backup [00:31] *** dashcloud has quit IRC (Read error: Operation timed out) [00:37] *** dashcloud has joined #archiveteam [00:41] *** cbb2 has joined #archiveteam [00:43] *** cbb has quit IRC (Ping timeout: 240 seconds) [00:47] *** toad has joined #archiveteam [00:57] *** dashcloud has quit IRC (Read error: Operation timed out) [01:04] *** dashcloud has joined #archiveteam [01:05] *** abartov has quit IRC (Remote host closed the connection) [01:06] *** db48x` has joined #archiveteam [01:08] *** dashcloud has quit IRC (Read error: Operation timed out) [01:13] What where what [01:14] *** dashcloud has joined #archiveteam [01:15] *** blha303 has joined #archiveteam [01:36] *** Ymgve has quit IRC () [01:38] cancel that, no bug tracker [01:46] *** chfoo has quit IRC (Read error: Connection reset by peer) [01:53] *** chfoo has joined #archiveteam [01:56] *** mistym has quit IRC (Remote host closed the connection) [02:07] *** toad has quit IRC (Leaving.) [02:09] *** mistym has joined #archiveteam [02:09] *** dashcloud has quit IRC (Read error: Operation timed out) [02:11] *** dashcloud has joined #archiveteam [02:29] *** cbb2 has quit IRC (Quit: cbb2) [03:07] *** primus104 has quit IRC (Leaving.) [03:37] Ctrl-S: If someone were to be able to store that art archive, is there a place it could go eventually? I have the free space (or could without all too much trouble), but that's most of my free space. :) [03:38] *** twrist has joined #archiveteam [03:38] I would also be concerned about the feasibility of pulling ~10 TB out of Tor in any sort of short order. [03:43] you'd need to parallelize over some huge number of tor instances to get enough bw [03:45] *** twrist has quit IRC (Ping timeout: 370 seconds) [03:52] I'd provide the space myself, but i'm australian and thus capped [03:53] It'd take me a year to download that much if i dropped all my other internet stuff [03:54] :( [03:54] I really don't know where it'd end up [03:55] I guess I could mail you hard drives. "Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway," indeed. [03:55] What country are you in? [03:55] US. [03:55] I have a friend there who might be able to accept them [03:55] All of that depends on even getting the data in the first place. [03:56] yeah [03:56] that is a problem [03:57] My bandwidth is probably much greater than what I'd expect to be able to pull out of Tor, which I expect would be the hangup. [03:57] aschmitz, that onion site works surprisingly well for me. [03:57] I admittedly haven't tried loading it. [04:01] It's respectable, at least. [04:02] *** mistym has quit IRC (Remote host closed the connection) [04:03] No directory index on /fa is annoying, though: can't tell when subfolders have been updated [04:04] indeed [04:09] Ctrl-S: how much space do you need? [04:09] I can get a few TB [04:09] I think the estimate was ~10 TB. [04:10] aschmitz, looking back it seems to be an extremely rough estimate [04:14] Woah woah. the onion site is KIA [04:14] ? [04:14] I can't connect to it anymore.... [04:15] Currently doing a test download from it, seems to work. [04:15] Oh wait. It's working now. :/ anyways /indexes might be of interest. [04:15] Yeah. [04:16] So I get 200 KB/s if I'm lucky. [04:16] Saw it hit 230 once. [04:17] Multiple connections might get a bit more, but not like to be 100x better. [04:17] Surely someone knows the owner of this beast... [04:17] you just using wget to grab it? [04:18] Ctrl-S: Yeah, at the moment. Not planning to continue past this one folder, though. [04:18] okay [04:18] What site was this? Probably worth moving off into a channel. [04:19] Yeah if it is 10 TB we can expect it to take 12 years to finish [04:20] The usernames I'm looking up are all coming back as likely to be DA? [04:21] some might also be on DA [04:21] site homepage: http://www.furaffinity.net/ [04:22] Anyone got a channel name they like? [04:24] just reuse #iceking [04:24] iceking? [04:25] Okay. [04:30] *** mistym has joined #archiveteam [04:37] *** twrist has joined #archiveteam [06:09] *** underscor has quit IRC (No Ping reply in 180 seconds.) [06:10] *** underscor has joined #archiveteam [06:10] *** swebb sets mode: +o underscor [06:22] *** the_fox has joined #archiveteam [06:39] *** db48x` has quit IRC (Read error: Operation timed out) [06:41] *** sep332 has quit IRC (Read error: Operation timed out) [07:07] *** Nertsy has quit IRC (Quit: Nertsy) [07:07] *** Nertsy has joined #archiveteam [07:54] *** mistym has quit IRC (Remote host closed the connection) [08:09] *** sep332 has joined #archiveteam [08:19] *** dashcloud has quit IRC (Read error: Operation timed out) [08:25] *** dashcloud has joined #archiveteam [08:51] *** signius has quit IRC (Read error: Operation timed out) [08:57] *** rolfb has joined #archiveteam [09:04] *** signius has joined #archiveteam [09:09] *** primus104 has joined #archiveteam [09:43] *** MMovie has joined #archiveteam [09:46] *** MMovie1 has quit IRC (Ping timeout: 306 seconds) [10:34] *** rolfb has quit IRC (Leaving...) [10:39] *** dcmorton has quit IRC (Quit: ZNC - http://znc.in) [10:55] *** dcmorton has joined #archiveteam [10:58] *** www2 has joined #archiveteam [10:59] hi i found out that gitorious be close on 1e of June http://thenextweb.com/insider/2015/03/03/gitlab-acquires-rival-gitorious-will-shut-june-1/ [11:04] *** Ymgve has joined #archiveteam [11:07] www2: hyah we are talking to tem [11:11] *** rolfb has joined #archiveteam [11:17] Smiley: no problem and i know there is some repos that have more than 10GB of date e.g. the repos from flightgear (fgdata) [11:17] *** www2 is now known as away [11:17] *** away is now known as www2-away [11:18] *** deathy has quit IRC (Read error: Connection reset by peer) [11:18] *** VonGuard has quit IRC (Ping timeout: 260 seconds) [11:18] *** deathy has joined #archiveteam [11:18] *** VonGuard has joined #archiveteam [11:19] *** danneh_ has quit IRC (Ping timeout: 260 seconds) [11:22] *** schbirid has joined #archiveteam [11:22] *** danneh_ has joined #archiveteam [11:26] DFJustin: i looked at the log page you linked yesterday, but the logs seems to be old? [11:26] http://badcheese.com/~steve/atlogs/?chan=archiveteam [11:34] www2-away: yeah, 10Tb total. [11:34] Smiley: are you talking gitorious? [11:39] *** Jonimus has quit IRC (Ping timeout: 370 seconds) [11:43] *** Jonimus has joined #archiveteam [11:48] nod? [11:51] Smiley: it's more like 4.5 TB total [11:51] atleast according to `df -h` [11:51] :-) [11:51] \o/ [11:51] we've halfed it over night :D [11:52] not sure where the 10 TB came from though [11:52] even saw 12 TB mentioned earlier [11:52] *** BlueMaxim has quit IRC (Leaving) [12:00] [19:57:49] < yipdw> in any case a 10 TB job is really just a dick move at present time [12:00] oh seems there was some mix up with a theortical discussion [12:01] Smiley, wait what? [12:01] what was this discussion about? [12:01] as before I have 2Tb storage on a semi slow machine [12:01] rolfb: jobs into archivebot, don't worry about it [12:02] sure? i'd like to be aware if i'm making any dick moves without knowing why [12:02] *** dashcloud has quit IRC (Read error: Operation timed out) [12:04] gitorious is experiencing some huge loads atm since the news hit slashdot [12:04] rolfb: someone like said "why not just archivebot" it. [12:04] ah, i see [12:04] Thats our answer to any smaller sites [12:04] As it works really nicely ;D [12:05] nice [12:05] Theres no automatic validation of jobs tho, other than not doing the same site over and over [12:05] Smiley: unrelated, but do you know anything about cleaning spam users? [12:05] rolfb: we keep users as they are, spam and all [12:05] ok [12:05] i would prefer to clean the db a bit soon though [12:05] if possible, we proitize real users, but we want everything as it was. [12:05] but i don't have tooling to identify [12:06] So, what are you to gitorius, owner? [12:07] Smiley: i'm lots of things. CEO of Gitorious AS, Chairman of the board of Powow which owns Gitorious company. [12:12] ok [12:13] best person to speak to for direct incusion into IA is SketchCow hes the only 'offical' person here [12:13] Smiley: thanks, I emailed him yesterday [12:14] k cool [13:25] *** primus104 has quit IRC (Leaving.) [13:44] *** sankin has joined #archiveteam [14:04] *** rolfb has quit IRC (Leaving...) [14:22] Yeah, here I am., [14:22] Send us a drive. [14:22] *** trs80 has quit IRC (Ping timeout: 186 seconds) [14:22] Jason Scott, c/o Internet Archive, 300 Funston Avenue, San Francisco, CA 94118 [14:36] \o/ [14:36] SEND ALL THE THINGS. [14:51] *** mutoso has joined #archiveteam [14:57] *** cbb has joined #archiveteam [15:19] *** marvinw has quit IRC (Ping timeout: 600 seconds) [15:27] *** Start has quit IRC (Disconnected.) [15:31] *** primus104 has joined #archiveteam [15:31] *** mistym has joined #archiveteam [15:37] *** marvinw has joined #archiveteam [15:43] *** mistym has quit IRC (Remote host closed the connection) [15:45] *** Nemo_bis has quit IRC (Ping timeout: 240 seconds) [15:51] SketchCow: awkwardly, rolfb left irc a few minutes before you arrived [15:53] *** Nemo_bis has joined #archiveteam [16:02] *** Start has joined #archiveteam [16:08] *** kyan_ has quit IRC (Ping timeout: 258 seconds) [16:19] *** mistym has joined #archiveteam [16:43] *** primus104 has quit IRC (Leaving.) [16:51] *** Start has quit IRC (Disconnected.) [16:52] for those out of the loop: http://archive.fart.website/bin/irclogger_log/archiveteam?date=2015-03-03,Tue [16:58] *** Start has joined #archiveteam [17:00] *** mistym has quit IRC (Remote host closed the connection) [17:20] still no plans for the Tor site then? [17:22] #iceking [17:35] Awwww [17:35] I'll mail him after the GDC thing happens. [17:35] That address is public anyway, send me your shit [17:36] *** sydbarret has joined #archiveteam [17:37] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [17:37] yahoosucks [17:37] thanks :) [17:37] welcome, sydbarret [17:37] care to introduce yourself? [17:41] sure, i have been running the warrior vm for a while and just registered on the wiki [17:42] sweet :) [17:45] *** Start has quit IRC (Disconnected.) [17:49] *** primus104 has joined #archiveteam [17:50] *** kyan has joined #archiveteam [17:51] how long should the export to archive.org last? [17:52] hmm? [18:02] oh it's done..the warrior is getting stuff to pull down again [18:02] i was getting errors before saying that there was no work to be done because an export was in progress [18:03] *** Start has joined #archiveteam [18:06] off to lunch...bbl :) [18:10] *** sydbarret has quit IRC (Ping timeout: 260 seconds) [18:11] *** mistym has joined #archiveteam [18:14] Could someone add nickreboot.com to archivebot? It's going down today [18:38] *** Start has quit IRC (Disconnected.) [19:33] *** ash_ has joined #archiveteam [19:34] hey, does anyone know anything about the old dailybooth archives and how to download them? [19:41] *** cbb has quit IRC (Quit: cbb) [19:45] *** Start has joined #archiveteam [19:46] *** Start has quit IRC (Read error: Connection reset by peer) [19:47] *** Start has joined #archiveteam [19:47] *** Start has quit IRC (Remote host closed the connection) [19:47] *** Start has joined #archiveteam [19:54] *** ash_ has quit IRC (Quit: Page closed) [20:19] *** mistym has quit IRC (Remote host closed the connection) [20:20] *** sydbarret has joined #archiveteam [20:25] *** Start_ has joined #archiveteam [20:25] *** Start has quit IRC (Read error: Connection reset by peer) [20:26] *** Start_ has quit IRC (Client Quit) [20:26] *** Start has joined #archiveteam [20:26] *** Start has quit IRC (Remote host closed the connection) [20:26] *** Start has joined #archiveteam [20:26] *** Start has quit IRC (Client Quit) [20:32] *** mistym has joined #archiveteam [20:32] *** www2-away has quit IRC (Remote host closed the connection) [20:34] *** www2 has joined #archiveteam [20:35] *** Start has joined #archiveteam [20:36] EA murdered Maxis: https://archive.today/KVEDu [20:38] \a@MaxisGuillaume looks like a bug [20:45] *** BlueMaxim has joined #archiveteam [20:50] "murdered" [21:04] fuck [21:04] some site started serving 500 and 404 after 3 days of wgetting :( [21:05] "how to ruin an archive #1" [21:12] *** mistym has quit IRC (Remote host closed the connection) [21:13] http://www.wfaa.com/story/news/local/texas-news/2015/03/03/txcn-to-sign-off-on-april-1/24341769/ [21:13] paging godane [21:14] txcn's official website is http://www.txcn.com, they're shutting down on april 1st [21:14] *** schbirid has quit IRC (Leaving) [21:18] no they're not [21:18] it's April 1st [21:18] nobody would be stupid enough to do that [21:19] http://searchengineland.com/yahoo-clues-app-search-other-products-to-shut-down-april-1-150320 [21:19] except yahoo [21:21] And on April 1st, those pages will be rickrolls because Yahoo. [21:21] "Haha, April Fools! What? oh no, we really did shut down lol" [21:22] *** edward_ has joined #archiveteam [21:22] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [21:22] edward_: yahoosucks [21:23] *** Start has quit IRC (Disconnected.) [21:23] edward_: greetings! care to introduce yourself? [21:23] xmc: thanks! [21:23] * edward_ used to work at the internet archive [21:23] ah, cool [21:23] on the open library project with aaronsw [21:24] * edward_ is interested in git-annex [21:24] :3 [21:24] git-annex is good stuff. [21:25] shame i don't speak haskell [21:26] * edward_ made us own debian package of the latest version of git annex: http://edwardbetts.com/git-annex/ [21:27] *** mistym has joined #archiveteam [21:40] *** Nertsy has quit IRC (Ping timeout: 512 seconds) [21:41] *** Nertsy has joined #archiveteam [21:53] *** sankin has quit IRC (Leaving.) [22:43] *** sydbarret has quit IRC (Read error: Operation timed out) [22:51] *** cbb has joined #archiveteam [22:58] *** garyrh_ has quit IRC (Quit: Leaving) [23:06] I'm getting a list of websites from the owner of trovebox. [23:06] We have 10 days left, we'll start it asap [23:06] We might need a lot of resources to get everything in time [23:07] * arkiver is off to bed [23:15] *** dashcloud has joined #archiveteam [23:15] *** X-Scale has joined #archiveteam [23:33] *** dashcloud has quit IRC (Read error: Operation timed out) [23:36] *** Ravenloft has joined #archiveteam [23:37] *** dashcloud has joined #archiveteam [23:43] *** SN4T14_ has joined #archiveteam [23:46] *** SN4T14 has quit IRC (Ping timeout: 306 seconds) [23:49] *** kyan has quit IRC (Ping timeout: 258 seconds) [23:51] would work equally well for Yahoo https://twitter.com/spamda/status/573262135746150400 [23:56] *** Start has joined #archiveteam [23:58] !con 508j99b922l62aidl56s7tidv 3 [23:59] Start: wrong channel [23:59] oh, whoops [23:59] Start: Set 508j99b922l62aidl56s7tidv to 3000 workers.