[00:00] what's good everyone [00:00] https://twitter.com/CityPolicePIPCU/status/841695126665736194 [00:00] :| [00:00] Fried Tofu [00:00] Surprised me, too [00:01] https://twitter.com/CityPolicePIPCU/status/841699007973928960 [00:01] Since WHEN were torrents illegal [00:01] e.e [00:02] surprisingly not covered on torrentfreak yet [00:03] I have a 2gb file sitting here that's been here for months, because I kind of forgot what it is. [00:07] It's..... 2gb of World of Warcraft Armory XML [00:07] Why do I have this. [00:09] because why not [00:09] Yeah but I don't even know how I got to getting it now. [00:17] Anyway, it's flung on the Archive and I can keep going with my cleanups [00:17] I learned a lot importing the Frostbyte, mostly how I need to be more careful in the future [00:33] https://twitter.com/maddow/status/841795163664089089 [00:33] "BREAKING: We've got Trump tax returns. Tonight, 9pm ET. MSNBC. (Seriously)." [00:37] Yeah [00:37] It's blowing up [00:37] This is a good time for everyone not in the US to know about Bartnicki v. Vopper [00:38] Which says, basically, fuck it, we got it, we're the press, suck a dick [00:46] *** passerby_ has joined #archiveteam-bs [00:48] I love the (Seriously) [00:48] inb4 his tax returns are empty spreadsheets [00:49] *** GE has quit IRC (Remote host closed the connection) [00:52] *** passerby has quit IRC (Read error: Operation timed out) [00:57] *** odemg has joined #archiveteam-bs [01:09] *** Pudsey has joined #archiveteam-bs [01:21] *** Pudsey has quit IRC (Remote host closed the connection) [01:21] *** passerby has joined #archiveteam-bs [01:24] *** passerby_ has quit IRC (Read error: Operation timed out) [01:24] *** RichardG has quit IRC (Read error: Operation timed out) [01:28] *** RichardG has joined #archiveteam-bs [01:29] *** passerby_ has joined #archiveteam-bs [01:32] *** passerby_ has quit IRC (Client Quit) [01:32] *** passerby has quit IRC (Ping timeout: 492 seconds) [01:32] *** passerby has joined #archiveteam-bs [01:53] *** pnJay has quit IRC (Leaving) [01:53] *** j08nY has quit IRC (Quit: Leaving) [02:16] *** Pudsey has joined #archiveteam-bs [02:18] *** Pudsey has quit IRC (Remote host closed the connection) [02:20] Trump tax returns... oh lord, Reddit is going to be an even more insuffereable place than usual. [02:34] *** BlueMaxim has joined #archiveteam-bs [02:39] *** winr4r has quit IRC (Read error: Operation timed out) [02:45] *** ndiddy has quit IRC () [03:00] *** winr4r has joined #archiveteam-bs [03:04] *** Coderjo has quit IRC (Ping timeout: 260 seconds) [03:24] *** Coderjo has joined #archiveteam-bs [03:38] *** Coderjo has quit IRC (Remote host closed the connection) [03:45] *** RichardG has quit IRC (Read error: Operation timed out) [03:45] *** RichardG has joined #archiveteam-bs [03:46] Sorry for the flaky connection a few days ago. And thanks Lord_Nigh for speaking up for me. (now sent to right channel) [03:57] *** pizzaiolo has left [05:05] *** RichardG has quit IRC (Read error: Operation timed out) [05:05] *** RichardG has joined #archiveteam-bs [05:25] Is it about time to do another census? [05:28] *** BlueMaxim has quit IRC (Read error: Operation timed out) [05:30] *** Muad-Dib has quit IRC (Ping timeout: 260 seconds) [05:36] HCross2: yes please! [05:36] Ill take a look over the weekend [05:36] If you're willing to step up, I'd be delighted to troubleshoot any problems you run into. [05:37] There are scripts on the wiki [05:37] Somebody2: what timezone are you in? [05:37] Pacific Time (same as IA). [05:38] But I'm up at weird hours. [05:38] and unavailable during usual work hours. [05:38] And yes, the scripts on the wiki are where I'd suggest you start. [05:39] I'll do some reading over it all. I'm in GMT so we may get the odd timezone issue. I've already briefly read the scripts, it looks simple ish [05:41] Yep, mostly what had me stuck was uncertanity about what kinds/parts of the freely-downloadable data IA would prefer not to be aggregated and published. [05:51] *** icedice has joined #archiveteam-bs [05:54] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [05:57] *** Muad-Dib has joined #archiveteam-bs [06:01] *** Sk1d has joined #archiveteam-bs [06:06] HCross2: i can help as much as i can as well [06:11] *** icedice has quit IRC (Quit: Leaving) [06:32] *** RichardG has quit IRC (Read error: Operation timed out) [06:32] *** RichardG has joined #archiveteam-bs [06:37] *** Yurume has quit IRC (Read error: Operation timed out) [06:39] *** Yurume has joined #archiveteam-bs [07:00] *** BlueMaxim has joined #archiveteam-bs [07:00] *** GE has joined #archiveteam-bs [07:13] *** Honno has joined #archiveteam-bs [07:29] *** masterX24 has joined #archiveteam-bs [07:29] back now after sleeping, still needing guidance on the upload of the crawl [07:43] paging xmc and SketchCow [07:47] looks like the buck sexton show ended on theblaze [07:52] *** Coderjo has joined #archiveteam-bs [07:54] *** odemg has quit IRC (Remote host closed the connection) [07:55] i'm grabbing it by hourly mp3s cause its easier to grab [07:56] also cause jan 27 and 31 was only the first 2 hours [07:57] *** kristian_ has joined #archiveteam-bs [08:25] *** RichardG has quit IRC (Read error: Operation timed out) [08:25] *** RichardG has joined #archiveteam-bs [08:28] *** antomatic has joined #archiveteam-bs [08:28] *** swebb sets mode: +o antomatic [08:31] *** antomati_ has quit IRC (Ping timeout: 244 seconds) [08:37] *** schbirid has joined #archiveteam-bs [08:40] *** Riviera has joined #archiveteam-bs [08:53] Whut [08:53] I'd say hold off for a bit, please [08:54] (Upload of crawl) [08:56] *** Honno has quit IRC (Ping timeout: 370 seconds) [08:56] *** Jonison has joined #archiveteam-bs [09:12] crawl is currently taking 10% of my webserver's disk capacity, (downloading to home computer and then uploading later isnt a viable option due to very slow uplink at home) [09:16] and holding off how long? @ sketchcow ? [09:55] *** bwn has quit IRC (Read error: Operation timed out) [10:00] *** JAA has quit IRC (Quit: Page closed) [10:01] *** JAA has joined #archiveteam-bs [10:02] I wish I could mirror the entire Canopus/GVG software download space. I hate that it is all hidden away behind a registration system. Even for ancient products like my Canopus ADVC-300 [10:10] *** j08nY has joined #archiveteam-bs [10:13] Urgent: Need a tracker admin to requeue app.net jobs. [10:17] *** bwn has joined #archiveteam-bs [10:32] *** pnJay has joined #archiveteam-bs [10:52] *** j08nY has quit IRC (Read error: Operation timed out) [10:55] *** GE has quit IRC (Remote host closed the connection) [11:13] *** pizzaiolo has joined #archiveteam-bs [11:15] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [11:17] *** j08nY has joined #archiveteam-bs [11:27] *** BartoCH has joined #archiveteam-bs [11:49] *** kristian_ has quit IRC (Quit: Leaving) [11:54] *** RichardG has quit IRC (Read error: Operation timed out) [11:54] *** RichardG has joined #archiveteam-bs [11:58] *** j08nY has quit IRC (Read error: Operation timed out) [12:07] *** Honno has joined #archiveteam-bs [12:33] *** j08nY has joined #archiveteam-bs [12:57] *** RichardG has quit IRC (Read error: Operation timed out) [12:58] *** RichardG has joined #archiveteam-bs [13:02] *** BlueMaxim has quit IRC (Read error: Operation timed out) [13:20] *** RichardG has quit IRC (Read error: Operation timed out) [13:20] *** RichardG has joined #archiveteam-bs [13:28] *** Honno_ has joined #archiveteam-bs [13:32] *** Honno has quit IRC (Ping timeout: 370 seconds) [13:53] *** GE has joined #archiveteam-bs [13:55] *** RichardG has quit IRC (Read error: Operation timed out) [13:55] *** RichardG has joined #archiveteam-bs [14:22] *** RichardG has quit IRC (Read error: Operation timed out) [14:22] *** RichardG has joined #archiveteam-bs [14:31] *** kyounko|2 has quit IRC (Read error: Connection reset by peer) [14:59] *** RichardG has quit IRC (Read error: Operation timed out) [14:59] *** RichardG has joined #archiveteam-bs [15:25] *** RichardG has quit IRC (Read error: Operation timed out) [15:25] *** RichardG has joined #archiveteam-bs [15:29] *** sep332 has joined #archiveteam-bs [15:32] *** sep332_ has quit IRC (Ping timeout: 260 seconds) [15:32] *** antomatic has quit IRC (Read error: Connection reset by peer) [15:33] *** antomatic has joined #archiveteam-bs [15:33] *** swebb sets mode: +o antomatic [15:53] *** masterX24 has quit IRC (Ping timeout: 268 seconds) [16:42] *** j08nY has quit IRC (Remote host closed the connection) [16:47] *** Stiletto has quit IRC () [17:18] *** odemg has joined #archiveteam-bs [17:22] *** icedice has joined #archiveteam-bs [17:32] *** Stilett0 has joined #archiveteam-bs [17:46] so i'm uploading koreanet-1 changwon world [17:46] its kids singing basicly [17:48] first one i could find: https://archive.org/details/koreanet-1_changwon_world-20010801 [17:54] *** me is now known as yipdw [18:10] *** Roelandus has joined #archiveteam-bs [18:10] *** SmileyG has joined #archiveteam-bs [18:11] *** SmileyG has quit IRC (Client Quit) [18:25] *** ndiddy has joined #archiveteam-bs [18:28] I have a (potentially incomplete) mobileme item that does not appear to be in the IA collection. (itemname is "inclusive.solutions") [18:29] I found it while trying to clean up disk space on a system I am nearing quota on [18:29] what should I do with it? [18:41] Could you explain "mobileme item"? [18:41] http://archiveteam.org/index.php?title=MobileMe [18:41] from that grab effort [18:42] aaaages ago [18:44] indeed. I've been gone for awhile, mainly due to personal stuff. [18:48] can i make wpull strip things from URLs when recursing? [18:48] eg instead of grabbing a gazillion copies of http://media-cdn.sueddeutsche.de/image/sz.1.440537/135x101?v=1357596425000 with varying timestamps [18:48] grab http://media-cdn.sueddeutsche.de/image/sz.1.440537/135x101 once? [18:48] 's#\?v=##' :} [18:54] *** j08nY has joined #archiveteam-bs [19:03] gone from AT stuff, that is. [19:09] *** pizzaiolo has quit IRC (Ping timeout: 260 seconds) [19:23] schbirid: You could probably do that with a plugin script. I know Archivebot/grab-site does something similar to strip session IDs from URLs [19:23] Exactly *how*, I couldn't tell you [19:32] 8) [19:48] *** GE has quit IRC (Remote host closed the connection) [19:57] *** RichardG has quit IRC (Read error: Operation timed out) [19:57] *** RichardG has joined #archiveteam-bs [20:22] *** RichardG has quit IRC (Read error: Operation timed out) [20:22] *** RichardG has joined #archiveteam-bs [20:27] *** odemg has quit IRC (Remote host closed the connection) [20:36] nightpool, it is common for open source business acquisitions. I generally grab a snapshot of the webpage anyways just in case. [20:37] Be careful of gitter however, it is a walking tarpit because of how their pages update. [20:42] rocode: I think the /archives pages are all basically static html? [20:45] Yeah, but you will need to whitelist the URL. [20:47] Yeah, just straight crawls aren't going to work [20:48] *** odemg has joined #archiveteam-bs [20:53] *** Aranje has quit IRC (Quit: Three sheets to the wind) [20:57] *** Aranje has joined #archiveteam-bs [21:14] *** GE has joined #archiveteam-bs [21:23] Ars has new article praising emulation for the purpose of game preservation: https://arstechnica.com/gaming/2017/03/how-emulation-helped-save-two-video-game-rarities/ [21:24] *** RichardG has quit IRC (Read error: Operation timed out) [21:24] *** RichardG has joined #archiveteam-bs [21:40] *** kristian_ has joined #archiveteam-bs [21:57] *** BlueMaxim has joined #archiveteam-bs [22:15] *** kristian_ has quit IRC (Quit: Leaving) [22:24] *** schbirid has quit IRC (Quit: Leaving) [22:27] is there any reason to back up something on archive bot if it's already on IA? [22:27] I can't remember [22:30] not really [22:30] if the IA grab is incomplete, that's a reason [22:49] If it isn't actually an item on IA, it is probably a partial grab. [23:02] One reason to grab something through Archivebot is you can download the archivebot WARC whereas IA keeps their raw scrape data private [23:02] yes, also that [23:11] *** Aranje has quit IRC (Quit: Three sheets to the wind) [23:15] kk thanks [23:16] *** bottymcbo has joined #archiveteam-bs [23:16] KeyError: Identifier('#archiveteam-bs') (file "/usr/local/lib/python3.5/dist-packages/sopel/coretasks.py", line 363, in track_join) [23:16] oh ffs you idiotic bot. [23:16] *** bottymcbo has quit IRC (Client Quit) [23:19] Interesting. I never noticed that those "liveweb" items on IA aren't downloadable. [23:19] JensRex: no bots that talk in #archiveteam or #archiveteam-bs, please [23:20] Not that it would make much sense to do so (unless you're backing up the Wayback Machine), given that those would contain all sorts of unrelated things [23:20] xmc: I know. It's not supposed to spew errors in chat... testing in seperate channel now. [23:48] *** RichardG has quit IRC (Read error: Operation timed out) [23:48] *** RichardG has joined #archiveteam-bs