[00:00] *** xk_id has quit IRC (Read error: Connection reset by peer) [00:04] *** Atom__ has quit IRC (Ping timeout: 483 seconds) [00:05] *** Atom__ has joined #archiveteam [00:08] *** gibigiana has joined #archiveteam [00:19] *** dashcloud has quit IRC (Read error: Operation timed out) [00:22] *** dashcloud has joined #archiveteam [00:47] *** primus104 has quit IRC (Leaving.) [00:54] *** Start has quit IRC (Read error: Connection reset by peer) [00:54] *** Start has joined #archiveteam [00:55] *** JesseW has joined #archiveteam [01:29] *** Start_ has joined #archiveteam [01:33] *** Start has quit IRC (Ping timeout: 306 seconds) [01:34] *** xk_id_ has quit IRC (Remote host closed the connection) [01:35] *** xk_id has joined #archiveteam [01:41] *** xk_id has quit IRC (Ping timeout: 369 seconds) [01:57] *** Emcy has quit IRC (Ping timeout: 306 seconds) [02:02] *** vitzli has joined #archiveteam [02:09] https://ipfs.io/ipfs/QmNhFJjGcMPqpuYfxL62VVB9528NXqDNMFXiqN5bgFYiZ1/its-time-for-the-permanent-web.html <- interesting [02:10] vitzli: I saw you asked about IA collections contents. I have a pull-request to the internetarchive python library to improve that handling [02:12] JesseW: that reminds me about the warc -> zip thing you wanted. [02:13] you can probably tweak the extracttool in https://github.com/chfoo/warcat or put a shell script wrapper around warcat [02:20] aaaaaaaaa: yeah, warcat seemed like the answer [02:20] Should we ever try to archive darknets such as i2p, tor, or freenet, or other intranets such as cjdns, ipfs, and anonet? [02:21] JesseW, thanks, hope PR will land. [02:23] oops, I meant to link https://github.com/chfoo/warcat/blob/master/warcat/tool.py#L201 [02:30] anomie: certainly we should [02:30] vitzli: I think jake has just been busy lately [02:32] JesseW: Do you think we should apply extra caution if/when we do that, to try and avoid stuff that's doubleplusungood illegal? [02:35] *** BlueMaxim has joined #archiveteam [02:35] *** xk_id has joined #archiveteam [02:46] anomie: I think such archives should learn hard towards 1) on-demand, rather than wide-grabs; 2) invite-only, rather than public-access. I think those features should handle the troublesome material issue. [02:47] JesseW: That might not be a bad idea, at least until the rough spots can be worked out. [02:47] (BTW, I'm another person who confused you with https://en.wikipedia.org/wiki/User:Anomie ) [02:48] anomie: I think it's a good idea even after the rough spots are worked out. [02:50] JesseW: Well, i think content can be made public if it after gets vetted, if that ever happens. [02:51] *** xk_id has quit IRC (Ping timeout: 600 seconds) [02:51] anomie: I think most of the protocols you mentioned already provide sufficient reliability that making a separate, public archive of such material wouldn't be very necessary/useful. [02:52] Maybe. [02:52] I think writing software to automatically transform, say, a Tor hidden service into something equivalent on IPFS or Tahoe -- that would be a good idea, certainly. [02:52] I don't think all tor sites are that risky though. [02:52] anomie: all of them, certainly not. [02:53] For example, I don't think that much caution should be taken with ebook libraries. [02:53] And making a tor hidden service that mirrored other tor hidden services -- that might be worth doing. [02:53] Maybe. [02:54] I remember I found a warc of the Fairplay hidden service once. [02:55] ITunes doesn't sell DRM's music anymore, but I imagine it could still be useful to those with old purchases. [02:56] That makes sense [02:59] lol: https://archive.org/stream/CreativeComputingBuyersGuide1985/Creative_Computing_Buyers_Guide_1985 [03:00] sorry, better link: https://archive.org/stream/CreativeComputingBuyersGuide1985/Creative_Computing_Buyers_Guide_1985#page/n31/mode/2up [03:04] There is a little bit of goofiness going on with IA today. Lot of derives failing. Will be fixed and re-submitted. [03:05] *** nadams has joined #archiveteam [03:10] *** brayden__ has quit IRC (Read error: Connection reset by peer) [03:10] *** brayden__ has joined #archiveteam [03:10] *** swebb sets mode: +o brayden__ [03:15] SketchCow: nothing on my end failed to derives today [03:16] but i do see alot of machines down [03:22] i'm up to 2015-06-25 of medium.com grabs [03:23] so 2015-06-25 grab as 379 410 gone errors [03:23] *379 'ERROR 410' errors [03:27] *** zenguy_pc has quit IRC (Read error: Connection reset by peer) [03:32] *** zenguy_pc has joined #archiveteam [03:32] https://archive.org/stream/CreativeComputingBuyersGuide1985/Creative_Computing_Buyers_Guide_1985#page/n71/mode/2up <- "One of the biggest kicks we got out of the documentation was the international unpacking instruction card. It is an over-sized fold-out pictorial and has been drawn by the same person who draws the escape instructions for passenger airplanes. It breaks the Apricot unpacking procedure into 15 easy steps, not count ing inflation of your li [03:32] Delightful. [03:41] *** JesseW has quit IRC (Leaving.) [03:41] *** JesseW1 has joined #archiveteam [03:43] *** zenguy_pc has quit IRC (Read error: Connection reset by peer) [03:49] *** zenguy_pc has joined #archiveteam [03:52] https://archive.org/stream/CreativeComputingBuyersGuide1985/Creative_Computing_Buyers_Guide_1985#page/n163/mode/2up <- "You can often judge a personal computer user by the hardware he selects. If the modem comes from UDS, chances are he has a serious mvest- ment in computer and software, a serious data communications requirement and serious com- puter-based decisions to make. " [03:53] I love the phrase "serious computer-based decisions to make". [04:05] *** nadams has quit IRC (Quit: Leaving) [04:25] *** aaaaaaaaa has quit IRC (Leaving) [04:27] Yeah… that's interesting. [04:27] is that from one of those manuals? [04:28] not one of the Manuals+ ones, no. This is from Creative Computing, a magazine. [04:28] I'm doing QA on some scans of it. [04:28] Ah, okay. [04:55] *** JesseW1 has quit IRC (Read error: Operation timed out) [05:01] *** wyatt8740 has quit IRC (Read error: Operation timed out) [05:03] *** wyatt8740 has joined #archiveteam [05:05] *** vitzli has quit IRC (Quit: Leaving) [05:10] !ao https://ipfs.io/ipfs/QmNhFJjGcMPqpuYfxL62VVB9528NXqDNMFXiqN5bgFYiZ1/its-time-for-the-permanent-web.html [05:10] er oops [05:11] *** wyatt8740 has quit IRC (Quit: I just saw a red door, now I want to paint it black...) [05:12] *** wyatt8740 has joined #archiveteam [05:13] yipdw: I think you already shared that one. [05:13] Oh well, it's still plenty relevant. [05:16] wasn't a share, I !aoed it in the wrong channel [05:19] Oh… I just realized that. [05:19] Wouldn't it be better to archive the original article though? [05:21] This is the original. https://blog.neocities.org/its-time-for-the-permanent-web.html [05:33] the one thing i quest is the ipfs.io domain [05:33] block that you block all ipfs sites [05:34] second is what do we do with ipfs sites if ipfs.io goes down [05:35] godane: Well, we could run ipfs locally, which is pretty lightweight in my experience. [05:35] ok [05:35] To view ipfs sites when running it locally, you use https://localhost:8080/ipfs/QmNhFJjGcMPqpuYfxL62VVB9528NXqDNMFXiqN5bgFYiZ1/its-time-for-the-permanent-web.html [05:37] I don't think it will be that must of an issue, but if we find ourselves archiving ipfs sites often, we might want to install our own ipfs node so that ipfs.io isn't hugged, and contribute to the network. [05:37] *much [05:37] But hey, let's see if there's much of value first. [05:42] as I understand it part of the point of ipfs is eliminating the need for Archive Team [05:42] is ipfs like a new incarnation of freenet [05:42] if that is true, that is a good thing [05:43] i think Archive Team of some sort is always going to be needed [05:43] xmc: not sure about the specifics but it is content-addressed, distributed, etc [05:43] yep [05:43] freenet [05:43] this is something that was proposed for #internetarchive.bak [05:43] mostly used these days for distributing child porn [05:43] it is not mutually exclusive with the current git-annex approach [05:44] *** qwebirc56 has quit IRC (Ping timeout: 240 seconds) [05:44] #-bs I guess if someone wants to talk about this more [05:44] yipdw: Things will stay on ipfs as long as it's in demand, or people are willing to save it. Just because neither is true doesn't mean it isn't worth saving. [05:45] anomie: I don't know what that has to do with ipfs [05:45] yipdw: In order for content to stay on ipfs, there has to be at least one person that has "pinned" it onto his computer. [05:47] It's not a magic cloud of infinite capacity. Eventually participants will need to make compromises between how much disk space they have, what they want to use it for, and which sites they want to help keep online. [06:36] *** Aranje has joined #archiveteam [06:49] *** kyan has quit IRC (Quit: Leaving) [07:07] *** primus104 has joined #archiveteam [07:12] *** xk_id has joined #archiveteam [07:26] *** arkiver2 has joined #archiveteam [07:26] *** russss__ has joined #archiveteam [07:26] *** kevin has joined #archiveteam [07:26] *** Ungstein has joined #archiveteam [07:26] *** aliz has joined #archiveteam [07:26] *** diacope has joined #archiveteam [07:26] *** karissa has joined #archiveteam [07:26] *** Ctrl-S has joined #archiveteam [07:26] *** zyphlar has joined #archiveteam [07:26] *** codl has joined #archiveteam [07:26] *** afics has joined #archiveteam [07:26] *** JSharp has joined #archiveteam [07:26] *** deathy has joined #archiveteam [07:26] *** _desu_ has joined #archiveteam [07:26] *** Fletcher has joined #archiveteam [07:26] *** HCross- has joined #archiveteam [07:26] *** sigkell has joined #archiveteam [07:26] *** irl1 has joined #archiveteam [07:26] *** Rickster has joined #archiveteam [07:26] *** Muad-Dib has joined #archiveteam [07:26] *** GLaDOS has joined #archiveteam [07:26] *** schbirid has joined #archiveteam [07:31] *** Coderjoe has quit IRC (Read error: Operation timed out) [07:38] *** xk_id has quit IRC (Remote host closed the connection) [07:39] *** xk_id has joined #archiveteam [07:39] *** atomotic has joined #archiveteam [07:44] *** Coderjoe has joined #archiveteam [07:48] *** arkiver2 has quit IRC (Ping timeout: 252 seconds) [07:49] *** xk_id has quit IRC (Remote host closed the connection) [07:51] *** xk_id has joined #archiveteam [07:51] *** xk_id has quit IRC (Remote host closed the connection) [07:58] *** arkiver2 has joined #archiveteam [08:17] *** vitzli has joined #archiveteam [08:25] *** MMovie2 has joined #archiveteam [08:27] *** MMovie has quit IRC (Ping timeout: 306 seconds) [09:12] *** habi has joined #archiveteam [09:16] *** habi has left [09:43] *** arkiver2 has quit IRC (Ping timeout: 252 seconds) [09:44] *** vitzli has quit IRC (Ping timeout: 483 seconds) [09:54] *** arkiver2 has joined #archiveteam [10:16] *** arkiver2 has quit IRC (Ping timeout: 252 seconds) [10:27] *** logan2 has quit IRC (Remote host closed the connection) [10:27] *** logan has joined #archiveteam [10:39] *** anomie has quit IRC (Read error: Connection reset by peer) [10:41] *** anomie has joined #archiveteam [10:49] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [11:01] *** arkiver2 has joined #archiveteam [11:24] *** arkiver2 has quit IRC (Ping timeout: 252 seconds) [11:38] I thought IPFS was too much of a hat trick on top of IA.BAK for now. [11:39] *** atomotic has joined #archiveteam [11:46] *** habi has joined #archiveteam [11:48] *** arkiver2 has joined #archiveteam [11:50] *** habi has left [12:05] *** arkiver2 has quit IRC (Ping timeout: 252 seconds) [12:30] *** Ymgve has quit IRC () [12:40] *** Ymgve has joined #archiveteam [13:01] *** arkiver2 has joined #archiveteam [13:05] *** arkiver2 has quit IRC (Ping timeout: 252 seconds) [13:17] *** BlueMaxim has quit IRC (Quit: Leaving) [13:36] *** wednesday has quit IRC (Ping timeout: 483 seconds) [13:48] *** arkiver2 has joined #archiveteam [13:51] *** Start_ has quit IRC (Quit: Disconnected.) [13:53] *** primus104 has quit IRC (Leaving.) [14:00] *** arkiver2 has quit IRC (Quit: Nettalk6 - www.ntalk.de) [14:08] *** arkiver2 has joined #archiveteam [14:31] *** scyther has joined #archiveteam [14:42] *** Start has joined #archiveteam [14:42] *** PurpleSym has joined #archiveteam [14:43] *** arkiver2 has quit IRC (Ping timeout: 252 seconds) [14:49] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [15:56] *** JesseW has joined #archiveteam [16:04] *** nertzy has joined #archiveteam [16:07] *** Start has quit IRC (Quit: Disconnected.) [16:12] *** Start has joined #archiveteam [16:14] *** Start_ has joined #archiveteam [16:15] *** Start has quit IRC (Read error: Connection reset by peer) [16:15] *** Start has joined #archiveteam [16:16] *** Start_ has quit IRC (Read error: Connection reset by peer) [16:24] *** JesseW has quit IRC (Read error: Operation timed out) [16:27] *** primus104 has joined #archiveteam [16:37] *** nertzy has quit IRC (Quit: This computer has gone to sleep) [16:44] SketchCow: I agree, I want to just try it on my own data for now [16:57] *** primus104 has quit IRC (Leaving.) [17:01] *** nertzy has joined #archiveteam [17:25] *** anomie has quit IRC (Read error: Connection reset by peer) [17:35] *** anomie has joined #archiveteam [17:35] *** RichardG has quit IRC (Read error: Connection reset by peer) [17:35] *** RichardG has joined #archiveteam [17:38] *** Start has quit IRC (Quit: Disconnected.) [17:39] *** nertzy has quit IRC (Quit: This computer has gone to sleep) [18:04] *** arkiver2 has joined #archiveteam [18:14] *** nertzy has joined #archiveteam [18:15] *** Boppen has quit IRC (Ping timeout: 198 seconds) [18:15] *** Boppen has joined #archiveteam [18:25] *** nertzy has quit IRC (Quit: This computer has gone to sleep) [18:38] *** antomati_ has quit IRC (Read error: Connection reset by peer) [18:39] *** antomatic has joined #archiveteam [18:39] *** swebb sets mode: +o antomatic [18:41] *** wm_ has quit IRC (Ping timeout: 240 seconds) [18:42] *** nertzy has joined #archiveteam [18:45] *** primus104 has joined #archiveteam [18:46] *** wm_ has joined #archiveteam [18:50] SketchCow: Why is it bad if services freak out? https://twitter.com/textfiles/status/641637629876903936 [18:51] I like the publicity for our archive projects [18:53] I think people should be kept up-to-date on the ArchiveTeam projects. [18:53] Our archives will be used more. [18:54] The publicity also makes us aware sooner when a website is going offline. [18:55] People now know ArchiveTeam for these web-archiving projects and are faster reporting to us when a website is going offline. [18:56] And it's of course nice to see how people respond to our projects, what they think about them. [18:57] But that is all only possible if people know about our newest projects. [18:57] *** arkiver2 has quit IRC (Ping timeout: 252 seconds) [18:57] chfoo: can you please add googlecode to the projects.json? [19:02] If you just like the publicity, then it doesn't matter when it's announced [19:02] chfoo: and create a FOS rsync for it? [19:19] *** rddfg165 has joined #archiveteam [19:21] Seems like a good time to grab anything and everything National Geographic: http://www.npr.org/sections/thetwo-way/2015/09/09/438853832 [19:21] *** nertzy has quit IRC (Quit: This computer has gone to sleep) [19:31] *** aaaaaaaaa has joined #archiveteam [19:31] *** swebb sets mode: +o aaaaaaaaa [19:40] http://lostmedia.wikia.com/wiki/Lost_Media_Wiki [19:53] *** Start has joined #archiveteam [20:04] *** schbirid has quit IRC (Quit: Leaving) [20:10] *** godane has quit IRC (Ping timeout: 492 seconds) [20:11] *** xk_id has joined #archiveteam [20:19] *** xk_id has quit IRC (Remote host closed the connection) [20:20] *** godane has joined #archiveteam [20:22] *** xk_id has joined #archiveteam [20:27] *** godane has quit IRC (Quit: Leaving.) [20:36] arkiver: because they freak out and try to stop us, that's why [20:37] Given that AT identifies itself in the UA, it is also a great test of who looks at their logs. [20:49] *** PurpleSym has quit IRC (Remote host closed the connection) [20:49] *** godane has joined #archiveteam [21:02] *** xk_id has quit IRC (Ping timeout: 606 seconds) [21:03] *** Start has quit IRC (Quit: Disconnected.) [21:28] *** scyther has quit IRC (Leaving) [21:32] *** zenguy_pc has quit IRC (Read error: Connection reset by peer) [21:32] *** aaaaaaaaa has quit IRC (Read error: Connection reset by peer) [21:32] *** aaaaaaaaa has joined #archiveteam [21:32] *** swebb sets mode: +o aaaaaaaaa [21:32] *** zenguy_pc has joined #archiveteam [21:42] *** rddfg165 has quit IRC (Quit: Page closed) [21:43] *** zenguy_pc has quit IRC (Read error: Connection reset by peer) [21:44] *** zenguy_pc has joined #archiveteam [21:50] *** zenguy_pc has quit IRC (Read error: Connection reset by peer) [21:50] *** zenguy_pc has joined #archiveteam [21:51] *** zenguy_pc has quit IRC (Read error: Connection reset by peer) [21:55] *** zenguy_pc has joined #archiveteam [22:24] arkiver: yes, done [22:29] *** xk_id has joined #archiveteam [22:31] *** xk_id_ has joined #archiveteam [22:31] *** xk_id has quit IRC (Read error: Connection reset by peer) [22:35] thank you [22:35] first test warc of google code: https://archive.org/details/googlecode-project_mechedit-20150909-204213.warc [22:38] *** xk_id has joined #archiveteam [22:38] *** xk_id_ has quit IRC (Read error: Connection reset by peer) [22:51] *** xk_id has quit IRC (Read error: Connection reset by peer) [23:11] *** xk_id has joined #archiveteam [23:24] *** Start has joined #archiveteam [23:55] *** BlueMaxim has joined #archiveteam [23:55] *** RichardG has quit IRC (Remote host closed the connection)