[00:02] *** philpem has quit IRC (Ping timeout: 252 seconds) [00:18] *** primus104 has quit IRC (Leaving.) [00:24] *** marvinw has quit IRC (Read error: Operation timed out) [00:34] *** BlueMaxim has joined #archiveteam [00:53] DFJustin: Killing the Touhou [00:53] Main issue is that it blows right out to a directory and it's doing it with unicode [00:53] I feel like that won't survive to the archive [00:57] *** BlueMaxim has quit IRC (Read error: Operation timed out) [00:57] *** BlueMaxim has joined #archiveteam [01:10] *** aaaaaaaaa has quit IRC (Read error: Connection reset by peer) [01:14] *** aaaaaaaaa has joined #archiveteam [01:14] *** swebb sets mode: +o aaaaaaaaa [01:22] unicode works ok since the v2 [01:24] *** dashcloud has quit IRC (Read error: Operation timed out) [01:28] *** dashcloud has joined #archiveteam [01:28] *** MMovie2 has joined #archiveteam [01:29] *** MMovie has quit IRC (Ping timeout: 306 seconds) [01:33] *** mistym has joined #archiveteam [01:56] *** dashcloud has quit IRC (Read error: Operation timed out) [02:00] *** dashcloud has joined #archiveteam [02:07] *** Jonimus has joined #archiveteam [02:13] *** mistym has quit IRC (Remote host closed the connection) [02:26] *** qrstuv has joined #archiveteam [02:37] *** tomwsmf-a has quit IRC (Read error: Operation timed out) [02:38] *** mistym has joined #archiveteam [03:45] *** marvinw has joined #archiveteam [04:12] *** aaaaaaaaa has quit IRC (Leaving) [04:52] *** dashcloud has quit IRC (Read error: Operation timed out) [04:55] *** dashcloud has joined #archiveteam [05:09] *** Emcy has quit IRC (Read error: Connection reset by peer) [05:27] *** vitzli has joined #archiveteam [05:52] *** microguru has joined #archiveteam [05:52] Hello. I've just found out about Archive team. I support your cause and am running a Warrior as we speak. [06:00] great! [06:12] is it OK if i only run the warrior for a few hours a day and with a limit of 1 MBp/s? [06:15] *** bzc6p_ is now known as bzc6p [06:15] if you stop it in an orderly way, that should be fine [06:15] as in stop it how the documentation says you should, and waiting for it to finish before you close it down [06:15] I use the "stop warrior" button on the web interface to stop it every time [06:16] sounds good! [06:16] alright. [06:17] you guys (we?) do some pretty good work. I first found out about archive team after looking for a file on pomf.se and being told archive team made an archive [06:18] if you're running a warrior, you definitely get to say we :) [06:19] did you get the thing you wanted out of pomf? [06:19] I realised the importance of archiving after losing one too many youtube videos to the DMCA and started youtube-dl'ing everything [06:19] :| [06:19] yeah. I know [06:20] Since them I've archived at least ~500 GB ish for my personal use [06:20] mostly videos and websites (sometimes full copies via wget) [06:21] * xmc nod [06:21] archivebot is good for that sort of thing [06:21] archivebot? [06:21] i throw stuff into archivebot if it seems useful, because i don't really trust anything [06:21] ooh [06:21] you're in for a treat [06:21] join #archivebot on this server [06:22] you can submit wget jobs, which get downloaded and then sent to web.archive.org [06:22] turnaround time for going into the archive is usually a day or so after the download completes [06:22] cool! [06:23] I use "wget -k -m -p -c --wait=10 --random-wait" for my archiving needs [06:23] * xmc nod [06:23] archivebot lets you wander into irc, say "this website is cool, go download it" and the bot takes care of everything else [06:23] the --wait=10 --random-wait makes everything take forever, but it keeps me unbanned [06:23] you have to keep an eye on it in case it gets stuck in a corner, but that's less and less lately [06:24] ya [06:25] just tried out archivebot on http://donh.best.vwh.net for a test [06:25] I'm surprised that website is still up [06:26] what is it? [06:26] it's a personal webpage [06:27] it had a copy of a book that I really liked (http://donh.best.vwh.net/Esperanto/eaccess/eaccess.book.html), so I kept a copy [06:27] * xmc nod [06:28] that book is why I'm intrested in esperanto. [06:28] lots of other good things there too [06:30] depending on what it is I'm preserving for personal use, I'v used print-to-file, wget, and screenshots [06:31] archivebot has a phantomjs mode where it executes javascript and scrolls to the bottom of the page, one pagedown at a time [06:31] it's kind of brute force and it works pretty well [06:32] archivebot's really cool [06:32] not only does it make copies for me, but everyone else can have a copy too without having to email me for it [06:32] yuuuup [06:32] and the copies go in an obvious place [06:33] speaking of that, I wonder how many other people keep personal archives [06:33] there should be a website where people post what things they have and are willing to share, and people can request a copy [06:33] hmmm [06:34] i'd suggest that said website not personally handle the files, for bandwidth and copyright reasons [06:35] people would have to use file hosts to transfer the files [06:35] copyright is not a concern for archiveteam [06:37] exactly, becaue just about everything that's being archived is copyrighted [06:37] it's just a conversation that we've had too much [06:38] the Library of congress archives stuff, so why not us? [06:38] archivebot also has an improvement list three miles long, no doubt [06:38] and I wish all I had to do was rub like so, and oh - [06:38] shit that doesn't really scan the same way, does it [06:39] they have different meter [06:39] given that copyright isn't a concern, is there anything off the table for archiving? [06:39] the second one is almost iambic [06:40] microguru: we prioritize by size/benefit [06:41] so that means something that's hugely important text gets prioritized over not very important videos? [06:41] yeah [06:41] I was thinking that illegal things and encrypted files would be prohibited [06:42] more 'evil' than 'illegal' [06:42] many illegal things aren't actually wrong [06:42] many wrong things aren't illegal [06:42] although maybe even illegal things provided there's a good reason (archiving hate speech for historical analysis) [06:43] "many illegal things aren't actually wrong; many wrong things aren't illegal" isn't that the truth [06:44] kind of like that guy I read about in the news with his basement full of KKK pamphlets he kept for future historians [06:45] at some point it might be useful to take this to #archiveteam-bs [06:46] i'm of two minds [06:46] http://america.aljazeera.com/watch/shows/america-tonight/articles/2015/7/8/Inside-a-security-experts-collection-of-hateful-artifacts.html [06:57] *** mistym has quit IRC (Remote host closed the connection) [07:00] ok [07:00] *** microguru has left [07:28] *** primus104 has joined #archiveteam [07:32] *** schbirid has joined #archiveteam [07:40] *** vitzli has quit IRC (Quit: Leaving) [07:55] *** bzc6p has quit IRC (Read error: Connection reset by peer) [07:58] *** mistym has joined #archiveteam [08:04] *** mistym has quit IRC (Read error: Operation timed out) [08:31] *** primus104 has quit IRC (Leaving.) [08:33] *** atomotic has joined #archiveteam [09:19] *** vitzli has joined #archiveteam [09:52] *** xk_id has joined #archiveteam [10:03] *** mistym has joined #archiveteam [10:08] *** bzc6p has joined #archiveteam [10:08] *** swebb sets mode: +o bzc6p [10:11] *** mistym has quit IRC (Read error: Operation timed out) [10:11] *** zyphlar has quit IRC (Read error: Connection reset by peer) [10:11] *** codl_ has quit IRC (Ping timeout: 252 seconds) [10:11] *** Boltsie has quit IRC (Ping timeout: 252 seconds) [10:11] *** russss__ has quit IRC (Ping timeout: 252 seconds) [10:11] *** deathy has quit IRC (Ping timeout: 252 seconds) [10:11] *** Ctrl-S has quit IRC (Read error: Connection reset by peer) [10:12] *** zyphlar has joined #archiveteam [10:12] *** codl_ has joined #archiveteam [10:12] *** russss__ has joined #archiveteam [10:12] *** Ctrl-S has joined #archiveteam [10:12] *** Boltsie has joined #archiveteam [10:13] *** deathy has joined #archiveteam [10:25] apinc.org,a french hoster, will close its shared hosting in 2 month [10:26] https://twitter.com/aubreymcfato/status/618466994317316096 [10:26] 18k site in google [10:26] *referenced by* [10:28] i really need to create an account on archiveteam.org [11:20] *** primus104 has joined #archiveteam [11:21] yahoosucks [11:21] *** bzc6p_ has joined #archiveteam [11:21] *** swebb sets mode: +o bzc6p_ [11:26] *** bzc6p has quit IRC (Read error: Operation timed out) [11:38] *** RichardG has quit IRC (Read error: Connection reset by peer) [11:43] *** RichardG has joined #archiveteam [11:51] *** xk_id has quit IRC (Remote host closed the connection) [12:04] *** mistym has joined #archiveteam [12:04] *** ats has quit IRC (Read error: Connection reset by peer) [12:06] *** oldcad has joined #archiveteam [12:09] *** BlueMaxim has quit IRC (Read error: Connection reset by peer) [12:10] *** ats has joined #archiveteam [12:11] *** mistym has quit IRC (Read error: Operation timed out) [12:23] *** xk_id has joined #archiveteam [12:27] *** xk_id has quit IRC (Remote host closed the connection) [13:06] *** bzc6p_ is now known as bzc6p [13:12] *** philpem has joined #archiveteam [13:29] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [13:32] *** Medowar has quit IRC (Quit: Leaving) [13:37] *** Froggypwn has quit IRC (Read error: Connection reset by peer) [13:38] *** Froggypwn has joined #archiveteam [14:16] *** atomotic has joined #archiveteam [14:19] *** primus104 has quit IRC (Leaving.) [14:32] *** mistym has joined #archiveteam [14:51] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [15:09] *** Jonimus has quit IRC (Ping timeout: 370 seconds) [15:25] *** Emcy has joined #archiveteam [15:33] *** mistym has quit IRC (Remote host closed the connection) [15:34] *** jmc_ has joined #archiveteam [15:35] *** jmc has quit IRC (Ping timeout: 255 seconds) [15:42] *** primus104 has joined #archiveteam [15:53] *** nox has quit IRC (Read error: Connection reset by peer) [16:11] *** nox has joined #archiveteam [16:14] *** primus104 has quit IRC (Leaving.) [16:23] *** mistym has joined #archiveteam [16:27] *** SimpBrain has joined #archiveteam [16:42] *** tomwsmf-a has joined #archiveteam [16:46] *** philpem has quit IRC (Ping timeout: 252 seconds) [16:55] *** dashcloud has quit IRC (Read error: Operation timed out) [16:58] *** dashcloud has joined #archiveteam [17:34] *** aaaaaaaaa has joined #archiveteam [17:34] *** swebb sets mode: +o aaaaaaaaa [17:42] *** Start has quit IRC (Read error: Connection reset by peer) [17:43] *** Start has joined #archiveteam [17:48] *** habi has joined #archiveteam [18:05] *** habi has quit IRC (Quit: Leaving.) [18:17] *** K4k has joined #archiveteam [18:19] *** primus104 has joined #archiveteam [18:34] Any big projects on atm except URLTeam? [18:43] *** vitzli has quit IRC (Quit: Leaving) [19:00] *** aliz has quit IRC (hub.se irc.du.se) [19:00] *** Rotab has quit IRC (hub.se irc.du.se) [19:00] *** Boppen has quit IRC (hub.se irc.du.se) [19:21] *** aliz has joined #archiveteam [19:21] *** Rotab has joined #archiveteam [19:23] *** Boppen has joined #archiveteam [19:44] *** atomotic has joined #archiveteam [19:50] *** khaoohs_ is now known as khaoohs [19:52] *** mistym has quit IRC (Ping timeout: 252 seconds) [20:03] *** SimpBrain has quit IRC (Quit: Leaving) [20:06] The usual large downloads. [20:06] Wiki keeps a lot of them [20:09] *** Froggypwn has quit IRC (Read error: Connection reset by peer) [20:10] *** Froggypwn has joined #archiveteam [20:12] *** pfallenop has joined #archiveteam [20:21] *** xtr-201 has quit IRC (Read error: Connection reset by peer) [20:29] *** schbirid has quit IRC (Leaving) [20:40] *** mistym has joined #archiveteam [20:46] *** Froggypwn has quit IRC (Read error: Connection reset by peer) [20:48] *** Froggypwn has joined #archiveteam [21:16] *** K4k has quit IRC (Read error: Connection reset by peer) [21:16] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [21:18] *** K4k has joined #archiveteam [21:43] *** mistym_ has joined #archiveteam [21:43] *** dashcloud has quit IRC (Read error: Connection reset by peer) [21:45] *** dashcloud has joined #archiveteam [21:45] *** ripvanwin has quit IRC (Read error: Operation timed out) [21:45] *** xtr-201 has joined #archiveteam [21:49] *** mistym has quit IRC (Read error: Operation timed out) [22:05] *** chfoo has quit IRC (Remote host closed the connection) [22:09] REDDIT CEO OUT [22:09] Obviously, the new guy is interim and we should replace him with godane [22:10] I guess the no-reddit-day thing worked [22:11] *** philpem has joined #archiveteam [22:11] *** chfoo has joined #archiveteam [22:17] *** habi has joined #archiveteam [22:20] *** tomwsmf-a has quit IRC (Ping timeout: 258 seconds) [22:20] *** xk_id has joined #archiveteam [22:20] *** habi has left [22:30] SketchCow: i would be the guy to get reddit subs to have git repos [22:31] that way EVERYTHING is saved [22:31] hah [22:32] a way to make reddit portable for meshnet [22:32] that and maybe web archives for urls after its submited [22:32] there will still be a live link but also a archive link [22:34] SketchCow: also dailymail.co.uk is full archive up to 2004-01 [22:37] *** K4k has quit IRC (Ping timeout: 186 seconds) [22:38] godane: this may relevant to your interests: https://speakerdeck.com/ussjoin/the-perfectly-legitimate-project [22:41] *** BlueMaxim has joined #archiveteam [22:45] great, now we're going to get a bunch of people saving every goddamn reddit thread again [22:45] at least they'll go "fast" [22:49] *** wyatt8740 has quit IRC (Read error: Operation timed out) [22:52] *** superkuh_ has joined #archiveteam [22:52] *** superkuh has quit IRC (Read error: Operation timed out) [22:52] *** kyan has quit IRC (Ping timeout: 258 seconds) [22:52] *** RKenshin has joined #archiveteam [22:54] *** db48x has quit IRC (hub.efnet.us irc.Prison.NET) [22:54] *** sunnymilk has quit IRC (hub.efnet.us irc.Prison.NET) [22:54] *** Kenshin has quit IRC (hub.efnet.us irc.Prison.NET) [23:10] *** RKenshin is now known as Kenshin [23:18] *** wyatt8750 has joined #archiveteam [23:19] *** wyatt8750 is now known as wyatt8740 [23:21] *** sunnymilk has joined #archiveteam [23:26] *** kyan has joined #archiveteam [23:43] *** bzc6p_ has joined #archiveteam [23:43] *** swebb sets mode: +o bzc6p_ [23:49] *** bzc6p has quit IRC (Ping timeout: 600 seconds) [23:58] *** mistym has joined #archiveteam