[00:04] *** odemg has joined #archiveteam [00:14] *** pnJay has quit IRC (Leaving) [00:33] *** SketchCow has quit IRC (Quit: leaving) [00:48] *** SketchCow has joined #archiveteam [00:48] *** swebb sets mode: +o SketchCow [00:48] HEY SO [00:49] If there are other channels and projects for me to get in on, invite me. [00:49] I just upgraded my system from 10.0 to 10.3 FreeBSD [00:49] All hail [01:23] *** Morbus has joined #archiveteam [02:04] *** db48x` is now known as db48x [02:04] *** Atom has joined #archiveteam [02:18] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [02:23] congrats on a successful upgrade! [02:26] *** BartoCH has joined #archiveteam [02:40] *** pizzaiolo has quit IRC (Remote host closed the connection) [02:46] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [03:03] *** BlueMaxim has quit IRC (Quit: Leaving) [03:05] *** ndiddy has quit IRC () [03:16] *** BlueMaxim has joined #archiveteam [03:19] *** VADemon has quit IRC (Quit: left4dead) [03:24] *** Stilett0 has quit IRC (Read error: Connection reset by peer) [03:32] *** squires has quit IRC (Ping timeout (120 seconds)) [03:38] *** BlueMaxim has quit IRC (Read error: Operation timed out) [03:47] *** joshuatj has joined #archiveteam [04:01] *** Stilett0 has joined #archiveteam [04:17] *** odemg2 has joined #archiveteam [04:20] *** odemg has quit IRC (Read error: Operation timed out) [04:36] *** odemg has joined #archiveteam [04:39] *** Stiletto has joined #archiveteam [04:39] *** odemg2 has quit IRC (Read error: Operation timed out) [04:40] *** Stilett0 has quit IRC (Ping timeout: 244 seconds) [04:41] *** odemg2 has joined #archiveteam [04:42] *** odemg has quit IRC (Read error: Operation timed out) [04:43] *** inittux has quit IRC (Read error: Operation timed out) [04:51] *** inittux has joined #archiveteam [04:56] *** BlueMaxim has joined #archiveteam [05:37] Took 12 minutes. [05:37] FreeBSD is badass [05:43] *** Sk1d has joined #archiveteam [06:05] *** rocode has quit IRC (Ping timeout: 246 seconds) [06:05] *** luckcolor has quit IRC (Ping timeout: 245 seconds) [06:06] *** luckcolor has joined #archiveteam [06:07] *** topdownji has quit IRC (Ping timeout: 246 seconds) [06:07] *** rocode has joined #archiveteam [06:07] *** topdownji has joined #archiveteam [06:24] *** brayden has quit IRC (Read error: Connection reset by peer) [06:25] *** brayden has joined #archiveteam [06:25] *** swebb sets mode: +o brayden [06:47] *** inittux has quit IRC (Read error: Operation timed out) [06:48] *** inittux has joined #archiveteam [06:49] *** Stiletto has quit IRC (Read error: Operation timed out) [06:49] *** Stilett0 has joined #archiveteam [07:08] *** arbin_ has quit IRC (Read error: Connection reset by peer) [07:35] *** odemg2 has quit IRC (Remote host closed the connection) [07:37] *** vitzli has joined #archiveteam [07:55] *** w0rp has quit IRC (Read error: Connection reset by peer) [07:55] *** w0rp has joined #archiveteam [07:56] *** JAA has joined #archiveteam [08:04] Mininova update: 31k links archived, currently 125k pending (at least 400k estimated) [08:10] *** joshuatj has quit IRC (Read error: Operation timed out) [08:13] Actually, that estimate is incorrect. I expect at least 500k URLs. [08:18] *** bwn has quit IRC (Read error: Operation timed out) [08:22] *** Ymgve has quit IRC (Remote host closed the connection) [08:22] *** bwn has joined #archiveteam [09:24] *** Jonison has joined #archiveteam [09:24] *** Ymgve has joined #archiveteam [09:29] *** Ymgve_ has joined #archiveteam [09:33] *** Ymgve has quit IRC (Ping timeout: 506 seconds) [09:36] *** Ymgve has joined #archiveteam [09:38] *** Ymgve_ has quit IRC (Ping timeout: 506 seconds) [09:39] *** Ymgve_ has joined #archiveteam [09:44] *** Ymgve has quit IRC (Ping timeout: 506 seconds) [09:48] *** atomotic has joined #archiveteam [09:51] *** joshuatj has joined #archiveteam [09:52] *** Ymgve_ has quit IRC (Ping timeout: 506 seconds) [09:52] *** Ymgve_ has joined #archiveteam [10:01] *** Ymgve_ has quit IRC (Ping timeout: 506 seconds) [10:01] *** Ymgve has joined #archiveteam [10:05] *** Ymgve_ has joined #archiveteam [10:07] *** atomotic has quit IRC (Quit: My MacBook has gone to sleep. ZZZzzz…) [10:10] *** Ymgve has quit IRC (Ping timeout: 506 seconds) [10:11] *** Ymgve has joined #archiveteam [10:14] *** Ymgve_ has quit IRC (Ping timeout: 506 seconds) [10:20] *** Ymgve_ has joined #archiveteam [10:20] *** Ymgve has quit IRC (Ping timeout: 506 seconds) [10:28] *** Ymgve has joined #archiveteam [10:28] *** Ymgve_ has quit IRC (Ping timeout: 506 seconds) [10:36] *** Ymgve_ has joined #archiveteam [10:37] *** Ymgve has quit IRC (Ping timeout: 506 seconds) [10:41] *** Ymgve has joined #archiveteam [10:44] *** Ymgve_ has quit IRC (Ping timeout: 506 seconds) [10:47] *** Ymgve_ has joined #archiveteam [10:49] *** Ymgve has quit IRC (Ping timeout: 506 seconds) [10:51] *** MMovie has quit IRC (Read error: Operation timed out) [10:51] *** MMovie has joined #archiveteam [10:55] *** Ymgve has joined #archiveteam [10:56] *** Ymgve_ has quit IRC (Ping timeout: 506 seconds) [10:57] *** atomotic has joined #archiveteam [11:04] *** Ymgve has quit IRC (Ping timeout: 506 seconds) [11:46] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [11:55] *** Ymgve has joined #archiveteam [11:58] *** BlueMaxim has quit IRC (Read error: Operation timed out) [12:05] *** pizzaiolo has joined #archiveteam [12:10] *** Stilett0 has quit IRC (Read error: Connection reset by peer) [12:21] *** Sveklan has quit IRC (Ping timeout: 244 seconds) [12:39] *** bwn has quit IRC (Ping timeout: 244 seconds) [12:44] *** Stilett0 has joined #archiveteam [12:45] *** Stilett0 is now known as Stiletto [12:48] *** BartoCH has joined #archiveteam [12:49] *** bwn has joined #archiveteam [12:50] *** Burak has joined #archiveteam [13:04] *** Burak has quit IRC (Ping timeout: 244 seconds) [13:07] *** Burak has joined #archiveteam [13:23] *** kristian_ has joined #archiveteam [13:37] *** Svekla has joined #archiveteam [13:38] *** Burak has quit IRC (Ping timeout: 246 seconds) [13:41] *** JAA has quit IRC (Ping timeout: 268 seconds) [13:42] *** JAA has joined #archiveteam [14:05] *** Dahnak has quit IRC (Quit: Dahnak Quit) [14:11] *** Dahnak has joined #archiveteam [14:20] *** Dahnak has quit IRC (Quit: Dahnak Quit) [14:21] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [14:24] *** Dahnak has joined #archiveteam [14:43] Just saw this: https://www.reddit.com/r/DataHoarder/comments/6127bh/can_we_help_archiving_sourcefed/ [14:58] *** Petri152 has quit IRC (Read error: Operation timed out) [15:05] *** kristian_ has quit IRC (Quit: Leaving) [15:06] *** atomotic has joined #archiveteam [15:10] *** Petri152 has joined #archiveteam [15:10] *** VADemon has joined #archiveteam [15:16] *** atomotic has quit IRC (Read error: Connection timed out) [15:16] *** amfiko has joined #archiveteam [15:20] hey guys, I've been emailing with Reinier Kromopawiro, owner of javanenvansuriname.info, and a bunch of connected websites (like javanenindiaspora.net). He says he's taking down all of his websites "soon". They contain a bunch of articles related to Javanese, and especially Surinamese Javanese history, culture, etc. You might want to grab it before it's gone. [15:22] *** RichardG has quit IRC (Quit: Keyboard not found, press F1 to continue) [15:35] amfiko: Please give us a list of domains and we can save them with Archievbot [15:38] The sites are javanenvansuriname.info, javanenindiaspora.net, and surinamersindiapsora.net. There's an overview on the bottom of this page: http://javanenvansuriname.info/OVERZICHT_10_Sitemap-BanyuMili-Javanen-van-Suriname.html in case I mistyped (you can't copy text on the page) [15:39] Thanks. I'll add them to Archivebot [15:42] Great! I'd just hate for all that information to be lost. [15:42] !a http://javanenvansuriname.info/ [15:43] oh [15:51] something wrong? [15:54] No, yipdw just used the wrong channel (Archivebot is controlled through #archivebot) [15:59] *** amfiko has quit IRC (Quit: Page closed) [16:15] *** JAA has quit IRC (Quit: Page closed) [16:20] *** vitzli has quit IRC (Quit: Leaving) [16:22] *** bwn has quit IRC (Ping timeout: 960 seconds) [16:42] *** redlob has quit IRC (ZNC - http://znc.in) [16:48] *** RichardG has joined #archiveteam [16:49] *** redlob has joined #archiveteam [17:29] *** JAA has joined #archiveteam [17:38] arkiver: I saw in the IRC logs that several years ago (2013) you suggested archiving all torrents on Mininova. Did that ever take off? If no, now would be the time (obviously); if yes, now would be the time to grab anything that was uploaded in the meantime or not archived back then for another reason. [17:38] when is it going away [17:38] 4 April [17:39] I have time after the 31st [17:39] not before that unfortunately [17:39] 31st of march [17:41] I can't find any number of the total size of all torrents on the website, but I'm guessing it's in the two-digit TB range. Way too large for me at the moment unfortunately. [17:42] torrent content, or torrent files? [17:42] Content [17:43] I'm already archiving the .torrent files with wpull. (The ArchiveBot grab from a month ago is incomplete; over 10% of the mininova.org pages are 500 errors.) [17:43] ah good [17:44] Regarding that wpull of the website, I'm at 50k archived currently [17:44] (of an estimated 500k) [17:51] Um. If you feed the torrent files straight into the IA they'll do the rest and get the content inside [17:52] Oh, nice [17:53] Don't do that for this though [17:53] it's too many torrents [17:55] The torrent files/magnet links are already being grabbed by archivebot. Mininova uses multiple trackers, including 3rd party. Not sure why we would need to get the torrent contents. [17:58] They do? I only see one tracker in the torrent files, http://tracker.mininova.org/announce [18:00] So ArchiveBot is running another job? https://archive.org/details/falconk_archivebot_www_mininova_org_20170226 is incomplete as mentioned above. [18:02] Also, all torrents with Mininova as the only seeder will die regardless of alternative trackers. [18:11] *** bwn has joined #archiveteam [18:17] *** bwn has quit IRC (Read error: Operation timed out) [18:24] *** bwn has joined #archiveteam [18:55] *** pnJay has joined #archiveteam [18:57] *** Dark_Star has quit IRC (Ping timeout: 633 seconds) [18:58] *** Dark_Star has joined #archiveteam [19:25] *** SirCmpwn has quit IRC (Read error: Operation timed out) [19:26] *** SirCmpwn has joined #archiveteam [19:28] *** Fxza has joined #archiveteam [19:28] *** Fxza has quit IRC (Client Quit) [19:52] *** tammy_ has joined #archiveteam [19:56] hi, I'd like to archive https://www.reddit.com/r/DataHoarder/comments/6127bh/can_we_help_archiving_sourcefed/ These youtube channel before they go dark and maybe deleted. Is this something that archive.org could host at it's confirmed they are closing? And what would be the best method for archiving an entire channel. Is there a sort or .warc format for youtube channels? Something that grabs comments and all. [19:58] tammy_, when is their channel going dark? [19:59] end of week will be the last video [19:59] a few channels are all owned by Discovery now, and they canceled them. list in the reddit post I linked. [20:01] arkiver, -^ Is this suitable for archivebot or do we need to do an actual project for multiple YT channels? [20:03] I'd be more concerned about the Archivebot instance running out of disk space [20:10] *** odemg has joined #archiveteam [20:25] you're going to be better off with youtube-dl for the videos. that won't get you the comments, of course [20:25] youtube-dl can recognize channels and do the right thing. youtube+dl + wpull + archivebot may or may not do the right thing; I never tried it [20:25] neither method is going to get you comments [20:26] although I suppose you could procedurally generate them with a markov chain from hate speech [20:26] close enough approximation [20:27] haha [20:27] the benefit to that suggestion, of course, is an infinite comment stream [20:28] I'm usually fastidious about completeness when archiving stuff, but I have to say I never gave much thought to youtube comments. [20:29] *** SirCmpwn has quit IRC (Read error: Operation timed out) [20:32] With the new YouTube design that uses infinite scrolling for comments, Archivebot with --phantomjs should be able to capture them [20:32] Though I guess it wouldn't expand the full reply chain for comments with more than 2 replies [20:34] *** SirCmpwn has joined #archiveteam [20:50] *** sep332 has quit IRC (Quit: konversation out) [20:59] *** odemg has quit IRC (Remote host closed the connection) [21:00] *** odemg has joined #archiveteam [21:10] *** odemg has quit IRC (Remote host closed the connection) [21:11] *** odemg has joined #archiveteam [21:23] *** sep332 has joined #archiveteam [21:30] *** Stiletto has quit IRC (Ping timeout: 250 seconds) [21:34] *** godane has quit IRC (Read error: Operation timed out) [21:46] *** godane has joined #archiveteam [22:01] *** odemg has quit IRC (Remote host closed the connection) [22:05] *** Stilett0 has joined #archiveteam [22:09] *** maelstrom has joined #archiveteam [22:13] *** JAA has quit IRC (Quit: Page closed) [22:33] *** VoidWhisp has quit IRC (ZNC 1.6.2+cygwin2 - http://znc.in) [22:33] *** VoidWhisp has joined #archiveteam [22:42] *** odemg has joined #archiveteam [22:44] *** BlueMaxim has joined #archiveteam [22:59] *** VoidWhisp has quit IRC (ZNC 1.6.2+cygwin2 - http://znc.in) [22:59] *** VoidWhisp has joined #archiveteam [23:12] *** maelstrom has quit IRC (Read error: Operation timed out) [23:14] *** Dark_Star has quit IRC (Ping timeout: 245 seconds) [23:14] *** Jonison has quit IRC (Read error: Connection reset by peer) [23:15] *** BartoCH has joined #archiveteam [23:19] *** BartoCH has quit IRC (Ping timeout: 260 seconds) [23:22] *** maelstrom has joined #archiveteam [23:26] *** Dark_Star has joined #archiveteam [23:28] *** antiufo has joined #archiveteam [23:29] *** ndiddy has joined #archiveteam [23:29] Hello, I'm registering a new account on the wiki. May I know the secret word? [23:34] Not with these words [23:38] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [23:39] antiufo: and what would you like to contribute? [23:43] The Facebook page. I noticed that it currently only mentions ways of exporting your own data (eg Download a copy of your data), but it doesn't mention about possible efforts of backing up other pages [23:44] pm! [23:46] *** antiufo has quit IRC (Quit: Page closed) [23:47] *** antiufo has joined #archiveteam [23:47] antiufo: did you get it? [23:47] one second, i'm using a web client [23:48] I've sent it again, check your private messages [23:48] *** antiufo_ has joined #archiveteam