[00:03] *** antomatic has joined #archiveteam [00:05] *** antomati_ has quit IRC (Read error: Operation timed out) [00:21] ok, the books are ready too [00:25] *** db48x has quit IRC (Quit: ERC (IRC client for Emacs 25.1.1)) [00:52] *** BlueMaxim has quit IRC (Read error: Operation timed out) [00:55] *** lesderid has quit IRC (Ping timeout: 260 seconds) [01:04] *** lesderid has joined #archiveteam [01:34] *** db48x has joined #archiveteam [02:07] *** wcarss has quit IRC (Remote host closed the connection) [02:59] *** Silvan has joined #archiveteam [03:00] *** SilSte has quit IRC (Read error: Operation timed out) [03:16] *** RoanKatto has joined #archiveteam [03:17] *** Morbus has quit IRC (Read error: Operation timed out) [03:18] *** ndiddy has quit IRC () [04:10] *** bwn has quit IRC (Read error: Operation timed out) [04:11] *** pizzaiolo has quit IRC (Remote host closed the connection) [04:13] *** bwn has joined #archiveteam [04:35] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [04:41] *** Sk1d has joined #archiveteam [05:32] *** maelstrom has quit IRC (Quit: Leaving) [05:34] *** RoanKatto has quit IRC (Read error: Operation timed out) [05:43] *** Aranje has quit IRC (Quit: Three sheets to the wind) [06:23] *** Nyx has quit IRC (Ping timeout: 260 seconds) [06:38] *** BlueMaxim has joined #archiveteam [06:56] *** BlueMaxim has quit IRC (Read error: Operation timed out) [06:56] *** BlueMaxim has joined #archiveteam [07:14] *** Nyx has joined #archiveteam [07:26] *** odemg has quit IRC (Remote host closed the connection) [07:28] *** schbirid has joined #archiveteam [07:42] *** odemg has joined #archiveteam [07:43] *** JAA has joined #archiveteam [07:50] xmc, guest__: I only grabbed the website. They stopped serving content at 08:28:03 CEST, and sadly my grab did not finish in time. I got to about 66.7k .torrent files (of 72055 total); not sure about the description pages, but probably also around that number. [07:53] By the way, there was already an ArchiveBot grab end of February, but it seems that it didn't get everything due to 500 Internal Server Errors etc. [07:53] I would've loved to grab the torrents as well, but that's too much data for me at the moment (about 3.5 TiB estimated). [08:03] *** Jonison has joined #archiveteam [08:17] *** Honno has joined #archiveteam [08:18] *** schbirid2 has joined #archiveteam [08:22] *** schbirid has quit IRC (Read error: Operation timed out) [08:23] *** atomotic has joined #archiveteam [08:57] *** username1 has joined #archiveteam [09:02] *** schbirid2 has quit IRC (Read error: Operation timed out) [09:16] *** schbirid2 has joined #archiveteam [09:19] *** username1 has quit IRC (Read error: Operation timed out) [09:21] JAA: i deleted the obvious spam and dupes out of the ebooks category, from the 8500+ torrents only 460 remained [09:22] I didn't bother to grab the "games - windows" category at all [09:37] *** username1 has joined #archiveteam [09:43] *** schbirid2 has quit IRC (Read error: Operation timed out) [09:56] *** schbirid2 has joined #archiveteam [09:59] *** edsu has quit IRC (Quit: leaving) [10:00] *** username1 has quit IRC (Read error: Operation timed out) [10:00] *** edsu has joined #archiveteam [10:01] *** pnJay has quit IRC (Leaving) [10:05] *** hive-mind has quit IRC (Ping timeout: 260 seconds) [10:06] *** hive-mind has joined #archiveteam [10:24] *** username1 has joined #archiveteam [10:27] *** schbirid2 has quit IRC (Read error: Operation timed out) [10:43] *** atomotic has quit IRC (Quit: My MacBook has gone to sleep. ZZZzzz…) [10:51] *** JAA has quit IRC (Quit: Page closed) [10:53] *** schbirid2 has joined #archiveteam [10:57] *** username1 has quit IRC (Read error: Operation timed out) [11:06] *** atomotic has joined #archiveteam [11:28] *** icedice has joined #archiveteam [12:07] looks like my local news archive is at 186gb [12:14] *** username1 has joined #archiveteam [12:18] *** schbirid2 has quit IRC (Read error: Operation timed out) [12:37] *** odemg has quit IRC (Remote host closed the connection) [12:40] *** schbirid2 has joined #archiveteam [12:44] *** Morbus has joined #archiveteam [12:45] *** username1 has quit IRC (Read error: Operation timed out) [12:53] *** pizzaiolo has joined #archiveteam [13:07] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [13:08] *** username1 has joined #archiveteam [13:11] *** schbirid2 has quit IRC (Read error: Operation timed out) [13:14] *** odemg has joined #archiveteam [13:16] *** odemg has quit IRC (Remote host closed the connection) [13:27] *** schbirid2 has joined #archiveteam [13:30] *** username1 has quit IRC (Read error: Operation timed out) [13:31] *** icedice has quit IRC (Quit: Leaving) [13:45] *** kniffy has quit IRC (Ping timeout: 240 seconds) [13:48] *** username1 has joined #archiveteam [13:51] *** schbirid2 has quit IRC (Read error: Operation timed out) [13:56] *** pnJay has joined #archiveteam [14:02] *** kniffy has joined #archiveteam [14:02] *** Nume has joined #archiveteam [14:02] hello [14:06] I just got myself a huge(~140GB) WARC file filled with fanfiction. Now I am just wondering how I should open it, since my browser runs out of memory really quickly. I apologize if this kind of question does not belong here, but thank you for your attention anyway. [14:07] *** BlueMaxim has quit IRC (Read error: Operation timed out) [14:10] *** odemg has joined #archiveteam [14:23] It seems you could open it in an external program like Web Archive Player, but it seems to not open. I can see it in my processes list though, so is it just slow to start up since the WARC file is so huge? [14:23] I am quite perplexed how to deal with this, since I only need a couple of stories found inside the file.. [14:24] *** schbirid2 has joined #archiveteam [14:27] *** username1 has quit IRC (Read error: Operation timed out) [14:35] I guess give webarchiveplayer some time [14:35] 140 GB is big [14:36] you can also browse it in the wayback machine [14:36] also let's go to #archiveteam-bs [14:40] alright, thank you [14:43] one thing though about web archive player, can I run out of memory trying to open this file? [14:48] *** username1 has joined #archiveteam [14:52] *** schbirid2 has quit IRC (Read error: Operation timed out) [15:10] *** schbirid2 has joined #archiveteam [15:13] *** username1 has quit IRC (Read error: Operation timed out) [15:30] *** oli has quit IRC (Ping timeout: 260 seconds) [15:37] *** oli has joined #archiveteam [15:38] *** username1 has joined #archiveteam [15:41] *** schbirid2 has quit IRC (Read error: Operation timed out) [16:03] *** schbirid2 has joined #archiveteam [16:07] *** username1 has quit IRC (Read error: Operation timed out) [16:21] altlabel beardicus Jonison midas midas1: please spread the @ [16:31] *** username1 has joined #archiveteam [16:35] *** schbirid2 has quit IRC (Read error: Operation timed out) [16:47] * guest__ gives an @ to xmc [16:52] is there a project to archive codeplex? [16:52] codeplex.com [17:00] *** schbirid2 has joined #archiveteam [17:02] *** username1 has quit IRC (Read error: Operation timed out) [17:14] Working on it now. Will need to do it again in October. [17:35] *** hook54321 has quit IRC (Ping timeout: 244 seconds) [17:36] *** tammy_ has quit IRC (Ping timeout: 244 seconds) [17:37] *** tammy_ has joined #archiveteam [17:46] *** hook54321 has joined #archiveteam [17:57] *** odemg has quit IRC (Remote host closed the connection) [18:02] *** hook54321 has quit IRC (Ping timeout: 244 seconds) [18:04] *** tuluut has quit IRC (Ping timeout: 244 seconds) [18:04] *** tuluut has joined #archiveteam [18:04] *** JAA has joined #archiveteam [18:05] *** bRick5772 has joined #archiveteam [18:07] *** hook54321 has joined #archiveteam [18:08] You may already have heard about Norte de Ciudad Juarez, the Mexican newspaper that shut down because one of their journalists was executed two weeks ago. [18:08] "Cantú [Norte executive] told the [Washington] Post he planned to tell his staff on Monday that the digital version of the paper would also be shutting down." (from http://time.com/4723047/newspaper-mexico-close-norte-de-ciudad-juarez/ ) [18:08] Not sure whether that means "we're not publishing anything anymore" or "we'll take everything down", but it might be a good idea to grab http://nortedigital.mx/ just in case. [18:13] Agreed [18:43] *** pizzaiolo has quit IRC (Ping timeout: 245 seconds) [18:47] *** pizzaiolo has joined #archiveteam [18:53] *** odemg has joined #archiveteam [18:54] *** Nume has left [19:00] *** ndiddy has joined #archiveteam [19:03] *** JensRex has quit IRC (Remote host closed the connection) [19:04] *** JensRex has joined #archiveteam [19:10] *** atomotic has joined #archiveteam [19:12] http://sci-hub.cc/downloads/doi.7z http://sci-hub.cc/downloads/dois-researchgate.log [19:13] *** username1 has joined #archiveteam [19:14] 403 forbiddon on both. ? [19:15] both saved just fine for me [19:16] schbirid2, what are these? [19:18] *** schbirid2 has quit IRC (Read error: Operation timed out) [19:18] Kaz, upload to IA? [19:18] I have no idea what these are [19:20] shared by sci hub twitter [19:20] some lists of dois compared to researchgate iirc [19:20] https://twitter.com/sci_hub [19:24] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [19:28] *** Kriel has joined #archiveteam [19:32] *** schbirid2 has joined #archiveteam [19:35] *** username1 has quit IRC (Read error: Operation timed out) [20:19] *** schbirid2 has quit IRC (Quit: Leaving) [20:24] https://coinplay.io/ is shutting down. Unfortunately, it looks like most content is inaccessible or gone already now as most pages just redirect to the homepage. The blog's still online though and lists at least part of what was available on the store, it seems. [20:27] *** Honno has quit IRC (Quit: Leaving) [20:31] https://www.mathjax.org/cdn-shutting-down/ -- not sure if this is worth investigating. It's not easily possible to find everything hosted on their CDN. [20:33] JAA, we can just archive the current CDN files, and the 404 resolver will solve it. [20:33] *** pnJay has quit IRC (Quit: Leaving) [20:35] rocode: I'm not sure what you mean. How would you get "the current CDN files"? There isn't any list as far as I can see, so those would need to be scraped from the web or something like that. [20:36] JAA: Search google for links to cdn.mathjax.org, compile list, remove duplicates, feed into archivebot. [20:37] Yeah, that would work. [20:43] Last one for today: http://deadrising.org/ ("a minecraft version of the COD Zombies mode"; essentially a store and a small community forum) has merged into http://journeygaming.com/ and will disappear "in a few days" as of 30th of March, i.e. any second now. [20:43] *** sep332 has joined #archiveteam [20:45] *** Jonison has quit IRC (Read error: Connection reset by peer) [20:45] *** sep332_ has quit IRC (Read error: Operation timed out) [20:52] *** bRick5772 has quit IRC (Quit: Leaving.) [20:53] I just saw that the Mininova grab is still running. The server is just responding with "Mininova.org is no more!" to everything since this morning (CEST), so you may want to stop it or skip all mininova.org URLs (unless ArchiveBot has special access or something?). [20:54] iirc we don't. i'll clean it up [20:55] *** icedice has joined #archiveteam [21:00] *** pnJay has joined #archiveteam [21:11] *** odemg has quit IRC (Remote host closed the connection) [21:12] *** odemg has joined #archiveteam [22:03] *** Mateon1 has quit IRC (Ping timeout: 245 seconds) [22:04] *** Mateon1 has joined #archiveteam [22:10] *** odemg has quit IRC (Remote host closed the connection) [22:13] *** odemg has joined #archiveteam [22:15] *** JAA has quit IRC (Quit: Page closed) [22:17] *** odemg has quit IRC (Remote host closed the connection) [22:24] *** tuluu_ has joined #archiveteam [22:26] *** tuluu has quit IRC (Read error: Operation timed out) [22:27] *** maelstrom has joined #archiveteam [22:30] *** odemg has joined #archiveteam [22:36] *** kcaj has joined #archiveteam [22:45] *** pizzaiol1 has joined #archiveteam [22:52] *** pizzaiolo has quit IRC (Remote host closed the connection) [22:57] *** khaoohs has joined #archiveteam [23:00] *** BlueMaxim has joined #archiveteam [23:34] *** pizzaiol1 has quit IRC (Read error: Operation timed out) [23:51] *** guest__ has quit IRC (Ping timeout: 268 seconds)