[00:16] *** Start has joined #archiveteam [00:21] *** ndiddy has joined #archiveteam [00:32] *** notjack has quit IRC (Ping timeout: 255 seconds) [00:34] hi [00:35] why is archive team's choice on the warrior still game trailers, when that project is finished [00:35] there's nothing to grab currently [00:36] and the rsync server needs to rest [00:38] choice is now urlteam2 [00:54] did you guys hear that google is closing picasa [00:54] i'm sure that will be fun :^) [01:00] nvm, stuff's being transferred to google photos [01:03] *** hive-mind has quit IRC (Ping timeout: 260 seconds) [01:03] *** hive-mind has joined #archiveteam [01:04] honestly I thought they did that years ago [01:04] with g+ [01:05] *** xXx_ndidd has joined #archiveteam [01:10] *** kyan has joined #archiveteam [01:11] hey what's with archiving search results, is that even a thing? [01:11] *** ndiddy has quit IRC (Read error: Operation timed out) [01:11] with all the SEO pollution even [01:21] *** xXx_ndidd is now known as ndiddy [01:48] *** BubuAnabe has quit IRC (Ping timeout: 633 seconds) [01:50] *** vitzli has joined #archiveteam [02:06] *** vitzli_ has joined #archiveteam [02:09] *** vitzli has quit IRC (Ping timeout: 246 seconds) [02:28] *** schbirid2 has joined #archiveteam [02:29] *** vitzli_ is now known as vitzli [02:30] *** VADemon has quit IRC (Quit: left4dead) [02:30] *** schbirid has quit IRC (Read error: Operation timed out) [02:34] *** yipdw has quit IRC (Read error: Connection reset by peer) [02:34] *** philpem has quit IRC (Ping timeout: 260 seconds) [02:35] *** yipdw has joined #archiveteam [02:57] *** espes__ has joined #archiveteam [03:17] *** megaminxw has quit IRC (Quit: Leaving.) [03:35] *** nickname_ has quit IRC (Ping timeout: 300 seconds) [03:45] *** BubuAnabe has joined #archiveteam [03:50] *** xXx_ndidd has joined #archiveteam [03:51] *** megaminxw has joined #archiveteam [03:56] *** ndiddy has quit IRC (Read error: Operation timed out) [03:56] *** xXx_ndidd is now known as ndiddy [03:59] *** WinterFox has quit IRC (Read error: Permission denied) [04:00] *** cadbury has quit IRC (Read error: Operation timed out) [04:01] *** cadbury has joined #archiveteam [04:02] *** WinterFox has joined #archiveteam [04:04] *** BubuAnabe has quit IRC (Ping timeout: 730 seconds) [04:24] *** mutoso has quit IRC (Ping timeout: 252 seconds) [04:34] *** vitzli has quit IRC (Leaving) [05:05] *** espes__ has quit IRC (Read error: Operation timed out) [05:39] *** Sk1d has quit IRC (Ping timeout: 250 seconds) [05:41] *** wp494 has quit IRC (Read error: Connection reset by peer) [05:46] *** Sk1d has joined #archiveteam [05:46] *** robink has quit IRC (Ping timeout: 190 seconds) [05:47] *** robink has joined #archiveteam [05:56] *** robink has quit IRC (Ping timeout: 190 seconds) [05:57] *** robink has joined #archiveteam [05:59] *** ndiddy has quit IRC (Read error: Connection reset by peer) [06:00] *** Stiletto has joined #archiveteam [06:25] *** wp494 has joined #archiveteam [06:51] *** Jogie has quit IRC (Ping timeout: 250 seconds) [07:04] *** Aranje has joined #archiveteam [07:15] *** oldcad has left [07:24] *** megaminxw has quit IRC (Quit: Leaving.) [07:49] *** megaminxw has joined #archiveteam [08:07] *** GLaDOS has quit IRC (Ping timeout: 260 seconds) [08:08] *** GLaDOS has joined #archiveteam [08:12] *** jut has joined #archiveteam [08:13] *** megaminxw has quit IRC (Read error: Operation timed out) [08:14] *** jspiros has quit IRC (Read error: Operation timed out) [08:17] *** megaminxw has joined #archiveteam [08:19] *** jspiros has joined #archiveteam [08:21] http://techcrunch.com/2016/02/12/youtube-acquires-bandpage/ Do we have it? [08:24] *** philpem has joined #archiveteam [09:13] I'm getting a load of warrior failures in the rsync stage on fotolog [09:13] https://www.irccloud.com/pastebin/LWkwgGvs/ [09:14] :/ it's making it way slower then it would be otherwise, all 6 workers are locked up on that error loop right now [09:14] TheKiwi, the staging server that receives your files is quite overloaded at the moment :P [09:14] heh [09:14] I assumed it was something along those lines, but wanted to make sure it wasn't my side [09:15] yeah np [09:16] *** jut has quit IRC (Ping timeout: 492 seconds) [09:16] ok it's time to cut some rsyncs down [09:16] 09:15:46 up 239 days, 12:53, 3 users, load average: 181.65, 183.79, 185.30 [09:16] that's a... rather large number :D [09:16] if you know someone who is running a gazillion instances, tell them they are not helping [09:17] maybe this is all gametrailers [09:18] yes, it is [09:20] yeah, it's all multi-gigabyte GameTrailers videos [09:20] unfortunately GameTrailers and fotolog share the same rsync module, so they're going to be subject to the same connection limit [09:22] got a week before fotolog kicks it [09:23] I don't know what I'm supposed to do with that information [09:23] it's not like I can inject more IOPS inversely proportional to shutdown date [09:23] a second rsync host would be very helpful [09:25] we do have another one, but its disk array needs rebuilding and I still need to get the old stuff off of it [09:26] do you have an estimate on the space required? [09:26] I just got an idea [09:27] I can't upload TO the docstoc collection BUT I can make my own collection [09:27] I have no idea why it took me so long to figure this out [09:27] well, I think I can? [09:27] nope [09:28] items in collections are exempt from things like spam checking (iirc) so creating them requires info@archive.org [09:28] oh interesting I can shove it in some archive team collections but not others [09:29] However, if you can just add an extra subject tag like "fordocstoc" the items can easily be added to another collection later [09:29] *** Fletcher sets mode: +o yipdw [09:34] well, all the items have "Docstoc Dry Dock" in their title so they should be discoverable [09:34] let's see if shoving it into community texts works [09:35] *** Tomcat_ has joined #archiveteam [09:52] drydocstoc [09:53] *** jut has joined #archiveteam [09:53] *** jut_ has joined #archiveteam [09:53] *** jut_ has left [10:06] *** philpem has quit IRC (Ping timeout: 260 seconds) [10:18] *** notjack has joined #archiveteam [10:25] *** notjack has quit IRC (Ping timeout: 255 seconds) [10:28] *** Tomcat_ has quit IRC (Remote host closed the connection) [11:08] *** msgctl has joined #archiveteam [11:32] *** GLaDOS has quit IRC (Read error: Operation timed out) [11:32] *** GLaDOS has joined #archiveteam [11:52] *** xekc has joined #archiveteam [11:56] yipdw: if they have itemtype web it doesn't matter where they are uploaded [12:03] *** msgctl has quit IRC (Quit: http://chat.efnet.org (EOF)) [12:34] *** VADemon has joined #archiveteam [12:55] *** lunGunit has joined #archiveteam [13:01] *** lunG has quit IRC (Read error: Operation timed out) [13:41] *** mismatch has joined #archiveteam [13:48] *** redlob has quit IRC (Quit: ZNC - http://znc.in) [13:58] *** WinterFox has quit IRC (Remote host closed the connection) [14:04] *** redlob has joined #archiveteam [14:09] *** kyan has quit IRC (This computer has gone to sleep) [14:13] *** vitzli has joined #archiveteam [14:26] I'd like to get a new service up next week for videos [14:27] Not all videos currently grabbed by youtube-dl are playable in the Wayback Machine, sometimes more fies need to be saved then youtube-dl saves [14:28] The service should also be able to upload videos as video items to IA besides grabbing them as WARCs [14:29] It should also do discoveries for all videos of a user, playlists, etc. and grab all those videos [14:30] What do you think of such a service? [14:44] *** nickname_ has joined #archiveteam [14:44] *** arkiver3 has joined #archiveteam [14:44] *** VADemon has quit IRC (left4dead) [14:45] *** megaminxw has quit IRC (Quit: Leaving.) [14:51] Hm, I think FOS is timing out on rsync when resuming large (Gametrailers) files [14:55] its being hit with upto 20gb uploads [14:56] (nods) [14:56] Mm, I appreciate the problem. [14:57] (and some of those uploads will be duplicates given that the outstanding items appeared to get reissued to warriors a few times) [14:59] wonder if a received item search might be useful - e.g. if I could tell that 295233, 295547 and 295555 had already come back from other sources, i could just turn that warrior off and stop it smacking FOS all the time, trying to upload something that's already there [14:59] [or, equally, leave it running if they haven't come back yet] [15:17] *** Tomcat_ has joined #archiveteam [15:33] *** nickname_ has quit IRC (Ping timeout: 300 seconds) [15:35] *** nickname_ has joined #archiveteam [15:48] *** nickname_ has quit IRC (Read error: Connection reset by peer) [15:54] Soundcloud.com just (?) updated Terms of Use [15:56] *** ultra1 has joined #archiveteam [15:57] "You must not collect or attempt to collect personal data, or any other kind of information about other users, including without limitation, through spidering or any form of scraping" Yeah right. [15:58] can anyone point me in the direction of the people running the Windows 3.11 archive? Sorry, I'm guessing this is probably the wrong channel, but it looks like some of you are involved with it [15:58] SketchCow, ^^ [16:00] mostly just wondering about if I as a pleb user can upload content the emulator can run and if so, what's needed? Is it just the meta XML or is there more? I see some sqlite file too [16:00] ultra1, http://digitize.archiveteam.org/index.php/Making_Software_Emulate_on_IA [16:01] ultra1, they are derived [16:01] aka, the archive makes it for you [16:01] vitzli: ah, okay, thank you, that seems like exactly what I was looking for [16:14] *** VADemon has joined #archiveteam [16:17] Boop [16:18] Lots of FOS discussion [16:19] The load was 250, so the fact it's at 128 is at least an improvement [16:19] If it's truly holding back fotolog grabbing, then I guess I better increase rsyncs [16:21] wiki was 508ing earlier [16:23] https://archive.org/details/exile3_ruined_world yay it works :) [16:28] *** VADemon has quit IRC (Quit: left4dead) [16:29] BandPage has been acquired by Youtube and has been losing users for while. [16:33] I increased RSYNC to 150 [16:46] *** Swizzle has joined #archiveteam [16:55] arkiver: i like the video archiving idea [17:18] *** VADemon has joined #archiveteam [17:22] Load up to 165 [17:29] *** alberto has quit IRC (Read error: Operation timed out) [17:51] ouch https://monitor.archive.org/weathermap/weathermap.html [17:51] though that's outgoing [17:58] *** xekc has quit IRC (Quit: My Mac has gone to sleep. ZZZzzz…) [18:04] everybody be downloading windows 3.1 [18:07] *** GNUtoo-i1 has joined #archiveteam [18:10] arkiver, as we discussed earlier, Im all for the youtube project [18:21] arkiver, sounds like a good idea. Could you include periodic scraping for new videos? [18:23] *** Zei-Pii has joined #archiveteam [18:23] yipdw: oh it's been a while since I saw the pipes so busy [18:26] *** vitzli has quit IRC (Leaving) [18:35] *** ats has quit IRC (Read error: Operation timed out) [18:39] haha. i love the 500 error page I just got [18:40] *** Swizzle has quit IRC (Read error: Operation timed out) [18:49] go barbra! fight the hoarde od zombies breaking down the routers [18:49] *** bunni has joined #archiveteam [18:50] are there any current issues with the server? Over the past two days I've been getting io timeouts while trying to rsync. [18:50] using the VM [18:51] bunni, yeah the amount of connections has been dropped [18:52] Every now and then I'll get a connection, upload a few MB, and then it times out and I spin for a while getting another connection. [18:52] [sender] io timeout after 301 seconds -- exiting [18:52] rsync error: timeout in data send/receive (code 30) at io.c(140) [sender=3.0.7] [19:03] *** Swizzle has joined #archiveteam [19:11] bunni: the rsync server and the network it is on are experiencing a lot of load right now [19:16] yep - any reasonably large file is failing to resume because of the load [19:17] [but obviously, a fair bit of the load is because everything is then retrying, over and over again] [19:20] Any way to safely pause my upload and try to run it later without losing the data? [19:21] *** espes__ has joined #archiveteam [19:22] If you're running the warrior VM then you should be able to just pause the VM and resume it again later [19:23] Alright, I wasnt sure if that would somehow screw with the tracker [19:23] No, that won't be a problem. :) [19:24] alright. One less client hammering on the server. Hope it helps. [19:28] *** arkiver3 has quit IRC (Ping timeout: 252 seconds) [19:35] *** espes__ has quit IRC (Ping timeout: 250 seconds) [19:38] *** atomotic has joined #archiveteam [19:41] Is downloading all of Sci-Hub a possibility for Archive Team, or is that too hot? [19:41] Some Russians have hosted something like 20TB of science papers. [19:41] Some legal entities are mad, and it probably won't last long. [19:42] w0rp, we can do that in our sleep [19:42] I think it's technically feasible. I'm just wondering about the legal aspect. [19:44] w0rp: rather pointless, IA wouldn't host it [19:44] But a (set of) mega-torrent(s) may be in order [19:44] I think it would be nice if *others* had copies of it. Even if they don't surface for decades. [19:45] Universities have copied, for that matter. [19:46] *** oldcad has joined #archiveteam [19:46] Most contracts signed by universities in the last decade for digital-only subscriptions include a (retroactive) copy of all the data in university premises. [19:51] *** arkiver3 has joined #archiveteam [19:54] *** jut has quit IRC (Read error: Connection reset by peer) [19:57] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [20:25] *** mismatch has quit IRC (Remote host closed the connection) [20:29] *** ats has joined #archiveteam [20:31] *** mismatch has joined #archiveteam [20:37] *** jmtd is now known as Jon [20:46] *** kyan has joined #archiveteam [20:58] *** RichardG has quit IRC (Ping timeout: 246 seconds) [20:59] *** Zei-Pii has quit IRC (Read error: Operation timed out) [21:52] *** philpem has joined #archiveteam [22:06] *** kyan has quit IRC (Quit: Leaving) [22:17] *** arkiver3 has quit IRC (Read error: Connection reset by peer) [22:19] *** arkiver3 has joined #archiveteam [22:27] *** WinterFox has joined #archiveteam [22:41] *** Tomcat_ has quit IRC (Remote host closed the connection) [22:45] *** megaminxw has joined #archiveteam [23:00] *** arkiver3 has quit IRC (Ping timeout: 252 seconds) [23:06] *** RichardG has joined #archiveteam [23:26] *** kyan has joined #archiveteam [23:42] *** espes__ has joined #archiveteam [23:45] *** icedice has joined #archiveteam [23:45] I've finally added Forrst, Invisionfree, and Megalodon.jp to the wiki: [23:45] http://archiveteam.org/index.php?title=Forrst [23:45] http://archiveteam.org/index.php?title=Invisionfree [23:46] http://archiveteam.org/index.php?title=Megalodon.jp [23:46] and I've also added them to http://archiveteam.org/index.php?title=Alive..._OR_ARE_THEY#Watchlist [23:46] I'll add screenshots later [23:48] Are any of these web archivers worth archiving? [23:49] https://www.reddit.com/r/KotakuInAction/comments/30q92y/silverstring_media_might_be_up_to_no_good/cpuum5x [23:49] I haven't heard of most of them