[00:00] *** schbirid2 has quit IRC (Read error: Operation timed out) [00:06] *** Stiletto has quit IRC () [00:09] *** kristian_ has joined #archiveteam-bs [00:17] *** schbirid has quit IRC (Ping timeout: 255 seconds) [00:26] *** nyany has quit IRC (Leaving) [00:29] *** schbirid has joined #archiveteam-bs [00:30] *** Stiletti has joined #archiveteam-bs [00:31] *** Stiletti is now known as Stiletto [00:59] All of my Roblox workers are either on 500 errors or rsync max connections(120) [01:04] *** kristian_ has quit IRC (Quit: Leaving) [01:22] *** nyany has joined #archiveteam-bs [01:46] *** j08nY has quit IRC (Remote host closed the connection) [01:56] *** schbirid has quit IRC (Ping timeout: 255 seconds) [02:02] chfoo, any way you can nudge your rsync connection max up a bit [02:04] oh wait it's on FOS [02:04] nvm [02:04] I'm dumb [02:09] *** schbirid has joined #archiveteam-bs [02:16] *** kristian_ has joined #archiveteam-bs [02:19] *** TheLovina has quit IRC (Read error: Operation timed out) [02:20] *** TheLovina has joined #archiveteam-bs [02:36] *** ld1 has quit IRC (Ping timeout: 260 seconds) [02:37] *** ld1 has joined #archiveteam-bs [03:03] *** kristian_ has quit IRC (Quit: Leaving) [03:22] http://www.archiveteam.org/ is returning 509 [03:36] *** schbirid has quit IRC (Read error: Operation timed out) [03:48] *** schbirid has joined #archiveteam-bs [03:51] *** pizzaiolo has quit IRC (Quit: pizzaiolo) [03:53] wait a bit [03:53] try again [04:11] *** Stiletto has quit IRC () [04:33] *** schbirid has quit IRC (Read error: Operation timed out) [04:45] *** schbirid has joined #archiveteam-bs [04:47] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [04:52] *** Sk1d has joined #archiveteam-bs [05:03] been nearly 2 hours now [05:51] *** schbirid has quit IRC (Ping timeout: 255 seconds) [06:04] *** schbirid has joined #archiveteam-bs [06:21] *** Honno has joined #archiveteam-bs [06:25] *** ld1 has quit IRC (Ping timeout: 260 seconds) [06:27] *** ld1 has joined #archiveteam-bs [06:28] Is there a channel for Roblox? [06:57] nope, this is it [06:57] okay, thanks [07:23] archiveteam, running out of bandwith sinds 2017 ;) [07:25] and rsync connections on FOS too [07:26] (BTW SketchCow, any chance of upping the limit a bit or is 120 all we're getting?) [08:02] that box is probably dying under that load anyway [08:05] possibly [08:05] might need another target that can handle much higher [08:22] *** j08nY has joined #archiveteam-bs [08:27] *** schbirid has quit IRC (Ping timeout: 255 seconds) [08:41] *** schbirid has joined #archiveteam-bs [08:44] *** tuluu has quit IRC (Ping timeout: 260 seconds) [09:14] *** kurt_ has quit IRC (Read error: Operation timed out) [09:14] *** Igloo_ has quit IRC (Read error: Operation timed out) [09:15] *** mgrytbak has quit IRC (Read error: Operation timed out) [09:15] *** tapedrive has quit IRC (Read error: Operation timed out) [09:16] *** tapedrive has joined #archiveteam-bs [09:16] *** kurt has joined #archiveteam-bs [09:16] *** Igloo has joined #archiveteam-bs [09:17] *** ItsYoda has quit IRC (Quit: rippppp to the yoda you used to know!) [09:17] *** mgrytbak has joined #archiveteam-bs [09:21] *** Hecatz has quit IRC (Ping timeout: 268 seconds) [09:23] *** ItsYoda has joined #archiveteam-bs [09:24] *** schbirid has quit IRC (Read error: Operation timed out) [09:25] *** Hecatz has joined #archiveteam-bs [09:34] *** zhongfu has quit IRC (Ping timeout: 260 seconds) [09:37] *** schbirid has joined #archiveteam-bs [09:38] *** Ravenloft has quit IRC (Ping timeout: 260 seconds) [09:38] *** zhongfu has joined #archiveteam-bs [09:55] *** schbirid2 has joined #archiveteam-bs [09:55] *** sun_shine has joined #archiveteam-bs [09:56] *** efsnable has joined #archiveteam-bs [09:57] I'm interested in archiving a pyramid scheme's website [09:58] *** schbirid has quit IRC (Read error: Operation timed out) [09:59] Said pyramid scheme has "virtual parties", the IDs of which are incremented integers. Over the past month the average is 1 party every 11.4 seconds. [10:00] The URLs to archive follow a pattern like http://example.scam/{user_id}/party/{party_id}/view [10:02] but a GET request to http://example.scam/party/{party_id}/view will return a 302 to the correct URL with the username [10:08] My question is somewhat about whether this is worth pointing ArchiveBot at and then secondarily how to best handle either a range or just a list of URLs covering a given time period. [10:13] *** DFJustin has quit IRC (Ping timeout: 260 seconds) [10:14] *** DFJustin has joined #archiveteam-bs [10:14] *** swebb sets mode: +o DFJustin [10:16] *** BlueMaxim has quit IRC (Read error: Operation timed out) [10:16] *** BlueMaxim has joined #archiveteam-bs [10:26] *** bitBaron has quit IRC (Read error: Operation timed out) [10:27] *** ja0Hai has quit IRC (Ping timeout: 260 seconds) [10:27] *** ja0Hai has joined #archiveteam-bs [10:44] *** zhongfu has quit IRC (Ping timeout: 260 seconds) [10:45] *** zhongfu has joined #archiveteam-bs [10:51] *** zhongfu has quit IRC (Read error: Connection reset by peer) [10:52] *** zhongfu has joined #archiveteam-bs [11:02] you can add a list of urls using !archiveonly < https://www.example.com/some-file.txt [11:02] https://archivebot.readthedocs.io/en/latest/commands.html#archiveonly-file [11:04] cc sun_shine [11:17] *** sun_shine has quit IRC (Ping timeout: 245 seconds) [11:50] im working on bringing up another rsync target [11:59] *** j08nY has quit IRC (Quit: Leaving) [12:11] *** BlueMaxim has quit IRC (Quit: Leaving) [12:14] *** username1 has joined #archiveteam-bs [12:17] *** schbirid2 has quit IRC (Read error: Operation timed out) [12:24] 120 is about all we can do. [12:24] That machine gets mega-hit all the time. [12:29] *** schbirid2 has joined #archiveteam-bs [12:30] *** username1 has quit IRC (Read error: Operation timed out) [12:48] *** ld1 has quit IRC (Ping timeout: 260 seconds) [12:49] *** ld1 has joined #archiveteam-bs [12:51] My guy who usually gives me 10-15 new CD-ROMs with images every two weeks, stumbled on a collection of Russian Warez CDs. [12:51] We already have a lot of Russian Warez CDs, but this one, Triada, is complete and massive. [12:52] He has a stupid fast pipe to us, it's usually a case of uploading the materials pretty quickly. [12:52] But he's been at it over a day and a half. [12:52] 151 CDs and DVDs, and 180gb so far. [12:56] wow, that's pretty damn neat [13:25] *** TheLovina has quit IRC (Ping timeout: 1208 seconds) [13:35] *** Stiletti has joined #archiveteam-bs [13:42] I remember there was a way to view the job log (derive tasks etc.) of an item that was uploaded to IA. anyone know which URL I need to reach this? [13:46] it was a site with a couple tables with small, colored cells which contained links to the raw job logs etc. Maybe it was even external to archive.org, I'm not sure [13:55] nope, it's on the https://monitor.archive.org/ page [13:57] my bad, wrong box [13:58] *** username1 has joined #archiveteam-bs [14:00] but i think you might need admin rights for it [14:01] no, I have definitely seen these logs for some of my uploads before [14:02] Darkstar: archive.org/history/ ? [14:02] got a link to your upload? [14:02] yes, exactly. thanks @PurpleSym! [14:03] the only one i know is https://catalogd.archive.org/ :x [14:05] *** schbirid2 has quit IRC (Read error: Operation timed out) [14:05] *** username1 has quit IRC (Read error: Operation timed out) [14:08] *** schbirid has joined #archiveteam-bs [14:08] *** Stiletto has joined #archiveteam-bs [14:11] *** Stiletti has quit IRC (Ping timeout: 260 seconds) [14:21] I'll drop some concurrent on Roblox until another rsync target becomes available [14:24] Hm, who is this Jeff Kaplan and why does he mark perfectly valid KryoFlux dumps on archive.org as "spam"? ;-) [14:24] (not my upload though, so I'm probably not the right person to contact him about it) [14:25] dorkstar [14:26] Jeff is the dude @ archive.org [14:27] hi midas [14:27] remember me [14:28] u lesbo freak [14:28] midas: "the dude"? I thought that's Jason :) [14:29] efsnable: no, should i? [14:29] i think we have a soundcloud employee in here.. [14:30] or that guy that runs/ran twitpic [14:30] can't tell, they're all so, so angry [14:30] oh yeah the twitpic dude, he was really angry [14:30] i liked him [14:31] im gay [14:32] i don't care what you are. [14:32] i sell weed to elementary school kids [14:32] i dont care what you do either [14:33] they caught me [14:33] im going to prison soon [14:33] for 5 years [14:33] ok [14:33] midas r u a chick ? [14:34] can i suck u [14:35] *** yipdw sets mode: +b *!bossgt100@70.39.109.163 [14:35] *** efsnable was kicked by yipdw (efsnable) [14:37] Aw I'm too late with the popcorn [14:37] * mls *sulks* [14:42] ty yipdw [14:43] *** kittymeow has joined #archiveteam-bs [14:43] Darkstar: I poked jason [14:44] Does anyone know a good way of archiving a page that needs a login/cookie with an external site like archive.is or webcitation or wayback [14:44] kaplan is an (the only?) admin who has to go through all the tons of crap coming in daily, sometimes he makes a mistake [14:45] DFJustin: thanks. this is about samna-ami-kryoflux by the way. I didn't realize that there is actually a real person sifting through all the uploads all day long :) [14:45] Where sure a single person logged on can save it, but then there's no way to know if that person edited the page they archived to add or change stuff [14:45] so it'd be really good if there was anything like that that would act as a proxy or something and then archive it [14:46] While you are logged in [14:47] hmm can https://webrecorder.io/ do that? (I haven't tried) [14:48] example https://register.thesecretworld.com/account/paidservice/ctrl/offer the whole of the http://tswshop.funcom.com site is only visible while logged in, but it contains prices [14:48] thanks I'll try [14:50] *** ld1 has quit IRC (Ping timeout: 260 seconds) [14:51] *** ld1 has joined #archiveteam-bs [14:52] "Webrecorder MaintenanceWebrecorder is being upgraded!Please come back soon!" :( [14:52] you can run it on your own computers [14:53] https://github.com/webrecorder/webrecorder [14:54] *** godane has quit IRC (Quit: Leaving.) [14:54] *** PurpleSym sets mode: +o midas [14:55] I use https://addons.mozilla.org/addon/scrapbook for stuff like that it's really good.. but the point is if it's on client side instead of a remote site, it's hard to prove with controversial or money related stuff that the person archiving it didn't edit it before sharing the archive files [14:59] *** qw3rty3 has joined #archiveteam-bs [15:03] This could apply to a lot of politiucal stuff on facebook etc too, more and more stuff is locked behind "you need an account to view this" as corporations get more confident of a monpoly that they know people will feel forced to do it [15:04] *** Zebranky has quit IRC (Ping timeout: 633 seconds) [15:04] *** Zebranky has joined #archiveteam-bs [15:07] oh boy, instantly the uploads start [15:08] *** schbirid has quit IRC (Read error: Operation timed out) [15:11] GLaDOS: Oh so you've noticed eh? ;-) [15:11] well the second i added combine harvester as a target, about 150 pipelines connected [15:12] Isn't there a round robin or load balancing thingamabob? [15:13] there is in the tracker, but it only works if there's more than one target [15:13] before it was only FOS [15:14] Ah right, I see now (actually paying attention to the wall of text) [15:14] so hopefully that should keep things smooth [15:14] kittymeow: actually I think https://www.taricorp.net/2016/web-history-warc/ this might be interesting to you, it uses your firefox cookies to get data from sites that require login. [15:16] I don't know what the current threshold is on the tracker, but I have a guess it's enough to congest both rsync targets like easy looking at the avg upload size [15:16] well right now the limit is at 150 items/minute, i'm not sure why it's set at that so i'm leaving it [15:17] may i suggest #robloxd [15:18] By all means [15:21] *** Stiletto has quit IRC () [15:22] *** schbirid has joined #archiveteam-bs [15:24] *** schbirid2 has joined #archiveteam-bs [15:26] *** schbirid has quit IRC (Read error: Operation timed out) [15:27] JAA: Lots of `.woff` are being pulled. Maybe doubles that could be left out. [15:28] cc arkiver [15:31] *** qw3rty3 has quit IRC (Nettalk6 - www.ntalk.de) [15:34] *** schbirid2 has quit IRC (Ping timeout: 255 seconds) [15:40] *** pizzaiolo has joined #archiveteam-bs [15:41] *** pizzaiolo has left [15:46] *** schbirid2 has joined #archiveteam-bs [15:50] *** Stiletti has joined #archiveteam-bs [15:52] *** godane has joined #archiveteam-bs [15:52] *** svchfoo1 sets mode: +o godane [16:10] *** ld1 has quit IRC (Ping timeout: 260 seconds) [16:10] *** ld1 has joined #archiveteam-bs [16:36] *** Stiletti is now known as Stiletto [17:02] *** username1 has joined #archiveteam-bs [17:06] *** schbirid2 has quit IRC (Read error: Operation timed out) [17:18] also i pay an unknowable amount of money to host the tracker and a few archivebot pipelines :) [17:20] Huzzah [17:23] *** Retroity has joined #archiveteam-bs [17:31] *** pizzaiolo has joined #archiveteam-bs [17:32] *** Retroity has quit IRC (Quit: Page closed) [17:45] *** ReimuHaku has quit IRC (Ping timeout: 250 seconds) [17:51] *** ReimuHaku has joined #archiveteam-bs [17:57] *** ReimuHaku has quit IRC (Ping timeout: 245 seconds) [18:03] *** ReimuHaku has joined #archiveteam-bs [18:14] *** ndiddy-pi has joined #archiveteam-bs [18:14] *** ndiddy has quit IRC (Read error: Connection reset by peer) [18:14] *** ndiddy-pi is now known as ndiddy [18:17] *** ReimuHaku has quit IRC (Ping timeout: 245 seconds) [18:19] *** RichardG has quit IRC (Ping timeout: 370 seconds) [18:22] *** ReimuHaku has joined #archiveteam-bs [18:29] *** Aranje has joined #archiveteam-bs [19:02] *** ld1_ has joined #archiveteam-bs [19:05] *** ld1 has quit IRC (Ping timeout: 260 seconds) [19:13] *** ld1_ is now known as ld1 [20:24] *** username1 has quit IRC (Ping timeout: 255 seconds) [20:26] *** balrog has quit IRC (Ping timeout: 260 seconds) [20:37] *** username1 has joined #archiveteam-bs [20:44] *** username1 has quit IRC (Quit: Leaving) [20:45] *** Stiletto has quit IRC (Read error: Operation timed out) [21:27] *** godane has quit IRC (Ping timeout: 260 seconds) [21:40] *** godane has joined #archiveteam-bs [21:41] *** svchfoo1 sets mode: +o godane [22:18] *** svchfoo3 has quit IRC (Quit: Closing) [22:25] *** RichardG has joined #archiveteam-bs [22:39] *** SilSte has quit IRC (Read error: Operation timed out) [22:50] *** SilSte has joined #archiveteam-bs [22:54] *** Silvan has joined #archiveteam-bs [22:54] *** Silvan has quit IRC (Read error: Connection reset by peer) [22:55] *** SilSte has quit IRC (Read error: Operation timed out) [23:01] *** SilSte has joined #archiveteam-bs [23:24] "Anyways.. while I know the MAME effort has been going on a long time, and in most cases they have been left alone by the original manufacturer/IP Rights Holder, they really need to be careful with what they're doing now; these manufaturers still have big legal departments, are more than willing to go for the throat, and companies like Namco and Nintendo in particular don't fool around [23:24] with this stuff. Copying ROMs is one thing, but when you're cracking the custom ASICs and other security devices they designed into the game hardware to prevent more or less what they're actually trying to do? That's one step away from producing knock-offs of the actual game. It's been a long time since I was involved in the coin-op game industry, but I'm sure that there are still plenty [23:24] of countries that would welcome with open arms even arcade games from 10 to 20 years ago, being better than what they have right now." [23:25] i hope those rippers are at least doing a minimal job to stay anonymous [23:25] from https://arstechnica.com/gaming/2017/07/mame-devs-are-cracking-open-arcade-chips-to-get-around-drm/ [23:26] nothing pisses me off more than when, say, a ROM translation or fan project reskinning/new content creators get shut down with a C & D [23:26] because they didn't even like try to stay anonymous [23:49] *** zyphlar has joined #archiveteam-bs [23:59] *** Odd0002 has joined #archiveteam-bs