[00:06] *** icedice2 has quit IRC (Ping timeout: 260 seconds) [00:10] *** icedice has joined #archiveteam-bs [00:56] *** ola_norsk has quit IRC (Remote host closed the connection) [01:10] *** tomatokin has quit IRC (Ping timeout: 360 seconds) [01:12] *** drumstick has joined #archiveteam-bs [01:23] *** kristian_ has joined #archiveteam-bs [01:39] *** tar-xvf has joined #archiveteam-bs [01:43] *** odemg_ has quit IRC (Read error: Operation timed out) [02:09] *** schbirid has quit IRC (Ping timeout: 255 seconds) [02:20] *** icedice has quit IRC (Ping timeout: 245 seconds) [02:21] *** schbirid has joined #archiveteam-bs [03:02] *** drumstick has quit IRC (Ping timeout: 248 seconds) [03:06] *** drumstick has joined #archiveteam-bs [03:07] *** Asparagir has joined #archiveteam-bs [03:19] *** Asparagir has quit IRC (Asparagir) [04:48] *** Stilett0 has quit IRC (Ping timeout: 264 seconds) [04:50] Jason Scott, c/o Internet Archive, San Francisco, CA 94118 [04:50] Did jrwr just explain to lord nightmare who I am [04:50] aaawww [04:51] *** qw3rty112 has joined #archiveteam-bs [04:51] I told him in a pm to email you SketchCow [04:52] Mailbox at textfiles [04:52] no, jason@textfiles.com or jscott@archive.org [04:53] It was from the whois, Google was turning up.empty for me, I'll add it to.my noted [04:53] Lord_Nigh: you around [04:54] yes, I think I have those emails already [04:54] Cool [04:55] You are still my hero Mr scott [04:56] *** qw3rty111 has quit IRC (Read error: Operation timed out) [04:57] 300 Funston Avenue address, i assume? [04:58] I think since it was books, right to the internet archive for them [04:59] 11:50 PM <BA1719@ SketchCow> Jason Scott, c/o Internet Archive, San Francisco, CA 94118 [05:15] *** Stilett0 has joined #archiveteam-bs [05:38] yes, but there's no address within san francisco in that line [05:41] Jason Scott, c/o Internet Archive, 300 Funston Avenue, San Francisco, CA 94118 [07:03] Sorry if my poking at WARC uploading was what prompted the discovery of the bug. [07:04] (actually, I'm not sure if this is a "sorry", "you're welcome" kind of situation) [07:08] *** Pixi has quit IRC (Ping timeout: 255 seconds) [07:12] *** Pixi has joined #archiveteam-bs [07:14] *** kristian_ has quit IRC (Quit: Leaving) [07:30] *** Specular has joined #archiveteam-bs [07:34] *** Pixi has quit IRC (Quit: Pixi) [07:35] *** Pixi has joined #archiveteam-bs [08:59] *** odemg_ has joined #archiveteam-bs [09:01] *** schbirid has quit IRC (Quit: Leaving) [09:02] *** tar-xvf has quit IRC (Read error: Operation timed out) [09:05] *** drumstick has quit IRC (Read error: Operation timed out) [09:05] *** drumstick has joined #archiveteam-bs [09:27] Is there any whitelisted archiving service besides waybackmachine's 'save page now'? [09:30] ArchiveBot [09:34] *** ZexaronS has joined #archiveteam-bs [09:36] SketchCow: Is there no room for a compromise here? Hide the “unauthorized” WARCs until the user confirms he understands they might be fake? Show a big red banner including user and collection name? [10:21] *** kimmer12 has joined #archiveteam-bs [10:27] *** kimmer1 has quit IRC (Read error: Operation timed out) [10:52] *** kimmer1 has joined #archiveteam-bs [10:54] *** kimmer13 has joined #archiveteam-bs [11:00] *** kimmer12 has quit IRC (Ping timeout: 633 seconds) [11:01] *** kimmer12 has joined #archiveteam-bs [11:02] *** kimmer1 has quit IRC (Ping timeout: 633 seconds) [11:05] *** kimmer13 has quit IRC (Ping timeout: 633 seconds) [11:16] *** kimmer1 has joined #archiveteam-bs [11:19] *** kimmer12 has quit IRC (Ping timeout: 633 seconds) [11:20] *** dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.) [11:21] *** dashcloud has joined #archiveteam-bs [11:39] *** drumstick has quit IRC (Ping timeout: 248 seconds) [11:42] *** tomatokin has joined #archiveteam-bs [11:51] *** kimmer12 has joined #archiveteam-bs [11:55] *** kimmer1 has quit IRC (Ping timeout: 633 seconds) [11:56] *** kimmer1 has joined #archiveteam-bs [12:00] *** kimmer13 has joined #archiveteam-bs [12:02] *** kimmer12 has quit IRC (Ping timeout: 633 seconds) [12:05] I love to see that compromise. I have been archiving myself because waybackmachine fails quite often. [12:06] *** BnAboyZ has quit IRC (Quit: The Lounge - https://thelounge.github.io) [12:06] *** kimmer1 has quit IRC (Ping timeout: 633 seconds) [12:06] *** dashcloud has quit IRC (Read error: Connection reset by peer) [12:08] *** dashcloud has joined #archiveteam-bs [12:11] *** kimmer13 has quit IRC (Ping timeout: 633 seconds) [12:12] *** BnAboyZ has joined #archiveteam-bs [12:17] I'm surprised not many people here talking about it, Isn't this really big deal for amateur archiver? [12:18] we mostly just use archivebot [12:18] *** kimmer1 has joined #archiveteam-bs [12:28] *** BlueMaxim has quit IRC (Quit: Leaving) [12:37] hadn't really considered manipulated WARCs before, do many try uploading them? [12:46] *** kimmer1 has quit IRC (Ping timeout: 633 seconds) [12:51] *** kimmer1 has joined #archiveteam-bs [12:59] *** kimmer12 has joined #archiveteam-bs [13:02] *** kimmer1 has quit IRC (Ping timeout: 632 seconds) [13:03] *** kimmer1 has joined #archiveteam-bs [13:06] *** kimmer12 has quit IRC (Read error: Operation timed out) [14:04] *** kimmer12 has joined #archiveteam-bs [14:04] Everyone is adorable. [14:05] Here's the problem. [14:05] We budgeted for 1pb of disk space last year [14:05] We used 2pb [14:05] At some point, it'll be noticed that "just folks" are slamming thousands of WARCs into the opensource uploads and they were getting into the wayback. [14:05] We don't delete data [14:06] But we may only whitelist a set that comes through an authorized channel. [14:06] To be honest, it wasn't supposed to be accepting them before. [14:06] Also, WAY too many people, once they realize they can upload "anything" do an excellent job of deciding 200-1tb collections are great to have "just because" and suddenly we're youtube.bak [14:06] That's all. We'll see how it plays out [14:10] *** kimmer1 has quit IRC (Ping timeout: 633 seconds) [14:15] sounds like a significant problem. Whitelisting or getting approved doesn't seem like a bad step, assuming legit archiving efforts can still get through. [14:18] SketchCow: Leave it to us nerds to archive too much [14:19] *** kimmer1 has joined #archiveteam-bs [14:24] *** kimmer13 has joined #archiveteam-bs [14:26] *** kimmer12 has quit IRC (Ping timeout: 633 seconds) [14:30] *** kimmer1 has quit IRC (Ping timeout: 633 seconds) [14:37] *** sep332 has joined #archiveteam-bs [14:39] *** tomatokin has quit IRC (Ping timeout: 360 seconds) [14:52] *** kimmer13 has quit IRC (Ping timeout: 633 seconds) [14:54] *** kimmer1 has joined #archiveteam-bs [14:59] *** kimmer12 has joined #archiveteam-bs [15:05] *** kimmer1 has quit IRC (Ping timeout: 633 seconds) [15:07] *** dashcloud has quit IRC (Ping timeout: 250 seconds) [15:09] *** kimmer1 has joined #archiveteam-bs [15:13] *** dashcloud has joined #archiveteam-bs [15:15] *** kimmer12 has quit IRC (Read error: Operation timed out) [15:19] I mean SketchCow, Vid.me total was something like 600TB [15:20] there are more and sites closing that are like that [15:20] we only ended up getting 200TB~ due to limitations [15:25] I think the total was 1.4PB after de-duping, according to the staff guy. Pretty sure some of that chunk would have been Youtube mirrors as well since they offered an import ability (and just shared content in general across some channels). Crazy. [15:27] Ya [15:27] since youtube is going down the shitter with videos being removed [15:34] *** icedice has joined #archiveteam-bs [15:36] jrwr: how was the AWS bill in the end :p [15:37] no idea [15:37] its STILL online [15:37] but not [15:42] *** kimmer1 has quit IRC (Ping timeout: 633 seconds) [15:50] *** Specular has quit IRC (Leaving) [15:52] *** kimmer1 has joined #archiveteam-bs [16:30] *** kimmer12 has joined #archiveteam-bs [16:33] *** jschwart has joined #archiveteam-bs [16:36] *** kimmer1 has quit IRC (Ping timeout: 633 seconds) [16:57] *** kimmer1 has joined #archiveteam-bs [17:03] *** kimmer12 has quit IRC (Ping timeout: 633 seconds) [17:27] kimmer1: fix your connection [17:29] *** kimmer12 has joined #archiveteam-bs [17:34] *** kimmer13 has joined #archiveteam-bs [17:35] *** kimmer1 has quit IRC (Ping timeout: 633 seconds) [17:38] *** kimmer12 has quit IRC (Read error: Operation timed out) [17:44] *** icedice has quit IRC (Quit: Leaving) [17:53] *** kimmer1 has joined #archiveteam-bs [17:54] *** astrid has joined #archiveteam-bs [17:55] *** swebb sets mode: +o astrid [18:02] *** kimmer13 has quit IRC (Ping timeout: 633 seconds) [18:15] *** ZexaronS has quit IRC (Quit: Leaving) [19:17] *** ndiddy_ has quit IRC () [19:31] someone archive https://twitter.com/OrrinHatch/status/945375067927490560 [20:20] *** RichardG has quit IRC (Read error: Connection reset by peer) [20:21] *** RichardG has joined #archiveteam-bs [20:31] SketchCow: we are up to 1997-01-31 with tagesschau evening news videos i got [20:52] Does anyone know of a website which currently has CloudFlare's I'm Under Attack mode activated? (That's the "Checking your browser before accessing X" message.) [20:53] i got to love the fact that install dead rising 4 needs a 42gb update [20:58] *** schbirid has joined #archiveteam-bs [20:59] i'm very sure we are fucking screwed with backing up current games [21:07] *** jschwart has quit IRC (Quit: Konversation terminated!) [21:12] *** Stilett0 is now known as Stiletto [21:17] *** Mateon1 has quit IRC (Ping timeout: 260 seconds) [21:17] *** Mateon1 has joined #archiveteam-bs [21:25] *** dd0a13f37 has joined #archiveteam-bs [21:29] *** dashcloud has quit IRC (Read error: Operation timed out) [21:30] *** dashcloud has joined #archiveteam-bs [21:30] *** MrDignity has quit IRC (Read error: Connection reset by peer) [21:34] *** icedice has joined #archiveteam-bs [21:51] *** icedice has quit IRC (Quit: Leaving) [22:04] JAA: archive.is zip downloads have it enabled permanently [22:04] iirc [22:05] or whatever their current TLD is [22:07] Ah sweet, thanks. [22:08] The website uses .fo, but the downloads are on .today. [22:09] Oh, website's available on .is as well, but HTTP redirects to .fo. [22:09] Whatever. [22:09] I guess he wants to spread it out [22:10] Isn't "I'm under attack" when they want you to complete a captcha? [22:11] Ew, I get the captcha on those downloads from my server. [22:11] No, attack mode is that message I believe. [22:12] https://support.cloudflare.com/hc/en-us/articles/200170076-What-does-I-m-Under-Attack-Mode-do- [22:12] bisnode.se gave me a captcha just now [22:14] joepie91: You can't enable it for specific parts of the site, it has to do with caching [22:14] Their zip downloads aren't cached, but the main page probably is [22:17] Yes, you can. [22:18] Huh? Since when? [22:18] No idea. [22:19] *** MrDignity has joined #archiveteam-bs [22:26] godane: https://thepiratebay.org/torrent/17957059/Dead_Rising_4-BALDMAN_(Inclu_Update_1) [22:26] i'm doing this on xbox one s [22:27] my pc is not powerful enough to play it anyways [22:28] For archival purposes pc is better [22:40] *** ArgyroNet has joined #archiveteam-bs [22:44] *** drumstick has joined #archiveteam-bs [22:44] *** ArgyroNet has left [22:48] Interesting. CF changed the code for their attack mode challenge slightly at some point in the past few months. [22:48] *** svchost03 has quit IRC (Ping timeout: 360 seconds) [22:49] Nothing that changes how it works though. [22:51] Actually looks like a small bugfix. [22:59] ..are you trying to break cloudflare? [22:59] Yes. [23:00] Well, succeeding, mostly. ;-) [23:00] ah lord [23:04] I ported joepie91's parser to Python and am just implementing the final missing parts to get it working. I'll then look into how it can be integrated into wpull. [23:15] *** BlueMaxim has joined #archiveteam-bs [23:23] *** Mateon1 has quit IRC (Remote host closed the connection) [23:24] *** Mateon1 has joined #archiveteam-bs [23:34] TIL that tr sucks at handling multi-byte characters. [23:57] Yay, it seems to work correctly. :-) [23:58] 100 test cases gave the same results as the Node interpreter.