[00:15] *** w0rp_ has joined #archiveteam-bs [00:16] *** w0rp has quit IRC (Read error: Operation timed out) [00:16] *** w0rp_ is now known as w0rp [01:28] *** VADemon has quit IRC (Quit: left4dead) [01:29] *** Ravenloft has joined #archiveteam-bs [02:17] *** Boppen has quit IRC (Ping timeout: 194 seconds) [02:27] alright, back - certainly happy to talk to folks about what makes sense/doesn't for such a setup [02:27] *** Boppen has joined #archiveteam-bs [02:27] I'd want to make sure it's something I can commit to long-term [03:02] *** ndiddy has quit IRC (Read error: Connection reset by peer) [03:49] *** BlueMaxim has joined #archiveteam-bs [04:01] *** RichardG has quit IRC (Ping timeout: 260 seconds) [04:03] *** RichardG has joined #archiveteam-bs [04:39] *** RichardG has quit IRC (Read error: Connection reset by peer) [04:40] *** RichardG has joined #archiveteam-bs [04:44] *** Stiletto has quit IRC (Read error: Connection reset by peer) [04:45] *** Stil3tt0 has joined #archiveteam-bs [05:30] *** Stil3tt0 is now known as Stiletto [05:54] *** Sk1d has quit IRC (Ping timeout: 194 seconds) [05:57] *** BlueMaxim has quit IRC (Read error: Operation timed out) [05:58] *** BlueMaxim has joined #archiveteam-bs [06:00] *** Sk1d has joined #archiveteam-bs [06:05] *** BlueMaxim has quit IRC (Read error: Connection reset by peer) [06:06] *** BlueMaxim has joined #archiveteam-bs [06:39] *** ravetcofx has quit IRC (Read error: Operation timed out) [06:43] *** ravetcofx has joined #archiveteam-bs [07:10] *** nicolas17 has quit IRC (how did it get so late again?) [07:41] *** wp494_ has joined #archiveteam-bs [07:43] *** wp494 has quit IRC (Ping timeout: 506 seconds) [07:45] *** Honno has joined #archiveteam-bs [07:59] so i'm about half way thur medium.com 2016 sitemap dumps [08:08] *** GE has joined #archiveteam-bs [08:16] *** schbirid has joined #archiveteam-bs [08:24] *** BlueMaxim has quit IRC (Ping timeout: 250 seconds) [08:25] *** BlueMaxim has joined #archiveteam-bs [08:29] *** fie has quit IRC (Quit: Leaving) [08:33] *** fie has joined #archiveteam-bs [08:39] *** BlueMaxim has quit IRC (Read error: Connection reset by peer) [08:40] *** BlueMaxim has joined #archiveteam-bs [09:30] *** Boppen has quit IRC (Ping timeout: 194 seconds) [09:35] *** GE has quit IRC (Quit: zzz) [09:40] *** Boppen has joined #archiveteam-bs [09:55] *** GE has joined #archiveteam-bs [10:07] *** GE has quit IRC (Remote host closed the connection) [10:40] *** pizzaiolo has joined #archiveteam-bs [11:41] *** BlueMaxim has quit IRC (Quit: Leaving) [11:44] *** pizzaiolo has quit IRC (Ping timeout: 260 seconds) [11:44] *** GE has joined #archiveteam-bs [13:04] *** rocode has quit IRC (Ping timeout: 246 seconds) [13:04] *** rocode has joined #archiveteam-bs [13:09] I thought the password was changed? [13:20] *** GE has quit IRC (Remote host closed the connection) [13:28] *** RichardG has quit IRC (Ping timeout: 244 seconds) [14:06] *** wp494_ is now known as wp494 [14:48] *** GE has joined #archiveteam-bs [14:54] *** Ravenloft has quit IRC (Ping timeout: 506 seconds) [15:41] *** Stiletto has quit IRC (Read error: Operation timed out) [15:42] *** nicolas17 has joined #archiveteam-bs [15:48] Hi. [15:49] ello [15:58] *** RichardG has joined #archiveteam-bs [16:22] SketchCow: It might be worth noting that the mediawiki version had a couple security issues and is almost a year old (the branch is now EOL). Especially it had issues with banned users' tokens/sessions remaining active after a ban, but this doesn't explain how they broke through the captcha/password thing. I am not sure if you have time to update it now (and whether that would be a good idea when someone is attacking [16:22] the page during the update) but it probably would be a good idea to do that at some point in the future. [16:33] *** Stil3tt0 has joined #archiveteam-bs [16:33] And he just registered another account, SketchCow. [16:35] *** VADemon has joined #archiveteam-bs [16:35] Interesting. [16:36] *** Honno has quit IRC (Ping timeout: 370 seconds) [16:38] I see the problem. [16:38] (I didn't change the real LocalSettings.php, just a backup) [16:38] Now both are changed, properly. [16:40] So this latest one was on me. [16:47] *** pizzaiolo has joined #archiveteam-bs [16:48] *** SmileyG has joined #archiveteam-bs [16:48] *** SmileyG has quit IRC (Client Quit) [16:51] *** pizzaiolo has quit IRC (Remote host closed the connection) [17:09] *** kyounko|2 has joined #archiveteam-bs [17:09] *** pizzaiolo has joined #archiveteam-bs [17:11] anyone has any idea why this would be deleted? https://web.archive.org/web/20170117170305/https://www.fsf.org/news/fsf-announces-a-major-overhaul-of-free-software-high-priority-projects-list [17:11] *** kyounko has quit IRC (Ping timeout: 246 seconds) [17:12] pizzaiolo: give it a day - it might have just been unpublished because some kind of edits had to be made [17:12] I see [17:13] if it's still gone tomorrow and no new post has appeared in its place (potentially at a different URL), that'd be the point to start asking questions :P [17:14] I did an archivebot just to be sure [17:16] works for me [17:16] that is, I can access the wayback link and the original link fine [17:17] maybe it was an IA hiccup [17:17] probably [17:29] pizzaiolo: oh wait, you meant that it didn't work on the wayback? I thought you meant the original [17:29] wayback one worked fine for me also [17:30] both work here too [17:31] *** pizzaiolo has quit IRC (Ping timeout: 250 seconds) [17:39] maybe I'm just lazy but my inbox treatment goes like this [17:39] wait for inbox to hit 2500 unread [17:39] mark all as read [17:39] if you can find someone who actually enjoys taking action on automated emails, I guess that sort of thing can work [17:40] or who has an unusually organized inbox and filter setup [17:40] oh I guess Nemo isn't in here [17:40] oh well [18:01] *** RichardG has quit IRC (Read error: Operation timed out) [18:01] *** RichardG_ has joined #archiveteam-bs [18:05] If you get that many automated emails that you don't care about, you may want to tune your subscriptions :p [18:15] or filters :) [18:40] This is adorable [18:54] i should like to get on with shoving gitorious into IA soon [18:55] As a server? [18:55] i'd like the repositories to also live in IA [18:56] i'm going to keep hosting it in its current form [18:56] but i do wish to send over a crapton of git-bundle files [18:57] (current gitorious hosting isn't five nines, it's barely one nine) [18:57] I'm barely one nine [19:43] *** Honno has joined #archiveteam-bs [19:44] *** ndiddy has joined #archiveteam-bs [19:50] *** Honno has quit IRC (Ping timeout: 370 seconds) [19:55] *** ranma has quit IRC (Read error: Operation timed out) [19:56] *** pikhq has quit IRC (Read error: Operation timed out) [20:06] I think I may have just murdered the tracker [20:08] what's wrong? [20:11] No HTTP response received from tracker. The tracker is probably overloaded. Retrying after 60 seconds... [20:12] hm [20:12] urlteam or one of the standard projects? [20:13] ftp-gov [20:13] cpu doesn't seem to be pegged at all [20:13] happens every now and then [20:16] *** pizzaiolo has joined #archiveteam-bs [20:17] *** pizzaiolo has quit IRC (Remote host closed the connection) [20:17] *** pizzaiolo has joined #archiveteam-bs [20:33] *** kyounko has joined #archiveteam-bs [20:35] *** kyounko|2 has quit IRC (Read error: Operation timed out) [20:36] Rewrote the FTPGOV thumnail/desc script to not do anything if the work is already done, setting it up for automatic running soon [20:39] Yep, looks solid. [20:46] *** pikhq has joined #archiveteam-bs [21:08] *** Swizzle has joined #archiveteam-bs [21:43] http://www.bbc.co.uk/news/world-us-canada-38659068 [21:43] can someone archivebot that please [21:46] done [21:47] thanks :) [21:47] non-voiced users are allowed to use archivebot, by the way [21:47] for !ao commands [21:48] *** BlueMaxim has joined #archiveteam-bs [21:55] Yay [21:55] I had a nice chat with the guy mirroring data.gov [21:55] He's writing custom grabbers, etc [22:12] *** schbirid has quit IRC (Quit: Leaving) [22:19] haha Twilio sent me this in a promo email [22:19] "AWS Lambda, otherwise known as “The Thing That Saves You from Hours of Infrastructure Management”," [22:19] sure, once you invest days to get your fucking code to play nice with it [22:19] lol [22:20] setting up the lambda and API gateway and all the IAM permissions around them is impossible without extra tooling around it [22:20] it replaces hours of management with others [22:20] maybe they have so many developers and ops people that the time spent getting it to work stably goes to zero with some sort of math [22:21] there is no cloud, it's just someone else's computer [22:21] there is no serverless, it's just someone else's container [22:23] there is no free, it's just someone else's money [22:23] the promo email continues by providing an example that uses AWS Lambda, claudia.js, and Twilio to build a chat bot [22:23] There is no closed debate, just bigger fires and me sitting cross-legged on your car watching [22:24] I mean I *guess* it's a better example than, say, word counting with mapreduce [22:24] watch me leverage 60 years of processor technology to BIN WORDS [22:24] bro we've got the *best* mapreduce, it's fully mongoDB compliant and everything [22:25] yipdw: did you see Fizzbuzz Enterprise Edition? [22:25] someone should do the microservices edition [22:26] I did [22:26] I can't understand it [22:26] I looked through it a while and couldn't find the code that actually does something [22:27] exactly :p [22:27] microservice fizzbuzz would be interesting [22:27] fizzbuzz, but with nonguaranteed results [22:27] eventually-consistent fizzbuzz [22:27] So, I'm about to give Archivebot a lot of work. [22:27] Ideally, it won't be too terrible [22:28] if it is we'll adapt [22:28] i hope to bring a big honkin new pipeline online within eight hours [22:28] actually, on a positive note, I like how dear imgui used fizzbuzz as a UI demonstrator https://github.com/ocornut/imgui/blob/master/imgui_demo.cpp#L1134 [22:29] cat *.tsv | wc -l [22:29] 6042989 [22:29] Don't flip out yet, I'm about to reduce it [22:29] is that one row per URL [22:29] yes, but not unduped yet [22:29] Hmu if you need emergency pipelines [22:30] oh ok [22:30] 6,042,989 is a lot but split it up by like 1,000,000 and it gets not so bad [22:30] ha ha [22:30] I love you think of it as "frown a little, nod" [22:31] https://i.imgur.com/EucIfYY.gif [22:31] it's a vestigial gut instinct from when people asked me "can our systems take this load" [22:31] the conditioned response is "haha fuck you no" [22:31] but usually it ends up being "oh that wasn't so bad" [22:31] De-duping now. Also, some of these are, like http://www.irs.gov [22:32] so there's this weird conflict [22:32] Can archivebot be handed a list? [22:32] I should just readthedocs [22:32] Oh there it is [22:32] cat *.tsv | sed 's/.*http/http/g' | cut -f 1 -d" " | sort -u | wc -l [22:32] 4046716 [22:33] Only 4 million!! [22:33] it can, and there is a !a < mode, vs. !ao < [22:33] though !a < with big sites can really load down a pipeline [22:33] since there's a list we can just use that to refer to things that need to be retried [22:33] I'm going to do these in sets, let me figure it out. [22:51] I wish there was a way of thanking users on the wiki [22:51] if they're in irc you can talk to them here [22:51] or you can write on their user page [22:55] wp494: ^ [22:55] assuming that applied to wp494 [23:02] yeah that's me [23:11] thanks wp494 [23:11] xmc: the thank you feature is really convenient tho [23:11] it's probably a mediawiki extension [23:12] it's one of the most obnoxious things i've ever seen in a webforum [23:12] why? it encourages participation [23:12] it tends to be implemented in an obnoxious way [23:13] xmc: how so? [23:13] i'm not going to debate this. i think it's obnoxious, you clearly don't. [23:13] It solves the dilemma of off-topicking "thanks" messages and encouraging the author so he can see how much impact s/he had [23:13] this is fine [23:13] although i'm not used to it too [23:14] xmc: I don't mean to be antagonizing, I was curious about your reasoning [23:14] it's not reasoning, it's a feeling [23:16] ^ it feels social-networkish? [23:16] *** GE has quit IRC (Remote host closed the connection) [23:16] *** Swizzle has quit IRC (Quit: Leaving) [23:19] *** Stil3tt0 has quit IRC (Ping timeout: 260 seconds) [23:22] The "like" or "reaction" button has saved so much on spam, but it does get pretty stupid when they try to implement it as a mood/tag system. "funny" "sad" etc. [23:22] Runs into the SlashDot problem, Insightful/Thoughtful/Funny etc. [23:23] rocode: that's not how mediawiki implemented it though [23:23] I was replying to VADemon, the webforum implementation. How did mediawiki solve it? [23:24] I would say the issue on archivebot isn't the load but huge jobs are more failure-prone [23:24] splitting it up increases the load in terms of how many resources you're using, but decreases the likelihood of a failure [23:24] rocode: you see an edit diff and there's an option to thank the user [23:24] on a single pipelines [23:24] that's all [23:24] -s [23:26] Sometimes threads actually should be bumped. The most non-annoying way I've seen is implemented on many Bulletin forums: "Thanks" button with a counter and hidden list of participants. Anyway I agree with you [23:29] so i have to see if i can finish rev3 geekbeattv collection [23:29] they don't have pages anymore but the rss feed and videos are still up [23:56] *** pizzaiolo has quit IRC (Read error: Operation timed out)