[00:47] what happened to torrentbytes? [00:49] after a recent experience with VBA at work, I think I know why there was so much VB crap, and why it was so popular now [00:59] it's surprisingly easy to slap together blocks of example/sample code and have them work together, even if you know little to nothing about the language. This leads to horribly written monstrosities that become key pieces of someone's workflow. [01:02] torrentbytes went offline for server upgrades [01:02] it was going to be take offline completely before [01:02] i grabed the forums anyways cause of this [03:00] dashcloud: I've had to deal with similar monstrosities in Excel [03:01] dashcloud: however, I've come to think that maybe the right way to "fix" that is to not eliminate the monstrosity, because it clearly works in some degree; rather, the right fix is to augment the monstrosity (at arm's length) with indexing, backup, query, etc. [03:01] dashcloud: or maybe that's just applicable to healthcare, I have no idea :P [03:02] so, I have that experience because I actually created a VBA script to handle a specific set of tasks in Word that I couldn't find any free program to do [03:04] I haven't done programming of any kind in many years, and I've never touched VBA before, yet with the help of "The Internet" and a little knowledge, I cobbled/hacked/etc a reasonably good script that solves the problem [06:47] "AWS is the overwhelming market share leader, with more than five times the compute capacity in use than the aggregate total of the other fourteen providers in this Magic Quadrant," [06:47] included HP, GoGrid, SoftLayer, Fujitsu, Virtustream, Tier 3, and Joyent in the "niche player" section, and Dimension Data, Savvis, and Terremark in the top left "challenger" section. [06:47] also rackspace and microsoft [06:48] I can tell that they have "more than five times the compute capacity in use", because EC2 instances are always slow as fuck [06:48] unfortunately spot instances are cheap as hell so I keep using them :( [06:48] it's lucky I don't apply the same philosophy to my food, or I'd be poisoning myself at McDonald's [09:03] GLaDOS: ? [09:04] I can't log in to the box :O [09:04] did you break it or ban me? :D [09:06] ..it went down literally just then. [09:07] "This dedicated server ks3099601.kimsufi.com is expired" [09:07] OH LOOK AT THAT, SAME ISSUE AS LAST TIME [09:08] ffs [09:08] kick off this time :! D: [09:17] "According to The Wall Street Journal, Alibaba.com is going for an IPO with a value of $70 Billion! Could this be an investment opportunity in Yahoo's stock, which holds 23% of Alibaba?" [09:18] Hahaha, worthless stock. [09:18] Also, "It's listed as active in our system. Are you unable to use it?" [09:18] Your system is WRONG. [09:19] Nice response time though. [10:40] i'm grabiing groklaw.net [10:43] Why? According to the cdx search that site has deep coverage going back to when it launched [10:43] Why not? [10:43] Doesn't hurt. [10:44] i know may know way i grabbed things by year [10:44] i got a read error at byte [10:45] then it does one retry and stops downloading [10:45] anyway to stop that? [10:50] also i need comments to be flat [10:50] so i can grab them all [10:55] By any chance, is anyone here capable of building regular expressions (regexps)? [10:55] if you have sample data [10:55] omf_: click on a story here: http://web.archive.org/web/20130102154112/http://groklaw.net/ [10:56] none of the links are in wayback [10:56] there going to the liveweb [10:57] omf_: also this will explain things better for you: http://web.archive.org/web/20130102154111/http://groklaw.net/robots.txt [10:58] anyways i need some help with how to get flat comments so we can get them all [10:58] I'm trying to write a rewrite/replace rule that will replace the first / occouring in the URI to _ (but not the / root) like: http://hostname/prod/TEXT-234234-HD_3423423_34234/ -> http://hostname/prod_TEXT-234234-HD_3423423_34234/ [10:59] omf_: ^ [10:59] I tried something like.. "rewrite ^/([^/]*)/([^/]*)$ /$1_$2 break;" - that didn't work though [11:02] is it always going to be a dir that you are renaming? [11:03] I think so, yes [11:03] might as well update that there, considering the last topic ;D [11:04] even better [11:08] (/)[\w-]+/$ [11:08] then replace the captured / with _ [11:08] 2013-08-20 07:07:59 (184 KB/s) - Read error at byte 27074/38322 (Connection reset by peer). Retrying. [11:08] fucking stopped again [11:09] i need help so when it does this it will keep going [11:12] omf_: I'm not quite sure how I'd use that :o [11:12] Are you going to use the regex over a file you have? [11:13] I'm using proxy_pass to another application [11:13] Integration work :| [11:14] App expects to get a url like http://hostname/prod_TEXT-234234-HD_3423423_34234/ - while the idiotic portal has links like http://hostname/prod/TEXT-234234-HD_3423423_34234/ [11:14] So I'm not doing anything besides rewriting the URL replacing the first instance of / in the URI [11:53] can anyone help me? [11:54] i need to figure out how to grab all comments off of groklaw.net? [12:03] all i know is there post method is used and the site refresh to a article.php [12:33] WHAT, groklaw is going away?? what's happening to the world [14:37] looks like %0D everywhere in my pdf list of groklaw [14:37] :-( [14:41] it just never dies [14:41] HELP [14:41] godane: it looks like a straightforward spidering will work on groklaw; it doesn't do anything fancy [14:41] godane: perhaps you're hitting it too fast [14:43] ylpdw: i thought that too but then i get a byte error retry [14:43] if i get that it will end [14:43] so i have to stop the byte error retry [14:43] and i don't think a wait will fix it [14:43] also remember i have crap wifi [14:49] huh [14:49] "The information on Groklaw is not intended to constitute legal advice. While Mark is a lawyer and he has asked other lawyers and law students to contribute articles, all of these articles are offered to help educate, not to provide specific legal advice. They are not your lawyers." [14:49] who is Mark? [14:50] oh [14:50] Mark Webbink [14:50] never mind [14:50] i put a 0.2 wait on my script [14:51] i'm hoping for no byte error crap [14:51] if that doesn't work, try something with a real line [14:51] or a VPS, etc. [14:52] there are pages only accessible with an account. [14:52] fyi [14:54] " Sorry, creation of new accounts has been temporarily disabled " [14:54] well then [14:54] they had problems with trolls :/ [14:54] do you have one? [14:54] maybe if you email pj she'll create an account for you [14:54] http://www.urbanterror.info/news/423-git-repository-hacked/ [14:54] I do [14:55] maybe you should run godane's grab [14:56] wget $website --mirror --warc-file=$website-$(date +%Y%m%d) --warc-cdx --reject-regex="(#|comment.php)" --warc-max-size=1G -H --domains=$website -w 0.2 -E -o wget.log [14:56] website="groklaw.net" [14:56] FYI, --mirror doesn't imply --page-requisites [14:57] you probably want that in there too [14:58] anything else before i start running it again [14:58] actually, I found another possibility [14:58] maybe we can just email her and ask her if she can donate a copy of all site data to IA [14:59] I'll do that [14:59] ok [14:59] still going to see about mirroring it [14:59] sure [14:59] yipdw: I'm pretty sure she would agree to that [15:00] though not user data obviously [15:07] it happened again [15:07] byte error [15:08] balrog: right, we're just looking for public data that's blocked by robots.txt [15:08] AFAIK, that includes comments [15:08] on groklaw? [15:08] yeah [15:08] and pdfs [15:08] unless I misread that robots file [15:08] yes, those too [15:08] those are important [15:08] well [15:08] having a user account allows you to view all comments on one page [15:08] I can't find a good PGP public key for her, heh [15:08] email her and ask for it [15:09] I'm not sure it makes sense to ask for a public key over a channel that's assumed compromised [15:09] I just won't encrypt this; it's not sensitive (yet) [15:10] by compromised I mean "someone could take it over" [15:10] I searched the keyserver and found two old 1024bit keys [15:10] which I wouldn't trust [15:10] I found a few, yeah [15:10] one's expired, the other isn't set to expire [15:11] you can email here mykolab address [15:11] her* [15:11] yeah, I'm sending to both [15:18] sent [17:07] |ω・) [17:14] :) [17:14] hahah omf_ well said [17:14] offical line, back that shit up [17:15] I am thinking groklaw might "come back" in the future [17:15] I hope so. [17:15] They were supposed to close a bunch of times before and it didn't happen [17:16] someone else might create a new site [17:16] nod [17:16] but that wont... save the existing stuff [17:16] not that i ever really follow it, it confused and scares the fuck outta me [17:20] having to decide what porn to put on the external: first world single's problem [18:35] I'm going to take a break from IRC for a few days, if you really need me contact me on XMPP (joepie91@dukgo.com) or via e-mail (admin@cryto.net)... but only if it's important in some way [18:47] balrog: I got a response from PJ [18:47] what did she say? [18:47] balrog: the reason why comments are blocked is because the majority of Groklaw members voted to keep them out of e.g. the LoC [18:47] you can use PM if you want [18:48] or if you have xmpp, I do xmpp with otr [18:48] oh, you want the whole email? [18:48] ok [18:48] I don't quite trust irc [18:48] unverified OTR isn't much better :P [18:48] but sure [18:48] I'm yipdw@member.fsf.org on XMPP [20:12] Commentary on my first ArchiveTeam diagram? --> http://picpaste.com/pics/Pz81z7Mx.1377022875.png [20:12] What other processes would benefit from a chart like this? [20:41] looks cool [20:41] maybe a start of an archiving project for a big site [20:42] someone says xx is dying need to save, people start investigating. it seems eventually some kind of leader emerges who is writing the code to get it started [20:43] I have quite a bit of that documented here --> http://pad.archivingyoursh.it/p/atpodcast [20:45] a graph is nice :P [20:46] Yeah a flow chart would make sense as a format [20:46] I am putting it on my list [21:39] so looks like episodes 293 to 306 of labrats.tv was on rev3 [21:39] i will add rev3 and revision3 to the keywords of those episodes [21:40] the creator will be labrats.tv [21:40] here is the itunes files: http://revision3.com/feed/show/labratsland/mp4-large/itunes