[00:12] HI GANG [00:12] Hi, SketchCow! [00:12] FOS is being looked at today. [00:13] Right, chucking this one out there, at the possible risk of getting laughed at. There's an art site called FurAffinity which is.. shall we say having some "administrative issues". [00:13] Can someone step forward and WARC the living shit out of Furaffinity? [00:13] They've got new hardware and rumour has it they're installing it, though given their previous track record I'm willing to bet they'll botch it in some way that'll nuke user data. [00:13] http://www.furaffinity.net/ [00:14] Main thing that'd be interesting is art - that's an incrementing ID starting at 1. URL syntax http://www.furaffinity.net/view/%d/ [00:15] Sorry -- URL syntax is http://www.furaffinity.net/full/%d/ [00:15] Parse the HTML for "//d.facdn.net/[^"]*" and grab that file too (that's the image, story or whatever itself) [00:16] Problems you will encounter: FA uses a fairly aggressive IP-based block system. It's not rate limiting, it's an *IP ban*. [00:16] According to Dragoneer: "Well, this just became a recent issue after we changed our parameters to try to stop leachers and scripters by drawing back the amount of reasonable requests that the average human would put in, and multiplied it enough where a full house of people accessing FA still wouldn't trigger it. " [00:16] (Dragoneer is FA's lead admin / site owner) [00:16] philpem, That is not too weird a site to save. We got 4chan stuff and someone got a major My Little pony site too [00:17] omf_, cool :) [00:17] The other issue: some art isn't visible unless you're logged in. You'll get an error message. The big two are "submission doesn't exist" (someone deleted it) and "you're not allowed to view this" [00:18] The second one pops up if the submitter set their profile to "private" or if the submission is rated higher than "general audiences" and you're not logged in with a suitably-endowed account. [00:18] Basically you need to turn the "safe for work" filter off and set the "age" field to something plausible but over 21. [00:18] philpem, any idea how big the site is or the number of pics? [00:19] Last quoted figure for the contents of the CDN was "several terabytes". Not sure how recent that is. [00:21] Their siteop published a few "submission maps" on his profile: http://www.furaffinity.net/user/yak [00:21] One pixel per ID, white for 'deleted', black for 'general audiences', blue/red for higher content ratings [00:22] So in theory you could parse one of those (most recent is 2012-01-11, there have been a lot of submissions since then) and get an idea of how many of the IDs are actually in use [00:22] There are also "journals" - http://www.furaffinity.net/journal/%d/ - again, IDs appear to start at 1 and increment from there. [00:23] Spoofing the login is a piece of cake if you have an account - code here: https://bitbucket.org/philpem/libfurator [00:24] just need user/passwd of a valid account [00:24] They've changed the site templates so 'download submission' doesn't work any more, but login/logout should be fine. [00:29] Anyway, I'm off to bed shortly (it's 1:30AM here). Will be online again from ~7pm BST tomorrow. [00:30] Yo from 1:30am over here [02:07] SketchCow, would it be alright if I just shoved all the ign sites into one item that can be split up later? [02:26] can someone wget-warc http://www.ambrosiasw.com/ ? [02:26] just got a tip that its dying soon [02:26] him weird [02:26] hm* [02:54] hey, does anyone have any tips on how to find obscure stuff? I'm looking for a sonar recording apparently released at some point by the Swedish navy of a Russian submarine during the cold war [02:54] with all the channels etc [02:57] use other search engines, search non-english sites- google translate can usually get you good enough button/option names. There's also torrent sites. [07:12] omf_: No. I'd really like them separate. [07:13] omf_: In a little while, we'll have FTP back in the guise of FOS, and we can do it that way. [07:14] I keep getting backed off on the ias3 uploads [07:15] grabbing now balrog_ [07:32] --- fortressofsolitude.textfiles.com ping statistics --- [07:32] 20844 packets transmitted, 0 packets received, 100% packet loss [07:33] D: [07:54] O_O [08:15] (。々°) [08:24] OK, weasels, back online in about 15 hours. [08:25] * BlueMax chirps, or whatever a weasel does [08:26] Drive stock prices down through rumor to cover a short [08:45] hdevalenc: this is probably the closest you're going to get without contacting someone http://www.nyteknik.se/nyheter/fordon_motor/fartyg/article266486.ece http://www.nyteknik.se/multimedia/dokument/article200688.ece/BINARY/Lyssna+p%C3%A5+ub%C3%A5tsljudet+h%C3%A4r [09:34] Oh finally, migrated away from arcti.ca [10:44] Taking the pastebin down for a bit. [11:06] Quick, archive GLaDOS's pastebin! [11:06] :) [11:43] Well, time to take an archive of a wiki I host then kill it. [11:43] * GLaDOS is migrating servers [13:07] Lied, back [13:07] So yeah, $70 for the Virgin Lounge in Heathrow. [13:07] Good purhcase. [13:09] Fancy. [13:17] Yeah [13:17] Well, I'm old. I like to try things. [13:18] This clubhouse has a tailor/shoe shiner, a deli (complimentary), showers, bar (complimentary), and an office/library. [13:18] In the library? Taschen books. Tons of them, [13:18] That sounds rather nice. [13:18] I just leafed through a $4000 Muhammed Ali book I never thought I'd touch, so that's something. [13:19] I've yet to be in any of the airline business lounges, so meh. [13:19] The one in Perth seems rather small from below, though. [13:20] Glados: they did have to expand the qantas one downstairs [13:20] Yeah, last time I was there, the renovation/expansion was ongoing. [13:20] I wonder what it's like now.. [13:41] well the interview seemed to go ok :o [13:42] SketchCow: how is the UK? [13:43] Like the US but with less pickup trucks [13:43] :D [13:43] Anyway, anything up Archive Team I need to know about? [13:43] I know our Posterous Fear is increasing. [13:43] I'm still waiting on FOS to return. [13:44] We improved posterous speed [13:44] ..we did? [13:44] Not sure if it's enough, but we had kennethre and alard testing last night UK time. [13:44] This is what I get for being too lazy to change IRC servers. [13:44] The limit is/was 1000. [13:45] POreviously it was 250 and kennethre was easily maxing it [13:45] Is there still some leeway for more people? [13:45] Wait, I can check that. [13:46] Oh, easily. [13:46] yeah I think posterous gets bogged down still somewhat, but at least it was going faster last night [13:47] wtf claims back down to 100 or so, I guess kennethre has all the big jobs now. Time to shutup or go to #preposterus [14:43] prpx: thanks a ton! [18:20] * philpem wanders in... [19:34] SmileyG: I suspect that kennethre's downloaders were getting blocked. [19:39] :< [19:41] At least, that would explain why he's very active for a while, then stops. [19:42] nod [19:42] however, useragent? :/ [19:42] He's using a normal user agent. [19:42] (I don't know how busy 'our' posterous servers are at the moment.) [19:44] ahok [19:56] Just tried: the ArchiveTeam posterous servers are really slow. [20:55] alard: exactly :) [20:56] just did another restart [20:56] myabe i should just keep doing those [20:56] They're not restarting by themselves? [21:00] alard: they restart once a day automatically, but i assume they all get blocked by then [21:00] i to use all my threads up on the archiveteam useragent [21:00] like 200 [21:01] kennethre: Maybe you could have them exit after a few items. (There's an option in run-pipeline for that.) [21:01] --max-items N [21:23] someone please invalidate my formspring and posterous jobs; my box can't seem to handle the load. [21:42] alard: oh interesting...