#archiveteam 2013-04-11,Thu

↑back Search

Time Nickname Message
00:12 πŸ”— SketchCow HI GANG
00:12 πŸ”— philpem Hi, SketchCow!
00:12 πŸ”— SketchCow FOS is being looked at today.
00:13 πŸ”— philpem Right, chucking this one out there, at the possible risk of getting laughed at. There's an art site called FurAffinity which is.. shall we say having some "administrative issues".
00:13 πŸ”— SketchCow Can someone step forward and WARC the living shit out of Furaffinity?
00:13 πŸ”— philpem They've got new hardware and rumour has it they're installing it, though given their previous track record I'm willing to bet they'll botch it in some way that'll nuke user data.
00:13 πŸ”— SketchCow http://www.furaffinity.net/
00:14 πŸ”— philpem Main thing that'd be interesting is art - that's an incrementing ID starting at 1. URL syntax http://www.furaffinity.net/view/%d/
00:15 πŸ”— philpem Sorry -- URL syntax is http://www.furaffinity.net/full/%d/
00:15 πŸ”— philpem Parse the HTML for "//d.facdn.net/[^"]*" and grab that file too (that's the image, story or whatever itself)
00:16 πŸ”— philpem Problems you will encounter: FA uses a fairly aggressive IP-based block system. It's not rate limiting, it's an *IP ban*.
00:16 πŸ”— philpem According to Dragoneer: "Well, this just became a recent issue after we changed our parameters to try to stop leachers and scripters by drawing back the amount of reasonable requests that the average human would put in, and multiplied it enough where a full house of people accessing FA still wouldn't trigger it. "
00:16 πŸ”— philpem (Dragoneer is FA's lead admin / site owner)
00:16 πŸ”— omf_ philpem, That is not too weird a site to save. We got 4chan stuff and someone got a major My Little pony site too
00:17 πŸ”— philpem omf_, cool :)
00:17 πŸ”— philpem The other issue: some art isn't visible unless you're logged in. You'll get an error message. The big two are "submission doesn't exist" (someone deleted it) and "you're not allowed to view this"
00:18 πŸ”— philpem The second one pops up if the submitter set their profile to "private" or if the submission is rated higher than "general audiences" and you're not logged in with a suitably-endowed account.
00:18 πŸ”— philpem Basically you need to turn the "safe for work" filter off and set the "age" field to something plausible but over 21.
00:18 πŸ”— omf_ philpem, any idea how big the site is or the number of pics?
00:19 πŸ”— philpem Last quoted figure for the contents of the CDN was "several terabytes". Not sure how recent that is.
00:21 πŸ”— philpem Their siteop published a few "submission maps" on his profile: http://www.furaffinity.net/user/yak
00:21 πŸ”— philpem One pixel per ID, white for 'deleted', black for 'general audiences', blue/red for higher content ratings
00:22 πŸ”— philpem So in theory you could parse one of those (most recent is 2012-01-11, there have been a lot of submissions since then) and get an idea of how many of the IDs are actually in use
00:22 πŸ”— philpem There are also "journals" - http://www.furaffinity.net/journal/%d/ - again, IDs appear to start at 1 and increment from there.
00:23 πŸ”— philpem Spoofing the login is a piece of cake if you have an account - code here: https://bitbucket.org/philpem/libfurator
00:24 πŸ”— philpem just need user/passwd of a valid account
00:24 πŸ”— philpem They've changed the site templates so 'download submission' doesn't work any more, but login/logout should be fine.
00:29 πŸ”— philpem Anyway, I'm off to bed shortly (it's 1:30AM here). Will be online again from ~7pm BST tomorrow.
00:30 πŸ”— SketchCow Yo from 1:30am over here
02:07 πŸ”— omf_ SketchCow, would it be alright if I just shoved all the ign sites into one item that can be split up later?
02:26 πŸ”— balrog can someone wget-warc http://www.ambrosiasw.com/ ?
02:26 πŸ”— balrog just got a tip that its dying soon
02:26 πŸ”— balrog him weird
02:26 πŸ”— balrog hm*
02:54 πŸ”— hdevalenc hey, does anyone have any tips on how to find obscure stuff? I'm looking for a sonar recording apparently released at some point by the Swedish navy of a Russian submarine during the cold war
02:54 πŸ”— hdevalenc with all the channels etc
02:57 πŸ”— dashcloud use other search engines, search non-english sites- google translate can usually get you good enough button/option names. There's also torrent sites.
07:12 πŸ”— SketchCow omf_: No. I'd really like them separate.
07:13 πŸ”— SketchCow omf_: In a little while, we'll have FTP back in the guise of FOS, and we can do it that way.
07:14 πŸ”— omf_ I keep getting backed off on the ias3 uploads
07:15 πŸ”— SmileyG grabbing now balrog_
07:32 πŸ”— SketchCow --- fortressofsolitude.textfiles.com ping statistics ---
07:32 πŸ”— SketchCow 20844 packets transmitted, 0 packets received, 100% packet loss
07:33 πŸ”— SmileyG D:
07:54 πŸ”— BlueMax O_O
08:15 πŸ”— omf_ Γ―ΒΌΒˆΓ£Β€Β‚Γ£Β€Β…Γ‚Β°Γ―ΒΌΒ‰
08:24 πŸ”— SketchCow OK, weasels, back online in about 15 hours.
08:25 πŸ”— * BlueMax chirps, or whatever a weasel does
08:26 πŸ”— SketchCow Drive stock prices down through rumor to cover a short
08:45 πŸ”— prpx hdevalenc: this is probably the closest you're going to get without contacting someone http://www.nyteknik.se/nyheter/fordon_motor/fartyg/article266486.ece http://www.nyteknik.se/multimedia/dokument/article200688.ece/BINARY/Lyssna+p%C3%A5+ub%C3%A5tsljudet+h%C3%A4r
09:34 πŸ”— GLaDOS Oh finally, migrated away from arcti.ca
10:44 πŸ”— GLaDOS Taking the pastebin down for a bit.
11:06 πŸ”— ersi Quick, archive GLaDOS's pastebin!
11:06 πŸ”— ersi :)
11:43 πŸ”— GLaDOS Well, time to take an archive of a wiki I host then kill it.
11:43 πŸ”— * GLaDOS is migrating servers
13:07 πŸ”— SketchCow Lied, back
13:07 πŸ”— SketchCow So yeah, $70 for the Virgin Lounge in Heathrow.
13:07 πŸ”— SketchCow Good purhcase.
13:09 πŸ”— GLaDOS Fancy.
13:17 πŸ”— SketchCow Yeah
13:17 πŸ”— SketchCow Well, I'm old. I like to try things.
13:18 πŸ”— SketchCow This clubhouse has a tailor/shoe shiner, a deli (complimentary), showers, bar (complimentary), and an office/library.
13:18 πŸ”— SketchCow In the library? Taschen books. Tons of them,
13:18 πŸ”— GLaDOS That sounds rather nice.
13:18 πŸ”— SketchCow I just leafed through a $4000 Muhammed Ali book I never thought I'd touch, so that's something.
13:19 πŸ”— GLaDOS I've yet to be in any of the airline business lounges, so meh.
13:19 πŸ”— GLaDOS The one in Perth seems rather small from below, though.
13:20 πŸ”— trs80 Glados: they did have to expand the qantas one downstairs
13:20 πŸ”— GLaDOS Yeah, last time I was there, the renovation/expansion was ongoing.
13:20 πŸ”— GLaDOS I wonder what it's like now..
13:41 πŸ”— SmileyG well the interview seemed to go ok :o
13:42 πŸ”— SmileyG SketchCow: how is the UK?
13:43 πŸ”— SketchCow Like the US but with less pickup trucks
13:43 πŸ”— SmileyG :D
13:43 πŸ”— SketchCow Anyway, anything up Archive Team I need to know about?
13:43 πŸ”— SketchCow I know our Posterous Fear is increasing.
13:43 πŸ”— SketchCow I'm still waiting on FOS to return.
13:44 πŸ”— SmileyG We improved posterous speed
13:44 πŸ”— GLaDOS ..we did?
13:44 πŸ”— SmileyG Not sure if it's enough, but we had kennethre and alard testing last night UK time.
13:44 πŸ”— GLaDOS This is what I get for being too lazy to change IRC servers.
13:44 πŸ”— SmileyG The limit is/was 1000.
13:45 πŸ”— SmileyG POreviously it was 250 and kennethre was easily maxing it
13:45 πŸ”— GLaDOS Is there still some leeway for more people?
13:45 πŸ”— GLaDOS Wait, I can check that.
13:46 πŸ”— GLaDOS Oh, easily.
13:46 πŸ”— SmileyG yeah I think posterous gets bogged down still somewhat, but at least it was going faster last night
13:47 πŸ”— SmileyG wtf claims back down to 100 or so, I guess kennethre has all the big jobs now. Time to shutup or go to #preposterus
14:43 πŸ”— hdevalenc prpx: thanks a ton!
18:20 πŸ”— * philpem wanders in...
19:34 πŸ”— alard SmileyG: I suspect that kennethre's downloaders were getting blocked.
19:39 πŸ”— SmileyG :<
19:41 πŸ”— alard At least, that would explain why he's very active for a while, then stops.
19:42 πŸ”— SmileyG nod
19:42 πŸ”— SmileyG however, useragent? :/
19:42 πŸ”— alard He's using a normal user agent.
19:42 πŸ”— alard (I don't know how busy 'our' posterous servers are at the moment.)
19:44 πŸ”— SmileyG ahok
19:56 πŸ”— alard Just tried: the ArchiveTeam posterous servers are really slow.
20:55 πŸ”— kennethre alard: exactly :)
20:56 πŸ”— kennethre just did another restart
20:56 πŸ”— kennethre myabe i should just keep doing those
20:56 πŸ”— alard They're not restarting by themselves?
21:00 πŸ”— kennethre alard: they restart once a day automatically, but i assume they all get blocked by then
21:00 πŸ”— S[h]O[r]T i to use all my threads up on the archiveteam useragent
21:00 πŸ”— S[h]O[r]T like 200
21:01 πŸ”— alard kennethre: Maybe you could have them exit after a few items. (There's an option in run-pipeline for that.)
21:01 πŸ”— alard --max-items N
21:23 πŸ”— balrog_ someone please invalidate my formspring and posterous jobs; my box can't seem to handle the load.
21:42 πŸ”— kennethre alard: oh interesting...

irclogger-viewer