#archiveteam 2012-07-12,Thu

↑back Search

Time Nickname Message
00:12 🔗 omf_ There are files out there that could be directly downloaded
03:48 🔗 BlueMax techcrunch.com/2012/07/11/hashable-the-app-that-aimed-to-replace-business-cards-to-shut-down-on-july-25/
04:26 🔗 yipdw BlueMax: Hashable does have some content, but it's available only if you sign in
04:26 🔗 yipdw a lot of profiles are restricted to an "inner circle" (ha)
04:26 🔗 yipdw actually
04:26 🔗 yipdw http://hashable.com/#!/stephencaggiano <-- can anyone tell me if you can see any user data on this without signing up and logging in?
04:27 🔗 chronomex seems to send me to the front page
04:27 🔗 chronomex or something frontpagey
04:27 🔗 yipdw hm
04:27 🔗 chronomex 0 user data
04:27 🔗 yipdw I guess if we want to archive it then we'll have to all make fake accounts again
04:28 🔗 yipdw I can't believe this site has leaderboards
04:28 🔗 chronomex haha, seriously?
04:28 🔗 yipdw yeah
04:28 🔗 yipdw this is kinda sad
04:28 🔗 yipdw seriously
04:28 🔗 yipdw this is fucking Persona
04:28 🔗 yipdw for business
04:28 🔗 chronomex "guys guys I have the best idea" "what is it?" "gamify business meetings!" "ooooooo"
04:28 🔗 yipdw Atlus should either be flattered or embarassed
04:29 🔗 chronomex who's atlus?
04:29 🔗 yipdw Hashable: SOCIAL LINKS GO
04:29 🔗 yipdw Atlus USA
04:29 🔗 yipdw they make the Persona series, amongst a ton of other games
04:29 🔗 chronomex does not mean anything to me
04:29 🔗 chronomex anyway
04:30 🔗 yipdw well, for more info, http://en.wikipedia.org/wiki/Shin_Megami_Tensei:_Persona
04:30 🔗 chronomex I basically have net-negative gamer and anime credentials
04:31 🔗 yipdw I tend to learn about this stuff by osmosis
04:31 🔗 chronomex sure
04:31 🔗 chronomex I'm pretty good at avoiding culture
04:31 🔗 chronomex so
04:31 🔗 chronomex I'm going to be away from home for a week
04:31 🔗 chronomex shut down computers or leave on?
04:31 🔗 yipdw shut down
04:32 🔗 chronomex but then I'd have to reopen everything
04:32 🔗 yipdw and make sure Wake-on-LAN works
04:32 🔗 chronomex heh
04:32 🔗 chronomex I don't have any of that set up
04:32 🔗 chronomex and they all have crypto root anyway
04:32 🔗 chronomex woop woop woop off-topic siren
04:33 🔗 arrith1 chronomex: dropbear
04:33 🔗 BlueMax chronomex, hibernate?
04:33 🔗 arrith1 all the convenience of no encrypted root, with all the insecurity of evil maids
04:33 🔗 chronomex BlueMax: well, I normally would but they've been up so long the kernels have been upgraded on-disk and so it wouldn't come up properly
04:34 🔗 arrith1 reboot then hibernate
04:34 🔗 chronomex I'm leaving in 20 minutes ...
04:34 🔗 chronomex I think I'll just shut them down now
04:34 🔗 * chronomex master of planning
04:34 🔗 yipdw that's where all your skill points went
05:35 🔗 Schbirid i was not successful in mirroring planetphillip.com
05:36 🔗 Schbirid he might be trapping bots with some dirs
05:36 🔗 Schbirid i got 404s, 500s etc
05:36 🔗 Schbirid gtg
05:45 🔗 Lord_Nigh http://www.trustedsec.com/july-2012/yahoo-voice-website-breached-400000-compromised/
05:46 🔗 Lord_Nigh only yahoo voice accounts affected, not yahoo email/yim/groups/etc
05:50 🔗 shaqfu I thought he took all of planetphillip offline
05:59 🔗 arrith1 http://burnbit.com/torrent/206849/yahoo_disclosure_txt
05:59 🔗 arrith1 torrent of the linked yahoo voice file being generated, ~20min to go
09:11 🔗 willwill Hashable to shut down on July 25 http://techcrunch.com/2012/07/11/hashable-the-app-that-aimed-to-replace-business-cards-to-shut-down-on-july-25/
09:15 🔗 SketchCow Again, no warning. Daaamn
09:16 🔗 BlueMax lol I linked that hours ago
09:16 🔗 BlueMax Hey SketchCow what's your opinion on the Ouya
09:17 🔗 SketchCow I invested.
09:17 🔗 SketchCow But big deal, it's an amusing $100 bet.
09:18 🔗 SketchCow I've spent $100 on a dinner that wasn't good
09:18 🔗 BlueMax SketchCow that's the exact reason I did that too
09:18 🔗 BlueMax $150 for a two-controller 1080p emulator box.
09:18 🔗 BlueMax Who can fairly complain?
09:18 🔗 BlueMax And if it gets XBMC and fairly good Android games, all the better.
09:27 🔗 SketchCow http://hashable.com/connections/index?eid=1BZTLZIAX7B60
09:28 🔗 SketchCow http://hashable.com/connections
09:28 🔗 SketchCow haha
09:35 🔗 BlueMax ?
09:51 🔗 ersi BlueMax: #archiveteam-bs for off-topic things >_>
09:51 🔗 BlueMax Oh shush ersi :P
09:54 🔗 SmileyG_ i love the way it still has a sign up link...
11:01 🔗 omf_ Schbirid, did you just wget/curl planetphillip? Use any special options? I will just build something that he cannot detect.
11:03 🔗 omf_ Maybe I am too old school but I have been writing scrapers since the late 1990s. The goal was always to get all the data as easily as possible. Usually by going undetected. I understand last minute grabs and these places expect it
11:04 🔗 omf_ So lets say he is running a honey pot or blackhole for bots
11:04 🔗 omf_ or even using some fancy piece of software on the logs
11:05 🔗 omf_ I would try spoofing my UA to spider the site and build a link list.
11:05 🔗 omf_ I would then take a statistically significant section of that data and try it out in Firefox automated via selenium and Perl
11:06 🔗 omf_ With the data in random order and random intervals and random amounts of pages getting opened
11:06 🔗 omf_ sometimes just 1, sometimes 3
11:07 🔗 omf_ I could then identify the trap and skip around it
11:07 🔗 omf_ usually I find them hidden in navigation blocks
11:07 🔗 omf_ AT is doing massive data mining and these are just methods for getting the data
11:08 🔗 SmileyG wish I knew this stuff :(
11:09 🔗 omf_ it is easier to be good at it now days
11:09 🔗 omf_ the tools are far more advanced and the tracking systems are not even close to catching up.
11:09 🔗 omf_ google has the best
11:10 🔗 omf_ one of the meme sites has a funny little trick for stopping robots as well
11:30 🔗 SmileyG maybe we should go to -bs?
11:31 🔗 SmileyG but I'm interested in waht they do.
15:57 🔗 Schbirid omf_: "time wget -w 2 -e robots=off -m -a planetphillip.com_20120711.log -nv --adjust-extension --convert-links --page-requisites --content-disposition --warc-file=planetphillip.com_20120711 http://www.planetphillip.com/"
15:57 🔗 Schbirid with a user agent clearly identifying me since i am a nice guy
15:57 🔗 Schbirid it felt like the grab was giving the server problems, since i got 500s randomly
15:59 🔗 omf_ well there is a 30 second crawl delay list in the robots.txt. He could have the server setup to throttle too many requests in a 30 second period
15:59 🔗 omf_ it is a possibility
16:24 🔗 sexfilmle http://www.sexfilmler.com see how turkish girls get fucked!
16:25 🔗 Lord_Nigh girls? i thought most turkish porn was gay porn to get people out of mandatory army service
16:25 🔗 Lord_Nigh regardless...
16:25 🔗 mistym I didn't really think the mechanics were that different.
16:57 🔗 SketchCow Nooooo, my sex film contact
17:46 🔗 SketchCow Under "who gives a shit", I'll be back in town properly this weekend after HOPE (hanging with Chronomex, actually) and all next week I'm archiveteam fulltime, properly.
17:48 🔗 SketchCow Our logo artist for Archiveteam Warrior is hard at work, that'll be ready soon.
18:33 🔗 omf_ SketchCow, you mentioned a while back you had some updated s3 scripts. Are they ready for public consumption?
18:59 🔗 godane SketchCow: I'm up to october 12 2011 of gbtv
18:59 🔗 godane i'm also uploading 'The List' episode thats going to be here: http://archive.org/details/GBTV_10_13_2011
21:07 🔗 godane SketchCow: http://www.engadget.com/2012/07/12/sinking-social-news-site-digg-bought-for-500k-by-nyc-firm-betaw/
21:07 🔗 godane we may need to do a just in time grab of digg.com
21:12 🔗 yipdw heh woww
21:12 🔗 yipdw I remember when Digg was hot shit
21:28 🔗 omf_ yeah digg was good before they realized they had no idea how to make money
21:28 🔗 omf_ I am glad to seem them die
22:41 🔗 Nemo_bis 
23:33 🔗 Perverzo Hi
23:33 🔗 Perverzo :D
23:47 🔗 instence http://hardware.slashdot.org/story/12/07/12/2219257/a-million-year-hard-disk

irclogger-viewer