#archiveteam 2013-01-24,Thu

↑back Search

Time Nickname Message
18:03 🔗 SketchCow Which gallery
18:04 🔗 schbiridi http://www.ratedesi.com/albumrecentpics.php <- NSFW penises, nothing worse though
18:04 🔗 SketchCow Yes, someone should immediately grab this.
18:04 🔗 SketchCow Do we have an effective way to grab vbulletin?
18:05 🔗 schbiridi nice, looks like it: http://archiveteam.org/index.php?title=VBulletin
18:06 🔗 schbiridi that gallery is seperate though and i think so are the user profiles.
18:06 🔗 schbiridi gotta go, take care
18:06 🔗 SketchCow Let's start with what makes sense.
18:06 🔗 SketchCow WHO WANTS TO DOWNLOAD RATEDESI.COM
18:08 🔗 edsu SketchCow: who listens to info@archive.org ?
18:08 🔗 SketchCow It's a general mailbox that allows whoever's on duty to route questions to the right internal person.
18:09 🔗 SketchCow xk_id: Be aware, you'll violate the TOS to do it. We're all for it, but your advisor needs to get in on it, sadly.
18:09 🔗 SketchCow Unless of course you're doing this freestyle, then just do it
18:19 🔗 alard SketchCow: https://github.com/ArchiveTeam/3frame-grab
18:19 🔗 balrog_ https://twitter.com/eevblog/status/294389522836381696
18:19 🔗 SketchCow Great
18:23 🔗 alard (Actually, that's not a good strategy to get 3frames. It doesn't include numbers.)
18:25 🔗 xk_id SketchCow: no, I have a supervisor. Thank you
18:25 🔗 xk_id I'll speak to him
18:25 🔗 xk_id wow. How come scholars never discuss this issue?
18:25 🔗 xk_id I've never seen it mentioned in any articles from my area
18:28 🔗 chronomex anonymisation is a very hard problem, btw
18:28 🔗 chronomex people keep messing it up in all sorts of ways
18:28 🔗 chronomex viz., the aol searches dump
18:28 🔗 xk_id In my case, it's pretty easy.
18:28 🔗 chronomex what are you working with?
18:28 🔗 xk_id because I need to crawl an online social network, and extract the social graph. No nodes will have usernames/names
18:29 🔗 xk_id Just making my own. Finished coding the worker and I'ms tarting to look into distributing it over EC2
18:29 🔗 edsu SketchCow: underscor is helping me out over in #internetarchive now, so I think I'm sorted
18:31 🔗 SketchCow I saw and I saw him hijacking internal chat to get this going, so yes.
18:31 🔗 SketchCow But info@archive.org would have worked too.
18:31 🔗 SketchCow xk_id: Sociologists have a massive amount of mores and issues regarding this. And rulesets.
18:32 🔗 SketchCow xk_id: The problem is that we moved into programmatic research, that is, the ability of programs and other observational items to go into general computing platforms, without those rules following. So it's easy to scrape something but TOSes get in the way.
18:34 🔗 xk_id I'm really surprised scholarly literature does not mention this issue
18:37 🔗 alard xk_id: You probably already know about these? http://snap.stanford.edu/data/
18:37 🔗 * xk_id nods
18:37 🔗 xk_id I want to make my own dataset
18:37 🔗 xk_id It is more worthwhile :)
18:38 🔗 xk_id but SNAP (and the others) are my backup plan
18:38 🔗 alard It's always good to do it yourself.
18:38 🔗 alard There's also our http://archive.org/details/friendster-dataset-201107 and http://archive.org/details/friendster-groups-201107
18:39 🔗 xk_id oh, cool. don't you need an account for accessing the friendster network?\
18:41 🔗 alard This is from before it changed into a gaming site.
18:42 🔗 xk_id alard: that's a very interesting dataset. has it been used so far?
18:42 🔗 xk_id I didn't know it's a gaming site now
18:42 🔗 alard xk_id: Not that I know of. I tried to get it listed on that snap site, sent them an email but never got a response.
18:43 🔗 alard They have a frienster dataset, but it's much smaller. (And that for a repository of "large" datasets. Ha.)
18:43 🔗 xk_id academics are a bit cliquey too i think
18:49 🔗 SketchCow Well yeah
18:58 🔗 alard xk_id: What kind of research are you doing?
19:00 🔗 edsu SketchCow: i will remember info@archive.org for the future, sorry if I subverted the normal procedure there
19:28 🔗 SketchCow It's not a big deal, I'm just telling you the easiest way to ensure stuff gets handled. I subvert the process 12 times a day
19:42 🔗 chronomex heheh
20:04 🔗 edsu SketchCow: nice :)
21:34 🔗 godane so all 2007 episodes of tekzilla are uploaded now
21:48 🔗 SketchCow I've been integrating as fast as I can.
21:48 🔗 SketchCow How's the new toy?
21:58 🔗 godane good
21:58 🔗 godane i have use it in windows
21:58 🔗 godane for some reason slitaz doesn't can't detect it
22:15 🔗 godane so i'm also mirroring thefeed images from my thefeed articles dump
22:16 🔗 balrog_ what's the model again?
22:16 🔗 balrog_ Plustek OpticBook 3800?
22:17 🔗 balrog_ or 4800?
22:22 🔗 godane 4800
22:40 🔗 SketchCow 4800
22:40 🔗 SketchCow godane: Go to http://www.hamrick.com/ and grab the trial software
22:42 🔗 godane i have it
22:43 🔗 godane i tried vuescan on linux and it didn't detect the scanner
22:44 🔗 godane i think i just have to update my slitaz-tank distro
22:44 🔗 balrog_ I don't see any OpticBooks in http://www.hamrick.com/vuescan/vuescan.htm#plustek
23:18 🔗 SketchCow Twitter is shutting down Posterous.
23:18 🔗 SketchCow Archive Team ahoy
23:18 🔗 SketchCow And I thought it was going to be a quiet fuckin' year
23:22 🔗 SketchCow Wall, explore anyway, no official date set yet.
23:22 🔗 SketchCow http://posterous.uservoice.com/knowledgebase/articles/56001-acquisition-faq
23:23 🔗 chronomex posterous? fuck
23:23 🔗 chronomex I don't see anything about shutdown there
23:24 🔗 chronomex I mean it hints at it
23:24 🔗 chronomex but that was in march
23:26 🔗 SketchCow http://socialnewsdaily.com/7309/posterous-not-accepting-new-accounts-twitter-reveals-nothing/
23:27 🔗 chronomex weird.

irclogger-viewer