Time |
Nickname |
Message |
18:03
🔗
|
SketchCow |
Which gallery |
18:04
🔗
|
schbiridi |
http://www.ratedesi.com/albumrecentpics.php <- NSFW penises, nothing worse though |
18:04
🔗
|
SketchCow |
Yes, someone should immediately grab this. |
18:04
🔗
|
SketchCow |
Do we have an effective way to grab vbulletin? |
18:05
🔗
|
schbiridi |
nice, looks like it: http://archiveteam.org/index.php?title=VBulletin |
18:06
🔗
|
schbiridi |
that gallery is seperate though and i think so are the user profiles. |
18:06
🔗
|
schbiridi |
gotta go, take care |
18:06
🔗
|
SketchCow |
Let's start with what makes sense. |
18:06
🔗
|
SketchCow |
WHO WANTS TO DOWNLOAD RATEDESI.COM |
18:08
🔗
|
edsu |
SketchCow: who listens to info@archive.org ? |
18:08
🔗
|
SketchCow |
It's a general mailbox that allows whoever's on duty to route questions to the right internal person. |
18:09
🔗
|
SketchCow |
xk_id: Be aware, you'll violate the TOS to do it. We're all for it, but your advisor needs to get in on it, sadly. |
18:09
🔗
|
SketchCow |
Unless of course you're doing this freestyle, then just do it |
18:19
🔗
|
alard |
SketchCow: https://github.com/ArchiveTeam/3frame-grab |
18:19
🔗
|
balrog_ |
https://twitter.com/eevblog/status/294389522836381696 |
18:19
🔗
|
SketchCow |
Great |
18:23
🔗
|
alard |
(Actually, that's not a good strategy to get 3frames. It doesn't include numbers.) |
18:25
🔗
|
xk_id |
SketchCow: no, I have a supervisor. Thank you |
18:25
🔗
|
xk_id |
I'll speak to him |
18:25
🔗
|
xk_id |
wow. How come scholars never discuss this issue? |
18:25
🔗
|
xk_id |
I've never seen it mentioned in any articles from my area |
18:28
🔗
|
chronomex |
anonymisation is a very hard problem, btw |
18:28
🔗
|
chronomex |
people keep messing it up in all sorts of ways |
18:28
🔗
|
chronomex |
viz., the aol searches dump |
18:28
🔗
|
xk_id |
In my case, it's pretty easy. |
18:28
🔗
|
chronomex |
what are you working with? |
18:28
🔗
|
xk_id |
because I need to crawl an online social network, and extract the social graph. No nodes will have usernames/names |
18:29
🔗
|
xk_id |
Just making my own. Finished coding the worker and I'ms tarting to look into distributing it over EC2 |
18:29
🔗
|
edsu |
SketchCow: underscor is helping me out over in #internetarchive now, so I think I'm sorted |
18:31
🔗
|
SketchCow |
I saw and I saw him hijacking internal chat to get this going, so yes. |
18:31
🔗
|
SketchCow |
But info@archive.org would have worked too. |
18:31
🔗
|
SketchCow |
xk_id: Sociologists have a massive amount of mores and issues regarding this. And rulesets. |
18:32
🔗
|
SketchCow |
xk_id: The problem is that we moved into programmatic research, that is, the ability of programs and other observational items to go into general computing platforms, without those rules following. So it's easy to scrape something but TOSes get in the way. |
18:34
🔗
|
xk_id |
I'm really surprised scholarly literature does not mention this issue |
18:37
🔗
|
alard |
xk_id: You probably already know about these? http://snap.stanford.edu/data/ |
18:37
🔗
|
* |
xk_id nods |
18:37
🔗
|
xk_id |
I want to make my own dataset |
18:37
🔗
|
xk_id |
It is more worthwhile :) |
18:38
🔗
|
xk_id |
but SNAP (and the others) are my backup plan |
18:38
🔗
|
alard |
It's always good to do it yourself. |
18:38
🔗
|
alard |
There's also our http://archive.org/details/friendster-dataset-201107 and http://archive.org/details/friendster-groups-201107 |
18:39
🔗
|
xk_id |
oh, cool. don't you need an account for accessing the friendster network?\ |
18:41
🔗
|
alard |
This is from before it changed into a gaming site. |
18:42
🔗
|
xk_id |
alard: that's a very interesting dataset. has it been used so far? |
18:42
🔗
|
xk_id |
I didn't know it's a gaming site now |
18:42
🔗
|
alard |
xk_id: Not that I know of. I tried to get it listed on that snap site, sent them an email but never got a response. |
18:43
🔗
|
alard |
They have a frienster dataset, but it's much smaller. (And that for a repository of "large" datasets. Ha.) |
18:43
🔗
|
xk_id |
academics are a bit cliquey too i think |
18:49
🔗
|
SketchCow |
Well yeah |
18:58
🔗
|
alard |
xk_id: What kind of research are you doing? |
19:00
🔗
|
edsu |
SketchCow: i will remember info@archive.org for the future, sorry if I subverted the normal procedure there |
19:28
🔗
|
SketchCow |
It's not a big deal, I'm just telling you the easiest way to ensure stuff gets handled. I subvert the process 12 times a day |
19:42
🔗
|
chronomex |
heheh |
20:04
🔗
|
edsu |
SketchCow: nice :) |
21:34
🔗
|
godane |
so all 2007 episodes of tekzilla are uploaded now |
21:48
🔗
|
SketchCow |
I've been integrating as fast as I can. |
21:48
🔗
|
SketchCow |
How's the new toy? |
21:58
🔗
|
godane |
good |
21:58
🔗
|
godane |
i have use it in windows |
21:58
🔗
|
godane |
for some reason slitaz doesn't can't detect it |
22:15
🔗
|
godane |
so i'm also mirroring thefeed images from my thefeed articles dump |
22:16
🔗
|
balrog_ |
what's the model again? |
22:16
🔗
|
balrog_ |
Plustek OpticBook 3800? |
22:17
🔗
|
balrog_ |
or 4800? |
22:22
🔗
|
godane |
4800 |
22:40
🔗
|
SketchCow |
4800 |
22:40
🔗
|
SketchCow |
godane: Go to http://www.hamrick.com/ and grab the trial software |
22:42
🔗
|
godane |
i have it |
22:43
🔗
|
godane |
i tried vuescan on linux and it didn't detect the scanner |
22:44
🔗
|
godane |
i think i just have to update my slitaz-tank distro |
22:44
🔗
|
balrog_ |
I don't see any OpticBooks in http://www.hamrick.com/vuescan/vuescan.htm#plustek |
23:18
🔗
|
SketchCow |
Twitter is shutting down Posterous. |
23:18
🔗
|
SketchCow |
Archive Team ahoy |
23:18
🔗
|
SketchCow |
And I thought it was going to be a quiet fuckin' year |
23:22
🔗
|
SketchCow |
Wall, explore anyway, no official date set yet. |
23:22
🔗
|
SketchCow |
http://posterous.uservoice.com/knowledgebase/articles/56001-acquisition-faq |
23:23
🔗
|
chronomex |
posterous? fuck |
23:23
🔗
|
chronomex |
I don't see anything about shutdown there |
23:24
🔗
|
chronomex |
I mean it hints at it |
23:24
🔗
|
chronomex |
but that was in march |
23:26
🔗
|
SketchCow |
http://socialnewsdaily.com/7309/posterous-not-accepting-new-accounts-twitter-reveals-nothing/ |
23:27
🔗
|
chronomex |
weird. |