[12:26] someone archive https://twitter.com/ProbablyOnion2 [12:26] http://krebsonsecurity.com/2014/05/teen-arrested-for-30-swattings-bomb-threats/ [12:59] schbirid: done, archivebot [13:05] yay [13:11] harddrive will arrive this week [13:11] and ill ship it to you [13:16] yaaay [13:16] ! [14:25] midas: can you also fire archivebot on http://forums.sugarcrm.com/ ? [14:26] sure [14:27] running [16:43] Hey, maniacs. [16:43] So, we are uploading 82gb of radio crap into the archive. [16:43] It's really in weird shape. [16:43] If anyone wants access to do metadata, that'd be welcome. [16:44] I also realize nobody has time for metadata [16:44] I think a project for this summer for me is coming up with some way to do metadata automatically that's vaguely useful. [16:51] ivan`: I am not sure if I read that correctly. Has the website owner disallowed members from logging in to archive the site, or do we just need login information? [17:07] yeah, enjoy that upload.... at least the file names are _Semi_ useful. [18:14] You like RPG fanzines? Let's be honest... probably not. Well too bad, because I've got another 200 uploading as we speak. [18:14] dag [18:25] Hey heyyyyyyyyyyyyyyyyyyy [18:25] http://rubyforge.org/ [18:25] What's the story - did we grab this thing? [18:27] I don't know, but I'll throw arivebot on it and get what we can [18:32] I just did [18:33] yup... saw that [18:36] SketchCow: https://twitter.com/SirTerryWrist/status/412908014316707840 heh [18:36] SketchCow: Can I bother you later to move a few hundred items into folkscanomy_games... once my current upload is done? [19:08] Sure. [19:08] Also, I got fed up with us uploading a bunch of crazy shit [19:08] So I wrote something that pulls keywords. [19:08] it does.... OK. It does better than a kid I'm hiring for $9.95/hr to do it [19:10] I should note this kid does not exist and I have not just destroyed a young person's dreams [19:11] https://archive.org/details/1981-03-compute-magazine [19:11] See? The "Subject", i.e. keywords. [19:15] That's awesome... now you need to share. [19:17] I think he just did share [19:18] Also, my upload is now done. Everything in here can probably go: https://archive.org/search.php?query=uploader%3A%22aeakett%40gmail.com%22%20AND%20subject%3A%22roleplaying%20game%22%20AND%20NOT%20collection%3A%22folkscanomy_games%22%20AND%20NOT%20subject%3A%22podcast%22%20AND%20NOT%20collection%3A%22archiveteam%22&page=1 [19:19] exmic: nah, he just shared the output... I wanna see inside of the black box. [19:20] o [19:21] Obviously, I'm finding edge cases are exploding. [19:21] Well, the creation of the subjects into the item are not a big black box. [19:21] That is, I just use the internetarchive python interface and do a ia metadata --modify="subject:SUBJECTTEXT" itemname [19:21] So, that saves me time. [19:22] But I'm using a keyword generator that I found on git [19:22] * SadDM waits with bated breath... [19:23] Shhh [19:23] https://github.com/ox-it/spindle-code [19:23] Is that enough of the black box for you? [19:24] lol... probably, yeah. [19:30] Now set so if there's subjects set it won't overwrite. [19:30] Now I will run it against an entire run of magazines. [19:31] If this works vaguely well, it will be especially good for the items that have never and will never have love. [19:32] It'll never be perfect. [19:32] https://archive.org/details/computer-power-user-magazine-v13i12 but that's a nice set. [19:33] it's pretty neat though... definitly a good start on stuff that I don't have time to actually read. [19:35] > bash keyblart "$each" [19:35] > do [19:35] > done [19:35] root@teamarchive0:/0/keywords# for each in `cat rammer.txt` [19:36] The fact that THAT will generate a "reasonble" collection of keywords from the items, put them in, have them eventually end up as a keyword index for that collection? [19:36] That works for me. [19:37] https://archive.org/details/computer-power-user-magazine-v13i11 [19:37] Keyword: "Moulin Rouge" [19:37] [19:37] -_- [19:38] >hard hat [19:44] https://archive.org/search.php?query=collection%3Acomputer_power_user&sort=-publicdate [19:44] There it is populating. [19:44] Not bad. [22:22] rocode: I don't know anything about it [22:32] ivan`: Thanks. I will see if I can find out more.