#archiveteam 2014-05-12,Mon

↑back Search

Time Nickname Message
12:26 🔗 schbirid someone archive https://twitter.com/ProbablyOnion2
12:26 🔗 schbirid http://krebsonsecurity.com/2014/05/teen-arrested-for-30-swattings-bomb-threats/
12:59 🔗 midas1 schbirid: done, archivebot
13:05 🔗 schbirid yay
13:11 🔗 midas1 harddrive will arrive this week
13:11 🔗 midas1 and ill ship it to you
13:16 🔗 schbirid yaaay
13:16 🔗 schbirid !
14:25 🔗 Nemo_bis midas: can you also fire archivebot on http://forums.sugarcrm.com/ ?
14:26 🔗 midas sure
14:27 🔗 midas running
16:43 🔗 SketchCow Hey, maniacs.
16:43 🔗 SketchCow So, we are uploading 82gb of radio crap into the archive.
16:43 🔗 SketchCow It's really in weird shape.
16:43 🔗 SketchCow If anyone wants access to do metadata, that'd be welcome.
16:44 🔗 SketchCow I also realize nobody has time for metadata
16:44 🔗 SketchCow I think a project for this summer for me is coming up with some way to do metadata automatically that's vaguely useful.
16:51 🔗 rocode ivan`: I am not sure if I read that correctly. Has the website owner disallowed members from logging in to archive the site, or do we just need login information?
17:07 🔗 Smiley yeah, enjoy that upload.... at least the file names are _Semi_ useful.
18:14 🔗 SadDM You like RPG fanzines? Let's be honest... probably not. Well too bad, because I've got another 200 uploading as we speak.
18:14 🔗 exmic dag
18:25 🔗 SketchCow Hey heyyyyyyyyyyyyyyyyyyy
18:25 🔗 SketchCow http://rubyforge.org/
18:25 🔗 SketchCow What's the story - did we grab this thing?
18:27 🔗 SadDM I don't know, but I'll throw arivebot on it and get what we can
18:32 🔗 SketchCow I just did
18:33 🔗 SadDM yup... saw that
18:36 🔗 rocode SketchCow: https://twitter.com/SirTerryWrist/status/412908014316707840 heh
18:36 🔗 SadDM SketchCow: Can I bother you later to move a few hundred items into folkscanomy_games... once my current upload is done?
19:08 🔗 SketchCow Sure.
19:08 🔗 SketchCow Also, I got fed up with us uploading a bunch of crazy shit
19:08 🔗 SketchCow So I wrote something that pulls keywords.
19:08 🔗 SketchCow it does.... OK. It does better than a kid I'm hiring for $9.95/hr to do it
19:10 🔗 SketchCow I should note this kid does not exist and I have not just destroyed a young person's dreams
19:11 🔗 SketchCow https://archive.org/details/1981-03-compute-magazine
19:11 🔗 SketchCow See? The "Subject", i.e. keywords.
19:15 🔗 SadDM That's awesome... now you need to share.
19:17 🔗 exmic I think he just did share
19:18 🔗 SadDM Also, my upload is now done. Everything in here can probably go: https://archive.org/search.php?query=uploader%3A%22aeakett%40gmail.com%22%20AND%20subject%3A%22roleplaying%20game%22%20AND%20NOT%20collection%3A%22folkscanomy_games%22%20AND%20NOT%20subject%3A%22podcast%22%20AND%20NOT%20collection%3A%22archiveteam%22&page=1
19:19 🔗 SadDM exmic: nah, he just shared the output... I wanna see inside of the black box.
19:20 🔗 exmic o
19:21 🔗 SketchCow Obviously, I'm finding edge cases are exploding.
19:21 🔗 SketchCow Well, the creation of the subjects into the item are not a big black box.
19:21 🔗 SketchCow That is, I just use the internetarchive python interface and do a ia metadata --modify="subject:SUBJECTTEXT" itemname
19:21 🔗 SketchCow So, that saves me time.
19:22 🔗 SketchCow But I'm using a keyword generator that I found on git
19:22 🔗 * SadDM waits with bated breath...
19:23 🔗 SketchCow Shhh
19:23 🔗 SketchCow https://github.com/ox-it/spindle-code
19:23 🔗 SketchCow Is that enough of the black box for you?
19:24 🔗 SadDM lol... probably, yeah.
19:30 🔗 SketchCow Now set so if there's subjects set it won't overwrite.
19:30 🔗 SketchCow Now I will run it against an entire run of magazines.
19:31 🔗 SketchCow If this works vaguely well, it will be especially good for the items that have never and will never have love.
19:32 🔗 SketchCow It'll never be perfect.
19:32 🔗 SketchCow https://archive.org/details/computer-power-user-magazine-v13i12 but that's a nice set.
19:33 🔗 SadDM it's pretty neat though... definitly a good start on stuff that I don't have time to actually read.
19:35 🔗 SketchCow > bash keyblart "$each"
19:35 🔗 SketchCow > do
19:35 🔗 SketchCow > done
19:35 🔗 SketchCow root@teamarchive0:/0/keywords# for each in `cat rammer.txt`
19:36 🔗 SketchCow The fact that THAT will generate a "reasonble" collection of keywords from the items, put them in, have them eventually end up as a keyword index for that collection?
19:36 🔗 SketchCow That works for me.
19:37 🔗 SketchCow https://archive.org/details/computer-power-user-magazine-v13i11
19:37 🔗 SketchCow Keyword: "Moulin Rouge"
19:37 🔗 SketchCow <face>
19:37 🔗 SketchCow -_-
19:38 🔗 rocode >hard hat
19:44 🔗 SketchCow https://archive.org/search.php?query=collection%3Acomputer_power_user&sort=-publicdate
19:44 🔗 SketchCow There it is populating.
19:44 🔗 SketchCow Not bad.
22:22 🔗 ivan` rocode: I don't know anything about it
22:32 🔗 rocode ivan`: Thanks. I will see if I can find out more.

irclogger-viewer