#archiveteam-bs 2014-06-08,Sun

↑back Search

Time Nickname Message
00:09 🔗 Famicoman all of it?
00:28 🔗 balrog joepie91: http://www.anonnews.org/press/item/820/comments/ didn't have it?
00:29 🔗 balrog it used to be at http://www.archiveteam.org/archives/edramatica/ED_archive.zip
00:32 🔗 joepie91 balrog: well yes, "used to e"
00:32 🔗 joepie91 be *
00:32 🔗 joepie91 I've run across a number of broken links on archiveteam.org
00:32 🔗 joepie91 which is simultaneously funny and kinda bad
00:56 🔗 nico so we should run !a http://www.archiveteam.org/ more often on #archivebot
01:06 🔗 ivan` joepie91: archivebot has it
01:06 🔗 ivan` maybe not the old version you want
01:06 🔗 ivan` https://encrypted.google.com/search?q=archivebot+encyclopediadramatica+site%3Aarchive.org&btnG=Search
01:09 🔗 DFJustin http://web.archive.org/http://www.archiveteam.org/archives/edramatica/ED_archive.zip
03:16 🔗 godane so i'm mirroring msnbc news pages from wayback machine
03:16 🔗 godane crazy code to make it happen from cdx: cat cdx*msnbc.com*news*1* | grep 'asp?cp1=1 ' | grep 'text/html 200' | sed 's| http|/http|g' | sed 's| text/html.*||g' | sed 's|.* ||g' | sed 's|:80||g' | sed 's|http://msnbc.com|http://www.msnbc.com|g' | sort | uniq > urls.txt
03:17 🔗 yipdw yeah
03:17 🔗 yipdw there comes a point where shell is no longer the best option :P
03:39 🔗 yipdw https://www.fanfiction.net/s/9571902/1/The-Truth
03:39 🔗 yipdw whoa
03:40 🔗 yipdw Edward Snowden/Hetalia Axis Powers crossover
04:15 🔗 joepie91 yipdw: there are no limits to what can be found on hte interwebs
04:15 🔗 joepie91 the *
04:15 🔗 joepie91 ivan`: not the same stuff
04:15 🔗 joepie91 I mean, that webecology backup -was- integrated into the new site
04:16 🔗 joepie91 but it's not the same data :p
04:27 🔗 godane so i got a 22 min video from dateline in 1998 about beef
04:27 🔗 vantec well, it was what's for dinner
06:37 🔗 joepie91 how come this is not being updated anymore? https://archive.org/details/freemusicarchive
06:43 🔗 joepie91 SketchCow: underscor: if I were to write a client for IA, what should I set as the default maximum concurrent download and upload limit?
07:12 🔗 SketchCow Why would you write a client?
07:12 🔗 SketchCow We already have one.
07:12 🔗 SketchCow You could look at it and see if improvements or features are needed.
07:18 🔗 exmic but making it work with fortran is so much work
07:30 🔗 SketchCow https://pypi.python.org/pypi/internetarchive
07:30 🔗 SketchCow We've done a million uploads with it
08:00 🔗 joepie91 SketchCow: I mean a graphical client, where uploading to IA is one of the features
08:00 🔗 joepie91 not just a library
08:00 🔗 joepie91 it's something I've been working on for a while to automate some processes here
08:01 🔗 joepie91 hence wondering how many concurrent transfers are acceptable
08:01 🔗 joepie91 (also, SketchCow, I've actually been providing some feedback / bug reports on that library already :)
08:05 🔗 SketchCow That's the one.
08:05 🔗 SketchCow I would say, ask Jake then.
08:05 🔗 SketchCow jake@archive.org
08:07 🔗 joepie91 alright, thanks
08:10 🔗 SketchCow Also, the answer to "why hasn't _____ been updated on archive.org" is ALWAYS "because there are 8 people responsible for maintaining collections"
08:10 🔗 SketchCow So unless an outside person is maintaining/co-maintaining the collection, fix-ups come in waves
08:11 🔗 SketchCow Across years, sometimes
08:25 🔗 godane i'm close to 1000 videos for 2000 clips from nbcnews
08:25 🔗 godane *for year 2000
08:46 🔗 joepie91 damnit gmail
08:46 🔗 joepie91 where did my "you don't have a subject" warning go
08:46 🔗 joepie91 SketchCow: I see
08:46 🔗 SketchCow So I've been working on script-based ways to shore up our stuff.
08:47 🔗 SketchCow Because when the new UI kicks in it will DEFINITELY show gaps and slowdowns in additions.
08:48 🔗 joepie91 what kind of stuff should I be thinking about?
08:48 🔗 SketchCow In what context
08:48 🔗 joepie91 thinking of*, sorry
08:48 🔗 joepie91 like, what kind of stuff is to be shored up
08:48 🔗 joepie91 (my brain is on low-power mode today)
08:50 🔗 SketchCow Help me understand what's going on, again. You hinted but I was busy.
08:50 🔗 SketchCow Quit your job, intend to do "stuff" for a year.
08:50 🔗 SketchCow With IA being one of the beneficiaries of this time.
08:50 🔗 SketchCow Is that right?
08:50 🔗 joepie91 oh, that was a different context actually
08:50 🔗 joepie91 this was more a generic question of "what do you mean with 'stuff' in <@SketchCow> So I've been working on script-based ways to shore up our stuff."
08:51 🔗 joepie91 but yes, the above is also correct
08:51 🔗 joepie91 (though I'll have to see how the fundraiser idea works out before I commit to anything)
08:51 🔗 SketchCow What I am talking about scripting isn't an archiveteam thing. It's a me and the archive thing.
08:51 🔗 joepie91 well yes, but I'm curious what kind of stuff it entails :P
08:51 🔗 SketchCow Many items don't have cover images. Many don't have keywords, etc.
08:51 🔗 joepie91 aha
08:51 🔗 joepie91 right
08:52 🔗 SketchCow Many have no metadata of any kind. Intend to work on that.
08:52 🔗 joepie91 SketchCow: I'd been pondering about this a bit, but idk if this might simply already be on the roadmap: would wikifying metadata not be an option?
08:52 🔗 SketchCow That is an ugly situation.
08:52 🔗 SketchCow We worked together on that one solution, but I've had zero time to work with your code.
08:53 🔗 SketchCow Yanking metadata into a wiki wholesale, and then we edit and I oversee it flying back in, could be good.
08:53 🔗 SketchCow That's the best compromise we can have it.
08:53 🔗 joepie91 well, the idea I was thinking of was more inline wikified editing - so that a user with an account on IA could just edit metadata from an item page itself (excluding 'protected' items)
08:53 🔗 joepie91 but not sure how technically feasible
08:53 🔗 SketchCow There will never, never, ever be, at least within the span of years, a case where you click on something at IA and people do editing in a wiki fashion.
08:54 🔗 joepie91 what's the reasoning behind that?
08:54 🔗 SketchCow It's baked into the organization at the moment.
08:54 🔗 SketchCow I mean, you want to go ahead and tell me why it's great, go ahead, make yourself feel better. But I can see it won't happy anytime soon.
08:54 🔗 joepie91 right, but I'm quite curious whether that's just a time/attention constraint issue, or an inherent conceptual problem with wikifying
08:54 🔗 SketchCow Happy?
08:54 🔗 joepie91 er
08:54 🔗 SketchCow Conceptual problem.
08:54 🔗 joepie91 conceptual problem that people have with *
08:54 🔗 joepie91 right
08:54 🔗 SketchCow Combined with time/attention.
08:56 🔗 joepie91 SketchCow: completely unrelated quesiton, do you guys at IA have a spamfilter that triggers on empty subject lines? because I accidentally sent my email to jake without a subject, and apparently my gmail setting to warn me about that has magically vanished
08:56 🔗 SketchCow My end-run is the closest we'll have.
08:56 🔗 joepie91 question *
08:57 🔗 SketchCow I have not the slightest idea.
08:57 🔗 SketchCow I do know we have a spam issue.
08:57 🔗 SketchCow I don't use the IA mail system.
08:57 🔗 joepie91 alright, we'll see if I get a response then
08:57 🔗 joepie91 right :P
08:57 🔗 joepie91 I suppose that if you have a spam issue, it's not a terribly trigger-happy filter (if any at all), so my mail will probably go through fin
08:57 🔗 joepie91 fine *
08:57 🔗 SketchCow I am all for us using the parallel wiki idea.
08:58 🔗 joepie91 SketchCow: can you elaborate on how you'd see that working, in a technical sense?
08:58 🔗 exmic metadata goes in
08:59 🔗 exmic metadata comes out
08:59 🔗 exmic can't explain that
08:59 🔗 joepie91 lol
08:59 🔗 SketchCow We did a prototype a while ao.
08:59 🔗 SketchCow Sort of - you wrote a post bot but I've been busy.
08:59 🔗 joepie91 well obviously, but the idea I got was that SketchCow meant using a standard wiki system (a la mediawiki), at which point the question is "how do you turn the wiki page back into useful metadata without making the page a pain to edit"
08:59 🔗 joepie91 re: exmic
09:00 🔗 SketchCow * collection chosen
09:00 🔗 SketchCow * metadata of all items is pulled into wiki under a set, with each item a page
09:00 🔗 SketchCow * editttttt
09:00 🔗 SketchCow * push all of it back
09:00 🔗 SketchCow ----
09:00 🔗 SketchCow On a page:
09:00 🔗 SketchCow metadata pair becomes == METADATA NAME ==
09:00 🔗 SketchCow Followed by metadata.
09:01 🔗 SketchCow Obviously there is some trickery from the ingestor to pull things in.
09:01 🔗 SketchCow Obviously there is potential for things to go wrong, or for issues with newbs making a mess
09:01 🔗 SketchCow Obviously it's not the fast fast fast fast shut the fuck up it's fast keep going world of, say, Wikipedia.
09:01 🔗 SketchCow Which... I hate.
11:43 🔗 nico 05:40 yipdw> Edward Snowden/Hetalia Axis Powers crossover
11:44 🔗 nico i really should try to restart the ffnet archiving project
11:50 🔗 nico https://github.com/FlatRockSoft/
14:10 🔗 SadDM SketchCow: is the code for your keyword generator posted anywhere?
14:11 🔗 SadDM I know you're using https://github.com/ox-it/spindle-code/ and https://pypi.python.org/pypi/internetarchive, but what about the glue and baling twine that holds them together?
14:23 🔗 godane some good news on the martin yan's chinatowns torrents
14:24 🔗 godane i got upload 2 and upload 4 last night
14:25 🔗 godane so now i got about 30 episodes of it
17:03 🔗 ersi Hmm~ got a USB stick that shows up in dmesg as a SCSI removable disk (like usual) that gets a device (/dev/sdb).. but I can't mount it and if I `dd` from it, it says "dd opening /dev/sdb no medium found" :/
17:04 🔗 ersi Any ideas on how to retrieve data from it?
17:39 🔗 nico ersi: borked usb stick?
17:39 🔗 nico do cfdisk /dev/sdb return something real?
18:31 🔗 SketchCow SadDM: My keyword generator is VERY weaksauce
18:32 🔗 SketchCow If you want it, I can provide it
18:32 🔗 SketchCow Obviously you need write control on the item for it to work.
18:43 🔗 SketchCow SadDM: http://fos.textfiles.com/keyworder.zip
18:44 🔗 SketchCow You need internetarchive (the python program) installed
19:36 🔗 SadDM SketchCow: anything I'd cobble together would also be weaksauce... you've just saved me the trouble
19:39 🔗 SadDM gah! *BOOM* goes the zip file
20:16 🔗 DFJustin http://www.reading.ac.uk/news-and-events/releases/PR583836.aspx
20:41 🔗 DFJustin https://www.youtube.com/watch?v=d0mg9DxvfZE
23:41 🔗 balrog kanzure_: good question, I dunno. I'd think that people who do photographic printed circuit board production might know.
23:41 🔗 balrog this is for diybio?
23:59 🔗 kanzure_ balrog: yes, sort of

irclogger-viewer