#archiveteam 2012-12-20,Thu

↑back Search

Time Nickname Message
09:04 🔗 hiker3 Are there examples on how to do a proper mirror of a website to a warc?
09:39 🔗 SketchCow On the wget page of archiveteam.org.
10:52 🔗 Nemo_bis So, to convert those ugly cbr files I ended up using KRename and PeaZip.
13:44 🔗 SketchCow Why did you rename them?
13:44 🔗 SketchCow archive can take .cbr and .cbz now
13:51 🔗 Nemo_bis SketchCow: because I had no idea.
13:51 🔗 Nemo_bis SketchCow: do _images.cbr work too? The content is not tidy at all.
13:57 🔗 Nemo_bis Ah, the _images is not even needed. http://blog.archive.org/2012/05/24/uploading-images-for-text-items/
13:57 🔗 Nemo_bis Sigh, I had even read that post I think. :/
14:04 🔗 SketchCow Yes
14:04 🔗 SketchCow I had them put in cbr and cbz support to save time
14:04 🔗 SketchCow The system unpacks them and makes it work
14:07 🔗 SketchCow Also, you're likely doing some magazines double.
14:07 🔗 SketchCow It's easier in the future just to shove these files at me.
14:09 🔗 SketchCow You got the attention of one of the developers from your uploading efforts, so congrats on that.
14:10 🔗 SketchCow http://archive.org/details/your-computer-magazine
14:10 🔗 SketchCow For example.
14:11 🔗 Nemo_bis I did check them before downloading.
14:11 🔗 Nemo_bis Let's see what happened in this case.
14:14 🔗 Nemo_bis Hm no idea.
14:15 🔗 Nemo_bis Most of the work is metadata, not downloading; so if you just want me to send you files you'd better directly download them. :)
14:15 🔗 Nemo_bis There's a list at http://archiveteam.org/index.php?title=Magazines_and_journals in case you lack ideas.:p
14:16 🔗 Nemo_bis I'm very sorry for YourComputer. :/
14:22 🔗 Nemo_bis SketchCow: I stopped the current upload of YourComputer but I can't stop derives. There shouldn't be many more duplicates, except among darkened stuff which I can't search.
14:38 🔗 Nemo_bis A couple collections I'm really eager to upload are "Meccano magazine" (1916-1981) and "Zzap!" (Italian version). :)
14:42 🔗 SketchCow Yes, those are much more relevant.
14:43 🔗 SketchCow by the way: the amount I like that we have piratebay links up on archiveteam.org: zero
14:45 🔗 Nemo_bis Better than negative?
14:45 🔗 Nemo_bis Linking is not illegal
14:46 🔗 Nemo_bis SketchCow: how many issues do you have in https://archive.org/details/microhobby-magazine ?
14:47 🔗 * Nemo_bis trying to check better for duplicates now, not only search.
14:48 🔗 SketchCow Yes, thank you. "Linking is not illegal"
14:49 🔗 SketchCow I cruise the pirate bay and a bunch of other scanning efforts that are public and I put them into archive.org constantly.
14:49 🔗 SketchCow So that action is pretty redundant.
14:50 🔗 Nemo_bis Which action?
14:50 🔗 SmileyG downlaod ALL the torrents.
14:50 🔗 SketchCow I'd like to see us come up with a system for submitting metadata improvements to collections on archive.org. The culture of the place (and logic) dictate a wikipedia-like maintenance of the metadata will never come forward, but coming up with a way to submit metadata so I can manually shove it in would be very helpful.
14:52 🔗 Nemo_bis That would surely be wonderful.
14:53 🔗 godane SketchCow: thanks for the attack of the show collection
14:54 🔗 SketchCow How the hell are you staying on top of me doing that?
14:54 🔗 SketchCow It doesn't notify you, does it?
14:54 🔗 Nemo_bis https://archive.org/search.php?query=jscott&sort=-publicdate ?
14:56 🔗 SketchCow https://archive.org/details/railwaymodeller
14:56 🔗 SketchCow Yes, I realize it's not hard to FIND OUT what I do - it's more a question of how fast one would track what I'm doing.
14:56 🔗 godane i have digit magazine
14:57 🔗 godane the items would have to be called something like digit-india-magazine-v#i#
14:57 🔗 godane also not all have covers
14:58 🔗 Nemo_bis SketchCow: it depends on how often one presses F5 I guess. :D
15:00 🔗 godane i have a lot of waiting tasks just for editing meta data
15:03 🔗 Nemo_bis Anyway SketchCow, most of what I uploaded is from a private Italian tracker, in general I agree that it's nothing special but it doesn't harm uploading some stuff I bump into while looking for other things.
15:03 🔗 Nemo_bis (I doesn't harm unless I upload duplicates of course. :"( )
15:03 🔗 SketchCow yeah, and that's fine. I'm just saying that riding pirate bay is not the best way to go - I'm already doing that as part of mypaid-for job
15:04 🔗 Nemo_bis SketchCow: that's why I put the list on the wiki, of course if you manage to do that stuff directly (or decide that it shouldn't be done) it will be much better (faster and better done).
15:05 🔗 SketchCow Yes, but I'm saying the entire "pull items from TPB" action isn't necessary to track.
15:05 🔗 Nemo_bis I was mostly looking for Italian stuff to avoid spending hundreds euros on buying magazines which someone already put on torrents. :p
15:05 🔗 SketchCow Statistically, I will go through all of the magazine and honestly even document and large-size torrents
15:05 🔗 Nemo_bis What do you mean, aren't todos useful?
15:06 🔗 SketchCow todos are useful in a roundabout sense
15:06 🔗 Nemo_bis uh?
15:06 🔗 SketchCow To-dos are useful when you are working on a set of items and want to close it down, or have a set of items multiple people are handling.
15:07 🔗 SketchCow And under the "pull items from The Pirate Bay to potentially put on archive.org", I have that one handled.
15:07 🔗 SketchCow "Pull items from private italian trackers", obviously you have that handled.
15:08 🔗 Nemo_bis Some (admittely few) of those items weren't so trivial to find, so you might have missed them.
15:08 🔗 Nemo_bis *admittedly
15:08 🔗 SketchCow Weren't so trivial to find.... on the pirate bay?
15:08 🔗 SketchCow Jesus, someone uploaded 1,441 newspapers. http://archive.org/details/narberthcivicassociation
15:09 🔗 Nemo_bis Meaning with very obscure name, no description, foreign language and so on
15:09 🔗 SmileyG https://archive.org/details/RedfishMagazine << on the right, does it always just list the file type, or can I have the names appear (i.e. I'm doing something wrong?)
15:10 🔗 SketchCow Let's bring this whole thing to #internetarchive
15:10 🔗 Nemo_bis But yes, for TPB most I can do is saving you some boring clicks/browsing, which is not bad though if I find something interesting?
15:11 🔗 SketchCow I am not indicating how strongly I do not like The Pirate Bay links on archiveteam.org
15:11 🔗 Nemo_bis Should those be replaced with titles without links?
15:12 🔗 Nemo_bis Sent to you as suggestion by email so that you can do when you have time/if they're worth it?
15:12 🔗 SketchCow I am always up for working with someone on suggested projects.
15:12 🔗 SketchCow if you find what you think are hidden gems, I always appreciate a mail. I get those a lot.
15:12 🔗 SketchCow I can absorb a torrent faster than nearly anyone, and have scripts to inject those items into archive.org very, very fast.
15:13 🔗 SketchCow metadata's a separate issue - I have a goal to make it a collaborative software issue.
15:13 🔗 SketchCow This is, again, us not in #internetarchive
15:14 🔗 SketchCow SmileyG: Please get over there too.
15:27 🔗 SketchCow http://www.flickr.com/photos/textfiles/sets/72157632295912594/with/8290464381/
15:27 🔗 SketchCow By the way.,
15:29 🔗 Nemo_bis SketchCow: all the way by truck?
15:30 🔗 SketchCow No, halfway we switched to a food cart
15:30 🔗 Nemo_bis Ah, good, they're much faster than trains.
15:36 🔗 SmileyG 788MPH?!
15:36 🔗 * SmileyG shuts up now

irclogger-viewer