#archiveteam 2014-04-17,Thu

↑back Search

Time Nickname Message
20:45 🔗 ivan` http://www.openculture.com/2014/04/free-british-pathe-puts-over-85000-historical-films-on-youtube.html
20:46 🔗 ivan` actually I guess they've been there for two years now
20:51 🔗 SketchCow Yeah, somehow this happened
20:57 🔗 Nemo_bis but I see no licenses
21:17 🔗 Atluxity now 9 years since youtube started
21:49 🔗 danneh_ http://www.theverge.com/2014/4/17/5624890/dropbox-acquires-loom
22:11 🔗 SketchCow https://archive.org/details/generalmanual_000000025
22:11 🔗 SketchCow Threw one of godane's manuals up.
22:11 🔗 SketchCow Did a little metadatering.
22:19 🔗 godane if we are just going to name them generalmanual_$number then i could do that
22:19 🔗 SketchCow Just upload them. I'm going to end up doing stuff regardless.
22:19 🔗 SketchCow I'll be doing a duplication kill, for example.
22:20 🔗 godane also i planed on putting making the file name and item name the same so image in search shows up
22:23 🔗 SketchCow Of the files you've uploaded so far, 2518 are unique and 794 are dupes.
22:23 🔗 SketchCow by the way
22:24 🔗 godane oh
22:24 🔗 SketchCow It's trivial to find dupes, so it's no big deal at all.
22:26 🔗 godane if you can give me the script i could just remove the dupes then
22:31 🔗 SmileyG fdupes is beautiful, heh
22:31 🔗 SmileyG but i bet this works on the IA's filesystem?
22:33 🔗 SketchCow http://superuser.com/questions/386199/how-to-remove-duplicated-files-in-a-directory godane
22:33 🔗 SketchCow I used the bash 4.x one
22:33 🔗 SketchCow Obviously you change echo "rm $file"
22:33 🔗 SketchCow I actually made a directory called DUPES
22:34 🔗 SketchCow and made that line into -> mv $file DUPES
22:36 🔗 SmileyG hmmm yikes
22:36 🔗 SmileyG http://freecode.com/projects/fdupes
22:39 🔗 SketchCow Both work for this.
22:43 🔗 schbirid nice http://www.pietrobattiston.it/ddupes
22:43 🔗 SmileyG nod.
22:57 🔗 SketchCow http://samsungdesign.tumblr.com/
22:57 🔗 SketchCow Now that is the single funniest thing I will see today
23:00 🔗 SmileyG lol
23:21 🔗 SketchCow hahah
23:21 🔗 SketchCow https://twitter.com/kirk_hadley/status/456934723802398720
23:34 🔗 godane i may just upload the pdfs directly to IA
23:35 🔗 SketchCow Why not FTP them to me?
23:35 🔗 godane cause your using the name generalmanual_000000025
23:35 🔗 godane is that just some text item or are they all going be named that
23:35 🔗 godane *like that
23:35 🔗 SketchCow Because I was doing a test item?
23:36 🔗 SketchCow Godane, I am completely capable of making intelligent decisions.
23:36 🔗 SketchCow I chose a name that wouldn't cause later issues, on a single object, to see:
23:36 🔗 godane i know
23:36 🔗 SketchCow 1. If we could derive from it
23:36 🔗 SketchCow 2. How the system would interpret it
23:36 🔗 SketchCow 3. How the OCR would handle it.
23:37 🔗 godane i figured it was a test item but was not sure
23:37 🔗 SketchCow But I need to know what your plan is if you're going to do something, because 18,000 items are a lot.
23:38 🔗 godane i still have no idea other then some thing like generalmanual_000000025
23:40 🔗 godane my only other problem is that i noticed that other pdf magazines have spaces in the filenames
23:42 🔗 godane i only complaining cause stuff like vacuumtubemanuals could have had the gif images in search
23:42 🔗 godane thats all
23:44 🔗 DFJustin honestly making it use the first gif if there isn't one matching the filename would be a trivial fix for whoever has access to the ia frontend code
23:46 🔗 godane i know but its my thing with some stuff
23:48 🔗 godane SketchCow: anyways i'm just going to continue to upload to you
23:50 🔗 godane i'm finding very odd stuff like Smart-Pot Programmable Crock Pot Owner's Guide next to Toshiba Combination Flat Color TV and VCR/DVD Player manuals
23:50 🔗 DFJustin I guess it's just anything sold by amazon that had a manual associated
23:51 🔗 godane thats my guess too
23:51 🔗 SketchCow I am sure of it.
23:51 🔗 SketchCow The GIF images in search thing....
23:51 🔗 SketchCow How do I put this.
23:51 🔗 SketchCow You are rearranging the sea shells to make a pretty little sand castle
23:51 🔗 SketchCow and there is a tsunami coming
23:52 🔗 SketchCow The Internet Archive experience and site will be ... a lot different this year and into next.
23:52 🔗 SketchCow I've seen the mockups and I'm involved
23:52 🔗 SketchCow It's going to be notable
23:52 🔗 godane ok
23:52 🔗 godane also i found another manual site
23:53 🔗 godane there are tons of pdf manuals in content.etilize.com/user-manual/ path

irclogger-viewer