[20:45] http://www.openculture.com/2014/04/free-british-pathe-puts-over-85000-historical-films-on-youtube.html [20:46] actually I guess they've been there for two years now [20:51] Yeah, somehow this happened [20:57] but I see no licenses [21:17] now 9 years since youtube started [21:49] http://www.theverge.com/2014/4/17/5624890/dropbox-acquires-loom [22:11] https://archive.org/details/generalmanual_000000025 [22:11] Threw one of godane's manuals up. [22:11] Did a little metadatering. [22:19] if we are just going to name them generalmanual_$number then i could do that [22:19] Just upload them. I'm going to end up doing stuff regardless. [22:19] I'll be doing a duplication kill, for example. [22:20] also i planed on putting making the file name and item name the same so image in search shows up [22:23] Of the files you've uploaded so far, 2518 are unique and 794 are dupes. [22:23] by the way [22:24] oh [22:24] It's trivial to find dupes, so it's no big deal at all. [22:26] if you can give me the script i could just remove the dupes then [22:31] fdupes is beautiful, heh [22:31] but i bet this works on the IA's filesystem? [22:33] http://superuser.com/questions/386199/how-to-remove-duplicated-files-in-a-directory godane [22:33] I used the bash 4.x one [22:33] Obviously you change echo "rm $file" [22:33] I actually made a directory called DUPES [22:34] and made that line into -> mv $file DUPES [22:36] hmmm yikes [22:36] http://freecode.com/projects/fdupes [22:39] Both work for this. [22:43] nice http://www.pietrobattiston.it/ddupes [22:43] nod. [22:57] http://samsungdesign.tumblr.com/ [22:57] Now that is the single funniest thing I will see today [23:00] lol [23:21] hahah [23:21] https://twitter.com/kirk_hadley/status/456934723802398720 [23:34] i may just upload the pdfs directly to IA [23:35] Why not FTP them to me? [23:35] cause your using the name generalmanual_000000025 [23:35] is that just some text item or are they all going be named that [23:35] *like that [23:35] Because I was doing a test item? [23:36] Godane, I am completely capable of making intelligent decisions. [23:36] I chose a name that wouldn't cause later issues, on a single object, to see: [23:36] i know [23:36] 1. If we could derive from it [23:36] 2. How the system would interpret it [23:36] 3. How the OCR would handle it. [23:37] i figured it was a test item but was not sure [23:37] But I need to know what your plan is if you're going to do something, because 18,000 items are a lot. [23:38] i still have no idea other then some thing like generalmanual_000000025 [23:40] my only other problem is that i noticed that other pdf magazines have spaces in the filenames [23:42] i only complaining cause stuff like vacuumtubemanuals could have had the gif images in search [23:42] thats all [23:44] honestly making it use the first gif if there isn't one matching the filename would be a trivial fix for whoever has access to the ia frontend code [23:46] i know but its my thing with some stuff [23:48] SketchCow: anyways i'm just going to continue to upload to you [23:50] i'm finding very odd stuff like Smart-Pot Programmable Crock Pot Owner's Guide next to Toshiba Combination Flat Color TV and VCR/DVD Player manuals [23:50] I guess it's just anything sold by amazon that had a manual associated [23:51] thats my guess too [23:51] I am sure of it. [23:51] The GIF images in search thing.... [23:51] How do I put this. [23:51] You are rearranging the sea shells to make a pretty little sand castle [23:51] and there is a tsunami coming [23:52] The Internet Archive experience and site will be ... a lot different this year and into next. [23:52] I've seen the mockups and I'm involved [23:52] It's going to be notable [23:52] ok [23:52] also i found another manual site [23:53] there are tons of pdf manuals in content.etilize.com/user-manual/ path