#wikiteam 2013-10-21,Mon

↑back Search

Time Nickname Message
02:33 🔗 JRWR I would like to thank you all for your efforts, and Im here to make a proactive offer of a archive project, I run the webservers over at PCGamingWiki.com, And I would like to see it archived before anything where to ever happen to it [x-posted from #archiveteam]
04:00 🔗 JRWR Anyone around?
04:06 🔗 chfoo i'm around but i think everyone else is away
04:13 🔗 chfoo JRWR: have you taken a look at http://archiveteam.org/index.php?title=WikiTeam yet? for now, you can be proactive by making a dump and uploading to archive.org.
04:14 🔗 chfoo then ask for the item to be moved to under the wikiteam collection
04:17 🔗 chfoo this collection -> https://archive.org/details/wikiteam
04:24 🔗 JRWR chfoo Now uploading dump of PCGamingWiki
04:25 🔗 JRWR I wonder if there is a good way to automate the same 20 wikis we own :)
04:26 🔗 chfoo how are you uploading? this might be useful: there's this https://pypi.python.org/pypi/internetarchive
04:26 🔗 JRWR ftp
04:27 🔗 JRWR just 7zip'd the images
04:37 🔗 JRWR wow thats taking forever, but at one point, I would like to do monthly backups for you guys, 200k pages strong across 20+ domains
04:41 🔗 chfoo yeah, i think using the s3-like interface would be the best way to go in the future but someone should clarify
04:42 🔗 chfoo also, im not sure if the ftp option already asked you, but if it requires a parent collection, just select texts (or any other public ones) for now
04:44 🔗 JRWR I think I broke it
04:44 🔗 JRWR yep, I broke it
04:45 🔗 chfoo what happened?
04:46 🔗 JRWR I uploaded the files to the FTP and I tried to claim the files, and it timed out, wont let me make it again, and doesnt show in my uploaded list
04:48 🔗 chfoo that doesn't sound good.
04:53 🔗 chfoo i can't really offer more suggestions other than try again, or use the web interface or the s3 interface
05:04 🔗 JRWR Im trying the S3 method
05:12 🔗 chfoo if it doesn't work, i hope it doesn't sour your experience. someone should be back later.
05:22 🔗 JRWR hrm, chfoo what would be a good example of the IA upload command
05:23 🔗 chfoo JRWR: i've only had experience using scripts for something else but this script uses curl: https://github.com/ArchiveTeam/archiveteam-megawarc-factory/blob/master/upload-one#L74
05:24 🔗 chfoo the full doc is at https://archive.org/help/abouts3.txt
05:26 🔗 chfoo if you are using curl or wget, its probably best to use the most minimal headers and then add the metadata later
05:28 🔗 chfoo if you are wondering about the minimal fields, you can use the web interface and select an dummy file to upload, then the list of options will appear that need to be filled in
05:43 🔗 chfoo i have to sleep now, but hopefully you'll get it working
05:58 🔗 JRWR even tho I have shell, the dump scripts seem to do the job better
06:33 🔗 CHANFIX2 1 client should have been opped.
06:40 🔗 JRWR poor Citadel
07:13 🔗 CHANFIX2 1 client should have been opped.
07:23 🔗 CHANFIX2 1 client should have been opped.
13:50 🔗 JRWR Anyone Home?
13:56 🔗 ersi Sure..
14:07 🔗 ersi also, I don't think it's a wise idea to run trivial scripts as root :)
14:08 🔗 Nemo_bis JRWR: I just added some docs to https://code.google.com/p/wikiteam/wiki/NewTutorial
14:09 🔗 Nemo_bis JRWR: launcher.py and uploader.py are rather silly, they want URLs and archives to be as they expect or they just sulk
14:10 🔗 JRWR hrm
14:10 🔗 JRWR well I used launcher
14:10 🔗 JRWR it worked well
14:10 🔗 JRWR no it didnt... no 7zs
14:10 🔗 JRWR odd
14:11 🔗 Nemo_bis ouch, gotta check that
14:11 🔗 JRWR Fuck it, can I get a example ia upload command so I can do this right by hand
14:11 🔗 Nemo_bis you can just do 7z a $dir.7z $dir
14:11 🔗 Nemo_bis then run the uploader.py again
14:12 🔗 Nemo_bis (for each of those wikis)
14:12 🔗 JRWR oh my
14:12 🔗 JRWR its about 5GB in total
14:15 🔗 Nemo_bis JRWR: of images or XML?
14:16 🔗 Nemo_bis If you have limited bandwidth, feel free to use 7z -mx=9 instead, you'll easily reduce those to few MB if it's XML

irclogger-viewer