#wikiteam 2012-04-11,Wed

↑back Search

Time Nickname Message
12:49 🔗 ersi bleh, totally need a easy way to add a new wiki
12:49 🔗 ersi editing the wiki sucks.. ironically :)
12:49 🔗 ersi I'll put this here and hope I'll remember it for later http://www.tuhs.org/wiki/The_Unix_Heritage_Society
13:44 🔗 Nemo_bis ersi, add a new wiki where?
13:46 🔗 ersi to some kind of list
13:46 🔗 ersi or make it known to 'the team'
14:26 🔗 Nemo_bis tsk -NC
14:26 🔗 Nemo_bis meh from within function "SiteStatsUpdate::cacheUpdate". MySQL returned error "1054: Unknown column 'ss_active_users' in 'field list' (localhost)".
14:27 🔗 Nemo_bis 36 pages
14:33 🔗 Nemo_bis ersi, downloading
14:37 🔗 ersi neat
14:40 🔗 Nemo_bis ersi, done
14:41 🔗 Nemo_bis now let's wait for underscor to produce the script for archive.org upload and it will get into the bunch
15:05 🔗 emijrp lol, i didnt expected these file sizes http://airto.hosted.ats.ucla.edu/wiki/index.php?title=Special:ListFiles&sort=img_size&limit=50&desc=1
15:11 🔗 Nemo_bis I have a wiki with 6 GB of images
15:11 🔗 Nemo_bis 467 wikis downloaded btw
15:12 🔗 emijrp you want a place in the hall of hardcore wiki archvists, uh?
15:13 🔗 Nemo_bis nah
15:13 🔗 Nemo_bis I only want to steal my ISP all the bandwidth I can.
15:13 🔗 Nemo_bis Upload bandwidth is easy with p2p, downloading constantly quite hard.
15:14 🔗 emijrp if you dont pay your bill, you are dividing by zero, so you get the optimus stolen bandwidth
15:14 🔗 emijrp INFINITE.
15:15 🔗 Nemo_bis heh
15:15 🔗 Nemo_bis that's my uny's bandwidth
15:15 🔗 Nemo_bis *uni
15:15 🔗 Nemo_bis btw it's a bit silly to 7z 6 GB of images
15:16 🔗 Nemo_bis or even worse a collection of thousands of PDF (of dubious copyright status I'd say)
15:18 🔗 emijrp Did you hear that Internet Archive crawls the entire web?
15:19 🔗 Nemo_bis emijrp, yeah, in fact I was saying that those should be made better available, maybe in a directory, and then happily derived too
15:19 🔗 Nemo_bis emijrp, I can safely assume that a 32 B 7z has something wrong, delete it and rerun the dump?
15:19 🔗 Nemo_bis Can't I.
15:20 🔗 emijrp just remove the 7z
15:21 🔗 Nemo_bis yep
15:21 🔗 Nemo_bis emijrp, is 1.2 KiB reasonable? let's check
15:21 🔗 emijrp maybe a wiki with a wrong api
15:21 🔗 emijrp or empty
15:23 🔗 Nemo_bis 70 pages and no xml
15:23 🔗 Nemo_bis 34 lines of 2012-04-10 09:04:48: Error while retrieving the full history of "Main_Page". Trying to save only the last revision for this page
15:25 🔗 emijrp special:export issues
15:25 🔗 emijrp shit happens
15:27 🔗 Nemo_bis emijrp, what should we do then?
15:28 🔗 emijrp obviously, a tiny eprcent of wikis will fail
15:28 🔗 emijrp dont care
15:28 🔗 Nemo_bis Perhaps we need a script to check that there's something within the 7z I upload. Or just upload everything, even a list of titles is useful.
16:44 🔗 ersi Why the frack are people 7zipping a bunch of images?
18:08 🔗 Nemo_bis ersi, because we're 7z everything together. Very good for xml, less useful for images.
18:09 🔗 ersi Yeah.. but.. you know.. why
18:10 🔗 Nemo_bis ersi, because 7z'ing the xml etc. and then tar'ing the 7z archive with the image directory is more code to write?
18:10 🔗 Nemo_bis Dunno, ask emijrp. :D
18:44 🔗 emijrp from a fast count, about 10 wikis die every day around the web
18:45 🔗 emijrp 13,000 died from 2009, (Andrew Pavlo list)
18:58 🔗 Nemo_bis well, maybe they were moved and we don't know where
18:58 🔗 Nemo_bis we should rerun the crawler to know
19:07 🔗 Nemo_bis hm, some python process consuming 2+ GiB of memory
20:41 🔗 ersi 'the crawler'? Which one?
20:46 🔗 emijrp http://www.cs.brown.edu/~pavlo/mediawiki/
20:46 🔗 ersi ah, alright
20:47 🔗 emijrp whgat do you think about archiving all the pages that Wikipedias delete?
20:47 🔗 emijrp It exists DeletionPedia for en: but it is inactive, and two German DeletionPedias that looks active.
20:48 🔗 ersi it's a really good idea
20:49 🔗 underscor how would you go about that?
20:49 🔗 underscor grab anything that has a rfd?
20:49 🔗 emijrp yes..
20:50 🔗 emijrp There are speedy deletions, just crap, so it is deleted quickly.
20:50 🔗 emijrp And deletion discussions for low notable topics. These are deleted after a week or so.
20:51 🔗 emijrp just crap = test edits, ISFDOSJIFIOSDJOFJAPSF , spam, links, etc
20:51 🔗 underscor yeah
20:51 🔗 underscor but non-notable things I'd like to preserve
20:51 🔗 Nemo_bis steal a staff account password and screenscrape the deletion archive on all wikis
20:51 🔗 underscor hahahaha
20:51 🔗 underscor ransom ariel
20:51 🔗 Nemo_bis or ask kindly to someone with shell access
20:51 🔗 underscor you know, the biggest thing to preserve, imo, are deleted media
20:52 🔗 underscor but I guess most of that is deleted for a reason?
20:52 🔗 emijrp copyvio
20:52 🔗 underscor yeah
20:52 🔗 emijrp and out of educational scope
20:52 🔗 underscor or pron
20:53 🔗 emijrp the only problem are pages that attack people
20:53 🔗 emijrp if we arhchive all, we are going to archive that dangerous pages
20:53 🔗 ersi well, I'd rather have us 'dark' out such items
20:53 🔗 ersi and still keep it
20:53 🔗 emijrp it is the worst problem DeletionPedias face
20:54 🔗 Nemo_bis well, the worst are supposedly oversighted
20:54 🔗 Nemo_bis so if you browse the standard deletion archive you won't find them
20:54 🔗 emijrp http://wikiindex.org/Deletionpedia and See also
20:55 🔗 emijrp i have a copy of deletionpedia
20:55 🔗 emijrp downloading pluspedia
20:57 🔗 emijrp marjorie-wiki fails
20:58 🔗 Nemo_bis emijrp, I've already downloaded deletionpedia, you know
20:58 🔗 Nemo_bis It's horribly slow IIRC
20:58 🔗 emijrp cool
20:58 🔗 Nemo_bis of course redownloading doesn't har,
20:58 🔗 Nemo_bis *harm
20:58 🔗 emijrp yes, slow and buggy, i had to repair my dump to exclude some broken <page> tags
20:59 🔗 Nemo_bis yeah
20:59 🔗 Nemo_bis but where is my dump
20:59 🔗 emijrp in a CD in a box under my bed
20:59 🔗 emijrp lul
20:59 🔗 Nemo_bis that sounds plausible
21:01 🔗 Nemo_bis http://archive.org/details/wiki-deletionpedia.dbatley.com
21:01 🔗 Nemo_bis took 3 months
21:02 🔗 emijrp mine was faster
21:05 🔗 emijrp but histories contain impossible dates, i remember 2012 dates in 2011
21:05 🔗 emijrp i dont know why
21:06 🔗 Nemo_bis hm
21:06 🔗 Nemo_bis Probably a broken importpload script?
21:07 🔗 Nemo_bis I suppose they alter history a bit and something goes wrong sometimes.
21:07 🔗 Nemo_bis emijrp, wanna upload your version to that item?
21:08 🔗 emijrp it is the same probably, DeletionPedia does not upload new pages since 2010 or so
21:10 🔗 Nemo_bis even better than, or dumps will be broken in different ways :)
21:10 🔗 Nemo_bis but broken versions of the same thing
21:15 🔗 emijrp with the sidebar trick, the documentation is now much better http://code.google.com/p/wikiteam/wiki/AvailableBackups
21:16 🔗 Nemo_bis Yes.
21:16 🔗 Nemo_bis I love how Google Code managed to make wiki syntax even more complex. :p
21:27 🔗 emijrp seeya

irclogger-viewer