[00:11] https://bugs.launchpad.net/calibre/+bug/885027 [00:16] huh, 7zip is unable to extract a 10GB .bz2 I created with lbzip2 [00:17] lbzip2 -t reports no errors [00:21] 7zip 9.30 alpha as well [00:21] (can't extract) [00:24] I have yet to find a conformance test for building warc files. I find this troubling since there is no way to guarantee cross program support [00:47] What's the most important project now? Yahoo Messages is done, isn't it? [00:47] Marcelo, it depends on if you just want to run the warrior or are willing to do a little cli work [00:48] I'm trying to run from Android :p [00:49] I'll take the "little cli work". [00:50] i found another kevin spacly interview [00:50] you familiar with using wget? [00:52] Well, I did use it sometimes. [00:55] Well the two non-warrior projects currently are thephoenix which is backing up a newspaper that has been discontinued [00:55] and ispygames which is a bunch of gaming sites being shutdown soon [00:56] the current warrior projects are: formspring, posterous, and urlteam [00:59] I just took the open solaris project off the current projects page [01:00] The sites are down and all that is left is uploading what we got [03:33] Has anyone mirrored crunchbase before? [04:51] are there any efforts to archive formspring underway? [04:52] i haven't much hard drive space free [04:52] but it's all text so it shouldn't be too large [04:54] oh, i just noticed we have a dedicated channel for it on the wiki [04:54] disregard my question [05:00] wow, you guys are really organized now [05:00] far cry from when we were archiving geocities :) [05:06] Yes, a few of us keep reminding people to update the wiki and people are. [05:07] We just started this page in Feb and it is coming along nicely http://www.archiveteam.org/index.php?title=Clown_hosting [05:59] glitch is still going at 122mb warc.gz [06:50] just made a page for reddit on the wiki [06:54] why? Reddit was already being tracked as part of the Fire Drill [06:55] We have been tracking it since 2011 [06:58] I've noticed plenty of other pages such as Wikipedia and wordpress that currently seem fine [06:58] IIRC, wikipedia's also on fire drill too [07:00] and reddit was also in the navbox, but had no page for it [07:00] Wiki pages for stuff that is not dying is just a waste. If you look through the wiki we have tons of stale pages that are not useful. We are working on pruning the wiki so only relevant pages exist [07:00] archiveteam does not delete [07:00] except spam [07:01] by pruning I mean moving to a archive area so people know not to edit [07:01] The last conversation about this was last week I think. We just do not have a good mechanism for bulk marking and moving pages [07:05] real helpful, Windows, to restart while warrior is running :/ [07:07] and this is why automatic updates were a shit idea [07:07] (in the sense that your computer could reboot without warning) [07:07] I always have them set to notify but don't download [07:11] Now uploading a metric assload of CD-ROMs [07:11] I hadn´t realised Windows could be so impatient [07:11] PC Plus CD seems to be the big winner for this round. [07:33] SketchCow, link me once you're done [11:46] * RachelCro http://getworld.uk.to -Online Multiplayer Game- [11:50] ios actuvate, [11:50] or even, Ops, activate! [12:28] http://archiveteam.org/index.php?title=Geocities the external mirror link "http://www.geociti.es/" is broken/spam fwiw (I can't edit the page) [12:28] errrr [12:28] hmmm either hijacked or gone away [12:29] removed the link for now [12:49] "The Linux Game Tome, one of the most important websites related to video gaming in GNU/Linux, will shut down on the 13th of April, according to a news post published on the website. [12:49] The maintainers of The Linux Game Tome will make available a dump of the games database, so that anyone interested can cook up a new and updated version of the website, and a worthwhile effort will be considered for a transfer of ownership of the domain. [12:49] happypenguin.org however it's slashdotted atm [12:51] I heard http://etherpad.wikimedia.org/ will be killed soon [12:51] No announcement yet, probably will just vanish [12:52] source? [12:52] s3cret [12:54] Damnit, I only just started using it as well. [12:55] Like, 5 seconds ago. [12:56] GLaDOS: that must be the reason then [12:56] :c [12:56] Just because I use 100 amazon instances to browse the web.. [13:00] so that's why I can't catch up [13:14] http://www.codingninja.co.uk/just-say-no/ [13:18] heh [13:19] wp494: may just be a request actually, real status probably being discussed elsewhere https://bugzilla.wikimedia.org/show_bug.cgi?id=45312 [13:36] Zombywuf: #archiveteam-bs for non archival stuff [13:37] Gah, pasted that in the wrong window [13:37] Sorry [13:37] aight, just a general notice [16:29] http://linux.slashdot.org/story/13/03/26/065223 [16:35] I will take care of Linux Game Tome once it comes back online. [16:36] I am glad the site is going away. It was always out of date and whenever I emailed them info about a page nothing got updated [16:37] If you are trying to promote linux gaming it is best to not look like a 1990s rejected website [16:37] shit steam has done more for linux than this site ever did [16:38] Yeah, I see they mention the 1999-era source code was never updated. [16:39] I am all for promoting open source but people need to recognize the concept of a brand and that image plays an important part. Just because you can see past things like that does not mean the majority of the world can. [16:41] Just look at all the promo shit SketchCow does for Archive Team. [16:45] Back to archive news. glitch is still downloading [16:45] They did not take the site down straight away so we will get more of it [16:47] The warc is 177mb, bigger than the "backup" that already exists [16:51] Glitch has announced they are NOT taking down the forums. [16:51] So get the dupe, we'll put it up, and huzzah [17:47] I have a compressed 5GB HTTrack of happypenguin.org from 2012-09-06 [17:47] (and yes I've started using warc for some things) [17:48] ivan`, Care to share? :) [17:51] omf_: sure, sec [17:53] thanks [18:02] omf_: https://ludios.org/tmp/happypenguin.org-and-dune2k.com.7z.torrent [18:23] It was going now it looks down [18:24] still going here, 27KB/s is up, just throttled by paranoid uTP [18:35] Item tewbacca: Step 8 of 16 Tracker confirmed item 'tewbacca'. [18:35] (as soon as a formspring item reaches completion, it suddenly consists of twice as many steps?) :) [18:42] :D [20:16] uh oh identi.ca may shut down? http://status.net/2013/03/26/no-more-new-registrations-on-identi-ca [20:31] It looks like they are going through a major software update [20:31] yeah [20:40] Besides SketchCow who are the other wiki admins? [20:44] omf_, http://archiveteam.org/index.php?title=Special:ListUsers&group=sysop [20:44] SketchCow: Do you think you could give the usermerge right to sysops as well? [20:48] okay. So we know we have stale pages that need to be archived. Should we just build up a list and have the pages locked? Is there a better solution? [20:49] We could take those pages out of the wiki and just host them statically on github for the future. The only downside I see is we lose displaying the page history [20:49] why do they need to be removed from the wiki? [20:49] I'm afraid I don't understand the urgency [20:50] put a completed banner at the top [20:55] I am willing to bet there are only 15-20 wiki pages that are really relevant and maintained. How do having all these other stale pages around help [20:56] How does having the wiki grow bigger make it easier to maintain [20:56] they don't impose an extra burden [20:56] next [20:57] wrong. [20:57] If someone spams up all the stale pages we keep for historical reasons who is going to clean that out [20:57] how does having pages of old shit make search results more relevant [20:58] what kind of hypocrite are you, to propose that archiveteam delete its own history? [20:58] Did I say delete [20:59] stop putting words in other peoples mouths. You have a tendency to do that [20:59] omf_ | We could take those pages out of the wiki [20:59] and did you read the part about backing up to github before that? Selective memory much [21:00] The problem is called 'The Paradox of Choice' The more we flood the users with options, the less happy and more stressed they become. This is one of the cornerstones of what make a good search engine as well as a relevant website [21:01] Is everything important on the same level? [21:02] I disagree with your basic premise, arguing the details is purposeless [21:03] I like how when you don't get your way you just become dismissive. Grow up. The wiki is not in the best shape as has been said by myself and others over and over and over again. Streamlining it which is what this process is called is a good thing [21:04] yes, the wiki could use a lot of love [21:04] removing content is not the way to do that [21:05] You missed the point again. [21:05] pray tell, what is the point [21:06] The point once again, is that there is stale material most of which is historical. Should that be editable so people can spam it up? No. The previous conversations on this subject have talked about locking and moving content so it cannot be editing. No deletion and nothing is removed since it would still be linked to. [21:07] I'd be happy adding a banner to the top of old projects and locking them, sure [21:07] Is there a way with this software to make an index page off of that kind of banner? [21:08] yes, you could make a category "completed projects" if one doesn't exist already [21:09] http://www.archiveteam.org/index.php?title=Category:Rescued_Sites [21:09] shocker of shockers [21:09] but that doesn't cover the pages we have of just info like the httrack page [21:10] Something more general like 'Historical Info' or something. I am terrible with names [21:10] hm [21:16] So we need an admin to create the category, label a few pages and see how it works out. [21:16] if I'm not mistaken, anyone can create categories and label pages [21:17] oh I did not know about the category thing [21:17] https://www.mediawiki.org/wiki/Help:Categories [21:17] yeah, mediawiki is pretty flexible [21:19] mostly you only need admin access for things that aren't easy to revert [21:19] The long term goal I see is people show up, we give them the software and info they need and they go off and save shit for us [21:19] Yeah I have never had to admin a wiki so I know very little about what features are normal [21:36] I couldn't get Special:AllPages to work on our wiki [21:38] ? [21:39] looks all ok [21:39] what is the url you used? [21:41] http://archiveteam.org/index.php?title=Special:AllPages [21:41] "/Sites using MediaWiki (English)to Rudimentary Info On Getting That Toilet Corrected" :D [21:52] meh, so much spam [21:52] what's the purpose of deleting accounts when we don't even delete pages in main namespace :) [21:52] yeah it is worse than I thought [21:55] Is marking pages with http://archiveteam.org/index.php?title=Category:Deleteme the way to go for spam [21:58] omf_: you might want to move them to a different namespace, i.e. rename them to Spam:Old_Page_Name or whatever [21:58] I think that'd be easier to delete [22:00] no [22:00] you create a redirect and it's double work [22:02] Nemo_bis, So are categories the way to go? {{delete}} for example [22:04] if you're not sysop, yes [22:04] I created that template IIRC [22:06] ok cool [22:16] I started labeling pages with that [22:37] ye gods - I stop a project on one host so I can use a different one instead ... last night. It's still running. [22:50] This debate is awesome. [22:50] However, we're not deleting the old pages on the Wiki [22:55] Even spam