[00:03] we've been thought this once already, when Microsoft killed the XBLA on the Xbox. [00:03] it's just that no one cared about the seven or eight bejeweled clones and casino games it hosted. [00:10] acridAxid: dont throw me in a pond, im not a witch, but i predict that it's going to happen again in a few years for the 360 [00:11] i've been thinking that for awhile, too [00:11] it has a much, much wider install base and there are many beloved games on XBLA [00:12] or even the separate indie thing, whatever that's called [00:13] i have a lot of unplayed steam games that im working through before i buy any new ones, but the ones i /do/ buy i ensure that they have drm free versions available [00:13] or discs. [00:13] it's going to be a tremendous problem in 10, 20 years. [00:23] *** Sellyme has joined #archiveteam [00:32] *** mistym has quit IRC (Remote host closed the connection) [00:33] *** cbb2 has joined #archiveteam [00:37] *** cbb has quit IRC (Read error: Operation timed out) [00:44] *** cbb2 is now known as cbb [01:08] *** mistym has joined #archiveteam [01:18] *** lytv has quit IRC (Read error: Operation timed out) [01:20] *** lytv has joined #archiveteam [01:32] *** primus104 has quit IRC (Leaving.) [01:41] *** kyan has quit IRC (Read error: Operation timed out) [01:45] *** kyan has joined #archiveteam [01:50] *** kyan_ has joined #archiveteam [01:52] *** kyan has quit IRC (Ping timeout: 258 seconds) [01:53] *** thefox is now known as OtherFox [01:55] *** mistym has quit IRC (Remote host closed the connection) [02:05] *** kyan_ is now known as kyan [02:47] *** mistym has joined #archiveteam [03:04] *** dserodio has quit IRC (Excess Flood) [03:08] *** dserodio has joined #archiveteam [03:12] *** kyan has quit IRC (Read error: Operation timed out) [03:18] *** Nertsy has quit IRC (Quit: Nertsy) [03:22] *** Nertsy has joined #archiveteam [03:44] *** cbb has quit IRC (Quit: cbb) [04:08] RedType: i worry less about PC games, because it seems somehow, somewhere, someone will have a copy [04:09] it's XBLA, PSN and things of that nature that I worry about [04:09] though I suppose people have copies there, too, don't they [04:12] digital copies are much easier to lose track of than physical copies, sadly [04:12] and often dependent on remote things of some sort [04:13] perhaps somebody should go on a packet capturing excursion of some sort.. [04:26] the pirate scenes for xbla and psn used to be pretty booming [04:26] you could probably find big collections if you looked in the right places [04:31] Everyone OK? [04:31] Sounds like someone wants to go on a hunt. [04:36] PlayStation Mobile is actually dying in a few months [04:37] Bunch of unique games and crapware stuff about to be lost. Bit of a shame really [04:37] :( [04:37] But y'know. Fuck that data. [04:38] funnily enough I probably have the biggest archive of them. I've been doing videos on PlayStation Mobile games since the thing came out, I have like 60 games partially documented on video [04:42] I can't see any way to back them up in any way though. Some worked on Android but support for that was removed and a bunch of games only work on PlayStation Vita specifically, which has zero way to actually pirate the games to archive them. [05:05] *** zenguy_pc has quit IRC (Read error: Operation timed out) [05:27] *** mistym_ has joined #archiveteam [05:34] *** mistym has quit IRC (Read error: Operation timed out) [05:48] *** zenguy_pc has joined #archiveteam [05:52] *** zenguy_pc has quit IRC (Read error: Operation timed out) [06:34] *** zenguy_pc has joined #archiveteam [06:45] *** mistym_ has quit IRC (Remote host closed the connection) [06:51] the worst part of it is these are the indie games, "user created" stuff especially for Sony's platforms [06:54] *** X-Scale has quit IRC (Ping timeout: 240 seconds) [06:55] copy data from local storage? [06:56] it has to be stored somewhere locally to play it, but it'd be a pain in the ass to extract [07:28] *** Ymgve has joined #archiveteam [07:36] *** dashcloud has quit IRC (Read error: Operation timed out) [07:44] *** dashcloud has joined #archiveteam [08:19] *** dashcloud has quit IRC (Read error: Operation timed out) [08:24] *** BlueMaxim has quit IRC (Read error: Operation timed out) [08:26] *** mutoso has quit IRC (Read error: Operation timed out) [08:27] *** dashcloud has joined #archiveteam [08:32] *** www2 has quit IRC (Read error: Operation timed out) [08:35] *** OtherFox has quit IRC () [08:40] *** mutoso has joined #archiveteam [08:46] *** www2 has joined #archiveteam [08:54] *** rolfb_ has joined #archiveteam [08:54] *** rolfb_ is now known as rolfb [09:23] *** primus104 has joined #archiveteam [09:37] Official SonyMicrosofTendo Policy On Video Game Preservation: [09:37] "Well, y'see... we kind of already /have/ your money?" [09:37] "And running servers is so hard." [09:38] "So very hard." [09:38] [End Statement] [09:38] 'Bye!' [09:39] 'Enjoy those consoles we sold you!' [09:58] *** schbirid has joined #archiveteam [10:01] *** johtso has joined #archiveteam [10:04] at least piracy helps with archival of some of the game libraries. so if, say, the original wii shop channel servers were taken down, much of the content there would be available online [10:07] In 2000 years time, scholars will look back and speculate as to why so many internationally successful software titles from numerous disparate producers all bore the common insignia "Cracked by Honktronics" [10:09] LOL [10:12] *** rolfb has quit IRC (Leaving...) [10:35] *** codinghor has joined #archiveteam [10:36] thanks for your help archiving http://oldforums.gearboxsoftware.com/ via http://archivebot.at.ninjawedding.org:4567/beta [10:36] I already donate $25/month to archive.org but I donated $100 as thanks for this effort in particular. SPIDERING IS SERIOUSLY HARD [10:36] donation paypal Confirmation number: 5A847350TV1008345 -- thanks again [10:38] *** codinghor has quit IRC (Client Quit) [10:39] *** techapj has joined #archiveteam [10:42] Fuck yeah [11:32] *** X-Scale has joined #archiveteam [11:44] *** londoncal has joined #archiveteam [12:03] *** primus104 has quit IRC (Leaving.) [12:07] *** signius has quit IRC (Ping timeout: 306 seconds) [12:14] :) [12:20] *** signius has joined #archiveteam [12:48] *** sankin has joined #archiveteam [13:43] *** primus104 has joined #archiveteam [13:44] *** dashcloud has quit IRC (Read error: Operation timed out) [13:47] hey [13:50] *** dashcloud has joined #archiveteam [14:00] *** nertzy2 has joined #archiveteam [14:06] *** nertzy has quit IRC (Read error: Operation timed out) [14:08] *** primus104 has quit IRC (Leaving.) [14:09] *** londoncal has quit IRC (Leaving...) [14:15] https://layervault.com [14:15] "LayerVault will be shutting down on April 11, 2015." [14:27] *** mutoso has quit IRC (Read error: Connection reset by peer) [14:28] *** Start has quit IRC (Disconnected.) [14:36] *** mistym has joined #archiveteam [14:51] *** signius has quit IRC (Read error: Operation timed out) [14:53] *** rolfb has joined #archiveteam [14:57] *** Emcy has quit IRC (Ping timeout: 512 seconds) [15:05] *** signius has joined #archiveteam [15:12] *** dashcloud has quit IRC (Read error: Operation timed out) [15:19] *** dashcloud has joined #archiveteam [15:20] *** Start has joined #archiveteam [15:32] *** Gfy has quit IRC (Quit: bye) [15:34] *** dashcloud has quit IRC (Read error: Operation timed out) [15:36] *** Emcy has joined #archiveteam [15:36] *** Gfy has joined #archiveteam [15:36] *** WubTheCap has joined #archiveteam [15:37] Not sure if anyone posted this, but Terry Pratchett died and his website front page was nuked. The audio pages etc yet seem to be there. [15:37] http://www.pjsmprints.com/ [15:37] Navigation from http://www.pjsmprints.com/audio/index.html [15:37] It got hammered with requests pretty hard just now [15:38] Front page in case you can't view it: https://archive.today/PNP01 [15:42] *** dashcloud has joined #archiveteam [15:51] *** Start has quit IRC (Disconnected.) [15:52] *** mistym_ has joined #archiveteam [15:58] *** mistym has quit IRC (Read error: Operation timed out) [16:06] *** mistym_ has quit IRC (Remote host closed the connection) [16:20] *** dashcloud has quit IRC (Read error: Operation timed out) [16:21] *** mistym has joined #archiveteam [16:24] *** dashcloud has joined #archiveteam [16:29] *** primus104 has joined #archiveteam [16:34] nuked, or just too popular for its host? [16:35] they are very different things [16:41] *** qwebirc19 has joined #archiveteam [16:42] *** AgentDrTr has joined #archiveteam [16:42] *** qwebirc19 has quit IRC (Client Quit) [16:42] xmc: Literally nuked, index.html was replaced [16:42] with what? [16:43] The death announcement [16:43] well that makes some sense [16:43] And now it was too popular for the host so 302 to localhost [16:43] you're misusing the word [16:43] 'nuked' means they deleted everything [16:43] ok? [16:44] now, in larger news [16:44] http://arstechnica.com/information-technology/2015/03/google-to-close-google-code-open-source-project-hosting/ [16:44] it was literally nuked, pjsmprints.com has nuclear fallout [16:44] if you go there in Google Chrome with the Geiger Counter extension it'll start ticking [16:46] oh jesus why can't google just read-only Google Code [16:47] wth [16:50] *** dashcloud has quit IRC (Read error: Operation timed out) [16:50] You'd think a multibillion dollar company could do that mnuch [16:51] you shouldn't expect a multibillion dollar company to do the sane, obvious thing [16:52] True, true. [16:53] *** dashcloud has joined #archiveteam [16:53] SketchCow: can you poke whoever at Google might be receptive to putting Google Code into a read-only state [16:53] I mean, it's going to be that way, but I mean indefinitely [16:56] google is turning into the new yahoo [16:56] Yagoog [16:56] Kenshin: Google is already the new Yahoo. [17:05] *** godane has quit IRC (Quit: Leaving.) [17:07] *** londoncal has joined #archiveteam [17:09] so that nobody else has to [17:09] i just wrote a scraper [17:09] to find projects [17:09] running it now [17:20] xmc: hi, did you get my email the other day? [17:21] hi yes [17:21] i've been slammed [17:21] that happens :) [17:21] have evaluated where to put things, am putting together a response [17:21] sounds good [17:21] i have a home for it [17:21] so [17:21] that's good [17:22] yup [17:22] we're preparing to email users soon, was wondering if you wanted a mention [17:22] i'm going to be spending a few hours on the road this afternoon (it is now 10:30am) and hoping to draft an email back to you from in the car [17:23] ok, we're not in any rush, i'm guessing we shouldn't start moving data before mid-May or later [17:24] anyway, have a nice trip and we'll talk later [17:34] seems that the google code search engine maxes out when it hits like 970 [17:35] so scraping all the tags i've found as well [17:40] *** Start has joined #archiveteam [17:41] p.s. if anyone has any ideas for finding projects that are not scraping the google code search results, go for it [17:43] *** Start has quit IRC (Client Quit) [17:44] *** filippo has joined #archiveteam [17:46] oh shit, wikiteam is on google code [17:46] *** primus104 has quit IRC (Leaving.) [17:48] no longer [17:48] *** Start has joined #archiveteam [17:49] okay, i just looked at special:linksearch on the wiki [17:51] current count: 9229 projects and growing [17:51] *** yan has joined #archiveteam [17:53] *** lag2 has joined #archiveteam [17:55] *** dashcloud has quit IRC (Read error: Operation timed out) [17:57] *** rolfb has quit IRC (Leaving...) [17:58] Pomf.se's deathwatch: "I still need to do some planning around this, I might get some hours in and get some money but nontheless the financial situation is bad and has always been." [17:58] "I will continue to run it for as long as I can, and when I can't anymore I will disable uploading and tell the users that the files are available for a month or so and that they can continue using uguu and/or any Pomf clone because Pomf is shutting down due to lack of money." [17:59] Is there any idea how much longer the urlteam seesaw tracker is going to be down for? Seems a shame to have all these scrapers spinning their wheels :( [17:59] *** dashcloud has joined #archiveteam [18:02] *** AgentDrTr has quit IRC () [18:04] i killed mine of after seeing the error [18:04] will resume it once it gets back up [18:04] https://partyvan.eu/transparency/emails/2015-03-12-pomfse-deathwatch.txt [18:07] *** SadDM has joined #archiveteam [18:07] *** swebb sets mode: +o SadDM [18:17] *** Emcy has quit IRC (Ping timeout: 512 seconds) [18:26] *** Emcy has joined #archiveteam [18:35] johtso: No. [18:35] johtso: It'll be up as soon as possible. [18:37] *** Start has quit IRC (Disconnected.) [18:43] By the way, pjsmprints has been up 200 ok for the last 45 minutes. Nobody started an archivebot task yet, but the site's still slow. [18:45] *** Start has joined #archiveteam [18:46] *** techapj has quit IRC () [18:46] *** Start has quit IRC (Read error: Connection reset by peer) [18:46] *** Start has joined #archiveteam [18:59] *** mistym has quit IRC (Remote host closed the connection) [19:01] *** Start has quit IRC (Disconnected.) [19:07] *** BlueMaxim has joined #archiveteam [19:09] *** bsmith093 has quit IRC (Remote host closed the connection) [19:12] *** mistym has joined #archiveteam [19:16] [17:44] http://arstechnica.com/information-technology/2015/03/google-to-close-google-code-open-source-project-hosting/ [19:16] well, that was only a matter of time... [19:17] *** Start has joined #archiveteam [19:17] yeah, i don't know why anyone wouldn't just use github. Is there a new project going to be started for that site? [19:17] londoncal: github is not exactly the gold standard of repository hosting either :) [19:18] lol at the github fanclub [19:18] *** Start has quit IRC (Read error: Connection reset by peer) [19:18] it's laughably feature-incomplete [19:18] and they have their priorities very very wrong [19:18] guess this is #archiveteam-bs material though [19:18] I mean, it's better than Google Code, but that's not hard [19:18] right [19:18] thought this was -bs [19:19] Im not saying its amazing, its just better (and normally the defacto, i.e., it has more projects). [19:20] well, you did say "why anyone wouldn't just use github", and this is why ;) [19:20] but yeah, londoncal: join #archiveteam-bs [19:21] haha, ok ok, clearly people like arguing a lot here :p [19:29] "Hi, I'm Vint Cerf. Deleting things is wrong. Erasing our digital history is wrong. We should keep everything. Except Google Code, obviously." [19:29] "Kick that shit right to the curb, no delay." [19:30] "Come listen to my plans for a Digital Tableau, which is a poncy thing that everybody else thought of ages ago but I'll pretend is new." [19:30] *** Start has joined #archiveteam [19:30] the fuck is a digital tableau [19:31] *** Start has quit IRC (Read error: Connection reset by peer) [19:31] Something I made up in the same vein as Vint's _real_ suggestion of "Digital vellum", i.e. an emulator [19:32] Because it's no good keeping files if modern software won't read them, etc [19:32] *** TheFifthH has joined #archiveteam [19:32] *** TheFifthH has left [19:32] So instead of storing a Wordperfect document you store a complete virtual machine to run WP and Windows and DOS and then load the file into that [19:33] okay, so, basically the thing that the Internet Archive is already doing [19:33] yes okay [19:33] * antomatic nods [19:33] 'cept more space-inefficient [19:33] * antomatic nods again [19:34] But it's Digital vellum! [19:34] And therefore more better [19:34] i'm glad he's spreading that message, if he has to make up some silly word-pair to reach a certain audience then that's fine [19:35] *** Start has joined #archiveteam [19:35] *** Start has quit IRC (Remote host closed the connection) [19:35] It's like he doesn't care if we ever have any free disc space ever again. (sobs) [19:35] *** Start has joined #archiveteam [19:36] *** nertzy2 has quit IRC (Quit: Leaving) [19:37] if anyone cars, i've found 64422 projects so far [19:37] google code, that is [19:37] nominal "projects", probably 6000% of that is spam or we [19:38] nice [19:38] *** primus104 has joined #archiveteam [19:38] Also, one emulator ('vellum') per file is ridculous [19:38] *** TheFifthH has joined #archiveteam [19:38] Alright, if the software is /really/ obscure [19:38] but otherwise, pointless [19:39] I feel like most of the time, there is probably a reasonable way to convert things [19:39] * antomatic nods [19:39] *** mistym has quit IRC (Remote host closed the connection) [19:39] i think if you're going to go to that kind of effort to make stuff available for the future, let's start with "don't fucking store stuff in proprietary formats" for a fraction of the cost [19:40] yes [19:41] *** londoncal has quit IRC (Remote host closed the connection) [19:41] i really don't think reading markdown is going to be a problem for people in the year 2215 [19:41] and not just because the human race will have wiped itself out by then [19:43] *** mistym has joined #archiveteam [19:43] *** nertzy has joined #archiveteam [19:43] *** n00b469 has joined #archiveteam [19:44] Hello all. [19:44] * winr4r waves at n00b469 [19:44] I've just seen that Google Code is shutting down: http://google-opensource.blogspot.co.uk/2015/03/farewell-to-google-code.html [19:44] yes [19:44] true dat [19:44] n00b469: i've already written an exploring crawler [19:44] awesome! [19:45] [applause] [19:45] I'm actually here about that [19:45] TheFifthH: excellent [19:46] Seems something called flossmole has a large list of GC projects: https://code.google.com/p/flossmole/downloads/detail?name=gcProjectLinks2012-Nov.txt.bz2&can=2&q=google+code [19:47] And, does someone of you here use Adium on OS X? [19:47] excellent [19:47] I cannot connect to the efnet.org IRC channel with it... [19:47] http://flossdata.syr.edu/data/gc/ [19:47] It's not up to date, but their source repository appears to contain tools to spider them [19:47] I'm here using the web interface. [19:48] n00b469: what error do you get [19:48] TheFifthH: good stuff [19:48] tbh it doesn't matter if it's out of date [19:50] i doubt many/any new ones have been created since whenever they created that dataset [19:50] winr4r: no error, Adium just keeps "connecting"... [19:51] n00b469: I'd not recommend using Adium for IRC [19:51] *** habi has joined #archiveteam [19:51] it's very much an IM client (like Pidgin is) and IRC support is awkward at best [19:52] n00b469 is now known as habi :) [19:52] joepie91: It works remarkably well and pools all my IM and chat activities into one program. [19:53] winr4r: It seems that I just had to wait for some minutes, then it connected. [19:53] *** n00b469 has quit IRC (Quit: Page closed) [19:54] habi: "just had to wait for some minutes" is not a good sign :P [19:54] I know - and also not really helpful for debugging :) [19:54] *** rolfb has joined #archiveteam [19:55] habi: irc.efnet.org will sometimes redirect you to a server that will refuse to answer your request [19:55] no idea why, maybe it's some IP-based block [19:55] anyway keep trying and eventually you'll find one that works [19:56] efnet doesn't have network-wide ban policies iirc [19:56] the hard part is finding one that works and doesn't netsplit every 32.7 seconds [19:56] or not in practice anyway [19:56] irc.teksavvy.ca is pretty good [19:56] TheFifthH: the flossmole data gave 10616 links, thanks [19:56] Glad to be of service. :) [19:57] winr4r: https://code.google.com/ [19:57] links on the right [19:57] yipdw: ah, ok, something like a load-balancer? [19:57] oh, huh [19:57] lol. [19:57] winr4r: https://code.google.com/hosting/search?q=&projectsearch=Search+projects [19:57] joepie91: Limited to known labels, and about 1000 results per label [19:57] empty search query [19:57] works [19:57] habi: yeah [19:58] extract project names, refine results by searching for keywords in those project names [19:58] rinse repeat [19:58] eventually you'll end up with a near-complete list [19:58] joepie91: yes [19:58] joepie91: welcome to what i did about two hours ago [19:58] lol [19:58] daymn.. google code! [19:58] fair enough [19:58] :P [19:58] * closure wishes he had 2 spare seconds to rub together [19:58] currently on the second iteration through tags [19:58] *** mistym has quit IRC (Remote host closed the connection) [19:59] current count: 75,465 [19:59] winr4r: went through freecode yet? [19:59] and freshcode [20:00] (freshcode == http://freshcode.club/, it's a still-operational freecode clone) [20:00] I can't help but notice that Chris Dibona was around for sourceforge, and now is sending out this google code EOL [20:00] joepie91: nope [20:00] winr4r: maybe also scrape sourceforge projects (they have full indexes, do they not?) for google code URLs on the frontpage, to catch any "we've moved to google code" links [20:00] joepie91: then I can just add irc.teksavvy.ca as a server instead of irc.efnet.org? [20:01] habi: yes [20:01] well in theory anyway [20:01] not /all/ efnet hosts resolve [20:01] but I think this one does [20:01] I'll try and be back in a moment :) [20:01] *** habi has quit IRC (Quit: Leaving.) [20:01] joepie91: not yet [20:01] winr4r: just giving you some ideas for the future :D [20:02] maybe google code needs a channel? [20:02] google code needs a pun [20:02] joepie91: you should help me! [20:02] #googleexplode ? [20:02] *** habi has joined #archiveteam [20:02] winr4r: too much work to do, already backlogged, and my bank account is at 0 :( [20:02] #scroogled? [20:02] Well, hello there :) [20:02] so can't put any time into this right now [20:02] habi: wb [20:02] joepie91: gotcha [20:02] wb habi [20:03] great! [20:03] (and when I say 0 I literally mean 0) [20:03] joepie91: yeah same [20:04] winr4r: one to add to the list: https://code.google.com/p/plowshare/ [20:04] anyone have an idea for a channel name for google code? [20:04] for some reason i want to help with this one [20:04] winr4r: also, scrape wikipedia dumps for google code links? [20:04] joepie91: got it [20:04] Sue__: \o/ [20:05] *** Sue__ is now known as Sue_ [20:06] joepie91: done that already [20:06] oh, they actually added an "export to github" button [20:06] heh [20:06] yes [20:06] well that's something [20:06] that's pretty cool of them [20:07] at least Google does a better shutdown than Yahoo would [20:07] :P [20:07] driving a bulldozer through the data centre is a better shutdown than yahoo would do [20:08] I wonder if that requires you to just be logged in to github [20:08] :) [20:08] if it does I'm writing something to log into github and click all the fucking buttons [20:08] yipdw: haha [20:08] see how long it takes for github to ban the account [20:09] github.com/GoogleCode [20:09] aw damn someone took it [20:10] yipdw: https://github.com/GoogleExplode [20:10] that works [20:10] hm [20:10] does the export button also work with SVN [20:10] because iirc Google Code supports SVN (and Mercurial?) [20:10] joepie91: the export function converts svn and hg to git [20:10] https://code.google.com/p/box2d/ <-- yes [20:11] and yeah, it's github oauth based, so [20:11] Box2D is a particularly weird case AFAICT [20:11] its canonical repo *is* the Google Code one [20:13] *** mistym has joined #archiveteam [20:14] ? [20:14] yipdw: isn't that the case for most [20:15] joepie91: maybe, but a lot of projects I've come across also maintain their google code repo alongside a gitbucketlab etc. one and say "go to gitbucketlab" [20:15] Start: GoogleConco(r)de would be quite irreverent I think... [20:15] gitbucketlaborious [20:15] the orious was lost [20:15] :) [20:16] yipdw: but yeah, there's a lot of stuff still exclusively on GC [20:16] does freshcode have a data dump available? [20:16] i don't want to write my fucking third scraper today [20:16] Anyways, I gotta run, see you all soon (now that I can connect to IRC with Adium) [20:16] oh I get it Google is GCing GC [20:16] hahaha [20:16] http://google-opensource.blogspot.nl/2015/03/farewell-to-google-code.html [20:17] google close google code [20:17] www2: we're on it [20:18] winr4r: http://freshcode.club/feed [20:18] ? [20:18] *** Start has quit IRC (Read error: Connection reset by peer) [20:18] *** Start has joined #archiveteam [20:18] oh [20:18] no problem [20:18] not complete feeds [20:18] hrm [20:18] tschou zäme [20:18] *** Sanqui has quit IRC (west.us.hub irc.mzima.net) [20:18] *** habi has quit IRC (Quit: Leaving.) [20:19] irc channel idea: #googleclosed [20:19] #googlecodered [20:20] wtf kind of weird github clone is this: http://fossil.include-once.org/freshcode/index [20:20] #googlecodeblue [20:20] Even better [20:20] #googlebloat [20:20] *** Sanqui has joined #archiveteam [20:20] *** sankin has quit IRC (Leaving.) [20:22] i've put #googlecodeblue on the wiki [20:22] *** Start has quit IRC (Client Quit) [20:27] Great [20:27] *** Start has joined #archiveteam [20:30] *** sankin has joined #archiveteam [20:40] *** bsmith093 has joined #archiveteam [20:40] *** bsmith093 has quit IRC (Client Quit) [20:41] *** BlueMaxim has quit IRC (Ping timeout: 306 seconds) [20:42] *** BlueMaxim has joined #archiveteam [20:50] ffs google [20:50] FFS [20:51] *** londoncal has joined #archiveteam [20:53] lol [20:54] johtso: Should be up and running again according to chfoo in #urlteam :) [20:54] ah great! [20:57] *** sankin has quit IRC (Leaving.) [21:00] *** patrickod has joined #archiveteam [21:14] is 6 the maximum workable concurrency for running the urlteam grabber from one ip address? [21:16] *** godane has joined #archiveteam [21:19] *** Start has quit IRC (Disconnected.) [21:29] *** godane has quit IRC (Quit: Leaving.) [21:32] *** schbirid has quit IRC (Leaving) [21:42] *** lag2 has quit IRC (Read error: Connection reset by peer) [21:53] *** Ravenloft has quit IRC (Ping timeout: 606 seconds) [21:56] *** BlueMaxim has quit IRC (Ping timeout: 512 seconds) [22:02] *** godane has joined #archiveteam [22:08] *** cbb has joined #archiveteam [22:14] SketchCow: Trovebox project is going to start within in the next few hours [22:14] According to the owner they have 6 million photos [22:15] I'm estimating it to be 3 - 10 TB [22:15] I'd have a better estimate if we run for a day [22:32] *** BlueMaxim has joined #archiveteam [22:48] *** dashcloud has quit IRC (Read error: Operation timed out) [22:52] *** dashcloud has joined #archiveteam [22:55] *** BlueMaxim has quit IRC (Ping timeout: 512 seconds) [22:56] *** BlueMaxim has joined #archiveteam [23:04] *** rolfb has quit IRC (Leaving...) [23:05] *** godane has quit IRC (Ping timeout: 260 seconds) [23:06] *** yan has quit IRC (Bye!) [23:17] *** godane has joined #archiveteam [23:20] joepie91: fossil is a vcs with its own builtin webserver, but i've not seen it styled like that before [23:24] *** SilSte has quit IRC (Ping timeout: 240 seconds) [23:27] *** dashcloud has quit IRC (Read error: Operation timed out) [23:28] *** Caber has joined #archiveteam [23:29] hello all, are you interested in the archive of a dutch forum for young (non)christian people about faith? [23:29] it has been there for over 10 years, has seen great numbers [23:30] heated discussions between different denominations [23:31] and a team that existed of different people all the time, they found a format to talk about christianity while disagreeing - also the normal forum humor etc. [23:31] *** dashcloud has joined #archiveteam [23:32] Caber: does an archive already exist? [23:32] or is it to be made [23:32] joepie91: not jet, but I might contact it staff [23:32] Caber: have a link to the forum for me? [23:32] I think they might be willing to cooperate now a closure is imminent [23:33] http://forum.credible.nl/ [23:33] looks small enough, can probably just put it into archivebot [23:33] that'd produce a known-valid archive, so to say [23:33] with all the metadata included [23:33] hold on [23:33] I might be able to get a SQL dump? - of which I need to filter the staff sections and private messages etc. [23:34] Caber: are all the threads public? [23:34] or are there categories that are only visible when logged in [23:34] we're talking 500 000 posts :P [23:34] sure, that's not that much [23:35] :P [23:35] Caber: but yeah, are all 'public' categories visible to guests, or? [23:35] let me check - I guess most is visible for everyone [23:35] staff allowed LGBT people to discuss, but you had to email to get access, let's not archive those [23:36] I need to check, can you hold for now [23:36] Caber: sure [23:37] yes, there is a subforum that is only visible for logged-in people [23:37] *** Caber is now known as caber [23:37] does it need to be archived? [23:38] or does it have private contents [23:40] depends on your definition of private - my first impression is that it has been like this to prevent it from turning up in google - people asking to pray for certain things etc. (not my personal style, but for the historian in 50-80 years it can tell a lot about digital christianity) [23:40] caber: right, I guess it can't hurt to make two copies [23:41] one through archivebot of all the guest-accessible stuff [23:41] and one via the staff, like an SQL dump or something [23:41] if that can be obtained [23:41] are you ok with me connecting the owners to inform them first? (we can still guerrilla archive) but it seems to me that is the gentlemen way of archiving it? [23:41] feel free to contact them about an SQL dump or somesuch, but I'd still feed it into archivebot now anyway :) [23:42] I'd rather make sure there's a copy and have angry people later [23:42] than not having a copy at all [23:42] btw, you're dutch? [23:42] mhmm [23:45] expected end-of-life: 30 may 2015 [23:45] *** X-Scale has quit IRC (Ping timeout: 240 seconds) [23:45] yep, saw the thread [23:46] it's in archivebot [23:46] good [23:46] caber: http://archivebot.at.ninjawedding.org:4567/ [23:46] you can follow it there [23:49] in my oppinion we should archive the 'just-register-and-be-able-to-see'-parts as well, but I am an information freedom extremist (why I spend money on tor exit relays) - so I guess someone else should make that call [23:50] now mailing the owners for possible SQL-dump [23:50] caber: the same opinion is generally shared around here :P [23:50] but it's not something one can easily do in archivebot [23:50] because it requires cookies [23:50] and afaik archivebot doesn't do that yet [23:50] ok [23:50] so that'd require a manual wget/wpull run [23:51] *** Start has joined #archiveteam [23:59] could this go with the meta-information? not archived is subforum "Geloofsleven" which the forum explains as following: "Zit je ergens mee en wil je graag dat ervoor gebeden wordt? Post het hier. Ook voor algemene topics over het onderwerp gebed of andere persoonlijke geloofsvragen zit je hier goed. Dit subforum is onzichtbaar voor gasten." this subforum has 851 threads / 27898 messages