[00:33] Wyatts: you can still store that full filename as metadata on the file [00:33] see e.g. https://archive.org/download/cdrom-marumaru-c-magazine-1995/cdrom-marumaru-c-magazine-1995_files.xml [01:29] greader-grab needs some downloaders, everyone got CAPTCHAed after I stupidly imported twitter feeds again [01:29] I removed them now [01:36] someone needs to make it an official warrior project [02:14] http://instantserver.io/ [02:14] Ha HA ha [02:38] heh [02:39] it even works [02:46] thanks SketchCow, this has unjammed greader-grab [02:47] Only works for 35 minutes!!! [02:48] that's a very long time [02:49] is Wyatts archiving doujinshi sites or something? [02:49] I appear to have stumbled into doing something sort of like that. [03:02] About to put all of No-Intro up. [03:02] Ha HA ha [03:14] ssh ubuntu@my-instantserver "wget https://ludios.org/tmp/greader_instantserver.sh -O greader_instantserver.sh && bash ./greader_instantserver.sh" [03:29] also if you click the gear you get a "get 5 more minutes" button [03:40] also, it only has 614 MiB of RAM. [03:43] ALL of it, SketchCow? Even the Nintendo stuff? [03:43] ivan`: er, where does ~/.local/bin/run-pipeline come from on your instantserver setup? [03:45] Coderjoe: pip install --user [03:47] oh. I don't use pip much, so I didn't know that "seesaw" was not the value for --user, but the package to be installed [03:47] ah [03:47] and I didn't know seesaw was in the cheese shop [03:56] I'm to the point with the manga uploads that I've created a metalanguage to indicate what to do with things. [03:56] so "c 1 v 1" will make it call it "Chapter 1 Volume 1" [03:56] Less typing [03:56] Makes a difference [03:59] Actually a lot of sites use c1v001 [03:59] 36G No-Intro-Collection_2013-06-14.zip [03:59] du -sh No-Intro-Collection_2013-06-14.zip [04:00] time to put up another 6,000 konachan images [04:47] http://archive.org/details/No-Intro-Collection_2013-06-14 [04:51] oh, cool, IA now says "(this item is currently being modified/updated with a "derive" task)" [04:52] damn all of no-intro is a ware [05:01] Oh right, so DFJustin, any thought on what should I rename all these things to? They don't really have chapters, most of them, or filenames with much in the way of Latin text. [05:02] What happens to unicode filenames that get uploaded anyway? [05:06] Good to see you in the channel, wyatt [05:07] SketchCow: Yeah, good to be back. You can blame Smiley for that, btw. [05:09] Oh yes, as to the question from earlier, has anyone run across a way to automate downloading from Mediafire? [05:09] it url-escapes them https://ia601701.us.archive.org/18/items/PC98_Games_1813 [05:10] I think plowdown can download files from Mediafire. [05:10] Plowshare* [05:11] URL escapes...Hmm, that's a bit suboptimal. [05:12] and then for text-type items the links on the item page link to a form that's only url-escaped once instead of twice like it needs to be and thus 404s https://archive.org/details/detectivezak_gmail_201304 [05:12] GLaDOS: Despite the CAPTHCA? [05:13] Captcha solving is built into plowshare [05:13] DFJustin: What part is 404ing there? [05:13] the 2nd doc link on the side [05:14] Oh, I see. [05:14] GLaDOS: Awesome, Time to install it and take it for a test drive. [05:20] Oh holy mother of god, I found another one. It's like they link to similar sites or something. ~_~ [05:21] The aptly-named 同人誌作家ごとに一括無料ダウンロード.com/ [05:21] hah [05:21] oh for god's sake more characters Courier New doesn't have [05:22] Also, thank god for adblock. [05:23] um that front page has some highly dubious content [05:23] Like I said, thank god for adblock [05:24] Also, it has direct downloads. [05:26] Oh, reading closer, it looks like they're moving there, instead. [05:27] Okay, so they iced all the mediafire downloads and moved from fc2 to their own domain. [05:30] http://www.deadfrog.us/entry.html?id=6310 [05:30] http://archive.org/details/MESS-0.149.BIOS.ROMs [05:30] Not bad, that's a bit more than I have, for now. [05:33] But I have more volumes! Ha! [06:53] Hi - I'm not sure if this is the right place to ask - I noticed Jason Scott announced that the entire NOINTRO rom collection is on archive.org - what are terms of downloading this? -- or, how come it's available there without any kind of backlash? [06:58] Well, this is vexing. wget is not digesting this unicode domain name well. Any tips? I'm trying to get index pages so I can slurp up delicious zips. [06:59] Wyatts: you might have to encode unicode domains with idna [07:00] Oh right, there is that. [07:01] wget...doesn't seem to have an option to do it, either. [07:01] python -c 'print u"☃.com".encode("idna")' [07:01] this is how I recall doing it before, but it doesn't seem to work [07:02] Ahah! Found an online thing and tried it; good call. [07:02] you can also paste the URL into your browser and copy the xn--...com thing [07:02] My browser shows the URL as 同人誌作家ごとに一括無料ダウンロード.com, so I'm not sure what you're talking about... [07:03] But this will be sufficient, thanks. [07:04] if I paste ☃.com into Chrome, I get http://xn--n3h.com/ in the address bar [07:04] heh, that's the longest idna domain I've seen [07:04] Yeah, Firefox doesn't seem to do that. [07:05] network.IDN_show_punycode -> true [07:05] Oh, I found a longer one a few minutes ago. [14:10] Morning! [14:14] Hiya! [14:15] o/ [14:16] Uploading Manga at the speed of light, also uploading this other guy's software - ExoDOS, an attempt to gather up a bunch of DOS programs [14:17] < ruairi> Hi - I'm not sure if this is the right place to ask - I noticed Jason Scott announced that the entire NOINTRO rom collection is on archive.org - what are terms of downloading this? [14:17] ruairi: If you decide to download it, make sure you have enough space on your disk drive [14:18] SketchCow, ExoDOS is a freaking legend, get everything he does [14:18] Is he [14:18] I hope he gets around to doing that Win31x project he was thinking of [14:18] I think I did. [14:19] His DOS collection that's on PleasureDome and (previously) UG is just amazing. So yeah. Don't mind me fanboying. [14:22] 109G eXoDOSAdv [14:22] 18G eXoDOSRPG [14:22] 47G eXoDOSStr [14:22] 90G eXoDOSAct [14:22] 91G eXoDOSSim [14:23] sweet, sweet DOS games. [15:28] SketchCow: Thanks for your response! I'm still baffled how this can be in the public domain though... [15:28] SketchCow: also, yes - the ExoDOS stuff is full on, very thorough. [15:35] ruairi: I wouldn't worry about it- as they say, easier to ask forgiveness than permission (and now it's archived for all of time) [15:49] dashcloud: since UG has gone, it puts at risk the preservation of PlayStation and other ISO collections, could they potentially be uploaded too? [15:58] if you've actually got the collections, I'd talk with SketchCow so arrangements could be made- if you don't have them, the answer's probably (unless you mean PS2 gen or newer- if that went up, it wouldn't publically accessible) [16:20] dashcloud: I haven't - but there is an exiled Underground Gamer IRC channel out there [16:22] dashcloud: espernet #ug [16:46] hi folks, I'd like to draw your attention to this IndieGoGO campaign: http://www.indiegogo.com/projects/timbuktu-libraries-in-exile/x/7939 [16:46] n 2012, under threat from fundamentalist rebels, a team of archivists, librarians, and couriers evacuated an irreplaceable trove of manuscripts from Timbuktu at great personal risk. The manuscripts have been saved from immediate destruction, but the danger is not over. A massive archival effort is needed to protect this immense global heritage from loss. [17:23] are they releasing the documents in PD? [17:24] they're not the authors [17:26] actually I meant more releasing at all [17:26] not sure if it's that one or a similar project, but when I checked out they required you to be affiliated with a research institution to get access to the documents [17:50] before you can release the documents, you first have to make sure they're around long enough to do something with [17:51] they've been moved from a hot, dry climate to one with high moisture and hot temperatures, and are stored in the same boxes they were transported out of the country with [17:54] here's an AMA (Ask Me Anything) they did on Reddit recently: http://www.reddit.com/r/IAmA/comments/1faiis/we_are_abdel_kader_and_stephanie_diakit%C3%A9_and_we/ [18:48] I mean this http://www.tombouctoumanuscripts.org/user/register/ - why aren't the documents simply publically downloadable? [19:03] scan the damn things already [19:03] no i've not read the link [20:05] so, if you're interested in why more of them haven't been scanned, here's the answer: http://www.reddit.com/r/IAmA/comments/1faiis/we_are_abdel_kader_and_stephanie_diakit%C3%A9_and_we/ca8eb4g [20:12] Scannnnn alllllll [20:18] dashcloud: I'm interested in why the ones that have already been scanned isn't just a click away [21:18] "you cannot use a heat or light based digitizing process." [21:18] "Everything has to be done by cold-circuit photographic process" [21:19] um, isn't "photographic", by definition, a light-based process? [21:23] in fact, the only results I get for a search on that are that reddit thread and the IGG campaign [21:42] I assume they mean Cold cathode lighting so there is no heat issues [21:55] Hmm, is there a switch or something to prevent wget from mangling all the filenames? Makes it hard to prevent duplication of what I already have. [22:00] Or do I have to get creative with the bashing?