[00:49] so, this is really awesome: http://www.chrismcovell.com/TapeDump_Controls.html Dumping your NES cartridges using the program which you've got onto the NES using a dev cart, like the PowerPak, and then use the NES's audio to dump your cart [03:49] WOO NEWS UPDATES [04:26] WOOO [04:26] about what [04:39] WOOOO [05:03] WOOOOOOo [05:38] can anyone explain why wayback takes so long: http://web.archive.org/web/20040430055952/http://www.techtv.com/callforhelp/shownotes/story/0,24330,3674545,00.html [05:38] anything from i want to say nov 2003 to very end of techtv.com domain is very slow [05:38] in wayback [05:56] ok now its talking 2+ min just to get to a page [05:56] what the hell [08:40] i figured away around the slowness of wayback machine [08:43] looks like if i can drop the l in html it will display the page 10 to 15 seconds when it trys to to look g4tv.com/vidoes [09:01] heads up, the Posterous tracker needs to be stopped [09:01] the pages are now just returning a "we're gone" page, any further archiving is useless [09:03] so how much did we get of posterous? [09:04] the tracker says 5265044 of 887636 done, but I don't know how long people have been returning junk data [09:04] so we got a very large portion, but not all [09:04] that sucks [09:05] we would have got it all if not for all the spam [09:05] almost all of the pages on posterous were markov-chain generated [09:09] We're still going with posterous.. [09:10] the pages are useless though [09:11] they just redirect to https://molliemax.posterous.com/bye.html [09:11] did we get all of streetfiles? [09:11] We have a backdoor.. [09:11] don't think so [09:12] GLaDOS: ah! you legend :) [09:12] we got a lot of it, but not all [09:12] And I think streetfiles collapsed under the stress. [09:12] ArchiveTeam needs a bat signal [09:13] a call to arms, or shells, or whatever [09:16] oh the getfileN.posterous.com domains are down :c [09:19] * nwh nods [09:23] https://groups.google.com/forum/?fromgroups#!forum/archive-team [09:23] There's the batsignal! [09:24] no posts since 2012 though [09:26] tweeting @archiveteam works [09:26] getfile2.posterous.com works for me. [09:26] getfile4, sorry. [09:28] Oh, wait, it works with a normal UA, but not with ours. [09:31] ..huh [09:34] DFJustin: they don't tweet new projects though ;) [09:37] nwh: #archiveteam-twitter [09:38] swebb outputs all tweets directed at @archiveteam, that contain a link to archiveteam.org, or contains the words archive archive team [09:39] finnnneee. [13:27] OK, go go go day [13:27] <3 [13:30] godan no we didnt get all streetfiles :/ [13:30] I see that the Archive Team wiki has had no spam for a week plus now! [13:30] We had NO warning on streetfiles. [13:30] My buddy tweeted me with less than 48 hours to go. [13:30] We're not magicians. [13:31] Also, the closure appears to be the result of a central political disagreement between admins. [13:31] Because someone has http://streetfil.es/ and seems to be claiming they're bringing it back. [13:32] right, i think we did a good job on streetfiles given that it didn't take the load too well [13:43] sorry we're dead? really? [13:45] ? [13:45] streetfiles. that's theyre pagen ow [13:45] oh, sorry we're dead. [13:45] http://streetfiles.org/ [13:46] Oh, yes. [13:46] Well, like I said, I'm about 90% sure this was internal politics. [13:46] This was a total volunteer thing, and the community hit a breaking point. [13:47] So this sort of petulant "bye dudes" is expected [13:47] so i've been asleep, did anyone grab all that AOL music stuff? [13:48] Not here [13:48] i mean sure, it mostly looks like regurged press releases and all [13:48] but still [13:54] Mo Posterous, Mo Problems [14:01] who backed up who not? who got dropped? which lot, sketchcow wit his name on the blimp. [14:03] FYI for all - Mid Atlantic Retro Computing Hobbyists appear to be running an Open House Workshop/Swap Meet Saturday May 18th and Sunday May 19th, I'm presuming at InfoAge in Wall, NJ [17:43] so, to upload my nwnet.co.uk site dumps using the web uploader, what should I fill in for the metadata? [17:53] also, should I just upload the WARCs or the WARCs+ the regular downloaded files? [17:55] warc plus cdx [18:02] what metadata should I fill in? [18:03] no idea [18:03] tag with archiveteam [18:10] so, I just started uploading and it's telling me there's a network problem- I'm pretty sure I'm not having network problems, so how do I fix this issue? [18:12] also tag with webcrawl [18:15] I can't because I tried starting the upload and now it's stuck on There is a network problem [18:19] so i have found about 4600+ lost techtv videos [18:21] ones you haven't found before, or total? [18:21] these are the missing id ones [19:12] so, what collection do I put my upload into? community texts? [19:17] yes [19:17] godane: have you seen my link? [19:17] uploading cdx is unnecessary because IA derives one [19:18] Right [19:18] yes [19:18] I should really make a collection for people to shove stuff into that can then be pulled over. [19:18] i'm too busy to grab that right now [19:18] The problem is it's not easy to declare things web [19:18] yeah there's a lot of warcs floating around outside the collection at the moment [19:19] when/at what conditions are warcs entered into the wayback machine, btw? [19:23] so, here's where my nwnet.co.uk grab is/will be located: http://archive.org/details/Nwnet.co.ukWebGrab [19:23] Let me know when it is uploaded. [19:23] Brewster made a quiet request to clean up collections and make them prettier. [19:24] Standard make-it-work stuff. I'll be doing that soon and others are too. [19:24] I mean, it's kind of my job [19:24] it is uploaded I think- it says after a few minutes I should be able to refresh the details page and see it [19:33] so it should be there now- I can see files [19:47] I was looking through my other items on the external I currently have attached, and there's a single one for lightning.prohosting.com - is there a collection somewhere for that? [20:21] SketchCow: There's an Upcoming collection that you could make. [20:21] Everything is uploaded for that one. [20:26] i just found out that my firefly mosquito parody is in computertechvideos [20:27] :P [21:13] SketchCow: if you're in cleanup mood there are some items to move to wikiteam collection here https://archive.org/search.php?query=wikiteam%20AND%20NOT%20collection%3Awikiteam%20AND%20collection%3Aopensource [21:20] "This week's Google Reader announcement has no impact on the Feed API. As in the past, we will continue to post explicitly whenever we have news about changes to this or other APIs." [21:20] (I think I should start backing it up anyway though) [21:22] it is very simple, there is no login required, just need to find every RSS/atom feed that exists ;) [21:25] Nemo_bis: I'm doing that, but I have to do it a slightly more annoying way [21:25] Takes longer, but it's more it takes longer to search and what do I care [21:29] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [21:30] yahoosucks [21:30] thanks :) [21:32] IF THEE BE A SPAMMER SUCH SADNESS THEE WILL KNOW [21:33] I'm going to write a proper tool that grabs the /reader/api/0/stream/contents/feed/ things, should it be writing warc files? [21:33] SketchCow: I have a number of warc's to be moved to some collection or other. [21:36] I'm not sure anyone will want the warc files as opposed to a big json blob that combines all the continuation-responses [21:37] and nobody's going to be typing http://www.google.com/reader/api/0/stream/contents/feed/http%3A%2F%2Fgoogleblog.blogspot.com%2Ffeeds... into web.archive.org [21:39] heh [21:39] perhaps it might be helpful to generate warc files with faked URLs that correspond to the post URLs in the feed data, though [21:43] WARC-Made-Of-Lies: 1 [21:44] Nemo_bis: Done [21:44] So [21:45] Here's the philosophy, ivan`. [21:45] If nothing else is to be grabbed, grab WARC. All can be built from WARC. [21:45] In an idea world, you grab WARC and you grab some easier to use contemporarily format [21:45] I can do that. assuming there is some Python code to write WARC. I don't want to make a new HTTP connection for each request. [21:46] In super ideal world, and I mean we're talking two unicorns going down on you, you are building a separate new cross-platform interface to the data. [21:46] But it all flows from those WARCs. [21:46] okay [21:46] Another angle: Internet Archive wants WARCs, they tolerate other formats because they like us [21:46] :D [21:46] In return, essentially infinite disk space [21:47] Also, fair warning [21:47] changed my diet this weekend [21:47] I may be rather insane for a week [21:48] https://github.com/internetarchive/warc well that's handy [21:48] SketchCow: you might want to pop in #preposterus [21:48] Vincent is around. [21:53] yeah, saw. [21:56] I have to let alard make the decisions on this one. [21:58] anyone recognize these? http://archive.org/details/rescuecrawl [21:59] They're both from underscor [21:59] Interesting. An attempt to have a less "You're in Archive Team" and more "Oh, if you have it." [22:00] did you see that my nwnet upload is fully uploaded? [22:01] heh, I can probably use wget-lua to decode the json and grab the continuation URL [22:01] Hey, Punchfork guy shut up [22:01] That's good news. [22:01] :D [22:02] I am going to make the description on the front of archiveteam's collection on archive.org nicer too [22:02] http://archive.org/search.php?query=uploader%3A%22djsmiley2k%40gmail.com%22&sort=-publicdate << most of the IGN/Gamespy stuff I've done [22:02] So anything tagged archiveteam needs to go into a collection at some point. [22:02] It's a pita to write the metadata for each one, so I scripted most of it.... [22:10] so, here's another item: https://archive.org/details/TouchatagWarc.warc - a site grab I did a while ago [22:34] i'm going to reboot soon [22:34] missing pages in both of my pc novice scans