#archiveteam 2013-05-05,Sun

↑back Search

Time Nickname Message
00:49 🔗 dashcloud so, this is really awesome: http://www.chrismcovell.com/TapeDump_Controls.html Dumping your NES cartridges using the program which you've got onto the NES using a dev cart, like the PowerPak, and then use the NES's audio to dump your cart
03:49 🔗 GLaDOS WOO NEWS UPDATES
04:26 🔗 chronomex WOOO
04:26 🔗 chronomex about what
04:39 🔗 BlueMax WOOOO
05:03 🔗 SketchCow WOOOOOOo
05:38 🔗 godane can anyone explain why wayback takes so long: http://web.archive.org/web/20040430055952/http://www.techtv.com/callforhelp/shownotes/story/0,24330,3674545,00.html
05:38 🔗 godane anything from i want to say nov 2003 to very end of techtv.com domain is very slow
05:38 🔗 godane in wayback
05:56 🔗 godane ok now its talking 2+ min just to get to a page
05:56 🔗 godane what the hell
08:40 🔗 godane i figured away around the slowness of wayback machine
08:43 🔗 godane looks like if i can drop the l in html it will display the page 10 to 15 seconds when it trys to to look g4tv.com/vidoes
09:01 🔗 nwh heads up, the Posterous tracker needs to be stopped
09:01 🔗 nwh the pages are now just returning a "we're gone" page, any further archiving is useless
09:03 🔗 godane so how much did we get of posterous?
09:04 🔗 nwh the tracker says 5265044 of 887636 done, but I don't know how long people have been returning junk data
09:04 🔗 nwh so we got a very large portion, but not all
09:04 🔗 godane that sucks
09:05 🔗 nwh we would have got it all if not for all the spam
09:05 🔗 nwh almost all of the pages on posterous were markov-chain generated
09:09 🔗 GLaDOS We're still going with posterous..
09:10 🔗 nwh the pages are useless though
09:11 🔗 nwh they just redirect to https://molliemax.posterous.com/bye.html
09:11 🔗 godane did we get all of streetfiles?
09:11 🔗 GLaDOS We have a backdoor..
09:11 🔗 Cameron_D don't think so
09:12 🔗 nwh GLaDOS: ah! you legend :)
09:12 🔗 Cameron_D we got a lot of it, but not all
09:12 🔗 GLaDOS And I think streetfiles collapsed under the stress.
09:12 🔗 nwh ArchiveTeam needs a bat signal
09:13 🔗 nwh a call to arms, or shells, or whatever
09:16 🔗 Cameron_D oh the getfileN.posterous.com domains are down :c
09:19 🔗 * nwh nods
09:23 🔗 GLaDOS https://groups.google.com/forum/?fromgroups#!forum/archive-team
09:23 🔗 GLaDOS There's the batsignal!
09:24 🔗 nwh no posts since 2012 though
09:26 🔗 DFJustin tweeting @archiveteam works
09:26 🔗 alard getfile2.posterous.com works for me.
09:26 🔗 alard getfile4, sorry.
09:28 🔗 alard Oh, wait, it works with a normal UA, but not with ours.
09:31 🔗 GLaDOS ..huh
09:34 🔗 nwh DFJustin: they don't tweet new projects though ;)
09:37 🔗 GLaDOS nwh: #archiveteam-twitter
09:38 🔗 GLaDOS swebb outputs all tweets directed at @archiveteam, that contain a link to archiveteam.org, or contains the words archive archive team
09:39 🔗 nwh finnnneee.
13:27 🔗 SketchCow OK, go go go day
13:27 🔗 SketchCow <3
13:30 🔗 flaushy godan no we didnt get all streetfiles :/
13:30 🔗 SketchCow I see that the Archive Team wiki has had no spam for a week plus now!
13:30 🔗 SketchCow We had NO warning on streetfiles.
13:30 🔗 SketchCow My buddy tweeted me with less than 48 hours to go.
13:30 🔗 SketchCow We're not magicians.
13:31 🔗 SketchCow Also, the closure appears to be the result of a central political disagreement between admins.
13:31 🔗 SketchCow Because someone has http://streetfil.es/ and seems to be claiming they're bringing it back.
13:32 🔗 flaushy right, i think we did a good job on streetfiles given that it didn't take the load too well
13:43 🔗 AKKuhn sorry we're dead? really?
13:45 🔗 SketchCow ?
13:45 🔗 AKKuhn streetfiles. that's theyre pagen ow
13:45 🔗 AKKuhn oh, sorry we're dead.
13:45 🔗 GLaDOS http://streetfiles.org/
13:46 🔗 SketchCow Oh, yes.
13:46 🔗 SketchCow Well, like I said, I'm about 90% sure this was internal politics.
13:46 🔗 SketchCow This was a total volunteer thing, and the community hit a breaking point.
13:47 🔗 SketchCow So this sort of petulant "bye dudes" is expected
13:47 🔗 AKKuhn so i've been asleep, did anyone grab all that AOL music stuff?
13:48 🔗 SketchCow Not here
13:48 🔗 AKKuhn i mean sure, it mostly looks like regurged press releases and all
13:48 🔗 AKKuhn but still
13:54 🔗 SketchCow Mo Posterous, Mo Problems
14:01 🔗 AKKuhn who backed up who not? who got dropped? which lot, sketchcow wit his name on the blimp.
14:03 🔗 AKKuhn FYI for all - Mid Atlantic Retro Computing Hobbyists appear to be running an Open House Workshop/Swap Meet Saturday May 18th and Sunday May 19th, I'm presuming at InfoAge in Wall, NJ
17:43 🔗 dashcloud so, to upload my nwnet.co.uk site dumps using the web uploader, what should I fill in for the metadata?
17:53 🔗 dashcloud also, should I just upload the WARCs or the WARCs+ the regular downloaded files?
17:55 🔗 Smiley warc plus cdx
18:02 🔗 dashcloud what metadata should I fill in?
18:03 🔗 Smiley no idea
18:03 🔗 Smiley tag with archiveteam
18:10 🔗 dashcloud so, I just started uploading and it's telling me there's a network problem- I'm pretty sure I'm not having network problems, so how do I fix this issue?
18:12 🔗 omf_ also tag with webcrawl
18:15 🔗 dashcloud I can't because I tried starting the upload and now it's stuck on There is a network problem
18:19 🔗 godane so i have found about 4600+ lost techtv videos
18:21 🔗 dashcloud ones you haven't found before, or total?
18:21 🔗 godane these are the missing id ones
19:12 🔗 dashcloud so, what collection do I put my upload into? community texts?
19:17 🔗 Nemo_bis yes
19:17 🔗 Nemo_bis godane: have you seen my link?
19:17 🔗 DFJustin uploading cdx is unnecessary because IA derives one
19:18 🔗 SketchCow Right
19:18 🔗 godane yes
19:18 🔗 SketchCow I should really make a collection for people to shove stuff into that can then be pulled over.
19:18 🔗 godane i'm too busy to grab that right now
19:18 🔗 SketchCow The problem is it's not easy to declare things web
19:18 🔗 DFJustin yeah there's a lot of warcs floating around outside the collection at the moment
19:19 🔗 Nemo_bis when/at what conditions are warcs entered into the wayback machine, btw?
19:23 🔗 dashcloud so, here's where my nwnet.co.uk grab is/will be located: http://archive.org/details/Nwnet.co.ukWebGrab
19:23 🔗 SketchCow Let me know when it is uploaded.
19:23 🔗 SketchCow Brewster made a quiet request to clean up collections and make them prettier.
19:24 🔗 SketchCow Standard make-it-work stuff. I'll be doing that soon and others are too.
19:24 🔗 SketchCow I mean, it's kind of my job
19:24 🔗 dashcloud it is uploaded I think- it says after a few minutes I should be able to refresh the details page and see it
19:33 🔗 dashcloud so it should be there now- I can see files
19:47 🔗 dashcloud I was looking through my other items on the external I currently have attached, and there's a single one for lightning.prohosting.com - is there a collection somewhere for that?
20:21 🔗 alard SketchCow: There's an Upcoming collection that you could make.
20:21 🔗 alard Everything is uploaded for that one.
20:26 🔗 godane i just found out that my firefly mosquito parody is in computertechvideos
20:27 🔗 godane :P
21:13 🔗 Nemo_bis SketchCow: if you're in cleanup mood there are some items to move to wikiteam collection here https://archive.org/search.php?query=wikiteam%20AND%20NOT%20collection%3Awikiteam%20AND%20collection%3Aopensource
21:20 🔗 ivan` "This week's Google Reader announcement has no impact on the Feed API. As in the past, we will continue to post explicitly whenever we have news about changes to this or other APIs."
21:20 🔗 ivan` (I think I should start backing it up anyway though)
21:22 🔗 ivan` it is very simple, there is no login required, just need to find every RSS/atom feed that exists ;)
21:25 🔗 SketchCow Nemo_bis: I'm doing that, but I have to do it a slightly more annoying way
21:25 🔗 SketchCow Takes longer, but it's more it takes longer to search and what do I care
21:29 🔗 ivan` WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD
21:30 🔗 chronomex yahoosucks
21:30 🔗 ivan` thanks :)
21:32 🔗 SketchCow IF THEE BE A SPAMMER SUCH SADNESS THEE WILL KNOW
21:33 🔗 ivan` I'm going to write a proper tool that grabs the /reader/api/0/stream/contents/feed/ things, should it be writing warc files?
21:33 🔗 Smiley SketchCow: I have a number of warc's to be moved to some collection or other.
21:36 🔗 ivan` I'm not sure anyone will want the warc files as opposed to a big json blob that combines all the continuation-responses
21:37 🔗 ivan` and nobody's going to be typing http://www.google.com/reader/api/0/stream/contents/feed/http%3A%2F%2Fgoogleblog.blogspot.com%2Ffeeds... into web.archive.org
21:39 🔗 chronomex heh
21:39 🔗 ivan` perhaps it might be helpful to generate warc files with faked URLs that correspond to the post URLs in the feed data, though
21:43 🔗 ivan` WARC-Made-Of-Lies: 1
21:44 🔗 SketchCow Nemo_bis: Done
21:44 🔗 SketchCow So
21:45 🔗 SketchCow Here's the philosophy, ivan`.
21:45 🔗 SketchCow If nothing else is to be grabbed, grab WARC. All can be built from WARC.
21:45 🔗 SketchCow In an idea world, you grab WARC and you grab some easier to use contemporarily format
21:45 🔗 ivan` I can do that. assuming there is some Python code to write WARC. I don't want to make a new HTTP connection for each request.
21:46 🔗 SketchCow In super ideal world, and I mean we're talking two unicorns going down on you, you are building a separate new cross-platform interface to the data.
21:46 🔗 SketchCow But it all flows from those WARCs.
21:46 🔗 ivan` okay
21:46 🔗 SketchCow Another angle: Internet Archive wants WARCs, they tolerate other formats because they like us
21:46 🔗 Smiley :D
21:46 🔗 SketchCow In return, essentially infinite disk space
21:47 🔗 SketchCow Also, fair warning
21:47 🔗 SketchCow changed my diet this weekend
21:47 🔗 SketchCow I may be rather insane for a week
21:48 🔗 ivan` https://github.com/internetarchive/warc well that's handy
21:48 🔗 Smiley SketchCow: you might want to pop in #preposterus
21:48 🔗 Smiley Vincent is around.
21:53 🔗 SketchCow yeah, saw.
21:56 🔗 SketchCow I have to let alard make the decisions on this one.
21:58 🔗 dashcloud anyone recognize these? http://archive.org/details/rescuecrawl
21:59 🔗 SketchCow They're both from underscor
21:59 🔗 SketchCow Interesting. An attempt to have a less "You're in Archive Team" and more "Oh, if you have it."
22:00 🔗 dashcloud did you see that my nwnet upload is fully uploaded?
22:01 🔗 ivan` heh, I can probably use wget-lua to decode the json and grab the continuation URL
22:01 🔗 SketchCow Hey, Punchfork guy shut up
22:01 🔗 SketchCow That's good news.
22:01 🔗 Smiley :D
22:02 🔗 SketchCow I am going to make the description on the front of archiveteam's collection on archive.org nicer too
22:02 🔗 Smiley http://archive.org/search.php?query=uploader%3A%22djsmiley2k%40gmail.com%22&sort=-publicdate << most of the IGN/Gamespy stuff I've done
22:02 🔗 Smiley So anything tagged archiveteam needs to go into a collection at some point.
22:02 🔗 Smiley It's a pita to write the metadata for each one, so I scripted most of it....
22:10 🔗 dashcloud so, here's another item: https://archive.org/details/TouchatagWarc.warc - a site grab I did a while ago
22:34 🔗 godane i'm going to reboot soon
22:34 🔗 godane missing pages in both of my pc novice scans

irclogger-viewer