[00:00] Wow, something is demolishing the system's IO [00:04] Here we go, this will either add tons of manuals... or make a huge mess [00:04] HUGE! MESS! HUGE! MESS! [00:05] haha [00:06] Oh god, it is working. [00:07] It's yanking each down, then blowing into archive.org [00:07] Pure evil [00:07] http://www.archive.org/search.php?query=collection%3Adec-manuals&sort=-publicdate [00:07] watch it populate [00:08] whoomph whoomph whoomph [00:13] It's working, and it's gone through a lot of them already. [00:13] I should be doing more local stuff here, though. [00:13] So I am not adding your descriptions until later, winr4r [00:13] that's fine! [00:13] I need to complete cleaning my room and packing for 9 days of CA [00:13] you're off tomorrow, right? [00:13] yes, that [00:14] http://www.archive.org/details/dec-Alphaserver800_sum [00:14] yay [00:15] :D [00:32] californiaye [00:38] http://www.archive.org/search.php?query=collection%3Adec-manuals&sort=-publicdate is populating [00:39] computer history museum should setup a kiosk that pipes that stuff [00:39] while they have some manuals on display you can thumb thru, this would be much better/more complete [01:21] I agree [01:21] We'll see how THAT goes [01:40] Warez Scene Notice Collection (2006-2010) [01:40] is that a complete collection ? [01:41] I'm pretty sure warez has been going for longer than 2006 :P [01:41] for notices in the time frame [02:10] * lemonkey chuckles [02:17] http://blogs.discovermagazine.com/80beats/2011/10/12/women-on-the-pill-may-choose-reliable-over-sexy-study-suggests/ [02:17] wrong window ignore :) [02:21] I doubt it's a complete collection. It's someone's collection I was sent. [02:23] i'll keep my eye out for an archive then [02:27] http://scenenotice.org/index.php [02:27] has a decent collection [02:27] but there will most likly be dupes with the uploaded pack [02:28] Oh no, dupes [02:28] and its mising all the .rars [02:28] * SketchCow gets the gascan [02:30] http://www.archive.org/details/synthmanuals-propellerhead [02:36] SketchCow: thanks for tweeting the geeks on board vimeo link [02:41] what other scene stuff are you intrested in getting your hands on SketchCow ? [02:53] This and that [03:20] SketchCow: you could pull in www.bitsavers.org aswell... lots of manuals there too [03:20] earthquake in SF [03:20] 4.2 same as earlier [03:22] Dark_Star: Yeah I should get on that [03:23] http://www.archive.org/details/bitsavers [03:23] Oh wait [03:23] http://www.textfiles.com/bitsavers/ [03:24] heh okay :) [03:38] lemonkey: yeah, nice roll to that one [03:52] no earthquake in seattle [03:52] I'm feeling left out [05:00] * BlueMax grabs chronomex and shakes him around a little [05:01] on my birthday! [05:01] :P [05:01] * BlueMax puts a party hat on [05:02] What, no hookers? [05:02] I thought SketchCow would have ordered them ages ago [05:02] I am celebrating my birthday with programming [05:04] print("Happy Birthday To Me") [05:04] oslt [05:04] I'M NOT A PROGRAMMER [05:06] I tried learning Java once and it went right over my head, same with C++ [05:07] m0lson, BlueMax: I have a predb dump that covers at least from 1996 to 2007... [05:08] 10 PRINT "HAPPY BIRTHDAY" [05:09] 20 PRINT "HOOKERS ON THE WAY" [05:50] 126G YV-6550032-6573504 [05:50] 238G YV-6500014-6549992 [05:50] root@teamarchive-0:/2/FTP/sushi/4363/6500000/videos# du -sh * [05:50] Just big numbers all around [05:50] hm... [05:51] Coming along nicely. Packing? Not so nicely. [05:51] Editing is going well too. [05:51] I want to send this back to the director tonight. [05:52] packing? what is the destination? [05:52] I improved the film, but I'm pulling it out before it gets to become basically my movie [05:52] California, SF. 9 days. [05:52] ah [05:52] Did you just mail me? [05:52] my drive should be arriving at IA tuesday [05:52] yeah [05:52] You kind of have a fast track here. [05:53] E-mail me an address, I'll send the BBS Doc. [05:54] wasn't sure if you were around. I suppose I could have asked [05:54] or noticed my IRC window sooner [05:59] Addressed, done [06:00] it appears friendster.014400000-014499999.tar.xz has finished [06:00] do you want get lamp back? [06:06] Nahhh [06:07] m0lson, BlueMax: database and a small update: http://www.megaupload.com/?d=56GUP3DS [06:29] ----- [06:29] I was asked what we're doing about Google Buzz material. Anyone feel like looking at it? [06:29] ----- [07:35] SketchCow: just a heads up, there's a lot of misinformation out there about the shutdown [07:37] for example, google reader is getting a lot of its social stuff merged in [07:40] but the current comment data may or may not, nobody knows [07:54] That [07:54] dam, thanks for the dump Coderjoe [07:54] 's why he asked [07:54] ersi: and i'm saying, not even people from google seem to know [07:54] those are the original files i downloaded [07:55] We're the kind of people who don't trust other people with data [07:55] You know how the saying go? Better safe(r) than sorry [07:56] when the other people are going "i unno" and shrugging, you better not trust them [07:58] And my point was that, indifferent to what people are going, you better not trust them. [07:59] Hm, are most buzz material private to the posters friends? How would one find people who've publically shared? [08:22] the first pre in that DB is from 1980-01-01 06:00:36 [08:22] so you have 1980 to 2007 [08:23] which is pretty awesome [09:13] wiki is off [09:15] Seems to be the MySQL database that isn't responding. [09:15] I'd just wait, it'll probably come back up. It's ran at a hosting company. [09:17] I remember when I went through every page of that wiki and cleaned it up slightly [09:17] I don't know wtf I was on [09:17] "Spring cleaning syndrome" [09:18] ick [09:18] So what's been going on around here lately [09:19] SketchCow's busy ingesting all our pirate booty into The Archives [09:20] we've been talking/thinking aboutGoogle Buzz which is closing [09:20] I heard about that [09:20] If you need help with it let me know, I'll be around [09:39] won't buzz stuff get migrated into +? [09:40] oh, apparently not [09:55] Who knows. [12:44] hey [12:45] http://go.to/ and all of its subdomains are apparently listed for sale on Sedo [12:45] just saying [12:55] aw man [12:55] fucking shorteners [12:57] should i add it to the wiki [12:57] i'm cleaning it up right now [13:01] sure [13:01] http://archiveteam.org/index.php?title=Deathwatch <- done [13:01] now adding go.to [13:02] add it to the URL shortener list as well, if you got the time [13:04] sure [13:04] yeah [13:04] i have lots of time [13:05] added to deathwatch [13:06] not sure where to add it in the URL shortener list... oh well [13:06] about go.to, it has a format where no random URLs happen [13:06] only self-specified ones [13:06] so the only way to scrape it is by google queries [13:09] i could try sending an email [13:09] it's owned by myphotoalbum.com [13:10] http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=site%3Ago.to [13:10] yup, google gives about 41000 results [13:10] now to find an API [13:12] It's at http://www.archiveteam.org/index.php?title=Urlteam [13:12] found it, updated [13:14] "The API provides 100 search queries per day for free." [13:14] and that's why i'll have to scrape google manually [13:14] ah, nice info [13:14] 100 search queries * up to 10 results / query [13:14] 1000 results / day. BOO [13:20] hooray [13:20] got it to filter links [13:21] yup, works [13:22] now i only need to make an automated app that takes the results [13:22] "Sorry, Google does not serve more than 1000 results for any query. (You asked for results starting from 40500.)". [13:22] le fu [13:57] it would probably be easier to bribe someone working in google [13:59] what are you trying to scrape? [14:04] go.to [14:04] no codes [14:04] only names [14:04] i'm scraping go.to because all of its domains are for sale on Sedo [14:04] all of them [14:04] also one more thing to consider [14:04] minecraft classic [14:04] it had a map saving system which is apparently pretty faily nowadays [14:04] and may be down [14:06] and apparently is fully dead by now [14:06] i have a backup of about 200 users' maps [14:06] 200 fully random users [14:06] i'd say it's about 3-4% of the total maps [14:07] i'll locate and find it [14:11] i'd recommend backing up minecraft versions too [14:12] but notch himself stopped someone from doing it [14:14] because notch is a fucking douche [14:14] asiekier_, there are patchers that contain past versions [14:14] WHATS FUCKING WRONG WITH OPTIONS, INSTEAD OF CHANGING THE WHOLE GAME [14:14] ass [14:14] [14:14] Cameron_D thanks [14:14] ersi i'm trying to find that save backup [14:15] Um, yeah - I'm not ranting about that [14:15] I meant that he changes the way shit works :( without any way to flip it back to that.. [14:15] but i'm worried it's gone [14:17] i think it was on my old HDD - which is dead [14:17] it was also on one of my freehosted websites - which is gone [16:01] right [16:01] coded a script for google-fuing links [16:12] ok, running it [16:19] did i just... [16:19] HTTP request failed! HTTP/1.0 503 Service Unavailable [16:26] added a 10-second delay [16:27] whatever, 5 seconds instead [16:27] should make it fast enough [16:27] or not, it is fast enough... just under 2 hours [21:32] Google bans you quite fast, but they also unban you automatically after a while. I think ndurner was using 2 to 20 seconds delay between requests, plus exponential (2, 4, 8, 16, 32, ...) backof when he got blocked for his google groups downloader [21:34] ipv6! [21:54] hi guys, there was a request in this article: http://libregraphicsworld.org/blog/entry/guitar-samples-in-gig-format-from-flame-studio-collection-shared for the collection being shared to get hosted on archive.org [22:20] This connection is slow. [22:20] I said, THIS CONNECTION FROM A PLANE TO IRC IS SLOW [22:20] Someone get on it. [22:20] a connection from a plane is slow? say it ain't so... [22:22] WTF THE FUCK WHERE IS MY JETBACK [22:22] Yes, that's right, I said WTF THhe Fuck [22:23] I ordered some food, so if that comes, I'll be idle. [22:23] I'm just doing some cleanups here and there, getting some mail out, flying in a plane. [22:23] All the synths went up, lots of manuals, and it was so simple [22:24] After a while it was no keypresses. 80 manuals went in with no intervention on my part. [22:24] The scripts are getting better, although like most macros there's still a lot of customization on the front end. [22:24] latency is high or throughput is low? [22:26] Latency is pretty unpredictable from the plane. [22:26] Sometimes fast, sometimes slow. [22:26] interesting [22:27] bufferbloat? [22:27] that can cause latency spikes [22:27] lag bubbles [22:27] I think the plane is just using packets and is sending stuff slong a radio channel. [22:27] well, internet traffic is packetized [22:27] the internet wouldn't work otherwise [22:28] I mean really encapulated bursts of packets, Vinton [22:28] but yes, probably the radio link is dropping some percentage of those packets and not telling you [22:28] even wifi does that [22:28] it's really annoying [22:28] An archivist named Jenny asked if she could help, she's a digitial preservation and records person, I suggested she go through our stuff, look for gaps or stuff we don't know about. [22:29] A la what we got with our WARC format [22:29] shiny [22:31] I'll open another window and get another few terabytes uploaded. [22:31] While on the plane. That'll be efficient. [22:31] This time I have a car in SF [22:31] And intend to use it. [22:31] See people, do stuff. [22:37] Hi. I saw 'warc', so I'll just jump in for a short note: I've made a new version of wget-warc, one that doesn't use the warctools library. It's much smaller, so I hope it has a better chance of being included in wget. It would be nice if you wouldn't have to install it as a separate extension. I just mailed the new version to the wget mailing list, so I'll see what their reaction is. [22:44] how does it handle the warc stuff, custom code? [23:06] * lemonkey waves to SketchCow in the sky over SF [23:09] what did Google do or not do recently that moved them to untrustworthy for data? [23:17] SketchCow: I unified the poetry.com stuff that you uploaded to archive.org [23:17] Chances are very good this spontaneous reader upgrade will demolish comments and social features. [23:17] That's the last last straw, but the murder of Wave after killing Etherpad, the death of buzz, the removal of google groups and those hundreds of gigabytes of files... [23:18] The fact that they kill products over time, often in a 3-4 but occasionally 1-2 year lifespan [23:18] This is all adding up very, very poorly. [23:18] Brad is of course working on them having export functions everywhere, but regardless. [23:18] SketchCow: but it doesn't include a lot of things from http://archiveteam.org/archives/.lulupoetry/ [23:19] db48x: Throw them in! [23:19] ok [23:20] I can swap the thing with the new pieces [23:21] I've been working so hard on the archive.org stuff from batcave I haven't even looked at the archives. [23:21] So that's why that was that. [23:21] I've got several .tar processing jobs going as we speak. Easily 2 tb of Yahoo Video. [23:25] oh, good [23:25] I've already downloaded everything in .lulupoetry [23:32] It wasn't much [23:32] It compresses very well. [23:32] 2 gigs :) [23:33] See [23:47] http://www.wired.com/underwire/2011/10/9-essential-geek-books/?pid=5167&viewall=true here's the first in a series [23:54] http://twitter.com/leighalexander/status/127531825311653888 [23:54] Either I won there or I placed well. [23:54] I need a command line program that does html entity substitution [23:54] " to ", etc [23:55] There MUST be some crap for that in perl. [23:55] MUST be [23:55] yea [23:55] Geeks are GENETICALLY DESIGNED to do that shit in perl [23:55] heh [23:55] the rest of the script is in shell though [23:55] there needs to be a utility that does this [23:55] lol [23:56] I mean, we have sha1, base64, uuencode, etc [23:56] Anywhere there's a list of possible responses to an incoming stream of text that has to follow an arbitrary and bizarre set of consistent standards, you can bet theres some perl in CPAN's asscrack that does it. [23:56] oh, yea [23:56] CPAN has everything [23:56] So have the shell call a perl statement. [23:56] that's the purpose of cpan [23:56] Do it all the time. [23:57] [23:57] She has blonde hair, beautiful [23:57] True Love by Brian E Hewins [23:57] [db48x@celebdil unified]$ cat 003/000/000/003000000.txt [23:57] I am well on my way to becoming archive.org's top uploader next to possibly prelinger. [23:57] brown eyes, and endless love. [23:57] [...] [23:58] \Her name is "Cindy", and she is the love of my life; [23:58] truly my best friend, my collie [23:58] I accidentally / the whole goddamn poetry / and I am quite sad [23:58] HAIKU BACK AT U [23:58] heh [23:59] hrm [23:59] the three tarballs you posted to archive.org have 264113 poems in them [23:59] 264113 < 14x10^6 [23:59] db48x: i can make you one real quick if you like [23:59] i wrote most of it for work