#archiveteam 2011-10-21,Fri

↑back Search

Time Nickname Message
00:00 🔗 SketchCow Wow, something is demolishing the system's IO
00:04 🔗 SketchCow Here we go, this will either add tons of manuals... or make a huge mess
00:04 🔗 SketchCow HUGE! MESS! HUGE! MESS!
00:05 🔗 winr4r haha
00:06 🔗 SketchCow Oh god, it is working.
00:07 🔗 SketchCow It's yanking each down, then blowing into archive.org
00:07 🔗 SketchCow Pure evil
00:07 🔗 SketchCow http://www.archive.org/search.php?query=collection%3Adec-manuals&sort=-publicdate
00:07 🔗 SketchCow watch it populate
00:08 🔗 chronomex whoomph whoomph whoomph
00:13 🔗 SketchCow It's working, and it's gone through a lot of them already.
00:13 🔗 SketchCow I should be doing more local stuff here, though.
00:13 🔗 SketchCow So I am not adding your descriptions until later, winr4r
00:13 🔗 winr4r that's fine!
00:13 🔗 SketchCow I need to complete cleaning my room and packing for 9 days of CA
00:13 🔗 winr4r you're off tomorrow, right?
00:13 🔗 winr4r yes, that
00:14 🔗 SketchCow http://www.archive.org/details/dec-Alphaserver800_sum
00:14 🔗 SketchCow yay
00:15 🔗 winr4r :D
00:32 🔗 lemonkey californiaye
00:38 🔗 SketchCow http://www.archive.org/search.php?query=collection%3Adec-manuals&sort=-publicdate is populating
00:39 🔗 lemonkey computer history museum should setup a kiosk that pipes that stuff
00:39 🔗 lemonkey while they have some manuals on display you can thumb thru, this would be much better/more complete
01:21 🔗 SketchCow I agree
01:21 🔗 SketchCow We'll see how THAT goes
01:40 🔗 m0lson Warez Scene Notice Collection (2006-2010)
01:40 🔗 m0lson is that a complete collection ?
01:41 🔗 BlueMax I'm pretty sure warez has been going for longer than 2006 :P
01:41 🔗 m0lson for notices in the time frame
02:10 🔗 * lemonkey chuckles
02:17 🔗 lemonkey http://blogs.discovermagazine.com/80beats/2011/10/12/women-on-the-pill-may-choose-reliable-over-sexy-study-suggests/
02:17 🔗 lemonkey wrong window ignore :)
02:21 🔗 SketchCow I doubt it's a complete collection. It's someone's collection I was sent.
02:23 🔗 m0lson i'll keep my eye out for an archive then
02:27 🔗 m0lson http://scenenotice.org/index.php
02:27 🔗 m0lson has a decent collection
02:27 🔗 m0lson but there will most likly be dupes with the uploaded pack
02:28 🔗 SketchCow Oh no, dupes
02:28 🔗 m0lson and its mising all the .rars
02:28 🔗 * SketchCow gets the gascan
02:30 🔗 SketchCow http://www.archive.org/details/synthmanuals-propellerhead
02:36 🔗 lemonkey SketchCow: thanks for tweeting the geeks on board vimeo link
02:41 🔗 m0lson what other scene stuff are you intrested in getting your hands on SketchCow ?
02:53 🔗 SketchCow This and that
03:20 🔗 Dark_Star SketchCow: you could pull in www.bitsavers.org aswell... lots of manuals there too
03:20 🔗 lemonkey earthquake in SF
03:20 🔗 lemonkey 4.2 same as earlier
03:22 🔗 SketchCow Dark_Star: Yeah I should get on that
03:23 🔗 SketchCow http://www.archive.org/details/bitsavers
03:23 🔗 SketchCow Oh wait
03:23 🔗 m0lson http://www.textfiles.com/bitsavers/
03:24 🔗 Dark_Star heh okay :)
03:38 🔗 closure lemonkey: yeah, nice roll to that one
03:52 🔗 chronomex no earthquake in seattle
03:52 🔗 chronomex I'm feeling left out
05:00 🔗 * BlueMax grabs chronomex and shakes him around a little
05:01 🔗 chronomex on my birthday!
05:01 🔗 chronomex :P
05:01 🔗 * BlueMax puts a party hat on
05:02 🔗 BlueMax What, no hookers?
05:02 🔗 BlueMax I thought SketchCow would have ordered them ages ago
05:02 🔗 chronomex I am celebrating my birthday with programming
05:04 🔗 BlueMax print("Happy Birthday To Me")
05:04 🔗 BlueMax oslt
05:04 🔗 BlueMax I'M NOT A PROGRAMMER
05:06 🔗 BlueMax I tried learning Java once and it went right over my head, same with C++
05:07 🔗 Coderjoe m0lson, BlueMax: I have a predb dump that covers at least from 1996 to 2007...
05:08 🔗 winr4r 10 PRINT "HAPPY BIRTHDAY"
05:09 🔗 BlueMax 20 PRINT "HOOKERS ON THE WAY"
05:50 🔗 SketchCow 126G YV-6550032-6573504
05:50 🔗 SketchCow 238G YV-6500014-6549992
05:50 🔗 SketchCow root@teamarchive-0:/2/FTP/sushi/4363/6500000/videos# du -sh *
05:50 🔗 SketchCow Just big numbers all around
05:50 🔗 Coderjoe hm...
05:51 🔗 SketchCow Coming along nicely. Packing? Not so nicely.
05:51 🔗 SketchCow Editing is going well too.
05:51 🔗 SketchCow I want to send this back to the director tonight.
05:52 🔗 Coderjoe packing? what is the destination?
05:52 🔗 SketchCow I improved the film, but I'm pulling it out before it gets to become basically my movie
05:52 🔗 SketchCow California, SF. 9 days.
05:52 🔗 Coderjoe ah
05:52 🔗 SketchCow Did you just mail me?
05:52 🔗 Coderjoe my drive should be arriving at IA tuesday
05:52 🔗 Coderjoe yeah
05:52 🔗 SketchCow You kind of have a fast track here.
05:53 🔗 SketchCow E-mail me an address, I'll send the BBS Doc.
05:54 🔗 Coderjoe wasn't sure if you were around. I suppose I could have asked
05:54 🔗 Coderjoe or noticed my IRC window sooner
05:59 🔗 SketchCow Addressed, done
06:00 🔗 Coderjoe it appears friendster.014400000-014499999.tar.xz has finished
06:00 🔗 Coderjoe do you want get lamp back?
06:06 🔗 SketchCow Nahhh
06:07 🔗 Coderjoe m0lson, BlueMax: database and a small update: http://www.megaupload.com/?d=56GUP3DS
06:29 🔗 SketchCow -----
06:29 🔗 SketchCow I was asked what we're doing about Google Buzz material. Anyone feel like looking at it?
06:29 🔗 SketchCow -----
07:35 🔗 RedType SketchCow: just a heads up, there's a lot of misinformation out there about the shutdown
07:37 🔗 RedType for example, google reader is getting a lot of its social stuff merged in
07:40 🔗 RedType but the current comment data may or may not, nobody knows
07:54 🔗 ersi That
07:54 🔗 m0lson dam, thanks for the dump Coderjoe
07:54 🔗 ersi 's why he asked
07:54 🔗 RedType ersi: and i'm saying, not even people from google seem to know
07:54 🔗 Coderjoe those are the original files i downloaded
07:55 🔗 ersi We're the kind of people who don't trust other people with data
07:55 🔗 ersi You know how the saying go? Better safe(r) than sorry
07:56 🔗 RedType when the other people are going "i unno" and shrugging, you better not trust them
07:58 🔗 ersi And my point was that, indifferent to what people are going, you better not trust them.
07:59 🔗 ersi Hm, are most buzz material private to the posters friends? How would one find people who've publically shared?
08:22 🔗 m0lson the first pre in that DB is from 1980-01-01 06:00:36
08:22 🔗 m0lson so you have 1980 to 2007
08:23 🔗 m0lson which is pretty awesome
09:13 🔗 emijrp wiki is off
09:15 🔗 ersi Seems to be the MySQL database that isn't responding.
09:15 🔗 ersi I'd just wait, it'll probably come back up. It's ran at a hosting company.
09:17 🔗 BlueMax I remember when I went through every page of that wiki and cleaned it up slightly
09:17 🔗 BlueMax I don't know wtf I was on
09:17 🔗 ersi "Spring cleaning syndrome"
09:18 🔗 BlueMax ick
09:18 🔗 BlueMax So what's been going on around here lately
09:19 🔗 ersi SketchCow's busy ingesting all our pirate booty into The Archives
09:20 🔗 ersi we've been talking/thinking aboutGoogle Buzz which is closing
09:20 🔗 BlueMax I heard about that
09:20 🔗 BlueMax If you need help with it let me know, I'll be around
09:39 🔗 winr4r won't buzz stuff get migrated into +?
09:40 🔗 winr4r oh, apparently not
09:55 🔗 ersi Who knows.
12:44 🔗 asiekier_ hey
12:45 🔗 asiekier_ http://go.to/ and all of its subdomains are apparently listed for sale on Sedo
12:45 🔗 asiekier_ just saying
12:55 🔗 ersi aw man
12:55 🔗 ersi fucking shorteners
12:57 🔗 asiekier_ should i add it to the wiki
12:57 🔗 asiekier_ i'm cleaning it up right now
13:01 🔗 ersi sure
13:01 🔗 asiekier_ http://archiveteam.org/index.php?title=Deathwatch <- done
13:01 🔗 asiekier_ now adding go.to
13:02 🔗 ersi add it to the URL shortener list as well, if you got the time
13:04 🔗 asiekier_ sure
13:04 🔗 asiekier_ yeah
13:04 🔗 asiekier_ i have lots of time
13:05 🔗 asiekier_ added to deathwatch
13:06 🔗 asiekier_ not sure where to add it in the URL shortener list... oh well
13:06 🔗 asiekier_ about go.to, it has a format where no random URLs happen
13:06 🔗 asiekier_ only self-specified ones
13:06 🔗 asiekier_ so the only way to scrape it is by google queries
13:09 🔗 asiekier_ i could try sending an email
13:09 🔗 asiekier_ it's owned by myphotoalbum.com
13:10 🔗 asiekier_ http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=site%3Ago.to
13:10 🔗 asiekier_ yup, google gives about 41000 results
13:10 🔗 asiekier_ now to find an API
13:12 🔗 ersi It's at http://www.archiveteam.org/index.php?title=Urlteam
13:12 🔗 asiekier_ found it, updated
13:14 🔗 asiekier_ "The API provides 100 search queries per day for free."
13:14 🔗 asiekier_ and that's why i'll have to scrape google manually
13:14 🔗 ersi ah, nice info
13:14 🔗 asiekier_ 100 search queries * up to 10 results / query
13:14 🔗 asiekier_ 1000 results / day. BOO
13:20 🔗 asiekier_ hooray
13:20 🔗 asiekier_ got it to filter links
13:21 🔗 asiekier_ yup, works
13:22 🔗 asiekier_ now i only need to make an automated app that takes the results
13:22 🔗 asiekier_ "Sorry, Google does not serve more than 1000 results for any query. (You asked for results starting from 40500.)".
13:22 🔗 asiekier_ le fu
13:57 🔗 Ymgve it would probably be easier to bribe someone working in google
13:59 🔗 Ymgve what are you trying to scrape?
14:04 🔗 asiekier_ go.to
14:04 🔗 asiekier_ no codes
14:04 🔗 asiekier_ only names
14:04 🔗 asiekier_ i'm scraping go.to because all of its domains are for sale on Sedo
14:04 🔗 asiekier_ all of them
14:04 🔗 asiekier_ also one more thing to consider
14:04 🔗 asiekier_ minecraft classic
14:04 🔗 asiekier_ it had a map saving system which is apparently pretty faily nowadays
14:04 🔗 asiekier_ and may be down
14:06 🔗 asiekier_ and apparently is fully dead by now
14:06 🔗 asiekier_ i have a backup of about 200 users' maps
14:06 🔗 asiekier_ 200 fully random users
14:06 🔗 asiekier_ i'd say it's about 3-4% of the total maps
14:07 🔗 asiekier_ i'll locate and find it
14:11 🔗 asiekier_ i'd recommend backing up minecraft versions too
14:12 🔗 asiekier_ but notch himself stopped someone from doing it
14:14 🔗 ersi because notch is a fucking douche
14:14 🔗 Cameron_D asiekier_, there are patchers that contain past versions
14:14 🔗 ersi WHATS FUCKING WRONG WITH OPTIONS, INSTEAD OF CHANGING THE WHOLE GAME
14:14 🔗 ersi ass
14:14 🔗 ersi </rantrage>
14:14 🔗 asiekier_ Cameron_D thanks
14:14 🔗 asiekier_ ersi i'm trying to find that save backup
14:15 🔗 ersi Um, yeah - I'm not ranting about that
14:15 🔗 ersi I meant that he changes the way shit works :( without any way to flip it back to that..
14:15 🔗 asiekier_ but i'm worried it's gone
14:17 🔗 asiekier_ i think it was on my old HDD - which is dead
14:17 🔗 asiekier_ it was also on one of my freehosted websites - which is gone
16:01 🔗 asiekier_ right
16:01 🔗 asiekier_ coded a script for google-fuing links
16:12 🔗 asiekier_ ok, running it
16:19 🔗 asiekier_ did i just...
16:19 🔗 asiekier_ HTTP request failed! HTTP/1.0 503 Service Unavailable
16:26 🔗 asiekier_ added a 10-second delay
16:27 🔗 asiekier_ whatever, 5 seconds instead
16:27 🔗 asiekier_ should make it fast enough
16:27 🔗 asiekier_ or not, it is fast enough... just under 2 hours
21:32 🔗 soultcer Google bans you quite fast, but they also unban you automatically after a while. I think ndurner was using 2 to 20 seconds delay between requests, plus exponential (2, 4, 8, 16, 32, ...) backof when he got blocked for his google groups downloader
21:34 🔗 alard ipv6!
21:54 🔗 dashcloud hi guys, there was a request in this article: http://libregraphicsworld.org/blog/entry/guitar-samples-in-gig-format-from-flame-studio-collection-shared for the collection being shared to get hosted on archive.org
22:20 🔗 SketchCow This connection is slow.
22:20 🔗 SketchCow I said, THIS CONNECTION FROM A PLANE TO IRC IS SLOW
22:20 🔗 SketchCow Someone get on it.
22:20 🔗 Coderjoe a connection from a plane is slow? say it ain't so...
22:22 🔗 SketchCow WTF THE FUCK WHERE IS MY JETBACK
22:22 🔗 SketchCow Yes, that's right, I said WTF THhe Fuck
22:23 🔗 SketchCow I ordered some food, so if that comes, I'll be idle.
22:23 🔗 SketchCow I'm just doing some cleanups here and there, getting some mail out, flying in a plane.
22:23 🔗 SketchCow All the synths went up, lots of manuals, and it was so simple
22:24 🔗 SketchCow After a while it was no keypresses. 80 manuals went in with no intervention on my part.
22:24 🔗 SketchCow The scripts are getting better, although like most macros there's still a lot of customization on the front end.
22:24 🔗 db48x latency is high or throughput is low?
22:26 🔗 SketchCow Latency is pretty unpredictable from the plane.
22:26 🔗 SketchCow Sometimes fast, sometimes slow.
22:26 🔗 db48x interesting
22:27 🔗 db48x bufferbloat?
22:27 🔗 db48x that can cause latency spikes
22:27 🔗 db48x lag bubbles
22:27 🔗 SketchCow I think the plane is just using packets and is sending stuff slong a radio channel.
22:27 🔗 db48x well, internet traffic is packetized
22:27 🔗 db48x the internet wouldn't work otherwise
22:28 🔗 SketchCow I mean really encapulated bursts of packets, Vinton
22:28 🔗 db48x but yes, probably the radio link is dropping some percentage of those packets and not telling you
22:28 🔗 db48x even wifi does that
22:28 🔗 db48x it's really annoying
22:28 🔗 SketchCow An archivist named Jenny asked if she could help, she's a digitial preservation and records person, I suggested she go through our stuff, look for gaps or stuff we don't know about.
22:29 🔗 SketchCow A la what we got with our WARC format
22:29 🔗 db48x shiny
22:31 🔗 SketchCow I'll open another window and get another few terabytes uploaded.
22:31 🔗 SketchCow While on the plane. That'll be efficient.
22:31 🔗 SketchCow This time I have a car in SF
22:31 🔗 SketchCow And intend to use it.
22:31 🔗 SketchCow See people, do stuff.
22:37 🔗 alard Hi. I saw 'warc', so I'll just jump in for a short note: I've made a new version of wget-warc, one that doesn't use the warctools library. It's much smaller, so I hope it has a better chance of being included in wget. It would be nice if you wouldn't have to install it as a separate extension. I just mailed the new version to the wget mailing list, so I'll see what their reaction is.
22:44 🔗 Coderjoe how does it handle the warc stuff, custom code?
23:06 🔗 * lemonkey waves to SketchCow in the sky over SF
23:09 🔗 dashcloud what did Google do or not do recently that moved them to untrustworthy for data?
23:17 🔗 db48x SketchCow: I unified the poetry.com stuff that you uploaded to archive.org
23:17 🔗 SketchCow Chances are very good this spontaneous reader upgrade will demolish comments and social features.
23:17 🔗 SketchCow That's the last last straw, but the murder of Wave after killing Etherpad, the death of buzz, the removal of google groups and those hundreds of gigabytes of files...
23:18 🔗 SketchCow The fact that they kill products over time, often in a 3-4 but occasionally 1-2 year lifespan
23:18 🔗 SketchCow This is all adding up very, very poorly.
23:18 🔗 SketchCow Brad is of course working on them having export functions everywhere, but regardless.
23:18 🔗 db48x SketchCow: but it doesn't include a lot of things from http://archiveteam.org/archives/.lulupoetry/
23:19 🔗 SketchCow db48x: Throw them in!
23:19 🔗 db48x ok
23:20 🔗 SketchCow I can swap the thing with the new pieces
23:21 🔗 SketchCow I've been working so hard on the archive.org stuff from batcave I haven't even looked at the archives.
23:21 🔗 SketchCow So that's why that was that.
23:21 🔗 SketchCow I've got several .tar processing jobs going as we speak. Easily 2 tb of Yahoo Video.
23:25 🔗 db48x oh, good
23:25 🔗 db48x I've already downloaded everything in .lulupoetry
23:32 🔗 SketchCow It wasn't much
23:32 🔗 SketchCow It compresses very well.
23:32 🔗 db48x 2 gigs :)
23:33 🔗 SketchCow See
23:47 🔗 dashcloud http://www.wired.com/underwire/2011/10/9-essential-geek-books/?pid=5167&viewall=true here's the first in a series
23:54 🔗 SketchCow http://twitter.com/leighalexander/status/127531825311653888
23:54 🔗 SketchCow Either I won there or I placed well.
23:54 🔗 db48x I need a command line program that does html entity substitution
23:54 🔗 db48x &quot; to ", etc
23:55 🔗 SketchCow There MUST be some crap for that in perl.
23:55 🔗 SketchCow MUST be
23:55 🔗 db48x yea
23:55 🔗 SketchCow Geeks are GENETICALLY DESIGNED to do that shit in perl
23:55 🔗 db48x heh
23:55 🔗 db48x the rest of the script is in shell though
23:55 🔗 db48x there needs to be a utility that does this
23:55 🔗 chronomex lol
23:56 🔗 db48x I mean, we have sha1, base64, uuencode, etc
23:56 🔗 SketchCow Anywhere there's a list of possible responses to an incoming stream of text that has to follow an arbitrary and bizarre set of consistent standards, you can bet theres some perl in CPAN's asscrack that does it.
23:56 🔗 db48x oh, yea
23:56 🔗 db48x CPAN has everything
23:56 🔗 SketchCow So have the shell call a perl statement.
23:56 🔗 chronomex that's the purpose of cpan
23:56 🔗 SketchCow Do it all the time.
23:57 🔗 db48x
23:57 🔗 db48x She has blonde hair, beautiful
23:57 🔗 db48x True Love by Brian E Hewins
23:57 🔗 db48x [db48x@celebdil unified]$ cat 003/000/000/003000000.txt
23:57 🔗 SketchCow I am well on my way to becoming archive.org's top uploader next to possibly prelinger.
23:57 🔗 db48x brown eyes, and endless love.
23:57 🔗 db48x [...]
23:58 🔗 db48x \Her name is &quot;Cindy&quot;, and she is the love of my life;
23:58 🔗 db48x truly my best friend, my collie
23:58 🔗 SketchCow I accidentally / the whole goddamn poetry / and I am quite sad
23:58 🔗 SketchCow HAIKU BACK AT U
23:58 🔗 db48x heh
23:59 🔗 db48x hrm
23:59 🔗 db48x the three tarballs you posted to archive.org have 264113 poems in them
23:59 🔗 db48x 264113 < 14x10^6
23:59 🔗 winr4r db48x: i can make you one real quick if you like
23:59 🔗 winr4r i wrote most of it for work

irclogger-viewer