Time |
Nickname |
Message |
00:32
π
|
DragonDon |
Greetings all |
00:40
π
|
mistym |
Hi DragonDon! |
00:40
π
|
DragonDon |
Hi mistym , how goes it? |
01:19
π
|
mistym |
DragonDon: Pretty good, thanks! What about you? (Sorry, totally missed your message!) |
01:49
π
|
godane1 |
SketchCow: how do you delete .pureftpd-upload files on archive.org? |
01:49
π
|
godane1 |
i'm trying to upload a maximum pc cd from june/july 2010 |
01:50
π
|
godane1 |
but my internet sucks |
01:52
π
|
godane1 |
can anyone help? |
01:58
π
|
Wyatt|Wor |
godane: Can you not use rsync? |
01:59
π
|
godane |
i'm using gtkftp |
01:59
π
|
godane |
*gftp |
02:04
π
|
Wyatt|Wor |
Wait, is that dotfile you mention for resuming the upload? |
02:11
π
|
godane |
i think so |
02:11
π
|
godane |
but gftp will not know to resume it |
02:11
π
|
godane |
anyways i'm 44% done |
02:12
π
|
Wyatt|Wor |
Weird, I thought gfp could do that. |
02:12
π
|
Wyatt|Wor |
Try wput? |
02:12
π
|
godane |
it could if the file name is the same |
02:12
π
|
DragonDon |
mistym, no probs, I ended heading off to a shower and getting dressed for the day :)_ |
02:13
π
|
godane |
i just want to know if i can remove the .pureftpd-upload files |
02:13
π
|
Wyatt|Wor |
I don't think the client matters as long as you have the dotfile to resume on the server-side |
02:13
π
|
godane |
so there not uploaded to archive.org cluster |
02:14
π
|
Wyatt|Wor |
It should just get removed when the upload completes. |
02:14
π
|
godane |
I'm reuploading the file |
02:15
π
|
godane |
its not resumeing any .pureftpd-upload |
02:15
π
|
godane |
so those files shouldn't get delete |
02:15
π
|
godane |
but it want them to be |
02:18
π
|
Wyatt|Wor |
In the future, since you say you have a bit of a flaky connection, you should be able to enable resume in gFTP, or use a resume-capable client. |
02:29
π
|
godane |
it doesn't work if file name change |
02:29
π
|
godane |
file is MPC_Buildit.iso |
02:30
π
|
godane |
also since of my flaky connection i don't know if resume is best cause you could be getting a corrupted file |
02:31
π
|
godane |
again don't care about resumeing now |
02:31
π
|
godane |
i just want the .pureftpd-upload.* files delete |
02:31
π
|
godane |
nothing else |
02:35
π
|
godane |
so again how do i remove .pureftpd-upload.* files? |
02:36
π
|
godane |
not crap about using the resume button |
02:36
π
|
godane |
cause MPC_BuildIt.iso != .pureftpd-upload.* |
02:37
π
|
godane |
fucking crap |
02:37
π
|
godane |
one of the .pureftpd-upload.* files did rename |
02:38
π
|
godane |
but uploaded a full 688mb iso just to get some 4121545648 image |
02:38
π
|
godane |
what the hell? |
02:41
π
|
godane |
now i'm resuming after the complete connection |
02:41
π
|
godane |
may have to resume again cause of there being some other .pureftpd-upload file |
02:44
π
|
chronomex |
.putrified |
02:45
π
|
godane |
i'm sorry that i'm pissed off |
02:45
π
|
godane |
but i may have upload 4 times has much data for one fucking iso |
02:47
π
|
Wyatt|Wor |
I'm guessing your FTP account doesn't have the privs needed to rm and you don't have a shell account? |
02:47
π
|
godane |
i just hope archive.org or gftp doesn't use rename the other .pureftpd file cause i will just delete what i have |
02:47
π
|
godane |
i have standard upload account |
02:48
π
|
godane |
i can't ever put my isos in software cd shareware part of archive.org |
02:48
π
|
Coderjoe |
http://archive.org/post/267549/stale-pureftpd-upload-file-blocks-checkin |
02:51
π
|
Coderjoe |
wow |
02:51
π
|
Coderjoe |
a familiar name in that thread (not including tracey pooh) |
02:52
π
|
Wyatt|Wor |
haha, I see it. |
02:53
π
|
Wyatt|Wor |
The S3 method seems best then. |
02:54
π
|
Coderjoe |
it does allow for the easiest automation |
02:54
π
|
Coderjoe |
(pushing 4990 videos up using it right now) |
02:55
π
|
Wyatt|Wor |
Curious, though, Tracey specifies a preference for curl. I've always seen curl and wget as roughly equivalent-- what differences inform the decision to use one or the other? |
02:56
π
|
Coderjoe |
for the s3 interface? probably because it is the one that sam documented in the api doc |
02:56
π
|
Coderjoe |
http://archive.org/help/abouts3.txt |
03:00
π
|
Coderjoe |
and i guess that hard drive fix line is just part of the legend. i could have sworn i've seen times where it is there and others when it isn't |
03:00
π
|
mistym |
http://www.kickstarter.com/projects/120873716/your-world This TOTALLY deserves to have made it to Kickstarter |
03:00
π
|
mistym |
Extremely professional site too http://www.yourworldinc.com/ |
03:03
π
|
Wyatt|Wor |
I asked this yesterday, but different people seem active, so I'll try again. If someone is interested in backing up their community/content on IA, how best to go about that? |
03:04
π
|
Wyatt|Wor |
Well, archiving, not "backing up" |
03:08
π
|
shaqfu |
mistym: WoW+SL? |
03:10
π
|
shaqfu |
Oh, hah; reading through this, it's like the biggest pipe dream ever |
03:11
π
|
aggro |
What's wrong with straight-up WoW? |
03:11
π
|
aggro |
Every RPG I've ever played has the same basic structure anyway. |
03:12
π
|
aggro |
tanks, healers, dps, yadda yadda |
03:12
π
|
mistym |
shaqfu: Yeah, it's basically "here are some minor quibbles I have with WoW", not a concept for a game |
03:17
π
|
shaqfu |
And funding it on $1.1M on that scope, yikes |
03:17
π
|
shaqfu |
No wonder he's getting no money - it's an obvious bomb |
03:19
π
|
Wyatt|Wor |
Hahah "I am an idea man." |
03:23
π
|
godane |
finally upload: http://archive.org/details/Maximum_PC_CD_June_July_2010 |
03:24
π
|
balrog_ph |
shaqfu: That's an utter joke |
03:24
π
|
Coderjoe |
oh man. i just had a painful thought: building a 6502 emulator in minecraft's redstone circuits |
03:25
π
|
shaqfu |
balrog_ph: ? |
03:25
π
|
balrog_ph |
That kickstarter |
03:25
π
|
shaqfu |
Oh, yeah |
03:51
π
|
SketchCow |
Back |
03:51
π
|
SketchCow |
Morgan says hi. |
04:18
π
|
chronomex |
hi morgan |
05:06
π
|
winr4r |
morning |
05:18
π
|
dnova |
mornin |
05:18
π
|
shaqfu |
Mornin' |
05:32
π
|
SketchCow |
Borp |
05:33
π
|
winr4r |
morgan is a bad-ass name |
05:33
π
|
shaqfu |
And a historical unit of measure! |
05:34
π
|
winr4r |
"In genetics, a centimorgan (abbreviated cM) or map unit (m.u.) is a unit for measuring genetic linkage." <- how about that! |
05:35
π
|
SketchCow |
http://www.esquire.com/features/robert-caro-0512 |
05:35
π
|
shaqfu |
Wait, really? I was thinking of the Dutch measure of land |
05:35
π
|
SketchCow |
Read up on the crazy |
05:42
π
|
winr4r |
wow |
05:46
π
|
shaqfu |
...sheesh |
05:51
π
|
Coderjoe |
where's the pc world article again? |
05:52
π
|
Coderjoe |
ah. found it |
05:52
π
|
winr4r |
http://www.pcworld.com/article/253672/the_archive_team_rescues_user_content_from_doomed_sites.htmlw |
05:52
π
|
winr4r |
too slow |
05:52
π
|
winr4r |
heh |
05:52
π
|
winr4r |
-w btw |
05:53
π
|
winr4r |
(though apparently it makes no difference) |
05:55
π
|
Coderjoe |
(from the 35mm vs digital article) |
05:55
π
|
Coderjoe |
It certainly isn't. James Cameron's Avatar got the ball rolling back in 2009. The 3-D blockbuster could only be shown via digital projectors, and so the first wave of theaters upgraded in a hurry. |
05:55
π
|
Coderjoe |
bull shit. |
05:55
π
|
Coderjoe |
my local IMAX was showing it on dual 70mm strips |
05:55
π
|
Cameron_D |
ohey my name |
05:57
π
|
Coderjoe |
perhaps on the 35 side, that is true, but not universally |
05:57
π
|
SketchCow |
Cameron's original hope for Avatar was that it could be a 3D-only proposition, but however quickly cinemas scurried to update their capabilities, it wasn't quite quickly enough. The film is being shown in several formats, including conventional 2D. Whether audiences favour the 3D (and IMAX 3D) versions is a significant factor in how far Avatar will spearhead the 3D-ification of effects blockbusters to come. |
05:59
π
|
SketchCow |
That smacks strongly of the LA Weekly reporter finding "avatar only to be released digital" articles and not finding the "pushback causes some standard-issue formats to be released too" articles. |
06:00
π
|
winr4r |
yes, it does |
06:02
π
|
Coderjoe |
And then there was Valentine's Day. Instead of a 35mm print, the studio offered Belove either a DCP or a DVD of Breakfast at Tiffany's. |
06:02
π
|
Coderjoe |
hahaha |
06:02
π
|
Coderjoe |
because DVD is even half of what 35mm is |
06:03
π
|
SketchCow |
The DVD offer was awesome |
06:05
π
|
winr4r |
DVD? as in a standard definition DVD? |
06:05
π
|
SketchCow |
Yeah, nice offer |
06:05
π
|
SketchCow |
Who knows if that's real |
06:06
π
|
SketchCow |
That LA Weekly person didn't really double-source, it seems. |
06:06
π
|
SketchCow |
Bet they didn't even call the studio to check |
06:06
π
|
SketchCow |
I was more fascinated that DCP won out as the internal format |
06:11
π
|
winr4r |
"A few months later, in January, one of the companies that makes the raw, unprocessed film stock, Eastman Kodak, filed for bankruptcy." |
06:11
π
|
winr4r |
...but the film division was one of the parts of kodak that was actually profitable, so tell me how that's related |
06:12
π
|
chronomex |
shhhh |
06:12
π
|
winr4r |
(thank god, i'd die if ektar 100 went away) |
06:13
π
|
SketchCow |
Shhhhhh |
06:13
π
|
SketchCow |
This is not a great article. |
06:13
π
|
SketchCow |
It informs to a few basic structures of the industry that are worth knowing. |
06:14
π
|
SketchCow |
But it is ultimately weaksauce |
06:34
π
|
Wyatt|Wor |
How does the relationship between 35mm movie film and 35mm photo film work? Is it the same stuff, only the final print is on film stock vs. photo paper? |
06:35
π
|
SketchCow |
No |
06:35
π
|
SketchCow |
There's other stuff. |
06:35
π
|
winr4r |
same size, different sprockets, sometimes different emulsions |
06:36
π
|
Coderjoe |
same width, different sprockets, different direction of travel... |
06:36
π
|
Wyatt|Wor |
Okay, so it's not going to have the same grain characteristics? |
06:37
π
|
Coderjoe |
that's largely the emulsions |
06:37
π
|
winr4r |
Wyatt|Wor: the area of a photo from a 35mm still camera is greater |
06:38
π
|
Coderjoe |
different emulsions have different grain sizes |
06:38
π
|
winr4r |
so if the emulsions are the same, the 35mm movie camera will have more grain (but the emulsions aren't always the same) |
06:38
π
|
Wyatt|Wor |
Because it needs to handle the stresses of running it through the gears repeatedly? |
06:38
π
|
Wyatt|Wor |
(That's re: still camera area being greater) |
06:40
π
|
Coderjoe |
unless you're in the low end of the market and working with reversal stock, you generally do not run your camera footage through a projector. |
06:40
π
|
Coderjoe |
heck, in 35, your camera footage doesn't even have sound |
06:40
π
|
winr4r |
Wyatt|Wor: oversimplifying, but the "height" dimension of a still shot is used as the "width" of a 35mm movie shot |
06:41
π
|
Coderjoe |
35mm still has the film pass the shutter horizontally, while 35mm motion has it going past vertically |
06:41
π
|
winr4r |
yes, better way of putting it |
06:41
π
|
Coderjoe |
... unless you're using lucasfilm's rescued and adapted vistavision equipment |
06:45
π
|
Wyatt|Wor |
Okay, so this is all a lot more complicated than I even realised. |
06:46
π
|
Wyatt|Wor |
But basically, for me, what it boils down to is this: A photographer friend of mine once said a good rule of thumb was 35mm is roughly equivalent to a good 8MP CCD. Sound about right? |
06:46
π
|
winr4r |
Wyatt|Wor: an 8mp bayer-interpolated CCD? no |
06:46
π
|
chronomex |
now we're getting into internet-holy-wars territory |
06:46
π
|
winr4r |
yes |
06:47
π
|
winr4r |
we are |
06:47
π
|
Wyatt|Wor |
Analogue vs. Digital is a vast theatre of combat. Well, I'm sorry about that. |
06:48
π
|
winr4r |
they're very different things, they even resolve detail very differently |
06:49
π
|
winr4r |
in any case i'm hungry |
06:49
π
|
chronomex |
I REFUSE TO BELIEVE THAT THERE IS NO ONE STANDARD OF COMPARISON |
06:49
π
|
Wyatt|Wor |
Well anyway, my point with all this is this DCP thing looks like it maxes out at 4096ΓΒ2160... Am I at least on the right track to have the impression that that's a bit of a step backward? |
06:50
π
|
winr4r |
Wyatt|Wor: see "resolve detail very differently" above |
06:50
π
|
* |
chronomex convenes a subcommittee to define the ANSI Standard Pel |
06:50
π
|
* |
chronomex defines ANSI Standard Pel == ANSI Standard Film Grain |
06:53
π
|
shaqfu |
And this is why I don't go anywhere near video preservation |
06:53
π
|
winr4r |
Wyatt|Wor: the longer version of that being that film never really runs out of resolution, it gradually resolves details less distinctly as they get finer |
06:54
π
|
winr4r |
Wyatt|Wor: digital resolves things 100% sharply until you hit its resolution limit |
06:55
π
|
chronomex |
I love the smell of nerd jihad in the morning! it smells like VICTORY. |
06:55
π
|
chronomex |
winr4r: no. pels are *sampling* |
06:58
π
|
winr4r |
chronomex: honestly, i hate the whole "film vs digital" thing because 1) nobody will ever agree 2) everyone is wrong 3) the world has decided in favour of digital so you might as well argue about whether the titanic or icebergs were cooler |
07:00
π
|
winr4r |
let us eat pot noodles instead |
07:02
π
|
winr4r |
Wyatt|Wor: don't be |
07:02
π
|
* |
chronomex peers closely |
07:02
π
|
chronomex |
yep, pixels. |
07:02
π
|
winr4r |
whoops, i was accidentally scrolled up a tiny bit, thanks mouse |
07:02
π
|
Wyatt|Wor |
I get that I'm a rank amateur; I'm just interested in this conundrum where the resolution limitations of DCP seem like they'll cause edge cases where it can't hold up to film. |
07:02
π
|
winr4r |
so i was responding to something he said earlier |
07:02
π
|
shaqfu |
winr4r: The iceberg was cooler; no way the ship was anywhere neat freezing |
07:03
π
|
winr4r |
need moar caffeine :/ |
07:03
π
|
winr4r |
shaqfu: haha |
07:03
π
|
Wyatt|Wor |
So where do icebergs go on the Cool Wall? |
07:14
π
|
winr4r |
bleh i think i am going to head back to bed |
07:14
π
|
Wyatt|Wor |
Sleep well |
08:04
π
|
Coderjoe |
http://archive.org/details/stage6-1351710 |
10:45
π
|
Wyatt|Wor |
Holy crap, it finished! It took about 5100 CPU Minutes, but that grep process finally finished! :D |
10:46
π
|
Wyatt|Wor |
morbid curiosity: 1 malloc() hell: 0 |
10:47
π
|
oli |
time to upgrade from that pentium pro 200? |
10:51
π
|
Wyatt|Wor |
oli, no, it's a bug in older versions of grep when you have a unicode locale. |
10:52
π
|
oli |
;P |
10:52
π
|
oli |
wasnt srs |
10:52
π
|
oli |
anyway what's up? |
10:52
π
|
Wyatt|Wor |
Not much. Going home soon. |
11:49
π
|
SmileyG |
rawwwwr |
11:52
π
|
SmileyG |
winr4r: well the iceburgs were made of ice, so they were atl east 0c |
11:52
π
|
SmileyG |
i'd think they were cooler than the titantic at any rate. |
12:01
π
|
Jaybird11 |
I'm the one who reported the probably shutdown of Q-audio.netto @textfiles and @archiveteam. Possibly one of several. |
12:02
π
|
Jaybird11 |
I have Q-audio posts up through 29479, the last numericly-indexed one. Well over 100 gigs. |
12:03
π
|
Jaybird11 |
That leaves out probably nearly a year of content, since he switched to Base36. |
12:04
π
|
Jaybird11 |
He is known to fight scrapers, including probably useragent blocking and IP blocking. File structure is quite simple. |
12:10
π
|
Nemo_bis |
If I 7z a files to a non-solid 7z archive with a different compression rate than the previously used one, will the new option be respected for the new files or not? |
12:10
π
|
Nemo_bis |
maybe alard knows |
12:12
π
|
SketchCow |
Jaybird11: He's never going to go for it. |
12:12
π
|
Jaybird11 |
Yeah I know. Worth the effort though anyway? |
12:12
π
|
SketchCow |
I mean, I'm happy to play the part of white knight and make it easy enough to do. |
12:13
π
|
SketchCow |
We can do a distributed attack |
12:13
π
|
SketchCow |
But he'll just nail those |
12:13
π
|
Jaybird11 |
Okay, here's the file structure info |
12:13
π
|
SketchCow |
I'd say don't dump it here unless it's short. |
12:14
π
|
Jaybird11 |
http://q-audio.net/i/XXX are info pages giving uploaded filename, timestamp submitted, size, etc. |
12:14
π
|
SketchCow |
I had no idea you had a good amount yourself. |
12:14
π
|
Jaybird11 |
http://q-audio.net/d/XXX are the files themselves. The server returns the XXX and not the real filename so you need the /i/XXX to find the filename. |
12:14
π
|
SketchCow |
Want me to give YOU an upload slot? :) |
12:15
π
|
Jaybird11 |
Sure. I don't have the info pages though, just the files. But the scraper I used did preserve the filenames. |
12:15
π
|
SketchCow |
Man, how crazy is it that I hear "over 100 gigs" and I go "oh! Well, just dump that shit here." |
12:15
π
|
Jaybird11 |
The files start numericly and go up through 29479. Then he switched to base36 and I never found a scraper to deal with that. |
12:16
π
|
SketchCow |
As if someone mentioned it was an attachment they could mail |
12:16
π
|
Jaybird11 |
The entire collection I think is over 300 gigs |
12:16
π
|
Jaybird11 |
The reason for the shutdown is, he's tired of Dreamhost and can't find storage as cheap |
12:17
π
|
Jaybird11 |
The scraper I used did not preserve the timestamps of the files so if you want those someone will have to scrape the info pages. |
12:17
π
|
Jaybird11 |
At least he gave us some warning rather than pulling the plug suddenly. |
12:18
π
|
SketchCow |
Big deal, if he's preventing download. |
12:19
π
|
Jaybird11 |
A distributed attack is probably the only way to even hope for a full archival. Problem is, I don't know how much time we have left, probably nobody does. |
12:20
π
|
Jaybird11 |
My collection of files is on Windows. If you'll want me to use Rsync, I'll need instructions for doing that on Windows or something. |
12:21
π
|
SketchCow |
OK, so two things. |
12:21
π
|
SketchCow |
1. My hope is that showing I'm wearing big boy pants and am willing to throw a few bucks his way will persuade him to do the command. |
12:21
π
|
Jaybird11 |
In case you didn't know, Q-audio is pretty much a twaud.io alternative this guy created when his Twitter client designed for the blind integrated audio upload support. |
12:22
π
|
SketchCow |
2. I am banking that he is basically agains tbeing randomly scraped by amateurs. |
12:22
π
|
SketchCow |
Oh, I am well aware of what this is, you made sure of that months ago. |
12:22
π
|
SketchCow |
At least somebody out there is looking out for his blind homeboys. |
12:23
π
|
Jaybird11 |
Ah good. I use the service myself. I knew this would probably happen someday. |
12:23
π
|
Nemo_bis |
Looks like 7z is smart enough. |
12:23
π
|
Jaybird11 |
I pushed a few of my friends to update the existing scraper to support base36, but nobody ever did as far as I know. |
12:25
π
|
SketchCow |
Coderjoe: I've added your next 311 videos to stage6. |
12:25
π
|
Jaybird11 |
On a related topic, is anyone proactivly archiving Soundcloud or Audioboo? |
12:25
π
|
SketchCow |
Coderjoe: Took one command, just did it, so that's how easy it is. |
12:26
π
|
winr4r |
good afternoon |
12:29
π
|
Jaybird11 |
If we can't get his cooperation, one way to salvage at least a piece of Q-audio other than what I already grabbed would be to call out to people who have downloaded their favorite clips, or who still have stuff they've uploaded. I have most if not all of what I've uploaded myself, and in the case of things I made myself, I have it in lossless format to boot. |
12:33
π
|
Jaybird11 |
I think one reason he's against archiving this stuff is that probably a lot of it was sent in direct messages between two individuals. There's really no way to filter out those posts. Posts recorded within Qwitter and its forks start with tmp, so filtering those would probably get rid of a lot of private stuff. But probably not all, and there's always the chance of filtering out something which might mean something to someone years do |
12:35
π
|
SketchCow |
RIght |
12:36
π
|
SketchCow |
I'm aware, that is in fact what's going to kill it. |
12:36
π
|
SketchCow |
I'm going in the front door here |
12:36
π
|
SketchCow |
I never think that works. |
12:39
π
|
Jaybird11 |
With your experience with Dreamhost, do you have any clues? If he cancels, does he have to sit out the rest of the month, then on May 1 it all goes boom? Or can he pull the plug on Dreamhost anytime he wants? This is all not withstanding his ability to get sick and tired of amateur scrapers and rm -rf * the whole mess and be done with it. |
12:40
π
|
SketchCow |
He can boom at anytime |
12:40
π
|
SketchCow |
I don't think he will. |
12:40
π
|
SketchCow |
His personal policing of the scraping is adorable and unneeded |
12:40
π
|
Jaybird11 |
I think what prompted this was, last night a VPS was down and so was the control panel. I think he's had it with outages. |
12:42
π
|
Jaybird11 |
On the subject of distributed archival. I've always thought it would be neat to have a distributed system people could run that just sits there, doing whatever ArchiveTeam wants. Sort of an opt-in botnet if you will. People could specify soft and hard limits for disk and bandwidth they're willing to donate to the cause, and also see what projects are running and exclude any they don't want to participate in for some reason. |
12:43
π
|
winr4r |
Jaybird11: cow mentioned exactly the same thing in his talk at PDA :) |
12:43
π
|
SketchCow |
Yeah, we call it Archive@home |
12:44
π
|
SketchCow |
It's the logical next step for the universal tracker. |
12:44
π
|
SketchCow |
This EXACT moment, I'm just delighted we have the universal tracker. |
12:44
π
|
SketchCow |
Requires a little setup, but then whooooboy |
12:44
π
|
SketchCow |
I just wish we didn't have to burn so much goodwill on mobileme |
12:44
π
|
Jaybird11 |
I know there's a virtual appliance, but that's probably not accessible to the blind since it doesn't have any screen reader or anything, and you have to know what you want it to do. |
12:45
π
|
SketchCow |
No, you don't want in on that crap, yet |
12:45
π
|
SketchCow |
I suppose we could take a swing at making stuff more accessible, but we're not there yet. |
12:46
π
|
oli |
Archive@home? |
12:46
π
|
Wyatt |
universal tracker _is_ pretty slick. I was planning on setting it up for 8bc until that crashed into a mountain of scene drama or something. |
12:46
π
|
oli |
how about Archive@everydedicatedandcolocatedserverpossible |
12:46
π
|
Wyatt |
(I'm in touch with the people who ran it trying to get what dumps are available) |
12:46
π
|
SketchCow |
Archive@home is a parody reference to seti@home, the distributed look for shit in space client |
12:47
π
|
SketchCow |
Oh, that's right, 8bc exploded, didn't it. |
12:47
π
|
oli |
i know :p |
12:48
π
|
Wyatt |
SketchCow: Yeah, I'm still not sure what exactly happened, but I've reached out to 2xAA, and through him hopefully Jose will be cooperative. |
12:48
π
|
Jaybird11 |
Why I think we would need both soft and hard limits is this. So normal projects, let's say you set a cap of 1TB you're willing to spend on disk. But here comes some new emergency project. Oh look at this! (Insert name of wildly popular service) has been acquired by Yahoo! They're giving the users twenty-four hours to get their junk off or it all goes away! Now your hard limit kicks in, and you start going at this new project like craz |
12:48
π
|
winr4r |
wait, 8bc is gone? :/ |
12:48
π
|
winr4r |
i remember poking around there a little while back, loved it |
12:48
π
|
Wyatt |
winr4r: Thaaaat's how it's looking. And I was going to archive it after MobileMe, too. :( |
12:49
π
|
winr4r |
:/ |
12:49
π
|
SketchCow |
Right now, I'm just trying to get off batcave. |
12:49
π
|
winr4r |
Wyatt: hey, first pass of the screenshot bot has completed, btw |
12:49
π
|
SketchCow |
Once I'm off batcave, I'll be then trying to get another server off archive.org. |
12:49
π
|
SketchCow |
But batcave, he asked me THREE MONTHS AGO to get off |
12:49
π
|
winr4r |
now to figure out why some pages cause it to hang for no good reason |
12:49
π
|
SketchCow |
It's taken THAT LONG to work out the 20tb |
12:49
π
|
Wyatt |
winr4r: Those take a while, don't they? |
12:50
π
|
SketchCow |
How are those screenshots being generated, anyway. |
12:50
π
|
winr4r |
SketchCow: by a python script using python's webkit bindings running in an Xvfb |
12:50
π
|
SketchCow |
213M newsyc-03/ |
12:50
π
|
SketchCow |
root@teamarchive-0:/2/FTP/tef# du -sh newsyc-03/ |
12:51
π
|
SketchCow |
That looks a lot like someone did some sort of awesome grab of ycombinator. |
12:52
π
|
SketchCow |
tef: Wake up when you get a chance, I want to understand these files before I upload them. |
12:52
π
|
Jaybird11 |
Cow, I'd love to be able to upload my Q-audio collection. I've been a bit concerned that, as far as I know, except for the real thing, I have the only, or one of few, copies. |
12:53
π
|
SketchCow |
Yeah, since coming to work for the archive, it's scary how differently I think about the whole thing. |
12:53
π
|
SketchCow |
Jaybird11: Would an FTP account be better than an rsync? |
12:53
π
|
Wyatt |
How's the accessibility of Cygwin? |
12:53
π
|
Jaybird11 |
Probably, unless you can instruct me on Windows. |
12:54
π
|
tef |
which files ? |
12:54
π
|
Jaybird11 |
It's mostly console so pretty good as far as I know. Let me make sure I don't have an rsync. |
12:54
π
|
tef |
SketchCow: well, those files are captures of news.ycombinators front page during the sopa blackout, at different times of the day iirc |
12:54
π
|
Wyatt |
So what's the story with batcave anyway? Why are we getting booted from it? |
12:54
π
|
SketchCow |
It's old style box |
12:55
π
|
Jaybird11 |
Nope, don't have rsync.exe. Does a good Windows port exist that I can just download and use? |
12:55
π
|
SketchCow |
They want to decomission the box and clear that rack. |
12:55
π
|
Wyatt |
Ah, getting decommissioned |
12:55
π
|
SketchCow |
Meanwhile, I'm on there like a tenacious old tenant who refuses to move |
12:55
π
|
tef |
winr4r: are you doing python pyqt stuff ? |
12:55
π
|
SketchCow |
It's a bit of stress for the admin but he's too nice to really confront me |
12:55
π
|
winr4r |
tef: it uses GTK |
12:55
π
|
SketchCow |
I would just do a straight transfer over to fos, but there's ironically not enough space. |
12:56
π
|
winr4r |
(note: it's someone else's script i've altered, i don't really know anything about pygtk either) |
12:56
π
|
tef |
winr4r: qt 4.8 uses webkit 2.2 so i'd recommend it over pygtk |
12:56
π
|
Wyatt |
So basically, we all owe the admin a pint. |
12:56
π
|
SketchCow |
Jaybird11: http://www.aboutmyip.com/AboutMyXApp/DeltaCopyDownloadInstaller.jsp |
12:57
π
|
SketchCow |
That is a port but may be a bit much |
12:57
π
|
tef |
SketchCow: actually if I recall correctly, the grabs of news.yc should be the front page and all links from that page |
12:58
π
|
tef |
so it should have the comments & the articles linked to |
12:58
π
|
SketchCow |
I have an idea. Wyatt, you work with jaybird in a private message to get his rsync going. |
12:58
π
|
Wyatt |
I'll see what I can do. |
13:00
π
|
Jaybird11 |
I've downloaded DeltaCopy. About to unzip. |
13:00
π
|
Jaybird11 |
say I'll be away from this window for a bit while I look at it |
13:00
π
|
winr4r |
tef: you're probably right, but webkit 1.x is what this box and Wyatt's VPS has |
13:00
π
|
Jaybird11 |
Yes, I am using a screen reader |
13:00
π
|
tef |
winr4r: ah cool |
13:01
π
|
tef |
winr4r: I was meaning to hook up my companies crawler to irc here but I've sorta not had the time yet |
13:01
π
|
winr4r |
tef: and i'm screenshotting fortunecity, i don't need to worry about any of the sites using features that only exist in webkit 2.x :P |
13:01
π
|
tef |
winr4r: :D |
13:01
π
|
tef |
winr4r: yeah there is also http://code.google.com/p/wkhtmltopdf/ |
13:02
π
|
Jaybird11 |
Okay, I have DeltaCopy installed. Is this basicly a GUI Rsync? |
13:02
π
|
SketchCow |
It might be. |
13:03
π
|
SketchCow |
But a really basic one so I had hoped your scraper would work |
13:04
π
|
SketchCow |
ANYHUB BBC friendster FRIENDSTER-LOGS MANUALS SOPA-NEWSYC SYNTHMANUALS |
13:04
π
|
SketchCow |
root@teamarchive-0:/2# ls |
13:04
π
|
SketchCow |
archiveteamorg-dir.xml.xz BERLIOS FRIENDSTER GOOGLEGROUPS MOBILEME-SETS SPLINDER thenews |
13:04
π
|
SketchCow |
archiveteamorg-grp.xml.xz DNA friendster-grab.zip MAGAZINES SOPA-GRAB STUFF YAHOOVIDEO |
13:04
π
|
SketchCow |
Ok, so there we go, the sort of roundup of data that was on batcave. |
13:05
π
|
SketchCow |
Some of those are A TAD LARGE |
13:05
π
|
Wyatt |
Wow, Yahoo Video is still hanging around undigested? |
13:05
π
|
SketchCow |
A directory is. |
13:05
π
|
SketchCow |
root@teamarchive-0:/2/FRIENDSTER# du -sh . |
13:05
π
|
SketchCow |
1.7T |
13:06
π
|
Jaybird11 |
Sorry, I don't know how to do private messages in IRC. Do I have a server IP address or something to put into DeltaCopy? |
13:06
π
|
Jaybird11 |
say Or a hostname? It asks for a hostname and a virtaul directory name |
13:07
π
|
Jaybird11 |
Also, sorry for typing say. I'm used to MUD/MOO systems where you actually have to type say before your text |
13:07
π
|
Wyatt |
Jaybird11: You can use /msg username or to open a ...I guess it's like a private channel with /query username |
13:11
π
|
Wyatt |
SketchCow: Where's he sticking this stuff? |
13:12
π
|
SketchCow |
fos.textfiles.com::qaudio |
13:12
π
|
SketchCow |
Oh man, this friendster thing is going to be a huge mess. :) |
13:14
π
|
winr4r |
what happened? |
13:14
π
|
Wyatt |
Well, we happened. |
13:14
π
|
winr4r |
http://lavender.fortunecity.com/powell/58/ |
13:14
π
|
winr4r |
also, will someone tell me if that loads for them? |
13:15
π
|
Wyatt |
Yes. |
13:15
π
|
winr4r |
this one quite dependably causes the script to crash |
13:15
π
|
winr4r |
okay |
13:15
π
|
winr4r |
well, not "crash", just sit there forever |
13:15
π
|
Ymgve |
that page...crashed my Opera |
13:15
π
|
Wyatt |
Wait. What? |
13:15
π
|
winr4r |
Ymgve: HM |
13:16
π
|
Ymgve |
for some reason it works now tho |
13:21
π
|
SketchCow |
OK, this is actually not as bad as I made it out. |
13:22
π
|
SketchCow |
I pulled out of mothballs the infrastructure for importing Friendster and once I did that, things are clicking into place. |
13:22
π
|
winr4r |
excellent :) |
13:25
π
|
SketchCow |
Most importantly, I had a program called The Renamerator which allows me to keep a consistent naming for these friendsters. |
13:25
π
|
SketchCow |
And this has shown a massive missing set of these files. |
13:25
π
|
SketchCow |
So that's good. |
13:26
π
|
SketchCow |
I think the next thing archiveteam wise is we need some programs written to pull down these files, do some hardcore analysis on them, and then upload those analysis files into the items. |
13:28
π
|
SketchCow |
OK, now to use the renamerator on the friendster files, all the rest have been tucked in. |
13:29
π
|
SketchCow |
This is how it works, for education: |
13:29
π
|
SketchCow |
-rw-r--r-- 1 jscott jscott 1300397206 2011-06-26 11:13 friendster.1000001-1009999.tar.bz2 |
13:29
π
|
SketchCow |
So, what's the tagline at the end.... like tar.gz or tar.bz2 |
13:29
π
|
SketchCow |
root@teamarchive-0:/2/FRIENDSTER# sh renamerator |
13:29
π
|
SketchCow |
tar.bz2 |
13:29
π
|
SketchCow |
------- |
13:29
π
|
SketchCow |
What's the middle piece, the XXXXXXXXX-XXXXXXXXX. |
13:29
π
|
SketchCow |
VVVVVVVVV-VVVVVVVVV |
13:29
π
|
SketchCow |
001000001-001009999 |
13:29
π
|
SketchCow |
-rw-r--r-- 1 jscott jscott 1300397206 2011-06-26 11:13 friendster.001000001-001009999.tar.bz2 |
13:30
π
|
SketchCow |
-rw-r--r-- 1 jscott jscott 1671043953 2011-06-26 12:28 friendster.1010000-1019999.tar.bz2 |
13:30
π
|
SketchCow |
So you see it took the file and helped me rename it so that leading zeros and length were consistent. |
13:30
π
|
* |
winr4r nods. |
13:30
π
|
BlueMax |
Good |
13:30
π
|
SketchCow |
I should say that last line was it showing me the next to do. |
13:31
π
|
SketchCow |
Anyway, this is helpful, since I have 79 of these to do. |
13:31
π
|
SketchCow |
Then I have to see about uploading them. |
13:36
π
|
winr4r |
are they huge? |
13:39
π
|
Jaybird11 |
Okay it looks like I have around 1984 files syncing. Wanted to do a test set first |
13:41
π
|
Jaybird11 |
This is going to take forever, my upload isn't the fastest |
13:54
π
|
SmileyG |
jackdaniels |
13:54
π
|
SmileyG |
yum yum |
13:54
π
|
SmileyG |
everyone should raise a glass |
13:56
π
|
* |
winr4r raises cup of tea |
13:56
π
|
emijrp |
hay guise |
13:57
π
|
winr4r |
hi emijrp |
14:08
π
|
alard |
So, if I see it correctly, there are "1NZ1".to_i(36) => 77725 new-style items on q-audio.net? |
14:09
π
|
alard |
Perhaps it shouldn't be that hard to download, if the front door stays shut. |
14:09
π
|
Jaybird11 |
It wouldn't be except that he's fighting scrapers. |
14:10
π
|
Wyatt |
I don't believe that's ever stopped us before. |
14:10
π
|
alard |
You could coordinate that. |
14:10
π
|
winr4r |
heh |
14:10
π
|
winr4r |
i don't get why he would be blocking scrapers |
14:10
π
|
alard |
One scraper at a time, at full speed, then someone else continues when it is blocked. |
14:10
π
|
Jaybird11 |
I assume he's paying for bandwidth and doesn't want everyone sucking it up |
14:10
π
|
winr4r |
Jaybird11: dreamhost is "unlimited" |
14:11
π
|
winr4r |
i think |
14:11
π
|
Wyatt |
If he's on Dreamhost, he's got at least a couple TB. They don't offer less, last I looked. |
14:13
π
|
alard |
Dreamhost might make it even easier: just hack in and rsync everything out. :) (Not the way to go, obviously, but if you look at the spam problem hacking Dreamhost can't be that hard.) |
14:14
π
|
closure |
"I have a complete archive of the Well" -- Waxy |
14:30
π
|
LucianT |
Testing. |
14:31
π
|
emijrp |
Testing. |
14:38
π
|
SketchCow |
We have forever |
14:45
π
|
BlueMax |
Testing our love |
14:47
π
|
Jaybird1 |
Testing. |
14:47
π
|
Jaybird1 |
Yup that works |
14:48
π
|
SketchCow |
http://www.youtube.com/watch?v=yVJnMj2oKfo |
14:48
π
|
Jaybird1 |
This is Jaybird11 using a different client, actually through a MOO |
14:51
π
|
Jaybird1 |
God my Q-audio rsync of the stuff I have is going to take forever |
15:02
π
|
SketchCow |
drwxr-xr-x 2 root root 4096 2012-04-15 06:27 FRIENDSTER-059000000 |
15:02
π
|
SketchCow |
-rw-r--r-- 1 jscott jscott 9720540573 2011-06-30 13:17 friendster.059950000-059959999.tar.bz2 |
15:02
π
|
SketchCow |
-rw-r--r-- 1 jscott jscott 9484202599 2011-06-30 16:00 friendster.059960000-059969999.tar.bz2 |
15:02
π
|
SketchCow |
-rw-r--r-- 1 jscott jscott 9747123742 2011-06-30 19:21 friendster.059970000-059979999.tar.bz2 |
15:02
π
|
SketchCow |
-rw-r--r-- 1 jscott jscott 9278985871 2011-06-30 22:07 friendster.059980000-059989999.tar.bz2 |
15:02
π
|
SketchCow |
-rw-r--r-- 1 jscott jscott 7637687509 2011-07-01 00:37 friendster.059990000-059999999.tar.bz2 |
15:02
π
|
SketchCow |
drwxr-xr-x 2 root root 4096 2012-04-15 07:59 FRIENDSTER-060000000 |
15:02
π
|
SketchCow |
drwxr-xr-x 2 root root 4096 2012-04-15 08:00 FRIENDSTER-065000000 |
15:03
π
|
SketchCow |
As you can see, it's a nice mix |
15:04
π
|
Jaybird1 |
I don't know how or I'd create a Q-audio page on the ArchiveTeam wiki |
15:07
π
|
winr4r |
is account creation still disabled, SketchCow? |
15:07
π
|
SketchCow |
Yeah |
15:07
π
|
SketchCow |
Maybe I'll fix that today |
15:07
π
|
SketchCow |
Right now, doing friendster, and going out to buy a camera for the defcon documentary. |
15:07
π
|
* |
winr4r nods |
15:07
π
|
winr4r |
SketchCow: because two isn't enough? :) |
15:08
π
|
SketchCow |
My two are for external interviews |
15:08
π
|
SketchCow |
for the main thing, I'm buying 5 |
15:08
π
|
winr4r |
Jaybird1: in any case, if jason fixes account creation later i'll happily do the page for you, if you tell me what you want to go on it |
15:08
π
|
winr4r |
five! |
15:09
π
|
winr4r |
DSLRs? |
15:10
π
|
Jaybird1 |
I guess just the standard ArchiveTeam info blob with URL, project status, etc. Obviously there's no tracker or source code. Later, if we are unable to get a full archive, we could put a callout to those who have private collections of their own or their favorite posts to come forward. |
15:10
π
|
SketchCow |
Nah, not dslrs. |
15:10
π
|
SketchCow |
Vixia thingy |
15:11
π
|
winr4r |
ah |
15:13
π
|
SketchCow |
8.4G FRIENDSTER-025000000 |
15:14
π
|
SketchCow |
First one dropping in |
15:15
π
|
Jaybird1 |
I killed the rsync for a sec to turn off compression of files. Since it's all audio, compression probably isn't accomplishing much and may even be slowing it down. I don't have access to transfer rates or percentages complete. |
15:15
π
|
SketchCow |
Exactly |
15:15
π
|
winr4r |
SketchCow: huzzah |
15:16
π
|
Wyatt |
Haha, if uploads to IA made musical notes, what would 8.4G of friendster sound like? |
15:16
π
|
SketchCow |
Jaybird1: So far you've uploaded 410mb. |
15:16
π
|
Jaybird1 |
That's it? Oh man my upload is really slow! |
15:17
π
|
winr4r |
that's slow? |
15:17
π
|
SketchCow |
It's seriously not that bad |
15:17
π
|
winr4r |
it actually took me all day to get 1.4gb of stuff to jason |
15:17
π
|
Jaybird1 |
I'd have thought I'd have a gig or two up by now. It's been a few hours |
15:22
π
|
winr4r |
Wyatt: are you around? |
15:23
π
|
Wyatt |
Aye? |
15:25
π
|
winr4r |
Wyatt: would you be so kind as to install xwd on the VPS? |
15:25
π
|
winr4r |
see if i can figure out wtf is going on with these sites consistently not loading |
15:29
π
|
Wyatt |
Done |
15:29
π
|
Wyatt |
I think. |
15:34
π
|
winr4r |
thanks! |
15:37
π
|
Jaybird1 |
Wow there are a lot of these tmp files with totally random and meaningless filenames. Unfortunately, because of the way the service works, even the guy who runs Q-audio doesn't have records of who uploaded each file, except for possible web server logs of IP addresses. |
15:52
π
|
Jaybird1 |
I'm going to go eat and do other stuff. Will stay connected though, and my Rsync will keep going. |
16:13
π
|
Nemo_bis |
rsync never eats? |
16:22
π
|
Wyatt |
Never. Especially after midnight. Moreover, getting rsync wet is ill-advised. |
16:57
π
|
SketchCow |
OK, off to the shopping |
16:59
π
|
winr4r |
SketchCow: don't forget my 5D mk III! |
19:34
π
|
alard |
Hi Insectoid. From twitter I gather that you're probably Mongoose_Q of Q-Audio, right? |
19:35
π
|
winr4r |
word, Insectoid |
19:36
π
|
Insectoid |
There we go. Yes sorted. I am |
19:36
π
|
alard |
Welcome! |
19:37
π
|
alard |
You probably want to speak to SketchCow / Jason Scott. |
19:37
π
|
balrog_ |
I'm curious, what are you involved with? :) |
19:37
π
|
winr4r |
he's out buying things atm |
19:37
π
|
Insectoid |
So first, there was Qwitter. It was pretty much the only Twitter client for the blind. |
19:38
π
|
Insectoid |
So then, I thought... Blind people, they'd probably like to use Twitter for voice clips! so, I created q-audio, a simple way of uploading voiceclips to share on twitter using the Qwitter client |
19:39
π
|
balrog_ |
Γ’ΒΒ¦and people ended up using it for other stuff? |
19:39
π
|
Insectoid |
That was a few years ago using a Dreamhost VPS. Dreamhost has kind of gone to shit, I want to move away, q-audio is the only thing holding me here. It's 304 gigs. It's primarily copyrighted content at this point, 2/3 of it from a simple sql query (voice clips had temporary names created with the python tempfile module so are easy to find) |
19:40
π
|
Insectoid |
I thought I'd shut it down. A lot of people (well, 3) protested. so I'm here. |
19:41
π
|
winr4r |
how are you sure that it's primarily copyrighted? |
19:41
π
|
bsmith096 |
someone said they have a complete WELL archive ?!? |
19:41
π
|
winr4r |
(i don't doubt you, i'm curious) |
19:41
π
|
winr4r |
bsmith096: waxy.org |
19:41
π
|
Insectoid |
Filenames primarily |
19:41
π
|
winr4r |
Insectoid: ah |
19:42
π
|
closure |
bsmith096: waxy yes |
19:43
π
|
Jaybird1 |
I'd like to jump in here. For those not following the Twitter conversation, yes it's true that a good deal of it is probably copyrighted files with no business ever having been uploaded. But there is some real user-generated content there. |
19:44
π
|
winr4r |
on the upside, Insectoid gained an awesome and varied music collection! |
19:46
π
|
Insectoid |
(u'Thaeme_Mar |
19:46
π
|
Insectoid |
ioto_Feat_Heliao-os_Anjos_Choram.mp3',), (u'Faixa_3.mp3',), (u'Saint_Clements_Ch |
19:46
π
|
Insectoid |
oir_-_Saint_Clements_Carol.mp3',), (u'um_amor_para_recordar392.mp3',), (u'06._Mu |
19:46
π
|
Insectoid |
chos_Quieren.mp3',), (u'Track01.mp3',), (u'CoolSong-NormalSpeed.mp3',), (u'corte |
19:46
π
|
Insectoid |
.mp3',), (u'plach.mp3',), (u'novela1.mp3',), (u'hore_linda_tao_linda.mp3',), (u' |
19:46
π
|
Insectoid |
Linkin_Park_-_Live_In_Texas_-_With_You_HQ.mp3',), (u'107_-_COMING_AROUND_AGAIN.m |
19:46
π
|
Insectoid |
p3',), (u'radioactivo_-_barbie_q.mp3',)] |
19:46
π
|
Insectoid |
the last filenames out of the database as they currently stand |
19:46
π
|
Insectoid |
>>> |
19:48
π
|
winr4r |
Insectoid: yeah, i sympathise |
19:49
π
|
* |
SmileyG is always out of the loop |
19:49
π
|
SmileyG |
what you backed up? |
19:54
π
|
Jaybird1 |
`mat Use headphones |
20:02
π
|
Insectoid |
So ... Now what? |
20:02
π
|
winr4r |
Insectoid: wait for jason to get back, he'll arrange whatever with you |
20:02
π
|
Insectoid |
Ah okay :) |
20:03
π
|
winr4r |
he is out buying video cameras |
20:03
π
|
tsp___ |
Insectoid: IMO the first step is to disable uploads, so people can't put new stuff into it |
23:56
π
|
Nemo_bis |
SketchCow, what do I need to do to be able to upload or move items to the wikiteam collection? |
23:57
π
|
Nemo_bis |
I'll need to create several hundreds and it will be quite tedious to change them afterwards. |
23:58
π
|
chronomex |
be good idea to upload to own collection, then add it as a subcollection to archiveteam |