[00:01] Yeah. [00:04] :o [00:04] not that I'm a brony or anything >.> [00:04] <.< [00:04] >.< [00:04] #bronyteam [00:05] Zebranky: >:| http://www.bronycon.org/index.php/meet-staff [00:05] haha [00:06] We're migrating from Joomla to Wordpress shortly, sooooo [00:06] oic [00:06] It's fun, though a hell of a lot of work. [00:07] Pretty neat to have phone calls with John de Lancie, though. [00:07] I got to explain "trolling" to him last night. [00:08] I'd be curious to hear his thoughts of Trekkies vs. Bronies [00:09] I think that'll have to wait for the con. He's been to innumerable Trek cons at this point, but ponies? This'll be a first. [00:10] Good point [00:10] I forgot that this is the first real MLP con, other than small fan gatherings [00:10] (someone call Morgan and Jason!) [00:11] Zebranky: is adult artwork allowed in the artist alley? (just morbidly curious) [00:13] I was going to ask "what the fuck happens at a brony con" but then I remembered that I've been suckered into attending anime cons and I had the same question [00:13] I went to two (NYCC/AF), and the...energy is unforgettable [00:13] chronomex: hehe [00:13] I want to go to one, but just haven't had the means [00:13] Strictly PG. We're trying to make nice with Hasbro. [00:14] cool :) [00:14] underscor: You're SF area, right? [00:14] hasbro's been fantastic [00:14] shaqfu: nope, DC [00:14] underscor: sounds like someday you will be a real live archivist with a real live salary [00:14] underscor: I'm surprised; there are a few down there [00:15] Heh, well, I'd say Studio B/DHX have been fantastic, next The Hub, next Hasbro Studios, and finally Hasbro corporate. [00:15] shaqfu: I'm freshly 18 [00:15] chronomex: Real live archivists get paid in what amounts to Monopoly money :( [00:15] My parents would go through the trouble of notarizing the consent form [00:15] wouldn't* [00:15] shaqfu: hmmmm. that's a problem. [00:15] wouldn't* [00:15] hmm [00:15] Consent forms for cons?! [00:15] fucking quassel [00:16] underscor: doesn't your bank have a notary? [00:16] Almost all do, if you're under 18 [00:16] shaqfu: it's common for <18 [00:16] Oh, true [00:16] chronomex: oh, yeah. but it's more than that [00:16] they don't want me going to a con alone, etc [00:16] but FUCK THAT I'M GOING TO VEGAS [00:16] I figured there wasn't much of a barrier thanks to how many <18 are at anime cons [00:17] underscor: Are you still involved in libraries/archives? [00:17] yeah, definitely [00:17] If you are, join ALA; you get *really* cheap VIP con tickets [00:17] I work in collections at my local library system [00:17] Like, NYCC for $10 [00:17] he's going to vegas to archive money [00:17] well, "volunteer" [00:18] it's an unpaid internship [00:18] underscor: Get used to it :( [00:18] and I'm discussing a job with brewster kahle of ia [00:18] but that's VERY tenative at the moment [00:18] (it'd be a part-time-while-I' [00:18] m-in-school thing) [00:19] Ah, okay [00:19] Even then, a job at IA right out of high school, god damn [00:19] hahaha [00:19] I get the feeling I'll be hearing a lot more from you 5 years down the line [00:20] it's probably the mind control drugs I put in his drink [00:20] awwh, :D [00:20] I hope so!~ :3 [00:20] Oh, speaking of your age, is that a liability with the Defcom filming? [00:21] I ask because I know with AC casinos, you have to be 21 to go on the casino floor [00:21] same in vegas [00:21] I don't anticipate you gambling much, but just something to keep in mind [00:21] afaik it's not an issue, based on the faq on defcon's website [00:21] yea [00:21] but SketchCo's fully aware of my age, so I assume he knows what he's doing [00:21] Sounds good [00:21] hahahaha [00:21] you assume wrong [00:22] :D [00:22] All the "con" areas are all ages [00:22] Hopefully you don't have to interview someone at a poker table [00:22] so I should be okay [00:22] There are parties and stuff that are bounced, but other than that [00:22] pfft, I'll just get a fake ID from silkroad [00:22] >.> <.< [00:22] hahaha [00:22] it's not hard to get trashed at defcon when you're 17 [00:23] so long as you aren't obnoxious about it [00:23] Seems like the kind of place that attracts a young crowd [00:24] :D [00:24] chronomex sounds like he's experienced [00:25] shhhhh [00:25] I only drink while soldering [00:27] hahahahahaha [00:27] I've actually never had any alcohol in my life o.O [00:27] o_o [00:28] If I believe half of what my friends say, I'm the only one [00:28] lol [01:08] Well, parodius.com project is dead [01:09] shaqfu: ? [01:10] Can't wget the pages, and the admin has ethical opposition to archiving them [01:11] Ethical opposition? Why so? [01:11] "ethical opposition" ? [01:11] he hates his users? [01:11] He feels that it's up to the users to decide if they want their content saved or not [01:11] He hates history? [01:11] he hates the future? [01:12] underscor: Are you religious? [01:12] The only "ethical" point I could see is that of not wanting anything from your past to be archived and dug up later. [01:12] And forcefully wget'ing everything ended poorly [01:13] He cut you off? [01:13] So we're jammed [01:13] mistym: Total blacklist [01:13] At least he didn't go all robots.txt, there are many versions on the wayback machine. [01:13] Ew. [01:13] (...maybe it's worth grabbing those before he retroactively robots.txts it.) [01:13] Using 1GB of bandwidth was too much [01:13] Sounds like a challenge to me. [01:14] aggro: Unless there's some way to spoof normal active browsing, under fake agents, I've got nothing [01:15] How does he intend to determine "normal active browsing" from a slow and steady crawl with fake agents? Looks like the site is staying up for a few months. [01:15] aggro: That's what I was thinking of [01:18] Never thought I'd manage to offend someone by asking to make a copy of data they're watching... [01:19] Basically all I told him was that we're archiveteam (link) and we want to archive everything; what would be the best way to do this? [01:20] Another example of better to ask forgiveness than permission :P [01:20] I even invited him to pop in on IRC. :/ [01:23] " In fact, given our highly moral and ethical values, I'm a little surprised you'd even ask for this. I hope I'm misunderstanding your request, otherwise I'm actually a bit offended by it." [01:23] I wonder if he's a fanfiction writer. [01:25] aggro: I tried that first; went poorly :P [01:25] And like a moron I deleted everything I grabbed since I wasn't using warc [01:25] It sounds like he (she?) is more frustrated than anything. I'm assuming this site has been up for a really long time (1991?) [01:26] I suppose we could point out that anything without robots.txt excluding Wayback is already archived [01:27] How does heritrix handle forums? [01:28] wget handled the cyberpunk forums wonderfully. (http://cyberpunkreview.com/forums/) [01:28] It had no issue with the forums I grabbed [01:29] aggro: Including linked files? [01:29] Like images and such? [01:29] Yeah. [01:29] images and mp3s, etc. [01:29] If it's embedded, yes [01:29] (This is me plotting to get a better archive of Soundshock) [01:30] Oh, Hmm... [01:30] problem: retarded forums that require you to log in to see picture attachments [01:30] No biggie really. Wget should have support for auth cookies and post requests with the login data. Might require a bit of massaging though. [01:31] Sidenote: If you know why wget segfaulted in http://archiveteam.org/index.php?title=Cyberpunkreview.com#Sunday_April_22_2012_Update <-- that case, let me know :P [01:31] Wyatt: You can, but it looks like you have to be careful [01:31] You'd have to tell it to span hosts (-H) but limit recursion [01:31] Otherwise you'll probably grab the entire Internet [01:31] haha [01:32] Haha, that would be slightly suboptimal. [01:32] Wyatt: better to be too comprehensive! [01:32] I already did an httrack crawl of it a few months back, but I'm not sure how throrough and complete it is. [01:32] Anyway, input on handling Parodius? [01:33] Should we call it dead or think of some new approach? [01:33] shaqfu: wait for the admin to get back to you [01:33] winr4r: He did [01:33] oh never mind, i just saw that [01:33] He did, and with fervor :D [01:34] I suppose we could wget...slowly [01:34] wget does random waiting. [01:34] Wyatt: Slowly so it doesn't eat his bandwidth [01:35] wget -w $some-integer --random-wait $some-url [01:35] wget only does random waiting with the --random-wait option [01:35] plus all the other options of course. [01:35] I'm curious what kind of crappy colo he's got where a paltry few gigs is a problem. [01:36] and --random-wait uses the value from --wait [01:36] Should I at least respond to this guy, or just leave it at that? [01:36] one local company around here only offered speeds like 1Mbps or the like for outrageous rates [01:37] Shit like that happens when you've got monopolies. [01:37] this was a small local ISP [01:37] Wyatt: I'd be curious to see what sort of ethical objection he has. [01:38] though their network connectivity likely came from the biggest local datacenter provider: US Signal [01:38] We get signal [01:38] aggro: Copyright [01:38] He doesn't think that we should forcibly copy his user's public data [01:39] I almost want to respond "fine, give me their addresses and we'll mail them deed-of-gift forms" [01:39] Then I suppose he has a moral and ethical opposition to any big search provider scraping too? [01:39] Also, I'm not aware of any copyright laws in the states that prohibit archival. [01:39] you know what i love that? [01:39] aggro: I'd respond "fine, we'll just grab pages whilst respecting robots.txt" but, again, bandwidth [01:39] -that [01:39] that one time where someone changed their ethics based on what some other person said to them in an email [01:40] people have values and it's usually best to work around them than to change them [01:40] Yeah, I don't see it changing [01:40] Which is what we're talking about. [01:40] shaqfu: which is why emailing him again won't accomplish anything [01:41] So leave it as it is? [01:41] And draw up battle plans? [01:42] What, you think we should just let it die? [01:42] not permanently, just for the moment [01:43] Just until we figure out a way to grab it [01:43] We will become parodiusitic organisms. [01:43] wait till it's closer to the closing, then ask again [01:43] dashcloud: I'm concerned that if we wait too long, users will have migrated off [01:43] but isn't that the best of all worlds? [01:43] True [01:43] the users actually have their stuff, and it's being used [01:43] Since then the data's still alive, somewhere [01:44] maybe that's how you could approach it later- ask to get everyone's stuff that hasn't been claimed [01:45] Feels risky. [01:45] I don't like the idea of playing the waiting game. [01:45] If you do wait, and he doesn't budge, then we're less time [01:46] do you have a rough idea of how many pages exist? [01:46] Relevant quote: " Each and every hosted person here has a right to define whether or not they want their content archived. Some may have robots.txt in place, others may not but might have other conditionals (some technical, some via footers/agreements on their page). Their data is their data; I am not the owner of their data. I cannot decide for them if they are comfortable with that." [01:46] dashcloud: I have a list of domains I grabbed from the site and Google [01:46] if you do, someone could figure out the minimum pages per day that could be done to still get them all in time [01:47] dashcloud: A lot of the domains host forums [01:47] And there's the FTP - still dunno what to do with that [01:48] Whether to call it a loss or try and find what's public [01:48] if it doesn't have anonymous access, there's nothing that can be done [01:49] unless the lack of anonymous is for some misguided reason, and the un/pw are posted publicly on the website or forums [01:49] but then you have to find that info in time [01:49] That's a trick [01:49] the whole situation is kind of depressing, especially since the notice is really good, and from that notice he seems like he wants to do the right thing for the users [01:50] dashcloud: I was shocked [01:50] "Their data is their data." True. And they leave it publicly available. [01:50] If you don't want something archived, it would seem reasonable not to place it in a public space. [01:51] I don't think most people have adjusted to what privacy is on the web- especially since before social networks, it was quite easy to be both public and private with your material [01:52] From a legal standpoint, my understanding is that archival is permitted so long as the archive is not making a profit off of the material. [01:53] sure, sounds reasonable [01:53] whatever [01:53] we've been over this [01:54] chronomex: Hm? [01:55] aggro: What's legal isn't what matters right now [01:55] Then what's the conclusion? Slow crawl? Leave it? [01:57] That's what I'm trying to gauge [01:57] Just slow crawl it at a reasonable rate then. If anyone has access to a box with several IPs, shuffle the job across them. [01:59] aggro: If you want to take the project up, go for it. I can't do much since, again, blacklisted [02:00] I've never heard of the site before. Any place I can read up on it? [02:00] on fortunecity, i used --wait=.5 --random-wait to try and reduce my risk of being noticed and blocked [02:00] aggro: parodius.com [02:00] It's not a huge host so don't expect to find much [02:01] i was more interested in preserving the data than scoring high on the tracker [02:01] Alright. So hosting company. Cool. I'll look into it. Has a wiki page been made yet? [02:02] Nope [02:02] It might be best to keep this one quiet, though [02:02] I'm not sure it's in our best interest to make the admin even angrier [02:03] indeed [02:03] I think it's time for a new crawler [02:03] If you're doing it, do it in stealth mode [02:03] scripted WebKit engine [02:03] DIFFERENTIATE THIS [02:03] Fake agent, slow, delayed, etc [02:04] yeah. The fortunecity scripts did all of that (except delay) iirc. [02:04] you could hack such a thing up with a logging proxy + one of the scriptable browser things [02:04] their colo and bandwidth through: http://corp.bayarea.net/ [02:04] --> we need a HTTP proxy that dumps everything into a .warc [02:04] that'd be badass. [02:04] chronomex: let's go into business [02:04] hrm? [02:05] oh [02:05] that just sounded like a webapp [02:05] Hm, is there any way to grab some data, hop on another proxy, and continue? [02:05] and all webapps can be businesses even if you don't have a business plan [02:05] haha [02:05] Hell, if we're slow-crawling, tor might be useful... [02:05] I wonder how much work it would take to write .warc from a http proxy. [02:05] shaqfu: you should probably not do a fake agent [02:05] Tor may work [02:05] but Tor is also easy to block [02:05] the exit node list is public [02:05] Oh, true [02:05] shaqfu: So, I find it interesting he makes the argument that archiving should be down to individual site owners, but then takes action to prevent you from archiving stuff from people who didn't set robots.txt [02:06] my favorite user agent is "EAT DELICIOUS POOP (Googlebot; compatible)" [02:06] mostly because it will stick out in a list of user agents [02:06] dashcloud: Fake it to a browser [02:06] Can it be randomised? [02:07] That is, give it a list of user agent strings to pick from at random? [02:07] You can set HTTP_PROXY in shell and just have a script change it based on a set. [02:07] mistym: You'd think robots.txt not forbidding crawling would be implict agreement, but w/e [02:07] here you go: http://www.useragentstring.com/pages/Browserlist/ [02:07] shaqfu: Thsi IS the internet [02:08] I think if you just go slow you'll be fine [02:08] So take it slow, use a spoofed agent, and possibly use a proxy [02:08] Sounds like we've got a plan [02:09] I wrote a comment ages ago on archiveteam's position on robots.txt. [02:09] http://news.ycombinator.com/item?id=2531625 [02:09] recommended reading. [02:10] "Sort of like a cat's attitude to playing fetch: You want it? You go get it. I've got more important things to do." [02:10] LIKE SHUT MY SHIT DOWN [02:10] right [02:10] Haha, special snowflake comment [02:10] I usually design my sites to be okay with crawling all the pages. [02:11] i design for easy replication [02:11] CouchDB ftw [02:11] "You are not a beautiful or unique snowflake. You are the same decaying organic matter as everyone else." [02:12] my /robots.txt: "# Go for it." [02:12] Y'know, I have to say [02:12] Coming from the world of paper, which tends to be hyperconservative about acquisitions (due to the very real threat of legal action) [02:13] It is so goddamn refreshing to say FUCK IT and do the right thing [02:13] :D [02:17] oh [02:17] http://archiveteam.org/index.php?title=Software [02:17] I didn't know we had that [02:17] maybe someone should add all the tools on the github to it [02:18] shaqfu: Haha, yes! [02:20] selenium might be useful at some point as well [02:35] http://archiveteam.org/index.php?title=Parodius_Networking [02:35] Left mostly blank for now. Y'all can add as you see fit :) [02:48] nitro2k01: not really, why? [02:55] Since you said you haven't tried alcohol at all [02:56] You're not missing much :P [02:56] I'm reasonably sober, (drink maybe 2 times per year, at all) and I've never been really drunk [02:56] Well, depends on precisely what you're drinking... but for the most part, it's all piss. [02:56] As in, never been so drunk I haven't sobered while still awake [02:58] And not religious, vtw [02:58] *btw [03:02] I'm not religious, just usually law abiding [03:02] plus I haven't really seen a reason to drink it [03:02] Just like smoking :P [03:02] Socializing is about the single biggest reason. [03:02] Depending on the context (work, clients, etc) [03:03] being sober is way underrated [03:03] lol [03:03] actually does being pumped on caffeine count as sober? [03:03] again, in high school, there aren't a lot of socialization contexts where drinking is part of it [03:03] What about being high on dopamine? [03:03] if not then i have actually not been sober in at least 5 years [03:03] I mean, aside from wild parties [03:04] but if there's alcohol, I leave. [03:04] I'm not dealing with that shit [03:04] fucking cops, all that noise [03:04] Cops would be your biggest headache. [03:04] underscor: aggressive drunk arseholes, too [03:04] lol [03:04] after x drinks: COME AT ME BRO! [03:05] I have the feeling I'd be a terrible drunk [03:05] like, the most awful you can imagine [03:05] I don't know which "type" I'd be [03:05] aggro: yes [03:05] maybe SketchCow will help me find out in vegas! [03:05] hahahahahahaha [03:06] although, aggressive drunk person + tough sober person = drunk person going to hospital [03:07] it doesn't just make you more aggressive, it means you can't hit worth a shit [03:07] But you THINK you're Bruce Lee! [03:07] haha [03:07] aggro: yes [03:08] Hangovers are what turned me off from heavy drinking. After my third I said fuck this shit. [03:09] FINALLY found some coal in this fucking seed (minecraft) [03:09] lol [03:10] aggro: "I DEMAND MOUNTAINS!" has been a good seed for us. [03:12] Has anyone else here used onenote? [03:13] underscor: Encountered it briefly and on another occasion was given a demo by an avid user. It's pretty neat. [03:13] I love it! [03:13] it's so amazing [03:13] I wouldn't be able to survive school without it :( [03:13] We get like a ream or so of paper a month [03:14] and I can laugh at all my friends when I have context aware search and OCR and freeform editing and stuff [03:14] and they're pawing through 40 sheets of paper [03:14] it's fantastic [03:14] I've found that pretty much every user is an avid user [03:14] There aren't really any casual onenote users [03:21] I've tried to find an alternative on Linux; closest I've gotten is Basket, thus far. [03:25] Yeah :( [03:25] I just had to suck it up and leave my tablet as windows [03:25] THe touch support was dismal anyway, so it's not a big deal [03:26] it's pretty much a dedicated onenote notebook machine [03:27] What in the actual fuck: http://www.citypaper.net/blogs/nakedcity/Philadelphia-School-District-announces-its-dissolution-.html [03:27] Time to archive the awkward 90s web design of public schools? [03:28] What's that? Public schools failing? In MY America? [03:28] That problem is so multifaceted I don't know how anyone plans to solve it. [03:29] amurrrrica [03:29] failing at public education [03:30] I'm sure the archiving of those is already handled adequetly by IA [03:30] aggro: you don't, you think of alternatives [03:31] You've got teachers underpaid, administrative staff overpaid, a teacher's union that's become so powerful no-one wants to go against it, etc etc [03:31] that or don't have kids so it doesn't affect you, thumbsup.jpg [03:31] property taxes bro :P [03:31] go homeless [03:32] 1) Go homeless 2) Do home teaching 3) ??? 4) PROFIT! [03:32] lol [03:32] I have considered that on some occasions. Really depends on what city you happen to be in though. [03:32] Speaking of that, check out this documentary http://www.youtube.com/watch?v=lDcVrVA4bSQ&t=3m0s [03:33] Compared to the rest of the world, us folks in the states have enormous homes. [03:33] Which is strange, considering how little time is spent in them. [03:34] Lots about the US is strange [03:34] like me, I'm in the US [03:34] and I'm strange [03:34] Whatever doesn't kill you simply makes you... [03:34] strange [03:34] 8) [03:42] people are strange [03:42] \o/ [05:57] huh nesdev offers a straight-up full site download, no need to spider that at least ftp://ftp.parodius.com/pub/nesdev/nesdev_weekly.zip [05:57] Hi gang. [05:59] hi [05:59] i'm aboyt to fall asleep on kb [06:00] may produce several pages of the letter j if i do [06:03] hi jason [06:07] Hi. [06:09] A group came out of nowhere to offer DEFCON a free documentary [06:09] It was asked if I would consider working with them [06:09] It's interesting, it energized me to go "FUCK NO" [06:12] DEFCON asked you or they asked you? [06:13] which is to say: DEFCON asked you to consider working with the free documentary team, or the free documentary team asked you? [06:14] i mean, i could see the value in communicating with them, but "please stop doing this thing from which you're making money and do this very similar thing for free" is really cool [06:27] fuck it nonetheless [06:41] Ah [06:41] I am being paid to make a DEFCON documentary [06:41] Group wrote them to say "We'd love to make a free comprehensive defcon documentary" [06:41] I was asked if I was interested in working with them in any way [06:42] I'm not. [06:42] Was DEFCON paid for via crowd, or via DEFCON? [06:43] http://www.youtube.com/watch?feature=player_embedded&v=2mpOwtz2AaM [06:43] Via DEFCON [06:43] My fee got cashed two weeks ago. ;) [06:44] Ah, snazzy [06:44] (off-topic, how'd NYU's preservation week events go?) [06:45] Well, I only went to this cute little meeting [06:45] What I love is that there were occupy archivists there [06:45] And they were very gung ho AS YOU WOULD EXPECT [06:46] But during their presentation they mentioned how it's hard sometimes to hang out at the Occupy office because THOSE people NEVER stop talking about occupy [06:46] That image is amusing to me [06:46] Occupy has an office? [06:46] Yes [06:47] I thought it was run like the old Bolshevik army - vote on everything! - but I suppose that makes sense [06:47] They have some tiny thing they're renting for organization and I guess there's piles of stuff there [06:47] Well, you can certainly have vote on everything, but they do need SOME place you call to get bailed out or to store something or whatever [06:47] Because obviously sitting in a park isn't working 100% for long-term storage [06:47] Sensible [06:48] Does the org make much paperwork? [06:48] I'm not that involved/aware [06:48] I wasn't sure if it was discussed much [06:48] In fact, I could care less, ultimately - they're just another subculture with interesting things. [06:49] Oh no, these were pretty straightforward presentations [06:49] It's so adorable, they're all well-meaning students, some of archiving, some of whatever [06:49] I was always kinda confused what the archivists did - I didn't know they actually *made* anything to hold onto, other than keeping news talk [06:49] But they produced some actual useful shit [06:49] http://activist-archivists.org/wp/?page_id=328 [06:49] Stuff like that [06:50] Ah, okay. Keeping Occupy-related stuff usable, lest it end up like Egyptian media (which is not) [06:50] http://archive.org/details/occupywallstreet [06:50] I helped with this! [06:50] But I also added a tea party one [06:50] because I am bad [06:52] Hey, it's accomplished more [06:53] Glad to see NYU really got into it - didn't see much else today [06:54] http://archive.org/details/cnn-transcripts-2000-2012 [06:54] Brilliant work by archive team member. [06:56] 12 years of cable news transcripts? Future historians are going to want to build a statue to whoever did that [07:01] He's finding more [07:02] I noticed that the URL in the desc redirects elsewhere :( [07:04] its http://transcripts.cnn.com/TRANSCRIPTS/ [07:04] need that /TRANSCRIPTS, seems like an idiot reidrect to go back to home instead of just /TRANSCRIPTS [07:04] Yeah [07:07] http://www.library.ohiou.edu/subjects/commwiki/index.php/Broadcast_News_Transcript_%26_Video_Sources looks like that lexis nexis has lots of others too [07:24] IA's continuing tv archive recordings include closed caption dumps, when available on the channel [07:24] Yes [07:24] Exactly [07:38] Oh MAN, is FOS getting hammered. :) [07:39] Poor FOS, it thought it would just be a crawler or convert .mp4 movies into a range of derivatives [07:39] But now, it's being force-fed the fois gras of mobileme [07:40] When MobileME is done, this gets blasted out of the speakers of archiveteam: http://www.youtube.com/watch?v=PEbJ4qLiMu0 [07:46] With pleasure [07:46] SketchCow, what do you mean with "done"? After those few selected users, alard said there are many more, not indexed by search engines etc. [07:46] I don't understand if IA wants those too. [07:46] Also, this happened: http://www.youtube.com/watch?v=RNuUgbUzM8U&t=1m43s [07:46] Seriously? There are more? [07:47] * NovaKing loves that song [07:48] That is insane double-dutch [07:48] I played you the dubstep mix, this is the original [07:48] And has an official video [07:48] ya, i know of dj fresh [07:48] i also like 'louder' [07:48] What I mean with "done" is "done" - the project ends [07:48] Like friendster - incomplete. Or like Splinder - complete. [07:48] We'll see which it ends up being. [07:49] Did you see my response to Rick Prelinger asking how archivists deal with not saving everything? [07:49] Not yet. [07:49] "I'm curious how #archivists employed in fulltime jobs adjust to constraints that prevent them from collecting everything they might want to." - Prelinger [07:50] "I imagine everything is on fire under our terrible sun and every time we collect something people gratefully cry" -me [07:52] http://www.zeldman.com/2012/04/24/content-strategy-double-header-a-list-apart-349/ [07:53] In case you want to see the kind of lady I'm engaged to. [08:01] http://bits.blogs.nytimes.com/2012/04/24/harvard-releases-big-data-for-books/ [08:10] SketchCow, any idea on the license? [08:10] I hope Harvard will follow best practices as Karlsruhe and others. [08:11] I'm not sure how I missed you were engaged, but congratulations. [08:11] No idea. [08:12] I've been engaged since last April. [08:12] I don't discuss it much [08:12] Who need another person with a target on their back [08:12] I was married for 10 years, she wasn't mentioned much either. [08:13] Yeah, didn't know that either. I guess I just don't know much beyond your public self. I understand your rationale, though. [08:14] She has Sockington these days. [08:42] http://theoatmeal.com/comics/state_web_spring [08:47] One has to know when to stop and call something done. I believe that's often the hardest part of doing something. It's certainly valid for my day job; You always want to test software more, in any way, but you have to focus on what's the most value (ie covering as much as possible in as short time as possible). also known as "focusing on not over-doing it" [08:51] quite [08:51] or, "no, you don't need that feature" [08:53] the number of times i've been told that i need to add this because this structured data is not quite expressive enough and they want to do [08:55] some can live with tech debt, some can't [08:55] which also ties into what SketchCow linked to: http://www.alistapart.com/articles/content-modelling-a-master-skill/ [08:56] I'll give it a read [09:01] i mean i wouldn't mind adding it, but if i gave in to adding every single field, rendering option, and etc i'm asked to do, 1) the admin would be horrendously complicated with a wall of options for each thing 2) i'd be blamed for 1) [09:04] (example: i was told that for a single business entry in the database, STRUCTURED data, it should render differently depending on which category was used to access it, and the listing within that category should also have a different title depending on category [09:05] hey guys, i'll just make you a bunch of static files and send you a fucking HTML editor next time [09:06] sorry, off-topic, other than that rachel lovinger is totally right (with my added proviso "don't always listen to clients") [09:10] Client don't generally know what they want, and try to design instead of listening to the designer. [09:10] in my experience. [09:11] Oh they know what they want, they just don't know how to express it, much less how to design and program it :P [09:11] I disagree ;) [09:11] if they knew what they want, you could show them and they'd go "Yes, but with X", and when you added X they'd be happy. [09:11] but they aren't they then go "Oh now I'm not so sure.... maybe it needs more Y?" [09:12] ....and in that moment, the client tries to be a designer. [09:12] they are no longer asking you to design something [09:12] they are telling you how they think the design should change to meet their (unknown) requirement [09:16] i can even deal with that [09:16] Or not? Maybe I'm wrong... [09:16] i stopped working with clients a long time ago :D [09:17] winr4r: As long as you know its happening then its ok. [09:17] changing requirements don't bother me, wanting structured data to act like unstructured data does [09:17] winr4r: yeah sounds like your screwed :D [09:17] * SmileyG_ sobs at the idea of all the non-redudant systems he oversees [17:34] SketchCow: What're you trying to do w/ ffmpeg? [17:36] Just make a fucking 512x384 .flv that isn't a bitrate of six million k a second. [17:36] And now tell me to use the -b option! [17:36] Because I do that, and it goes "Ah! You want 610k/s! Well excellent. I'll begin encoding at 1M/sec now" [17:36] It's like talking to someone at a movie theater popcorn line [17:38] Erk, yeah. Bitrate in particular is "fun" that way. [17:38] Try -vb 610k [17:39] It's now a race between handbrake and ffmpeg [17:39] -b used to be taken as a shorthand for always meaning video bitrate. Which they realized is heck of vague, but changing the behaviour didn't help when everyone was doing it the old way for years. [17:39] Nope, still doing 1 meg [17:40] It's just so fucking frustrating [17:40] This shouldn't be so hard. [17:40] I realize some of this is because the MpegFF guy is a toolbag [17:42] You need a second pass unfortunally :/ [17:42] -b:v 610k? (which still may not be EXACTLY the bitrate you want... it's sitting around 740k/s when I do that. Fun.) [17:42] fucking video crap [17:43] ^ that [17:45] b:v unrecognized option [17:46] But let's not keep going in this channel, it's 100% not archiveteam related. [17:46] It's FFMPEG is fucking retarded and WinFF, which I am using, is also retarded. [17:46] Which is a non-relevant issue [17:46] I'm just angry because I don't want to keep doing this second video job of mine [17:46] But I do need that second income stream [17:47] So the fact that I can't even say "do this" and have it DO IT even after it SHOWS I SET THE OPTION tells me there is assery ahoy [17:47] THAT is more archiveteam related: assery ahoy [17:47] #archiveteam: assery ahoy [17:48] LOL [17:50] OK, upgrading to most recent. [17:50] I apparently am slightly behind. [17:51] ffmpeg is behond only imagemagick in frequently changing what commands exist/do [17:51] (I secretly love ffmpeg. But seriously, it does some silly things.) [17:52] fuck IT, seriously :P [17:56] thanks >_< :D [17:57] OH FUCK YES [17:57] and by that I get to watch NEW WinFF be retarded as old [17:57] SketchCow: also, libav has more or less superceded ffmpeg [17:57] oh wait [17:57] never mind, ffmpeg is not exactly in use here [17:57] yipdw: No, libav is a fork that's developed by some other people. Work goes into both by different people. [17:57] mistym: I thought all the good developers on ffmpeg had left [17:58] yipdw: Far as I know, not everyone. A bunch of people left though, yeah. [17:58] I thought everyone with a brain left [17:58] so in that sense it's a bit like the XFree86/Xorg split [17:58] OK, new thing has a flv preset I am now going to assault [17:59] yipdw: XFree86 just straight-up stopped developing things. Really seems more XEmacs/Emacs, or every BSD. [18:01] SketchCow: Hopefully the preset works. [18:01] OK, time for some delicious lunch. [18:05] mistym: Maybe more like Open/LibreOffice? Irreconcilable differences, so fork. [18:06] http://webaim.org/blog/user-agent-string-history/ [18:07] OK, solution [18:07] buy Adobe Media Encoder [18:07] Preset, with slight changes to bandwidth and resolution. [18:07] I couldn't hear the rest after you said "buy" [18:07] I almost heard you say Adobe [18:07] obtain Adobe Media Encoder [18:08] But that's like hearing "rape.. child" and you know it can't get better the longer you read [18:08] actually, AME is a pretty terrible piece of software too, so maybe it won't work so well [18:09] SketchCow: rapeseed produces oil I use to cook things for children [18:09] BAM [18:10] yipdw: Oh god no. I hate AME so very very much. [18:10] * mistym migrated from AME to scripts on top of ffmpeg in her workflow ages ago [18:10] mistym: heh [18:11] the times I've used it have been a bunch of misery [18:11] Did you know that it truncates videos at a random, not-quite-4GB marker when encoding to a network drive? [18:11] maybe it's just what I'm used to, but uncompressed export -> avisynth -> x264 tends to be a hell of a lot more understandable [18:11] mistym: nope [18:12] is it a silent failure? [18:12] Completely silent. [18:12] wonderful [18:12] It treats it as a success. [18:12] how the hell is that even professional-grade [18:12] Near as I can tell it's not a 32/64-bit thing, and it has no trouble encoding to arbitrary gargantuan sizes on a local disk. [18:13] I wonder if it's a CIFS thing, but that wouldn't really make sense [18:13] I've written massive files way above 4GB over CIFS/SMB before [18:13] ...from VirtualDub [18:15] God only knows. Not diagnosing that anymore, that's for sure. [18:16] that sounds like the hard-earmed result of an absolutely horrible day [18:24] OK, problem solved [18:26] SketchCow: Awesome. The flv preset did it? [18:27] Yeah [18:27] "Make for websites" [18:27] Then I modified the bandwidth and the resolution, left the rest [18:28] I am sure some video aspbergers person could tell me another 1000 option settings, but as it is, it preduces much smoother video than it did before. [18:28] And since I am in a very nice city, I am going to go fetch some groceries. [18:28] And tell a bunch of people that no, I am not giving them a dollar [18:28] Here's a good one [18:28] See, that's a better use of your time, and I say that as one of the aforementioned video nerds. [18:29] I'd go nuts if I had to encode a lot of video [18:29] I go to a market down here, the lady was walking by a week or so ago, and someone boosted some stuff while a truck was being unloaded [18:29] I mean, grabbed a crate of food and ran top speed away [18:29] What makes it great is that it was BLUEBERRIES [18:29] lol [18:29] Two guys ran out and around the corner to chase him [18:30] "BLUEBERRY THEIF!" [18:30] But man, something about "What you in for?" "Boosted some BLUE berries, man" just makes it [18:30] Grand Theft Blueberry [18:30] Which I think would be a great hooker name [18:30] What does one even do with a crate of blueberries? Eat them for every meal for a week? [18:30] mafia (allegedly) watches the main fruits and vegetables bulk market here (one of the biggest in Europe) to avoid that [18:31] shaqfu: so many pancakes [18:31] TOTALLY closed the anderson deal. Get the escort service on the line, I want GTB to stop by [18:31] mistym: Or jelly by the gallon [18:31] NYC Milk market was totally mafia controlled [18:31] When they got it out, it was, like, 25% of the price went down [18:31] OK, now I want to make NYC Milk Market: The Videogame [18:32] http://www.brooklynpaper.com/stories/30/14/30_14rawmilk.html [18:32] mistym: fuck yes [18:32] "Sell 'insurance' to grocery store owners for extra points" [18:32] I'd throw a blueberry your way for it [18:33] What a weird story [18:43] what a non-story [18:46] Hot Milk [19:37] Ooh, new Adam Cadre IF [19:39] Oh, weak, it's a port (but still extremely cool) [20:24] So who was/is working on FlickrFckr? [21:02] mistym: I was [21:02] I think I might have been the only one, can't quite remember [21:03] underscor: Cool! I was curious how you're doing it. [21:05] Oh, jeez. A bunch of gross bash scripts [21:05] I haven't touched it since like last june [21:05] haha [21:06] Were you using the Flickr API, or was it based on scraping the site? Guessing scraping if you were doing it in bash. [21:13] Both [21:13] well, I scraped an API key off the site [21:13] and then used the API for the rest of it [21:14] (so each request used a different api key, less suspicious) [21:14] An illicit key! How naughty. [21:14] hahaha [21:14] Well, I don't want to deal with them banning it or something [21:14] plus scraping of that scale violates the terms of use [21:15] Yeah, makes sense. [21:21] So any interesting gotchas working with the Flickr API? I was interested in a smaller-scale Flickr grab of some kind myself. [21:27] Haven't really found anything, actually [21:27] Their API is p sweet [21:28] Well that's awesome to know. [21:30] Will have to give it a try sometime. Wish they had an easier way to get an API key like, e.g., Twitter does. You don't have to go all cloak 'n dagger or submit an app to a public registry there. [21:30] Yeah :( [21:30] Steal the key they use on the api test page, it's what I do [21:30] ;) [21:31] If the idea migrates into a work project they would probably not be impressed with that approach. ;) [21:39] haha [21:39] true [21:43] for flickrfckr, render each flickr page as a PNG by running a browser in Xvfb, and upload it to a Flickr account [21:43] I guess that won't work too well unless you have a Flickr Pro account though [21:55] * underscor does [21:55] that'd be an interesting project [21:55] I wonder if they'd get upset [22:06] who would? [22:26] I just tried ABBYY FineReader today, and it was amazing- I had no idea OCR could be that good [22:29] it's Good. [22:36] I only had version 9 (the express edition)- I can't even imagine how much better the later versions are [22:41] does it cost teh money [22:45] yes [22:50] $30 for the version I got [23:11] dashcloud: that's not so bad [23:11] i have some manuals that i've not OCRed just because they're a pain in the arse to do [23:11] i'd like something that can take a PDF, OCR it, add a text layer, save PDF without me having to fart around [23:13] okay- the $30 version won't take a PDF- it just handles scanning and going from there, and converting images into text [23:13] winr4r: uploading to archive.org does all that automatically :) [23:13] if you convert your PDF into a series of images, you'll be fine [23:13] SketchCow: i got INPUT magazine [23:13] if you're interested, it's here: http://www.tigerdirect.com/applications/SearchTools/item-details.asp?EdpNo=1729182&CatId=973 [23:14] i don't know if you guys got it [23:14] it was for 1981-1985 [23:14] *from [23:15] i also have a good dark-rack of edge magazine [23:40] did I miss someone mentioning this before? http://www.theverge.com/2012/4/23/2961601/0-day-art-digital-art-torrents-piracy