[00:16] * Coderjoe binks [00:16] BLINKS [00:17] review of a taco bell on google's store review system: Kind of a flamer for a cashier. But whatev's it was good food. [00:17] wtf does having a "flamer for a cashier" have to do with anything? [00:23] haha [00:23] obviously it's something to take into consideration when deciding to frequent said establishment [00:23] well, this is a crazy conservative area. perhaps someone is afraid he'll put The Gay in their food [00:23] :D [00:24] dat gay powder, recruitin' all them y'ung folks to their craaaazy buttfuckin' ways [01:10] Wow, Twitter loves the Montreal Mirror. [01:10] you mean the montreal mirror mirror? [01:11] mirror^2 [01:20] crazy germans [01:20] http://www.liveleak.com/view?i=4d6_1341254855 [01:20] and http://www.youtube.com/watch?v=RobaJKGMMiE [01:25] underscor: it's interesting how all those people claiming that 'the gays are trying to infect others' never think about a reasoning as to why 'the gays' would want to do tghat [01:25] that * [01:26] OH SHIT [01:26] joepie92! [01:26] ohai :P [01:26] * joepie91 ninjas into discussion [01:26] i wonder how widely the just do it summer project will cover. like i guess different kinds of media formats, but that's basically a separate category from file/archive/etc formats [01:26] because they're EVIL and want to DESTROY HUMANITY [01:27] oh hai joepie91 [01:27] I thought there were 2 joepie's for a minute [01:27] I didn't know which one to shoot [01:27] hah [01:27] haha * [01:27] joepie91: the gay agenda is as mysterious as it is nonsensical [01:28] lol [01:29] hm "FileTeam", maybe. for the summer project thing [01:38] O_O [01:38] http://boingboing.net/2012/07/03/cisco-locks-customers-out-of-t.html [01:47] jesus [03:33] in case you thought it was just their consumer division that had it out for people, think again: http://arstechnica.com/tech-policy/2011/07/a-pound-of-flesh-how-ciscos-unmitigated-gall-derailed-one-mans-life/ [05:16] SketchCow, those BBS textfiles gone up yet? [05:16] ummmm [05:16] Google Video stopped taking uploads in May 2009. Later this summer weÃ¢??ll be moving the remaining hosted content to YouTube. Google Video users have until August 20 to migrate, delete or download their content. WeÃ¢??ll then move all remaining Google Video content to YouTube as private videos that users can access in the YouTube video manager. For more details, please see our post on the YouTube blog. [05:29] yeah I never figured out why they didn't kill google video on time [05:29] everytime I see a video on it I think "This relic is still around?" [05:32] I am all for competition in terms of product offerings but gv was never good [05:39] because of us [05:40] and complaints about there not being an easy migration to youtube [05:52] really? I wrote a Perl script to do it for a friend. It fetched the video from gv and then posted it to youtube via the api [05:52] not even 50 lines [05:52] mmm [05:52] postmortem of the AWS us-east-1 outage [05:53] http://www.theregister.co.uk/2012/07/03/amazon_outage_post_mortem/ [05:53] omf_: complaints about google not providing an easy means for non-techie people to do so [05:53] omf_: well i'd rather have GV up than not have those videos be available. and they're going to be private on youtube i guess, so as good as gone unless AT/archive.org puts their stuff in action [05:54] ^ what he said. if they go private, we may never see them again [05:56] I would rather have google give us a straight dump of all the content and then close the site down. Running "competing" services in a company can cause serious problems. Yahoo always being a great example of this [05:56] a bunch are apparently grabbed already, so hopefully it's almost done [05:56] I always worry about companies messing up already working products [05:56] omf_: i think the only sites to give dumps have been url shortener places after they've shut down [05:56] AIM / ICQ and how that worked so well [05:57] arrith1, that is a shame [05:57] omf_: yeah. but also probably some liability google lawyers wouldn't want to worry about [05:57] this reminds me again of when the jQuery admins borked their whole plugin system [05:58] lost everything on the site [05:58] no backups [05:58] and then tried to play it off as a good things(tm) [05:58] and this was last year I believe [05:58] wow [05:59] i'd hope using some kind of dvcs they'd have backups 'for free' [05:59] lots of sites mirror even just site code/html/etc to like github [05:59] they had a content management system that has an automated backup option [06:00] it kills me [06:00] more than half of what they had has no reappeared in the wild yet [06:00] sites like that are probably good candidates for the Deathwatch page on the wiki [06:01] for people to get ideas of what sites to preemptively archive [06:01] like ff.net is being backed up currently [06:03] yeah I have added a few things to deathwatch [06:04] SketchCow: I understand now why you didn't upload bbs interviews [06:04] I have also been looking back through my bookmarks for sites that are candidates as well [06:04] omf_: nice [06:04] the more of that the better [06:05] yep. I have 4 more entries I am working on. [06:05] eventually I am going to join the ranks of archiving sites that are allive [06:05] I do some of reddit now [06:06] yeah reddit sure could use it, it's not that big too [06:06] hackernews is next for me [06:06] omf_: looking into the archiveteam warrior project might be good [06:06] reddit has a massive amount of new content per day [06:07] the archive team warrior is making it easier for lots of people to spring into action, even preemptively [06:07] oh yeah and that is where it would have to go [06:07] the problem is also mapping out all the subreddits. I have been working with a stats tracker on that [06:07] omf_: well at least compared to like video sites, reddit is only a few hundred GB iirc, unlike other places [06:07] omf_: might want to put what you've got so far on a page on the wiki for reddit [06:08] one thing about reddit is the posts and associated comments get moved to a cold storage after a certain date [06:08] people rallied together to get fanfiction.net preemptive archiving that way [06:08] ah yeah [06:08] you can only go 1000 entries back in a subreddit [06:08] also hard to get past 1k items [06:08] yeah [06:08] that i've yet to figure out how to get around [06:08] the admins have said publicly "instead of scraping, if you want a copy of reddit just ask" though i'm not sure how they feel about AT [06:08] so i haven't tried [06:08] I have been talking one of the reddit devs into making me data dumps on certain sections [06:09] I want to see how much I can get out of them [06:09] that would be pretty good [06:09] omf_: if you get anything, it would be good to get that backed up onto archive.org servers [06:10] omf_: dunno if you're familiar with the AT methods but it usually goes bunch of users scramble to dl a site / sites, then upload to reserved space generally on archive.org servers, then it gets inserted into the archive.org system [06:11] 570,770 is the amount of unique posts I got so far [06:12] not counting links in posts, comments, or reddit pages [06:13] omf_: yeah i'd be curious about your methodology, for storing, verifying if you already have a post or not, etc [06:15] well I kinda hacked it initially but left the door open [06:16] the fields I wanted to track are mapped to columns in a table [06:16] then there is a column that holds the entire fetched json [06:16] so I can get any piece of data at a time [06:16] I use a unique key against the url and subreddit [06:17] because I want to see cross posts [06:17] the script runs on a cron job doing different sections at different times of the day [06:18] I also create log files on runs. The log file contains all the post urls only and a little metadata. If needed I could rebuild the database using the log file [06:21] omf_: using files or a db like mysql? [06:21] mariadb [06:21] the logs are flat text files [06:22] I have another table that tracks reddit votes over time [06:22] and the comment count [06:22] so I can see which kinds of stories in which subreddits get the most attention [06:23] hm if you tweaked that into a archiveteam warrior compatible setup you might be able to get some help downloading [06:23] oh that is easy to do [06:23] I am going to wait to see if they will just give me the data dump [06:23] i'm not sure what people call it but there's also this heroku tracker thing, where it has a full status page displayed. they've had that for other projects [06:23] so far they have been friendly and helpful [06:23] hmm yeah [06:23] have they said 'yes'? [06:24] They said it would depend on subreddit size and how long to pull off cold storage [06:24] if I can get them at a lull in work they said it would be more feasible. [06:25] if you can get enough of those, maintaining a full mirror would get a lot easier [06:25] one would just have to maintain it by grabbing the latest stuff [06:29] maintaining is easy so far [06:29] some subreddits I poll 4 times a day others 1 time a week [06:30] figuring out the frequency is one of things I would have to do for at warrior [06:30] slamming everything on their site violates this TOS [06:30] yeah, that sounds like a fun statistics problem [06:30] nah it is just getting a list of all the subreddits [06:30] well it'd be a bunch of different people [06:31] I looked at the wikiteam [06:31] they have it divided into lists [06:32] I would do the same but based on update frequency. Some users would have to leave warrior running for 4 cron jobs or more a day. Not so much a hit and run like normal AT actions [06:32] i think the preemptive archiving stuff will be very different from the dash and grab AT stuff [06:33] since yeah people will have to leave it running. but people have unused server capacity and stuff [06:33] plus they can always drop out if they need that capacity, and resume if/when they can [06:34] true [06:34] hm that should be more prominent on the wiki. making it clear that there are dash and grab efforts and other preemptive efforts [06:35] i think preemptive stuff is just so new. i haven't looked recently but i think the only preemptive AT effort currently is the fanfiction.net one [06:37] yep [06:38] that is why I am updating deathwatch [06:38] so people can see the types of sites we are looking for [06:39] might be good to put together a 'candidates for preemptive archiving' page, if some sites look extra dire, or they're especially popular like reddit [06:39] hm more for my userpage todo list [07:16] hmmm [07:16] google killing more stuff but nothing of import afaik [07:16] all presentation stuff + google video, which is moved to youtube. [07:18] the best product google canceled was google squared [07:18] there is nothing out there like it now [07:18] google squared o_O? [07:18] dude [07:18] you would run a search [07:19] and it would return this interactive spreadsheet [07:19] on the left one range of the search and the right another range [07:19] you could then add refining terms or select squares [07:19] that square would become the focus and then the results would change [07:20] it allowed you to visualize, sort and use desperate data [07:20] web pages, fact data, images, video, it did it al [07:22] there have been other preemptive panic grabs [07:23] here is a 37 second intro https://www.youtube.com/watch?v=__INtIXNLmI [07:24] and then this google talk that has it https://www.youtube.com/watch?feature=player_detailpage&v=5lCSDOuqv1A#t=1658s [07:25] also each use of google squared ran around 200 searches [07:26] i think i might see why they killed it :P [07:26] they killed all of google labs for no good reason [07:26] they had it open for 2 years [07:27] and gave a tech talk about it last year and then bam closed [07:27] $$$$ [07:27] fucking Larry Page did that [07:27] he cut a ton of shit and focused everyone on g+ [07:33] well, i think this kickstarter will fail. 40 hours to go and less than .6% pledged [07:33] kickstarter for what? [07:34] ah I do remember that [07:34] an indie action/adventure hero-rescues-the-girl feature film [07:34] o_O [07:35] sounds inspiring. [07:35] http://www.kickstarter.com/projects/1431718519/lady-in-distress-feature-film [07:35] hmmm [07:35] I "bookmarked" it on kickstarter, and they sent me the 48hr message today [07:36] Anyone use a remote server to upload to IA [07:36] it looked mildly interesting, but i wasn't sure i would have any funds to chip in [07:36] I was thinking of using my webserver to fetch and then upload a collection I have been working on [07:37] or know where I could rent a seed box or the like [07:37] omf_: I sorta do. it isn't really different than locally via commandline using the s3api interface [07:37] my local internet is super slow that being the difference [07:38] I am talking about a server with speed [07:38] and I am refactoring that s3api script [07:38] as part of it [07:38] the box is on my lan, but i just ssh'd in and wrote a python wrapper script around curl [07:39] the only thing to watch for is an errir telling you to slow down, but i think you can only get that if you're on the ia lan [07:40] I am only going to upload 1 file at a time [07:40] they are cd and dvd isos [07:41] I have been writing scrapers for years. I am polite about file transfers because it does not draw attention [07:41] reddit for examples bans bots all the time for hitting their service too fast. [07:41] How hard is it really to add a sleep statement to your code to fix it? [07:42] sure. SketchCow was able to overwhelm it with one at a time uploads while uploading to the s3api from within IA's network [07:43] for s3 insertion you don't generally need a sleep statement [07:48] https://sphotos.xx.fbcdn.net/hphotos-ash3/529310_337414343002615_1271453806_n.jpg [07:50] ha [08:03] lol @ running across a page with anonnews links on the archiveteam wiki [08:37] omf_: disparate* [08:41] joepie91: nice haha [08:42] it's still a bit odd to browse a site and run across a link to a site of my own, lol [08:42] not sure if I'll ever get used to that [10:34] So teh higgs was behind the sofa after all... [11:09] https://www.youtube.com/watch?v=bl_1OybdteY [13:11] uploading episode 123 of dl.tv [13:11] :-D [13:23] uploading episode 124 of dl.tv [14:24] godane: :D [15:21] http://cardiganponi.tumblr.com/post/26444462697/important-message-about-the-bronycon-orgy [15:23] ...the orgy? [15:27] Sounds like trolling to me [15:39] SketchCow: thanks for the ops *bows head as sword is tapped on his shoulder* [15:53] Well, -bs is quite the bush leagues [15:53] good morning jason :) [15:53] (bush leagues?) [16:06] (never mind, looked it up) [16:09] Oh, damn, forgot you're english [16:10] I'm, like, the idiom-o-matic [16:10] When Sockington used Spastic as a term, his UK fans flipped [16:10] Oh hey I got ops too :D [16:11] yeah, everyone gets to party in -bs [16:11] SketchCow, did you upload those BBS textfiles from 1987 upped somewhere? [16:11] I wouldn't mind having a look [16:11] those won't be up for a tad [16:11] I have to go on my next trip [16:12] OK [16:12] Let me know when you do [16:12] If you remember [16:12] SketchCow: haha [16:12] Oh, wait [16:12] http://www.textfiles.com/F/ [16:12] that's some of them [16:12] yeah, in the 1980s "spastic" was a standard awful playground insult [16:12] But there's over 100, they'll all be put there before splitting off into their permanent home [16:13] I remember hearing about a Nintendo game that got recalled because the translators had used the word without knowing what it meant in the UK. [16:14] Fair enough SketchCow, cheers [16:15] mistym: pooper mario? [16:17] SketchCow: so about this new project, the basic idea is document how to get stuff out of old formats (digital or otherwise) into new ones? [16:17] i could probably spend some time working on that, at least gathering what information is available [16:19] winr4r, that does appear to be the point [16:20] I'll be writing what I'm planning in more detail on the wiki page. [16:20] http://www.archiveteam.org/index.php?title=Just_Solve_the_Problem_2012 [16:23] 'This is not a "sprung from the forehead of Zeus" attempt to completely re-boot the process of enumerating the many formats out there. Much work has been done and there is much to share.' [16:23] hm [16:24] a lot of work has been done there, but all the sites that are out there that give information on lots of file formats, are also pretty sketchy [16:25] like, they'll have some brief documentation on it then it's like "if you can't open .WPS files, you MAY HAVE ERRORS IN YOUR REGISTRY!!! click to download REGISTRY FUCKFACE v13.9 to correct these errors!!!" [16:25] Well, not QUITE true. [16:26] Realize, one of the sad things to happen here with archive team is I'm constantly exposed to all the sort of insider crapola in the arhciving biz. [16:26] And there's actually a pile of initiatives [16:26] i can imagine [16:26] Some are very good, in fact. [16:30] so, what is the goal behind the new project that these other projects aren't doing? [16:30] Be unemcumbered by funding, politics and justification. [16:31] so it's still about documenting various formats and how to get stuff out of them? :) [16:31] Yes [16:32] but unencumbered by anyone saying "well, do we *really* need to know how to read raw files from a canon EOS D30?" [16:32] Not quite the problem. [16:32] The problems are usually [16:32] - "Look, we only have these interns until Sept. 13, do NOT send them off to document non-critical formats" [16:33] - "Why the fuck are we including BBS texts as canonical documents" [16:33] - "Oh, we're just doing DIGITAL formats. Punch cards are self-evident" [16:33] ah! [16:33] I want boundary-less hothouse [16:34] or even, "books are self-evident so we don't need to digit-LOLJK pulping the stuff that isn't important to us now"? [16:35] either way, i'm all for throwing some time into it [16:35] I'd help if I had a way to [16:35] I think we can stand on the picplz thing. [16:35] 36 hours, we duped that bitch [16:36] i was more than impressed by you guys doing that [16:37] The picplx team have bigger e-peens than most people on the net right now [16:38] <- just received his Internet Archive sweater. e-peen++ [16:38] Schbirid: more than jealous [16:38] kinda hoped for a shirt tbh but still ncie [16:38] pft :P [16:38] I will only wear a shirt that advertises IA if it has SketchCow in a tutu on the back saying "There are things we need to save. This is not one of them." [16:39] i dont think underscor would borrow SketchCow his tutu [16:41] that reminds me I have a few 90s text files I was gonna put together and send in [16:42] Just, if we could all chip in $50 and got Jason to do that for an IA t-shirt...I would be in heaven :P [16:49] oh man [16:49] that definitely needs to be a reward [16:50] ! [16:52] they could be part of reg for archiveteamcon 2013 [16:52] too [16:52] :D [17:02] I'd come to that conference [17:25] pft [17:25] jason would be like the only extrovert there [17:55] the dangerous trend we've seen in afghanistan and iraq, btw, is that our militaries are getting very good at subduing insurrections [17:55] (i have a brother in afghanistan, btw) [17:56] wrong window [17:56] sorry, i should sleep more than two hours a night :/ [18:57] comodo <3 http://isc.sans.edu/diary.html?storyid=13606 [20:38] ugh. this ocsp thing seems like an information leak, on the privacy front [20:41] ugh where? [21:08] eh? [21:19] what irc clients are people using. I am thinking about trying out a new one [21:20] I still use xchat [21:20] you should be able to do /VERSION #archiveteam-bs [21:20] god xchat is 14 years old [21:21] heck I'm still on mIRC [21:22] ugh that should have shown up in the server tab [21:22] it is a few little things that are starting to annoy me [21:23] omf_: I've been enjoying Limechat, but I've only tried it on OS X. [21:24] try Textual (unfortunately it's free only if you compile it yourself) [21:26] I am a Linux user [21:28] Ah, okay. Not sure I can help then :( (I see there's a Limechat Windows, but I don't know about Linux...) [21:28] I see irssi representing [21:28] why do Apple users assume everyone uses a Mac? [21:28] omf_: I don't, I just mentioned what IRC client I was using. [21:28] I said I [21:29] sorry. I was just reading about more fighting between apple and everyone else [21:29] 'd only tried the OS X version because I don't know what the other platforms are like. [21:29] I was saying that to mistym [21:29] this patent shit really sucks [21:29] the cell phone market is going to implode [21:30] It's like MAD where someone pressed the button anyway. [21:32] It's crazy :/ [21:33] balrog: Yeah, I've been meaning to give it a try. [21:36] DFJustin: I'm on mIRC, too. I occasionally think about switching to something newer, but I'm used to it and it works. And, unlike the web, I don't imagine IRC has changed much since 1999. [21:39] omf_: I'm using nettalk. [21:39] it's <3. [21:39] runs under WINE quite nicely [21:57] I remember the first time I built ytalk [21:57] I dialed into a friends computer and we chatted for hours [21:58] ytalk to mirc when on windows to xchat