[00:00] If you prefer, I can tell you a set of redis commands you can use to clear the todo list [00:00] that sounds cleaner, though I have the nginx stanza written in front of me [00:01] location /posterous/request { try_files =404; } [00:01] Both will work just as well [00:01] The warrior treats a 404 as "no more items" [00:01] ok [00:01] why do we need to do this again? [00:02] Because posterous is down [00:02] And 0 MB warcs are not really useful [00:02] it's down? [00:02] laaaame [00:02] Did you reload nginx? [00:02] ah crap this is my normal user account [00:03] 1sec [00:03] Haha ok [00:03] So the good news is: Posterous came back this very second [00:04] ummmm [00:04] I just reloaded nginx [00:04] The bad news: I don't know if it is better to start handing out items again, or to wait for alard and take a look at the damage all those almost empty warc files did [00:05] wep that broke it [00:06] hm. [00:06] I am still receiving items from the tracker? [00:07] yeah I spun it back up [00:07] or did I [00:07] [ pid=17642 thr=3064179568 file=ext/nginx/HelperAgent.cpp:933 time=2013-03-02 16:07:16.571 ]: Uncaught exception in PassengerServer client thread: [00:07] exception: Cannot read response from backend process: Connection reset by peer (104) [00:07] backtrace: [00:07] in 'void Client::forwardResponse(Passenger::SessionPtr&, Passenger::FileDescriptor&, const Passenger::AnalyticsLogPtr&)' (HelperAgent.cpp:698) [00:07] in 'void Client::handleRequest(Passenger::FileDescriptor&)' (HelperAgent.cpp:859) [00:07] in 'void Client::threadMain()' (HelperAgent.cpp:952) [00:07] getting these in error.log [00:08] alard: think I dun fucked it up [00:08] this is why I usually defer to someone who actually knows what's going on [00:08] soultcer: sounds like you understand the tracker better than I [00:09] Most of my knowledge comes from the readme of the universal-tracker repo [00:09] probably better than me still [00:11] I wonder what your nginx config looks like [00:12] why don't I give you an account on this box too [00:12] send me an ssh key [00:12] and a preferred username [00:36] wooooops http://www.youtube.com/watch?v=OvyIrsZ7Zhs [00:44] are there any other bigger wayback machines out there? [00:44] i want to find more asf files of the old techtv site [00:52] godane: No, there isn't. For what I know, there's archive.is and WebCite besides IA Wayback. Unless you find something in the search engines cache's [00:53] ok [00:53] it just sucks cause i think i found a episode of the screen savers asf full episode [00:54] its one of the 'better' dial up episode in that is more full motion instead of stop-motion [01:06] Vanilla Sky is one fucked up movie [01:07] Yes it is [01:07] did you just watch it? [01:07] what the fuck is happening [01:09] this is one bad trip, hehe [01:10] https://moviesayings.files.wordpress.com/2012/11/penc3a9lope-cruz-vanilla-sky.png [01:10] hehe [01:19] Wow, this is so fucking trippy [01:19] It just gets.. more [01:20] TEEEEEEEEECH SUUUUUUUPPOOOOOOOOOOOOOOOOOORTT [01:52] uploaded: https://archive.org/details/images.g4tv.com-20001-to-30000_l-images [02:56] uploaded: https://archive.org/details/images.g4tv.com-30001-to-40000_l-images [07:43] http://www.weirdstuff.com/cgi-bin/item/11508 this is neat [07:43] Also tempted to pick http://www.weirdstuff.com/cgi-bin/item/386355-2 up [08:16] weird person, weird stuff. what can go wrong? [08:24] hey everyone [08:24] i maybel able to get all episodes of the daily nut podcast [08:25] BlueMax: <3 [08:26] so close to having sync internet [08:26] can't wait to brutally murder posterous [08:32] I wish I could help you guys murder posterous [08:33] I want to be weird, just like underscor [08:33] BlueMax, why not help? [08:33] because I'll get banned...? [08:34] and I don't have a second connection or anything... [08:34] BlueMax: :D [08:34] Still, a single thread helps [08:34] also bans are pretty short now [08:34] you know underscor, I look up to you [08:35] you've done so much in the realm of computing and you're younger than me. [08:35] * underscor hugs [08:35] Thanks :3 [08:35] * BlueMax jealous. [08:35] It means a lot to hear [08:35] TEACH ME YOUR WAYS. [08:36] I mean...all I could do was think about what was possible with the ArchiveTeam Warrior. You fucking built the thing. [08:37] i think i just saved 6 daily nut videos [08:37] x3 [08:37] BlueMax: /j #preposterus :3 [08:37] i'm doing a hit or miss based on the missing ids [08:52] * joepie91 spots an underscor [08:52] ohai! [08:54] * BlueMax throws joepie91 at the underscor [08:55] D: [08:55] :( [08:55] also, argh, frustrated [08:55] BlueMax caught a wild underscor using the joepie91 ball [08:55] waiting for tahoe lafs treasurer guy to get back to me [08:55] * BlueMax releases the underscor [08:55] new project, can't launch until I has tahoe-lafs address [08:55] D: [08:56] wrong nature, bad EVs [08:56] :3 [08:56] D: [08:56] lol [08:56] Fine [08:56] * underscor huffs [08:56] I'm a ball now? [08:56] also, underscor [08:56] http://redonate.net/ [08:56] it's not officially launched yet [08:56] underscor puffs, blows my house down? [08:56] because tahoe-lafs address missing [08:56] :P [08:56] but it works already [08:56] sorta [08:57] :o neat [08:59] well, just set up the cronjob [08:59] let's see if it works [08:59] *theoretically* the output should land in my inbox [09:00] IT WORKED! :D [09:00] http://owely.com/11cHKt1 [09:48] so i think i'm starting to pull a jason scott [09:48] only cause i'm downloading like everything that i can from g4tv [09:49] underscor: http://joepie91.wordpress.com/2013/03/03/announcing-redonate-recurring-contributions-done-right/ [10:07] so redonate is just a mailing list [10:07] got it [10:18] * joepie91 sighs [10:49] http://www.cnbc.com/id/100514618 [10:49] Yahoo to Shut Down 7 Products, Including Blackberry App [10:49] Yahoo said its app for Blackberry smartphones would no longer be available for download, or supported by Yahoo, as of April 1 Yahoo also said that on April 1 it will stop supporting Yahoo Avatars [10:49] The other Yahoo products set to be terminated include Yahoo App Search, Yahoo Sports IQ, Yahoo Clues, the Yahoo Message Boards website and the Yahoo Updates API. [11:13] april fools? [11:13] or srs? [11:14] It is for real [11:20] It doesn't seem like a huge surprise to me. Yahoo seems to be a sinking ship in general. [11:22] they are gaining market share back in flickr [11:23] because of a better mobile app and because instragram's bs [11:23] I have more hope for them under the new CEO than before [11:23] Well, that's good, I guess? [11:23] flickr does not have a fuck you TOS like instragram and facebook [11:23] that is the key difference [11:24] yahoo does not take control or ownership of your pics [11:24] There needs to be competition in this space because google is not the end all be all [11:26] and Microsoft is basically irrelevant [14:28] Here is CloudFlare's post regarding their downtime -- hosted on posterous! http://blog.cloudflare.com/todays-outage-post-mortem-82515 [14:37] Starting WgetDownload for Item user-MsMims - Downloaded: 76310 URLs. [14:37] Another huge one [14:43] chazchaz: You may say so, but they earn a lot of cash. [14:56] S[h]O[r]T, joepie91, chazchaz, omf_: Regarding Yahoo! Messages (Seems to be the only one with any data to save): #BurnTheMessenger [14:59] The more I think about the Yahoo cuts the better I like it. You see Yahoo was crazy buying up and killing companies [15:00] Awesome, got 48GB of punchfork downloaded~ [15:00] Now that they are not that cash rich, they are looking to kill services that do not get much use. [15:00] Now think to google [15:00] Fuck 'em all I say [15:00] for years google ran hundreds of projects that did shit for them [15:01] Once they started cutting that back there is a direct increase in the quality of the products google kept [15:01] we started seeing new site designs and features [15:01] not really imo [15:01] In one of the articles it said yahoo had 72 mobile apps [15:01] I think the quality has gone down hill [15:01] ersi, are you talking search quality [15:02] yahoo doesn't even have 72 products like that [15:02] All but 14 are getting cut [15:02] No, generally [15:03] Like the blackberry apps. For a 3-7% market share they had full scale dev teams [15:03] Google's quality has gone down hill and their new designs are usually terrible [15:04] compared to what exactly? [15:04] most of their competition is a joke [15:04] to themselfs [15:04] which is sad for us the consumers [15:05] yeah [15:16] soooo...... cloudflare don't push out a test of their rules, for even like 10 minutes before goign live with them.... thats nice to know [15:16] when response time is more important than making sure something is valid (ok, its a weird bug/issue but still if they'd pushed it to one router and waited 5 min, they'd of seen it crashing?) [15:17] Also, for all their "off network monitoring" they don't have off network controlable power? [15:17] That's the whole point of CloudFlare - quick response to distributed attacks [15:17] silly [15:18] too quick, thats my point [15:18] Honestly, they should've at least just propagated the rule to a single router first ;o [15:18] yeah. [15:18] I mean, it was a weird rule to begin with [15:18] THey point out its traffic that shoudln't ever existed. [15:18] so.... that raises questions about wtf is _actually_ happening [15:19] Why specify the packet length at all? Just block the IPs [15:19] some kind of weird underflow/overflow situation where the bytecounter is breaking. [15:19] Reacting fast to _one_ customer, killed their entire network [15:19] I mean, if their whole network was being screwed up by these packets, sure, block em right away with no testing, it can't break more than it already is. [15:20] dunno, it's easy to have hindsight [15:20] yah. I guess. [15:20] especially as an outsider [15:20] without accurate insight/data on the situation [15:20] But testign stuff on live :S [15:20] Everyone does it [15:20] Yup, I'd like more info on what the cause was. [15:20] Even places that test a lot [15:20] * Smiley ponders packets that announce one size and never reach it. [15:20] such as the slow loris attack, but for routers o_O [15:21] Maybe I should just be greatful I've not done it yet? :D [15:21] btw, I've gone from 0 GB to 49 GB punchfork ;D [15:21] nice :D [15:21] Almost broke into the top 10 :P [15:22] Since ZipExport breaks often, I moved it to before WgetDownload - so it'll break faster (meaning I get more successful downloads) [15:22] hahah nice. [15:22] Is it known why it breaks yet? [15:23] I havn't had time to fix it, I reported it on github [15:23] I'd say no [15:23] seems to be encoding issues - either with python zipfile or BeautifulSoup4 maybe [15:24] One more question ;) [15:24] What's happening to the broken users on the tracker now? [15:24] doesn't matter [15:24] or rather, it's fixable [15:24] ok :) [15:24] It's better to grab the ones that we can grab now and focus on the last ones later [15:25] yah of course [15:26] Starting WgetDownload for Item user-MsMims - Downloaded: 84050 URLs. [15:26] neat [15:26] yup, wondering if it might break 100k :D [15:26] had one over 100k yesterday, sadly it broke on ZipExport [15:27] D: [15:47] Anyone else write articles? Not posts to a blog but articles for magazines, news sites, etc... [15:47] Smiley: http://news.ycombinator.com/item?id=5313716 [15:48] It seems like there are less writers in that space now but more bloggers in general [15:48] News sites are practically glorified blogs [15:48] Just reprints of TT, Reuters etc [15:49] journalism will eat itself. [15:52] it already is [15:52] how many crappy papers have fallen off? I am not sure it is even possible to keep count [15:54] The only three newspapers I know of that are actually doing good now are NYT, Wallstreet Journal and The Washington Post [15:54] I miss Newsweek [15:56] yeah, cept I said that about 10 years ago :/ [15:57] slowly it disolves itself and reforms as people go "hey, theres no good physical writers out there writing proper stuff, everything is just republished rubbish...2 [15:58] until you hit that balance of sustainable work.... and something else :D [16:00] Another factor that hardly gets mentioned is business writing. [16:00] Why go be a journalist when big companies will hire you to write at much higher salaries [16:01] Writing used to be "simple" You had books, newspapers and magazines [16:01] businesses were small and it was easy to keep track of [16:01] think pre-1900 [16:01] now we have businesses with 100k+ employees [16:02] So they need manuals, internal memos, wikis, internal/external docs, PR/marketing, etc... [16:03] Business is sucking up the writers and its not like more people are just hanging out waiting to write [16:03] Every advanced writing course I took in college, all the other students were going to become teachers, pre-education [16:03] There's a lot of twerps that wants to be journalists [16:03] It's not exactly a shortage of those morons [16:03] yeah glory hounds [16:04] Quality journalists, sure [16:04] I mean in general quality writers [16:04] ersi: hai [16:04] http://joepie91.wordpress.com/2013/03/03/announcing-redonate-recurring-contributions-done-right/ :D [16:05] doesn't seem interesting at all [16:06] sorry :P [16:06] :( [16:08] https://secure.flickr.com/photos/djsmiley2k/8524485692/in/photostream [16:08] more interesting :D [16:13] uhoh [16:13] https://secure.flickr.com/photos/djsmiley2k/8524485692/in/photostream [16:13] Oh, it just started zipping [16:13] Smiley: Can you do `python --version` inside your Warrior btw? [16:13] nope, no ssh to the warrior atm :( [16:13] 'k [16:14] not sure how to add port forwarding via VBoxManage [17:16] * ersi shrugs at punchfork [17:20] alard: Seems like some recipes have multiple encodings or something else funny - which will make everything blow up and the user fail. Should we try to convert it somehow, or just catch the exception and skip the recipe and go on? [17:21] Also, nice. I'm in the top10 of punchfork now :D with only 198 items [17:33] hahaha [17:35] I'm closing in on you Smiley [17:35] :D [17:35] [15:27:30] < Smiley> Starting WgetDownload for Item user-MsMims - Downloaded: 84050 URLs. [17:35] looks like that one failed then [17:35] <>shrug> :D [17:36] - Saved 2303 recipes. <<-- or not, D: [17:56] ersi: do we know which? [17:56] there's probably a way to handle it [17:57] perhaps by using a non-wget downloader [17:58] balrog: It's not using wget [17:58] what is it using? [17:58] python-requests + beautifulsoup4 [17:58] aaah [17:59] It's a nasty problem irregardless, when idiots mix encodings in a document [17:59] yep [18:07] ok, it looks like my entire vm just blew up o_O [18:08] Ooop, it's coming back, just loading really slowly..... HUGE user/zip issue maybe? [18:09] 20k and 40k url users at the mo... [18:11] shouldn't be too bad, those only take about 50-100MB RAM for me [18:11] if it's wget [18:11] yeah, the other one before has disappeared tho and I don't see it on the tracker. [18:11] I noticed that when I set the warrior to "archiveteam's choice", it was picking urlteam instead of Punchfork which is closing soon. Does it do any prioritization? [18:11] [15:27:30] < Smiley> Starting WgetDownload for Item user-MsMims - Downloaded: 84050 URLs. [18:12] mistym: It's manual [18:12] mistym: we are cleaning up punchfork atm- theres only some users which are causing issues to download. Other than that its'd finished. [18:12] Smiley: Ah [18:12] * Smiley goes away now [18:38] ersi: I think skipping the recipe is good enough, unless it's very easy to fix or you want to spend a lot of time on it. [18:39] which one is messed up? [18:40] and how many are messed up? [18:41] (And the recipes are still saved in the warc, they're just not in the zip file.) [18:43] what happens when you extract the warc with The Unarchiver / unar? [18:51] alard: yeah, seems to be very few in total anyway - for the users I've tried manually with echoing an error and passing, seems to be about 1-2 reciepts only [18:52] so look like archive.org is down [18:52] scheduled maintenance it says [18:55] i thought it was cause my constain uploading [18:57] godane: heh, naw [19:09] Smiley: Slowly getting nearer you on top10 [19:10] i got 5gb of images between 60001 to 80000 image ids [19:13] most of the 5gb is in the 70001 to 80000 area [19:15] ersi: hahah nice. [19:15] godane: :) nice. [19:32] so i'm trying to grab all the daily nut podcast from 2006 [20:10] Smiley: Getting closer to you now :D [20:10] D: [20:11] Was at #11 now I'm #8 [20:12] nice :) [20:58] joepie91: around? [22:21] nytimes grean blog is axed: http://green.blogs.nytimes.com/2013/03/01/a-blogs-adieu/ [22:24] green goes black