#archiveteam-bs 2013-03-03,Sun

↑back Search

Time Nickname Message
00:00 🔗 soultcer If you prefer, I can tell you a set of redis commands you can use to clear the todo list
00:00 🔗 chronomex that sounds cleaner, though I have the nginx stanza written in front of me
00:01 🔗 chronomex location /posterous/request { try_files =404; }
00:01 🔗 soultcer Both will work just as well
00:01 🔗 soultcer The warrior treats a 404 as "no more items"
00:01 🔗 chronomex ok
00:01 🔗 chronomex why do we need to do this again?
00:02 🔗 soultcer Because posterous is down
00:02 🔗 soultcer And 0 MB warcs are not really useful
00:02 🔗 chronomex it's down?
00:02 🔗 chronomex laaaame
00:02 🔗 soultcer Did you reload nginx?
00:02 🔗 chronomex ah crap this is my normal user account
00:03 🔗 chronomex 1sec
00:03 🔗 soultcer Haha ok
00:03 🔗 soultcer So the good news is: Posterous came back this very second
00:04 🔗 chronomex ummmm
00:04 🔗 chronomex I just reloaded nginx
00:04 🔗 soultcer The bad news: I don't know if it is better to start handing out items again, or to wait for alard and take a look at the damage all those almost empty warc files did
00:05 🔗 chronomex wep that broke it
00:06 🔗 chronomex hm.
00:06 🔗 soultcer I am still receiving items from the tracker?
00:07 🔗 chronomex yeah I spun it back up
00:07 🔗 chronomex or did I
00:07 🔗 chronomex [ pid=17642 thr=3064179568 file=ext/nginx/HelperAgent.cpp:933 time=2013-03-02 16:07:16.571 ]: Uncaught exception in PassengerServer client thread:
00:07 🔗 chronomex exception: Cannot read response from backend process: Connection reset by peer (104)
00:07 🔗 chronomex backtrace:
00:07 🔗 chronomex in 'void Client::forwardResponse(Passenger::SessionPtr&, Passenger::FileDescriptor&, const Passenger::AnalyticsLogPtr&)' (HelperAgent.cpp:698)
00:07 🔗 chronomex in 'void Client::handleRequest(Passenger::FileDescriptor&)' (HelperAgent.cpp:859)
00:07 🔗 chronomex in 'void Client::threadMain()' (HelperAgent.cpp:952)
00:07 🔗 chronomex getting these in error.log
00:08 🔗 chronomex alard: think I dun fucked it up
00:08 🔗 chronomex this is why I usually defer to someone who actually knows what's going on
00:08 🔗 chronomex soultcer: sounds like you understand the tracker better than I
00:09 🔗 soultcer Most of my knowledge comes from the readme of the universal-tracker repo
00:09 🔗 chronomex probably better than me still
00:11 🔗 soultcer I wonder what your nginx config looks like
00:12 🔗 chronomex why don't I give you an account on this box too
00:12 🔗 chronomex send me an ssh key
00:12 🔗 chronomex and a preferred username
00:36 🔗 chronomex wooooops http://www.youtube.com/watch?v=OvyIrsZ7Zhs
00:44 🔗 godane are there any other bigger wayback machines out there?
00:44 🔗 godane i want to find more asf files of the old techtv site
00:52 🔗 ersi godane: No, there isn't. For what I know, there's archive.is and WebCite besides IA Wayback. Unless you find something in the search engines cache's
00:53 🔗 godane ok
00:53 🔗 godane it just sucks cause i think i found a episode of the screen savers asf full episode
00:54 🔗 godane its one of the 'better' dial up episode in that is more full motion instead of stop-motion
01:06 🔗 ersi Vanilla Sky is one fucked up movie
01:07 🔗 omf_ Yes it is
01:07 🔗 omf_ did you just watch it?
01:07 🔗 ersi what the fuck is happening
01:09 🔗 ersi this is one bad trip, hehe
01:10 🔗 omf_ https://moviesayings.files.wordpress.com/2012/11/penc3a9lope-cruz-vanilla-sky.png
01:10 🔗 ersi hehe
01:19 🔗 ersi Wow, this is so fucking trippy
01:19 🔗 ersi It just gets.. more
01:20 🔗 ersi TEEEEEEEEECH SUUUUUUUPPOOOOOOOOOOOOOOOOOORTT
01:52 🔗 godane uploaded: https://archive.org/details/images.g4tv.com-20001-to-30000_l-images
02:56 🔗 godane uploaded: https://archive.org/details/images.g4tv.com-30001-to-40000_l-images
07:43 🔗 underscor http://www.weirdstuff.com/cgi-bin/item/11508 this is neat
07:43 🔗 underscor Also tempted to pick http://www.weirdstuff.com/cgi-bin/item/386355-2 up
08:16 🔗 BlueMax weird person, weird stuff. what can go wrong?
08:24 🔗 godane hey everyone
08:24 🔗 godane i maybel able to get all episodes of the daily nut podcast
08:25 🔗 underscor BlueMax: <3
08:26 🔗 Sue so close to having sync internet
08:26 🔗 Sue can't wait to brutally murder posterous
08:32 🔗 BlueMax I wish I could help you guys murder posterous
08:33 🔗 BlueMax I want to be weird, just like underscor
08:33 🔗 omf_ BlueMax, why not help?
08:33 🔗 BlueMax because I'll get banned...?
08:34 🔗 BlueMax and I don't have a second connection or anything...
08:34 🔗 underscor BlueMax: :D
08:34 🔗 underscor Still, a single thread helps
08:34 🔗 underscor also bans are pretty short now
08:34 🔗 BlueMax you know underscor, I look up to you
08:35 🔗 BlueMax you've done so much in the realm of computing and you're younger than me.
08:35 🔗 * underscor hugs
08:35 🔗 underscor Thanks :3
08:35 🔗 * BlueMax jealous.
08:35 🔗 underscor It means a lot to hear
08:35 🔗 BlueMax TEACH ME YOUR WAYS.
08:36 🔗 BlueMax I mean...all I could do was think about what was possible with the ArchiveTeam Warrior. You fucking built the thing.
08:37 🔗 godane i think i just saved 6 daily nut videos
08:37 🔗 underscor x3
08:37 🔗 underscor BlueMax: /j #preposterus :3
08:37 🔗 godane i'm doing a hit or miss based on the missing ids
08:52 🔗 * joepie91 spots an underscor
08:52 🔗 joepie91 ohai!
08:54 🔗 * BlueMax throws joepie91 at the underscor
08:55 🔗 joepie91 D:
08:55 🔗 joepie91 :(
08:55 🔗 joepie91 also, argh, frustrated
08:55 🔗 underscor BlueMax caught a wild underscor using the joepie91 ball
08:55 🔗 joepie91 waiting for tahoe lafs treasurer guy to get back to me
08:55 🔗 * BlueMax releases the underscor
08:55 🔗 joepie91 new project, can't launch until I has tahoe-lafs address
08:55 🔗 joepie91 D:
08:56 🔗 BlueMax wrong nature, bad EVs
08:56 🔗 underscor :3
08:56 🔗 underscor D:
08:56 🔗 joepie91 lol
08:56 🔗 underscor Fine
08:56 🔗 * underscor huffs
08:56 🔗 joepie91 I'm a ball now?
08:56 🔗 joepie91 also, underscor
08:56 🔗 joepie91 http://redonate.net/
08:56 🔗 joepie91 it's not officially launched yet
08:56 🔗 BlueMax underscor puffs, blows my house down?
08:56 🔗 joepie91 because tahoe-lafs address missing
08:56 🔗 joepie91 :P
08:56 🔗 joepie91 but it works already
08:56 🔗 joepie91 sorta
08:57 🔗 underscor :o neat
08:59 🔗 joepie91 well, just set up the cronjob
08:59 🔗 joepie91 let's see if it works
08:59 🔗 joepie91 *theoretically* the output should land in my inbox
09:00 🔗 joepie91 IT WORKED! :D
09:00 🔗 joepie91 http://owely.com/11cHKt1
09:48 🔗 godane so i think i'm starting to pull a jason scott
09:48 🔗 godane only cause i'm downloading like everything that i can from g4tv
09:49 🔗 joepie91 underscor: http://joepie91.wordpress.com/2013/03/03/announcing-redonate-recurring-contributions-done-right/
10:07 🔗 chronomex so redonate is just a mailing list
10:07 🔗 chronomex got it
10:18 🔗 * joepie91 sighs
10:49 🔗 S[h]O[r]T http://www.cnbc.com/id/100514618
10:49 🔗 S[h]O[r]T Yahoo to Shut Down 7 Products, Including Blackberry App
10:49 🔗 S[h]O[r]T Yahoo said its app for Blackberry smartphones would no longer be available for download, or supported by Yahoo, as of April 1 Yahoo also said that on April 1 it will stop supporting Yahoo Avatars
10:49 🔗 S[h]O[r]T The other Yahoo products set to be terminated include Yahoo App Search, Yahoo Sports IQ, Yahoo Clues, the Yahoo Message Boards website and the Yahoo Updates API.
11:13 🔗 joepie91 april fools?
11:13 🔗 joepie91 or srs?
11:14 🔗 omf_ It is for real
11:20 🔗 chazchaz It doesn't seem like a huge surprise to me. Yahoo seems to be a sinking ship in general.
11:22 🔗 omf_ they are gaining market share back in flickr
11:23 🔗 omf_ because of a better mobile app and because instragram's bs
11:23 🔗 omf_ I have more hope for them under the new CEO than before
11:23 🔗 chazchaz Well, that's good, I guess?
11:23 🔗 omf_ flickr does not have a fuck you TOS like instragram and facebook
11:23 🔗 omf_ that is the key difference
11:24 🔗 omf_ yahoo does not take control or ownership of your pics
11:24 🔗 omf_ There needs to be competition in this space because google is not the end all be all
11:26 🔗 omf_ and Microsoft is basically irrelevant
14:28 🔗 Cameron_D Here is CloudFlare's post regarding their downtime -- hosted on posterous! http://blog.cloudflare.com/todays-outage-post-mortem-82515
14:37 🔗 Smiley Starting WgetDownload for Item user-MsMims - Downloaded: 76310 URLs.
14:37 🔗 Smiley Another huge one
14:43 🔗 ersi chazchaz: You may say so, but they earn a lot of cash.
14:56 🔗 ersi S[h]O[r]T, joepie91, chazchaz, omf_: Regarding Yahoo! Messages (Seems to be the only one with any data to save): #BurnTheMessenger
14:59 🔗 omf_ The more I think about the Yahoo cuts the better I like it. You see Yahoo was crazy buying up and killing companies
15:00 🔗 ersi Awesome, got 48GB of punchfork downloaded~
15:00 🔗 omf_ Now that they are not that cash rich, they are looking to kill services that do not get much use.
15:00 🔗 omf_ Now think to google
15:00 🔗 ersi Fuck 'em all I say
15:00 🔗 omf_ for years google ran hundreds of projects that did shit for them
15:01 🔗 omf_ Once they started cutting that back there is a direct increase in the quality of the products google kept
15:01 🔗 omf_ we started seeing new site designs and features
15:01 🔗 ersi not really imo
15:01 🔗 omf_ In one of the articles it said yahoo had 72 mobile apps
15:01 🔗 ersi I think the quality has gone down hill
15:01 🔗 omf_ ersi, are you talking search quality
15:02 🔗 omf_ yahoo doesn't even have 72 products like that
15:02 🔗 omf_ All but 14 are getting cut
15:02 🔗 ersi No, generally
15:03 🔗 omf_ Like the blackberry apps. For a 3-7% market share they had full scale dev teams
15:03 🔗 ersi Google's quality has gone down hill and their new designs are usually terrible
15:04 🔗 omf_ compared to what exactly?
15:04 🔗 omf_ most of their competition is a joke
15:04 🔗 ersi to themselfs
15:04 🔗 omf_ which is sad for us the consumers
15:05 🔗 ersi yeah
15:16 🔗 Smiley soooo...... cloudflare don't push out a test of their rules, for even like 10 minutes before goign live with them.... thats nice to know
15:16 🔗 Smiley when response time is more important than making sure something is valid (ok, its a weird bug/issue but still if they'd pushed it to one router and waited 5 min, they'd of seen it crashing?)
15:17 🔗 Smiley Also, for all their "off network monitoring" they don't have off network controlable power?
15:17 🔗 ersi That's the whole point of CloudFlare - quick response to distributed attacks
15:17 🔗 ersi silly
15:18 🔗 Smiley too quick, thats my point
15:18 🔗 ersi Honestly, they should've at least just propagated the rule to a single router first ;o
15:18 🔗 Smiley yeah.
15:18 🔗 ersi I mean, it was a weird rule to begin with
15:18 🔗 Smiley THey point out its traffic that shoudln't ever existed.
15:18 🔗 Smiley so.... that raises questions about wtf is _actually_ happening
15:19 🔗 ersi Why specify the packet length at all? Just block the IPs
15:19 🔗 Smiley some kind of weird underflow/overflow situation where the bytecounter is breaking.
15:19 🔗 Smiley Reacting fast to _one_ customer, killed their entire network
15:19 🔗 Smiley I mean, if their whole network was being screwed up by these packets, sure, block em right away with no testing, it can't break more than it already is.
15:20 🔗 ersi dunno, it's easy to have hindsight
15:20 🔗 Smiley yah. I guess.
15:20 🔗 ersi especially as an outsider
15:20 🔗 ersi without accurate insight/data on the situation
15:20 🔗 Smiley But testign stuff on live :S
15:20 🔗 ersi Everyone does it
15:20 🔗 Smiley Yup, I'd like more info on what the cause was.
15:20 🔗 ersi Even places that test a lot
15:20 🔗 * Smiley ponders packets that announce one size and never reach it.
15:20 🔗 Smiley such as the slow loris attack, but for routers o_O
15:21 🔗 Smiley Maybe I should just be greatful I've not done it yet? :D
15:21 🔗 ersi btw, I've gone from 0 GB to 49 GB punchfork ;D
15:21 🔗 Smiley nice :D
15:21 🔗 Smiley Almost broke into the top 10 :P
15:22 🔗 ersi Since ZipExport breaks often, I moved it to before WgetDownload - so it'll break faster (meaning I get more successful downloads)
15:22 🔗 Smiley hahah nice.
15:22 🔗 Smiley Is it known why it breaks yet?
15:23 🔗 ersi I havn't had time to fix it, I reported it on github
15:23 🔗 ersi I'd say no
15:23 🔗 ersi seems to be encoding issues - either with python zipfile or BeautifulSoup4 maybe
15:24 🔗 Smiley One more question ;)
15:24 🔗 Smiley What's happening to the broken users on the tracker now?
15:24 🔗 ersi doesn't matter
15:24 🔗 ersi or rather, it's fixable
15:24 🔗 Smiley ok :)
15:24 🔗 ersi It's better to grab the ones that we can grab now and focus on the last ones later
15:25 🔗 Smiley yah of course
15:26 🔗 Smiley Starting WgetDownload for Item user-MsMims - Downloaded: 84050 URLs.
15:26 🔗 ersi neat
15:26 🔗 Smiley yup, wondering if it might break 100k :D
15:26 🔗 ersi had one over 100k yesterday, sadly it broke on ZipExport
15:27 🔗 Smiley D:
15:47 🔗 omf_ Anyone else write articles? Not posts to a blog but articles for magazines, news sites, etc...
15:47 🔗 ersi Smiley: http://news.ycombinator.com/item?id=5313716
15:48 🔗 omf_ It seems like there are less writers in that space now but more bloggers in general
15:48 🔗 ersi News sites are practically glorified blogs
15:48 🔗 ersi Just reprints of TT, Reuters etc
15:49 🔗 Smiley journalism will eat itself.
15:52 🔗 omf_ it already is
15:52 🔗 omf_ how many crappy papers have fallen off? I am not sure it is even possible to keep count
15:54 🔗 omf_ The only three newspapers I know of that are actually doing good now are NYT, Wallstreet Journal and The Washington Post
15:54 🔗 omf_ I miss Newsweek
15:56 🔗 Smiley yeah, cept I said that about 10 years ago :/
15:57 🔗 Smiley slowly it disolves itself and reforms as people go "hey, theres no good physical writers out there writing proper stuff, everything is just republished rubbish...2
15:58 🔗 Smiley until you hit that balance of sustainable work.... and something else :D
16:00 🔗 omf_ Another factor that hardly gets mentioned is business writing.
16:00 🔗 omf_ Why go be a journalist when big companies will hire you to write at much higher salaries
16:01 🔗 omf_ Writing used to be "simple" You had books, newspapers and magazines
16:01 🔗 omf_ businesses were small and it was easy to keep track of
16:01 🔗 omf_ think pre-1900
16:01 🔗 omf_ now we have businesses with 100k+ employees
16:02 🔗 omf_ So they need manuals, internal memos, wikis, internal/external docs, PR/marketing, etc...
16:03 🔗 omf_ Business is sucking up the writers and its not like more people are just hanging out waiting to write
16:03 🔗 omf_ Every advanced writing course I took in college, all the other students were going to become teachers, pre-education
16:03 🔗 ersi There's a lot of twerps that wants to be journalists
16:03 🔗 ersi It's not exactly a shortage of those morons
16:03 🔗 omf_ yeah glory hounds
16:04 🔗 ersi Quality journalists, sure
16:04 🔗 omf_ I mean in general quality writers
16:04 🔗 joepie91 ersi: hai
16:04 🔗 joepie91 http://joepie91.wordpress.com/2013/03/03/announcing-redonate-recurring-contributions-done-right/ :D
16:05 🔗 ersi doesn't seem interesting at all
16:06 🔗 ersi sorry :P
16:06 🔗 joepie91 :(
16:08 🔗 Smiley https://secure.flickr.com/photos/djsmiley2k/8524485692/in/photostream
16:08 🔗 Smiley more interesting :D
16:13 🔗 Smiley uhoh
16:13 🔗 Smiley https://secure.flickr.com/photos/djsmiley2k/8524485692/in/photostream
16:13 🔗 Smiley Oh, it just started zipping
16:13 🔗 ersi Smiley: Can you do `python --version` inside your Warrior btw?
16:13 🔗 Smiley nope, no ssh to the warrior atm :(
16:13 🔗 ersi 'k
16:14 🔗 Smiley not sure how to add port forwarding via VBoxManage
17:16 🔗 * ersi shrugs at punchfork
17:20 🔗 ersi alard: Seems like some recipes have multiple encodings or something else funny - which will make everything blow up and the user fail. Should we try to convert it somehow, or just catch the exception and skip the recipe and go on?
17:21 🔗 ersi Also, nice. I'm in the top10 of punchfork now :D with only 198 items
17:33 🔗 Smiley hahaha
17:35 🔗 ersi I'm closing in on you Smiley
17:35 🔗 ersi :D
17:35 🔗 Smiley [15:27:30] < Smiley> Starting WgetDownload for Item user-MsMims - Downloaded: 84050 URLs.
17:35 🔗 Smiley looks like that one failed then
17:35 🔗 Smiley <>shrug> :D
17:36 🔗 Smiley - Saved 2303 recipes. <<-- or not, D:
17:56 🔗 balrog ersi: do we know which?
17:56 🔗 balrog there's probably a way to handle it
17:57 🔗 balrog perhaps by using a non-wget downloader
17:58 🔗 ersi balrog: It's not using wget
17:58 🔗 balrog what is it using?
17:58 🔗 ersi python-requests + beautifulsoup4
17:58 🔗 balrog aaah
17:59 🔗 ersi It's a nasty problem irregardless, when idiots mix encodings in a document
17:59 🔗 balrog yep
18:07 🔗 Smiley ok, it looks like my entire vm just blew up o_O
18:08 🔗 Smiley Ooop, it's coming back, just loading really slowly..... HUGE user/zip issue maybe?
18:09 🔗 Smiley 20k and 40k url users at the mo...
18:11 🔗 ersi shouldn't be too bad, those only take about 50-100MB RAM for me
18:11 🔗 ersi if it's wget
18:11 🔗 Smiley yeah, the other one before has disappeared tho and I don't see it on the tracker.
18:11 🔗 mistym I noticed that when I set the warrior to "archiveteam's choice", it was picking urlteam instead of Punchfork which is closing soon. Does it do any prioritization?
18:11 🔗 Smiley [15:27:30] < Smiley> Starting WgetDownload for Item user-MsMims - Downloaded: 84050 URLs.
18:12 🔗 ersi mistym: It's manual
18:12 🔗 Smiley mistym: we are cleaning up punchfork atm- theres only some users which are causing issues to download. Other than that its'd finished.
18:12 🔗 mistym Smiley: Ah
18:12 🔗 * Smiley goes away now
18:38 🔗 alard ersi: I think skipping the recipe is good enough, unless it's very easy to fix or you want to spend a lot of time on it.
18:39 🔗 balrog which one is messed up?
18:40 🔗 balrog and how many are messed up?
18:41 🔗 alard (And the recipes are still saved in the warc, they're just not in the zip file.)
18:43 🔗 balrog what happens when you extract the warc with The Unarchiver / unar?
18:51 🔗 ersi alard: yeah, seems to be very few in total anyway - for the users I've tried manually with echoing an error and passing, seems to be about 1-2 reciepts only
18:52 🔗 godane so look like archive.org is down
18:52 🔗 balrog scheduled maintenance it says
18:55 🔗 godane i thought it was cause my constain uploading
18:57 🔗 ersi godane: heh, naw
19:09 🔗 ersi Smiley: Slowly getting nearer you on top10
19:10 🔗 godane i got 5gb of images between 60001 to 80000 image ids
19:13 🔗 godane most of the 5gb is in the 70001 to 80000 area
19:15 🔗 Smiley ersi: hahah nice.
19:15 🔗 Smiley godane: :) nice.
19:32 🔗 godane so i'm trying to grab all the daily nut podcast from 2006
20:10 🔗 ersi Smiley: Getting closer to you now :D
20:10 🔗 Smiley D:
20:11 🔗 ersi Was at #11 now I'm #8
20:12 🔗 Smiley nice :)
20:58 🔗 norbert79 joepie91: around?
22:21 🔗 godane nytimes grean blog is axed: http://green.blogs.nytimes.com/2013/03/01/a-blogs-adieu/
22:24 🔗 ersi green goes black

irclogger-viewer