[00:43] <omf_> Do we as heavy users of computer really workout enough.
[00:43] <omf_> I wouldn't be surprised if everyone in this irc channel is unfit, myself included.
[00:44] <omf_> Do things like standing desks help out?
[00:49] <SketchCow> ha ha.
[00:49] <SketchCow> Well, I have a fitbit scale and fitbit wearable.
[00:49] <SketchCow> I have been low carb for over a week
[00:50] <SketchCow> CPAP machine and blood pressure meds to keep that under control
[00:50] <omf_> Oh I didn't realize you dropped carbs AND caffeine. You must enjoy sleeping at least for a week or two
[00:51] <SketchCow> No sense in going low carb and drinking diet soda
[00:52] <omf_> I give up caffeine over a year ago and it was really fucking hard. I was an 8+ cups a day for 11 years
[00:52] <omf_> Low carb worked for me as well but both at once, I give you credit man, that is hard
[01:03] <SketchCow> archiveteam-health
[01:03] <SketchCow> (Don't go there)
[01:03] <SketchCow> (It's full of fried recipies
[01:03] <godane> SketchCow: i have mirrored clutofmac articles
[01:04] <godane> its about 469.3mb
[01:18] <SketchCow> Bravo
[01:19] <godane> still have to go after images
[01:21] <omf_> So you backed up the pages but not the images... why not get both at once?
[01:21] <godane> i do more then one archive
[01:21] <godane> first the index so i can get the links
[01:21] <godane> then the articles
[01:21] <joepie91> speaking of which
[01:22] <joepie91> I had a bit of an issue wget-warc'ing engadget
[01:22] <joepie91> it... ran out of RAM
[01:22] <joepie91> how fix?
[01:22] <godane> engadget is done
[01:22] <joepie91> well yes, but assuming a theoretical future crawl
[01:22] <joepie91> that might have the same issue
[01:22] <joepie91> (did you get joystiq and their subsites, btw?)
[01:22] <joepie91> (massively and wow in particular)
[01:22] <godane> i did based on years
[01:23] <godane> after i grabed the pages
[01:23] <joepie91> I see
[01:23] <joepie91> is there any way to do a 'regular' warc but have it store whatever it's throwing into RAM, in some other place?
[01:23] <joepie91> (I assume the URL list?)
[01:24] <omf_> godane, How is that more efficient than using the spidering features built into most website mirror software
[01:24] <omf_> joepie91, have you looked at the wget code
[01:24] <godane> its was mostly so i didn't get hit by the 4gb wget warc limit
[01:25] <godane> and my wifi drops sometimes
[01:25] <omf_> I have 20gb warcs I made with wget. That sounds more like a 32bit problem
[01:25] <omf_> or your running on Windows
[01:26] <joepie91> omf_: I have not
[01:26] <godane> also this gives order to things
[01:26] <joepie91> I was more looking for a command line switch kind of thing
[01:26] <omf_> joepie91, that does not exist
[01:27] <DFJustin> irony http://www.nature.com/nature/journal/v497/n7448/full/497183a.html
[01:27] <omf_> Paywalls all the way down
[01:28] <omf_> godane, how does having things in "order" help at all? When this gets shoved into the wayback machine it doesn't matter
[01:28] <godane> it helps me do it in warc-proxy
[01:29] <godane> it takes a very long time to make idx files on the fry locally
[01:29] <godane> and again my wifi sucks sometimes
[01:29] <godane> or my internet sucks sometimes
[01:29] <omf_> Is wifi all they offer in your area?
[01:29] <godane> we have cable
[01:30] <godane> but my room is not where the cable is
[01:31] <godane> also i don't like upload 20gb files
[01:31] <godane> even 5gb to 10gb files i try my best to not grab cause it takes to long
[01:32] <omf_> So you put up with and change your workflow to work around wifi drops when a few dollars in cable and some time would fix the problem permanently.
[01:32] <DFJustin> even on a wired connection I've had hiccups uploading 20gb stuff to IA with consumer-level upstream
[01:33] <godane> my dad will not like 60ft of cable running across the living room
[01:33] <omf_> So have I, but that happens way less than bad wifi single
[01:34] <godane> again this is my way of doing things
[01:34] <godane> i don't need the #comments pages
[01:34] <omf_> They sell these little plastic hooks with a sticky side at Lowes or Home Depot. They are paint safe and easy to remove. I have used them to run line on a ceiling so it is out of the way
[01:34] <godane> or #top
[01:34] <godane> cause there the same page
[01:34] <godane> again i will not do that
[01:35] <godane> wifi is my ownly option without me going with a netbook into that room
[01:36] <chronomex> omf_: paint safe? that's a blatant lie.
[01:36] <omf_> chronomex, I have removed them after 3 years use and no paint damage
[01:36] <chronomex> depends on the paint I guess
[01:39] <godane> also based on the link list i cut my mirror to half cause there no bad or double links to the same story
[01:39] <godane> all cause of a #comments like link
[01:43] <omf_> godane, URL fragments (aka #comment) can be handled by modern spiders automagically.
[01:44] <godane> again i want to upload small sites
[01:44] <godane> this made cultofmac small
[01:44] <godane> we have the articles
[01:45] <godane> also just spidering will get crap like this
[01:45] <godane> http://cdn.cultofmac.com/wp-content/themes/com2012/i/a/gu
[01:46] <godane> its a broken link but cultofmac.com redirects
[01:46] <godane> more data
[01:46] <godane> which makes it harder for me to upload
[01:46] <godane> should have been this link: http://cdn.cultofmac.com/wp-content/themes/com2012/i/a/guest_post.jpg
[01:46] <omf_> But that redirection information can be useful. What if those links are used on another site? Leaving them out means your grabs are incomplete
[01:47] <omf_> BlueMax, they just make printers
[01:48] <omf_> a few people got in trouble for hosting the files for 3d guns
[01:48] <BlueMax> ah right OK
[01:48] <omf_> there are other 3d printer makers as well, all trying to make the best printer
[01:48] <godane> that redirect is to something i grabed in my article dump
[01:48] <omf_> If I had the money spare I would own one
[01:48] <godane> no need to do a 2nd grab in my image dump
[01:49] <omf_> It doesn't matter. If the warc does not have information that the url you left out is a redirect then there is no record of that path to that content.
[01:50] <omf_> plus if you deduplicate before uploading then content fingerprinting cleans things up
[01:50] <godane> i talked about dedup with the same warc
[01:50] <godane> wget warc doesn't work that way
[01:51] <godane> it can't figure it out in less its in older dump
[01:51] <godane> that you point to cdx file to
[02:03] <omf_> Why not use warcdump to map and then warcfilter to clean it up. I assume that is why those warc tools were written
[02:06] <godane> i never use those tools
[02:06] <godane> will try at some point
[02:06] <godane> but right now i'm doing it my way
[02:15] <dashcloud> so, does anyone know of stories, videos, etc on the history of the slow-cooker (I guess Crock-Pot is the most well-known of the bunch)? It must have been a fascinating invention at the time- a cooking appliance you can safely leave on for 8 hours and not worry about something catching fire
[02:15] <omf_> I like how rice cookers can be used like crockpots now
[02:17] <dashcloud> apparently some of the Japanese models play a short bit of music once you successfully cook the rice
[02:35] <SketchCow> I heard some use a form of fuzzy logic to determine cooking time.
[02:53] <omf_> I know there are a fuck ton of bloggers out there now but how many software programming do you think? A few thousand that are actively updated
[02:54] <omf_> I was looking at the government numbers for programmers and is tiny. Then think that most of them do not blog and what exactly is left
[03:01] <godane> so i got at least 40 videos today
[03:01] <godane> most are tss previews for shows
[03:01] <godane> but weeds out the index so i can search for other stuff
[08:03] <SmileyG> So here I am at work
[08:04] <SmileyG> Nothing ever changes, this makes me lololol.
[08:26] <SmileyG> /dev/sdc1                                 1.4T  685G  642G  52% /home/tim.bowers/bin/ign/storage
[08:26] <SmileyG> yes, warc's
[08:29] <norbert79> "Ain't nobody got time for that"... lol... Huge data
[08:30] <SmileyG> norbert79: :D
[08:30] <SmileyG> I have about 50+ more metadata files to write :<
[12:51] <godane> so i have started to upload the cultofmac.com dumps i have
[12:51] <godane> uploaded: https://archive.org/details/www.cultofmac.com-index-20130513
[12:52] <godane> also note that the pages maybe also in my articles dump too
[13:54] <SmileyG> damn, it's quiet in here today?
[14:03] <godane> uploaded: https://archive.org/details/www.cultofmac.com-articles-20130513
[14:08] <godane> i maybe able to get gphoria 2004
[14:08] <SmileyG> what is/was that?
[14:09] <godane> it was a award show on g4
[14:10] <godane> i got the 2006 one on IA
[14:12] <godane> funny that mpg can be played right away when its been downloaded but not mp4
[14:12] <SmileyG> right
[14:12] <SmileyG> im so bored out of my skull at work and have no clue wtf is going on atm (my brain has fried and I'm honestly lost) I'm just gonna work on metadata
[14:13] <SmileyG> damn, newamerica is still fetching.
[14:13] <SmileyG> as is pouet.
[14:13] <godane> how big is it?
[14:13] <SmileyG> waiting for it to tell me :DX
[14:13] <joepie91> pouet!
[14:14] <SmileyG> 141x1.1Gb
[14:14] <SmileyG> for new america, so far.
[14:14] <SmileyG> -rw-r--r-- 1 tim.bowers games 34G May 14 15:14 ./rotavault.ign.com-2013-04-19.warc
[14:15] <SmileyG> -rw-r--r-- 1 tim.bowers games 10G May 14 15:15 ./pouet/pouet.net_06052013.warc
[14:25] <godane> i'm just glad i didn't have to do that now
[14:25] <godane> newamerica that is
[14:31] <SmileyG> aye
[15:16] <DFJustin> https://twitter.com/mikko/status/334286983193047040
[15:17] <SmileyG> http://www.h-online.com/security/news/item/Skype-with-care-Microsoft-is-reading-everything-you-write-1862870.html english version
[18:13] <omf_> SmileyG, hey yo ;)
[18:14] <SmileyG> hey
[18:14] <SmileyG> su
[18:14] <SmileyG> sup?
[18:14] <omf_> need some jobs to run?
[18:14] <SmileyG> JHust currently spazzing out atm so no thanks D:
[18:15] <omf_> Everything all right?
[18:19] <omf_> http://yro.slashdot.org/story/13/05/14/0134224/new-prenda-law-shell-corp-threatening-to-tell-your-neighbors-you-pirated-porn
[18:19] <omf_> I cannot wait for the new round of kick Prenda in court
[18:19] <sep332> what's left of them? thought they got dismantled by a judge.
[18:20] <sep332> or was that just their law firm maybe
[18:20] <omf_> They formed a new shell company and started right back up
[18:20] <omf_> same people
[18:20] <sep332> they've done that before, you'd think the judge woulda seen it coming
[18:22] <omf_> The judge cannot stop them from breaking the law again, only punish them
[18:22] <omf_> So today it is finally back into the 70s outside, we had a couple days back in the 40s and that sucked
[18:22] <sep332> looks like the previous punishment was just $80k fine (and the breakup)
[18:23] <sep332> yeah it's nice now! actually below freezing last night, ridiculous
[18:27] <joepie91> omf_: hahahaha
[18:27] <joepie91> these guys are just an infinite source of entertainment, aren't they
[18:27] <omf_> The cool thing is now that the judge has slapped and they did it again, now RICO can be brought in
[18:27] <joepie91> also, completely unrelated
[18:27] <joepie91> linux kernel sploit
[18:27] <joepie91> root priv escalation
[18:27] <joepie91> affects .32 (vswap) openvz kernels
[18:27] <joepie91> if you have an openvz VPS with vswap, I recommend backing up your shit and informing your provider
[18:27] <joepie91> http://www.lowendtalk.com/discussion/10514/linux-kernel-2.6.37-3.8.8-0day
[18:28] <joepie91> supposedly it allows you to gain system root from a container
[18:38] <omf_> Oh look Ubuntu canceled brainstorm, thats a shock. A distro runs a site for community input and then ignores all that input
[18:40] <chronomex> seems typical
[18:44] <ersi> omf_: Is setting up your jobs hard? Is it suitable to do in the warrior?
[18:45] <omf_> If I overhauled the warrior
[18:47] <ersi> So, that's a no?
[18:50] <SmileyG> omf_: btw what is the jobs? :D
[18:50] <omf_> Running the jobs is easy, the initial setup on Debian and CentOS is pretty involved because of how out of date the software is on those distros
[18:51] <ersi> SmileyG: Posterous-screenshotting
[18:51] <SmileyG> gentoo \o/
[18:51] <omf_> and just bumping to unstable or testing causes conflicts in the dependencies
[18:51] <SmileyG> I can ssh into work and give it a poke
[18:51] <SmileyG> though I just signed my self back off work D:
[18:51] <omf_> SmileyG, You running gentoo at work?
[18:51] <SmileyG> yes, I'm fucking epic
[18:51] <omf_> haven't tried it on there, in theory it should be the easiest
[18:52] <SmileyG> indeed
[18:52] <SmileyG> give me commands
[18:52] <SmileyG> FEED ME MOAR.
[18:52] <SmileyG> sorry
[18:52] <SmileyG> hyper/headfucked right now
[18:52] <omf_> you will have to figure out some of the package names since they are changed with every distro
[18:52] <SmileyG> nod
[18:52] <SmileyG> znurt to teh rescue.
[18:53] <SmileyG> imagemagick by any chance?
[18:53] <omf_> nope
[18:53] <SmileyG> bit of python I guess...
[18:53] <omf_> not even a chance
[18:53] <SmileyG> :D
[18:53] <SmileyG> Ok cool, just give me commands and set me going with a small set
[18:54] <omf_> Also job tuning is based on cpu cores, and RAM
[18:56] <SmileyG> :O
[18:56] <SmileyG> well ram is at a premium atm due to some huige wgets
[18:57] <omf_> It is more CPU bound
[18:57] <omf_> much more
[18:57] <SmileyG> Ok good xD
[18:57] <SmileyG> I can free up a lot of CPU quite easily.
[18:57] <SmileyG> Just sorting font packages atm.. I suspect most are already installed.
[18:57] <joepie91> blah
[18:58] <joepie91> my internet is now so fast that my bottleneck is my disk I/O...
[18:58] <SmileyG> xD
[18:58] <SmileyG> ssd + ramdisks ftw.
[18:58] <joepie91> I'm doing an emergency backup of all my VPSes running on openvz .32 atm
[18:58] <joepie91> just in case
[18:58] <omf_> I doubt it. The bulk of them are Asian languages. So unless you frequently use Mandarin or Japanese I would think no.
[18:58] <SmileyG> And then realise if you can download faster than you can save, you can stream stuff faster than you can consume it.
[18:58] <SmileyG> joepie91: learn to QoS also.
[18:58] <joepie91> SmileyG: ?
[18:59] <omf_> gentoo-portage.com/media-fonts/corefonts isn't loading for me
[18:59] <joepie91> (also, I'm having a race with my internet right now - trying to clean out my disk in time before it fills up)
[19:00] <SmileyG> :O
[19:00] <SmileyG> pah, let me dig it out
[19:01] <SmileyG> http://corefonts.sourceforge.net/
[19:35] <godane> i just found Jurassic Park the ride E! Live Permiere Special
[20:00] <omf_> So bash fails with 100,000 items to glob
[20:00] <omf_> something new I learned today
[20:02] <balrog> use xargs
[20:06] <sep332> usual bash line length is limited to 256 kB
[20:12] <omf_> I used to compile the kernel so this problem wouldn't happen. I got around it by doing: find . -mindepth 1 -maxdepth 1 -iname "*.png" | zip -0 -@ images.zip
[20:17] <omf_> Yeah sep332 I had forgot there was a size limit
[20:17] <sep332> yeah it happens :)
[20:17] <omf_> I thought for a minute it was # of items and not size of buffer
[20:28] <omf_> Some days it is hard to keep track of all the moving parts
[21:33] <omf_> Fight the power - https://9gag.com/gag/aejoYKv
[21:36] <godane> so looks like usatoday.com is not in wayback
[21:49] <godane> SketchCow: cultofmac.com is backed up now
[21:49] <godane> uploaded: https://archive.org/details/cdn.cultofmac.com-images-20130513
[22:41] <godane> i had to fix a desc and link for july 2 2008 epsisode of buzz out loud in the wiki i'm getting links and descs from
[22:52] <omf_> Load average: 27.43
[22:52] <omf_> Not sure if I am doing enough work yet
[22:56] <joepie91> omf_: pft, real men have at least 41