[00:43] <omf_> Do we as heavy users of computer really workout enough. [00:43] <omf_> I wouldn't be surprised if everyone in this irc channel is unfit, myself included. [00:44] <omf_> Do things like standing desks help out? [00:49] <SketchCow> ha ha. [00:49] <SketchCow> Well, I have a fitbit scale and fitbit wearable. [00:49] <SketchCow> I have been low carb for over a week [00:50] <SketchCow> CPAP machine and blood pressure meds to keep that under control [00:50] <omf_> Oh I didn't realize you dropped carbs AND caffeine. You must enjoy sleeping at least for a week or two [00:51] <SketchCow> No sense in going low carb and drinking diet soda [00:52] <omf_> I give up caffeine over a year ago and it was really fucking hard. I was an 8+ cups a day for 11 years [00:52] <omf_> Low carb worked for me as well but both at once, I give you credit man, that is hard [01:03] <SketchCow> archiveteam-health [01:03] <SketchCow> (Don't go there) [01:03] <SketchCow> (It's full of fried recipies [01:03] <godane> SketchCow: i have mirrored clutofmac articles [01:04] <godane> its about 469.3mb [01:18] <SketchCow> Bravo [01:19] <godane> still have to go after images [01:21] <omf_> So you backed up the pages but not the images... why not get both at once? [01:21] <godane> i do more then one archive [01:21] <godane> first the index so i can get the links [01:21] <godane> then the articles [01:21] <joepie91> speaking of which [01:22] <joepie91> I had a bit of an issue wget-warc'ing engadget [01:22] <joepie91> it... ran out of RAM [01:22] <joepie91> how fix? [01:22] <godane> engadget is done [01:22] <joepie91> well yes, but assuming a theoretical future crawl [01:22] <joepie91> that might have the same issue [01:22] <joepie91> (did you get joystiq and their subsites, btw?) [01:22] <joepie91> (massively and wow in particular) [01:22] <godane> i did based on years [01:23] <godane> after i grabed the pages [01:23] <joepie91> I see [01:23] <joepie91> is there any way to do a 'regular' warc but have it store whatever it's throwing into RAM, in some other place? [01:23] <joepie91> (I assume the URL list?) [01:24] <omf_> godane, How is that more efficient than using the spidering features built into most website mirror software [01:24] <omf_> joepie91, have you looked at the wget code [01:24] <godane> its was mostly so i didn't get hit by the 4gb wget warc limit [01:25] <godane> and my wifi drops sometimes [01:25] <omf_> I have 20gb warcs I made with wget. That sounds more like a 32bit problem [01:25] <omf_> or your running on Windows [01:26] <joepie91> omf_: I have not [01:26] <godane> also this gives order to things [01:26] <joepie91> I was more looking for a command line switch kind of thing [01:26] <omf_> joepie91, that does not exist [01:27] <DFJustin> irony http://www.nature.com/nature/journal/v497/n7448/full/497183a.html [01:27] <omf_> Paywalls all the way down [01:28] <omf_> godane, how does having things in "order" help at all? When this gets shoved into the wayback machine it doesn't matter [01:28] <godane> it helps me do it in warc-proxy [01:29] <godane> it takes a very long time to make idx files on the fry locally [01:29] <godane> and again my wifi sucks sometimes [01:29] <godane> or my internet sucks sometimes [01:29] <omf_> Is wifi all they offer in your area? [01:29] <godane> we have cable [01:30] <godane> but my room is not where the cable is [01:31] <godane> also i don't like upload 20gb files [01:31] <godane> even 5gb to 10gb files i try my best to not grab cause it takes to long [01:32] <omf_> So you put up with and change your workflow to work around wifi drops when a few dollars in cable and some time would fix the problem permanently. [01:32] <DFJustin> even on a wired connection I've had hiccups uploading 20gb stuff to IA with consumer-level upstream [01:33] <godane> my dad will not like 60ft of cable running across the living room [01:33] <omf_> So have I, but that happens way less than bad wifi single [01:34] <godane> again this is my way of doing things [01:34] <godane> i don't need the #comments pages [01:34] <omf_> They sell these little plastic hooks with a sticky side at Lowes or Home Depot. They are paint safe and easy to remove. I have used them to run line on a ceiling so it is out of the way [01:34] <godane> or #top [01:34] <godane> cause there the same page [01:34] <godane> again i will not do that [01:35] <godane> wifi is my ownly option without me going with a netbook into that room [01:36] <chronomex> omf_: paint safe? that's a blatant lie. [01:36] <omf_> chronomex, I have removed them after 3 years use and no paint damage [01:36] <chronomex> depends on the paint I guess [01:39] <godane> also based on the link list i cut my mirror to half cause there no bad or double links to the same story [01:39] <godane> all cause of a #comments like link [01:43] <omf_> godane, URL fragments (aka #comment) can be handled by modern spiders automagically. [01:44] <godane> again i want to upload small sites [01:44] <godane> this made cultofmac small [01:44] <godane> we have the articles [01:45] <godane> also just spidering will get crap like this [01:45] <godane> http://cdn.cultofmac.com/wp-content/themes/com2012/i/a/gu [01:46] <godane> its a broken link but cultofmac.com redirects [01:46] <godane> more data [01:46] <godane> which makes it harder for me to upload [01:46] <godane> should have been this link: http://cdn.cultofmac.com/wp-content/themes/com2012/i/a/guest_post.jpg [01:46] <omf_> But that redirection information can be useful. What if those links are used on another site? Leaving them out means your grabs are incomplete [01:47] <omf_> BlueMax, they just make printers [01:48] <omf_> a few people got in trouble for hosting the files for 3d guns [01:48] <BlueMax> ah right OK [01:48] <omf_> there are other 3d printer makers as well, all trying to make the best printer [01:48] <godane> that redirect is to something i grabed in my article dump [01:48] <omf_> If I had the money spare I would own one [01:48] <godane> no need to do a 2nd grab in my image dump [01:49] <omf_> It doesn't matter. If the warc does not have information that the url you left out is a redirect then there is no record of that path to that content. [01:50] <omf_> plus if you deduplicate before uploading then content fingerprinting cleans things up [01:50] <godane> i talked about dedup with the same warc [01:50] <godane> wget warc doesn't work that way [01:51] <godane> it can't figure it out in less its in older dump [01:51] <godane> that you point to cdx file to [02:03] <omf_> Why not use warcdump to map and then warcfilter to clean it up. I assume that is why those warc tools were written [02:06] <godane> i never use those tools [02:06] <godane> will try at some point [02:06] <godane> but right now i'm doing it my way [02:15] <dashcloud> so, does anyone know of stories, videos, etc on the history of the slow-cooker (I guess Crock-Pot is the most well-known of the bunch)? It must have been a fascinating invention at the time- a cooking appliance you can safely leave on for 8 hours and not worry about something catching fire [02:15] <omf_> I like how rice cookers can be used like crockpots now [02:17] <dashcloud> apparently some of the Japanese models play a short bit of music once you successfully cook the rice [02:35] <SketchCow> I heard some use a form of fuzzy logic to determine cooking time. [02:53] <omf_> I know there are a fuck ton of bloggers out there now but how many software programming do you think? A few thousand that are actively updated [02:54] <omf_> I was looking at the government numbers for programmers and is tiny. Then think that most of them do not blog and what exactly is left [03:01] <godane> so i got at least 40 videos today [03:01] <godane> most are tss previews for shows [03:01] <godane> but weeds out the index so i can search for other stuff [08:03] <SmileyG> So here I am at work [08:04] <SmileyG> Nothing ever changes, this makes me lololol. [08:26] <SmileyG> /dev/sdc1 1.4T 685G 642G 52% /home/tim.bowers/bin/ign/storage [08:26] <SmileyG> yes, warc's [08:29] <norbert79> "Ain't nobody got time for that"... lol... Huge data [08:30] <SmileyG> norbert79: :D [08:30] <SmileyG> I have about 50+ more metadata files to write :< [12:51] <godane> so i have started to upload the cultofmac.com dumps i have [12:51] <godane> uploaded: https://archive.org/details/www.cultofmac.com-index-20130513 [12:52] <godane> also note that the pages maybe also in my articles dump too [13:54] <SmileyG> damn, it's quiet in here today? [14:03] <godane> uploaded: https://archive.org/details/www.cultofmac.com-articles-20130513 [14:08] <godane> i maybe able to get gphoria 2004 [14:08] <SmileyG> what is/was that? [14:09] <godane> it was a award show on g4 [14:10] <godane> i got the 2006 one on IA [14:12] <godane> funny that mpg can be played right away when its been downloaded but not mp4 [14:12] <SmileyG> right [14:12] <SmileyG> im so bored out of my skull at work and have no clue wtf is going on atm (my brain has fried and I'm honestly lost) I'm just gonna work on metadata [14:13] <SmileyG> damn, newamerica is still fetching. [14:13] <SmileyG> as is pouet. [14:13] <godane> how big is it? [14:13] <SmileyG> waiting for it to tell me :DX [14:13] <joepie91> pouet! [14:14] <SmileyG> 141x1.1Gb [14:14] <SmileyG> for new america, so far. [14:14] <SmileyG> -rw-r--r-- 1 tim.bowers games 34G May 14 15:14 ./rotavault.ign.com-2013-04-19.warc [14:15] <SmileyG> -rw-r--r-- 1 tim.bowers games 10G May 14 15:15 ./pouet/pouet.net_06052013.warc [14:25] <godane> i'm just glad i didn't have to do that now [14:25] <godane> newamerica that is [14:31] <SmileyG> aye [15:16] <DFJustin> https://twitter.com/mikko/status/334286983193047040 [15:17] <SmileyG> http://www.h-online.com/security/news/item/Skype-with-care-Microsoft-is-reading-everything-you-write-1862870.html english version [18:13] <omf_> SmileyG, hey yo ;) [18:14] <SmileyG> hey [18:14] <SmileyG> su [18:14] <SmileyG> sup? [18:14] <omf_> need some jobs to run? [18:14] <SmileyG> JHust currently spazzing out atm so no thanks D: [18:15] <omf_> Everything all right? [18:19] <omf_> http://yro.slashdot.org/story/13/05/14/0134224/new-prenda-law-shell-corp-threatening-to-tell-your-neighbors-you-pirated-porn [18:19] <omf_> I cannot wait for the new round of kick Prenda in court [18:19] <sep332> what's left of them? thought they got dismantled by a judge. [18:20] <sep332> or was that just their law firm maybe [18:20] <omf_> They formed a new shell company and started right back up [18:20] <omf_> same people [18:20] <sep332> they've done that before, you'd think the judge woulda seen it coming [18:22] <omf_> The judge cannot stop them from breaking the law again, only punish them [18:22] <omf_> So today it is finally back into the 70s outside, we had a couple days back in the 40s and that sucked [18:22] <sep332> looks like the previous punishment was just $80k fine (and the breakup) [18:23] <sep332> yeah it's nice now! actually below freezing last night, ridiculous [18:27] <joepie91> omf_: hahahaha [18:27] <joepie91> these guys are just an infinite source of entertainment, aren't they [18:27] <omf_> The cool thing is now that the judge has slapped and they did it again, now RICO can be brought in [18:27] <joepie91> also, completely unrelated [18:27] <joepie91> linux kernel sploit [18:27] <joepie91> root priv escalation [18:27] <joepie91> affects .32 (vswap) openvz kernels [18:27] <joepie91> if you have an openvz VPS with vswap, I recommend backing up your shit and informing your provider [18:27] <joepie91> http://www.lowendtalk.com/discussion/10514/linux-kernel-2.6.37-3.8.8-0day [18:28] <joepie91> supposedly it allows you to gain system root from a container [18:38] <omf_> Oh look Ubuntu canceled brainstorm, thats a shock. A distro runs a site for community input and then ignores all that input [18:40] <chronomex> seems typical [18:44] <ersi> omf_: Is setting up your jobs hard? Is it suitable to do in the warrior? [18:45] <omf_> If I overhauled the warrior [18:47] <ersi> So, that's a no? [18:50] <SmileyG> omf_: btw what is the jobs? :D [18:50] <omf_> Running the jobs is easy, the initial setup on Debian and CentOS is pretty involved because of how out of date the software is on those distros [18:51] <ersi> SmileyG: Posterous-screenshotting [18:51] <SmileyG> gentoo \o/ [18:51] <omf_> and just bumping to unstable or testing causes conflicts in the dependencies [18:51] <SmileyG> I can ssh into work and give it a poke [18:51] <SmileyG> though I just signed my self back off work D: [18:51] <omf_> SmileyG, You running gentoo at work? [18:51] <SmileyG> yes, I'm fucking epic [18:51] <omf_> haven't tried it on there, in theory it should be the easiest [18:52] <SmileyG> indeed [18:52] <SmileyG> give me commands [18:52] <SmileyG> FEED ME MOAR. [18:52] <SmileyG> sorry [18:52] <SmileyG> hyper/headfucked right now [18:52] <omf_> you will have to figure out some of the package names since they are changed with every distro [18:52] <SmileyG> nod [18:52] <SmileyG> znurt to teh rescue. [18:53] <SmileyG> imagemagick by any chance? [18:53] <omf_> nope [18:53] <SmileyG> bit of python I guess... [18:53] <omf_> not even a chance [18:53] <SmileyG> :D [18:53] <SmileyG> Ok cool, just give me commands and set me going with a small set [18:54] <omf_> Also job tuning is based on cpu cores, and RAM [18:56] <SmileyG> :O [18:56] <SmileyG> well ram is at a premium atm due to some huige wgets [18:57] <omf_> It is more CPU bound [18:57] <omf_> much more [18:57] <SmileyG> Ok good xD [18:57] <SmileyG> I can free up a lot of CPU quite easily. [18:57] <SmileyG> Just sorting font packages atm.. I suspect most are already installed. [18:57] <joepie91> blah [18:58] <joepie91> my internet is now so fast that my bottleneck is my disk I/O... [18:58] <SmileyG> xD [18:58] <SmileyG> ssd + ramdisks ftw. [18:58] <joepie91> I'm doing an emergency backup of all my VPSes running on openvz .32 atm [18:58] <joepie91> just in case [18:58] <omf_> I doubt it. The bulk of them are Asian languages. So unless you frequently use Mandarin or Japanese I would think no. [18:58] <SmileyG> And then realise if you can download faster than you can save, you can stream stuff faster than you can consume it. [18:58] <SmileyG> joepie91: learn to QoS also. [18:58] <joepie91> SmileyG: ? [18:59] <omf_> gentoo-portage.com/media-fonts/corefonts isn't loading for me [18:59] <joepie91> (also, I'm having a race with my internet right now - trying to clean out my disk in time before it fills up) [19:00] <SmileyG> :O [19:00] <SmileyG> pah, let me dig it out [19:01] <SmileyG> http://corefonts.sourceforge.net/ [19:35] <godane> i just found Jurassic Park the ride E! Live Permiere Special [20:00] <omf_> So bash fails with 100,000 items to glob [20:00] <omf_> something new I learned today [20:02] <balrog> use xargs [20:06] <sep332> usual bash line length is limited to 256 kB [20:12] <omf_> I used to compile the kernel so this problem wouldn't happen. I got around it by doing: find . -mindepth 1 -maxdepth 1 -iname "*.png" | zip -0 -@ images.zip [20:17] <omf_> Yeah sep332 I had forgot there was a size limit [20:17] <sep332> yeah it happens :) [20:17] <omf_> I thought for a minute it was # of items and not size of buffer [20:28] <omf_> Some days it is hard to keep track of all the moving parts [21:33] <omf_> Fight the power - https://9gag.com/gag/aejoYKv [21:36] <godane> so looks like usatoday.com is not in wayback [21:49] <godane> SketchCow: cultofmac.com is backed up now [21:49] <godane> uploaded: https://archive.org/details/cdn.cultofmac.com-images-20130513 [22:41] <godane> i had to fix a desc and link for july 2 2008 epsisode of buzz out loud in the wiki i'm getting links and descs from [22:52] <omf_> Load average: 27.43 [22:52] <omf_> Not sure if I am doing enough work yet [22:56] <joepie91> omf_: pft, real men have at least 41