#archiveteam-bs 2013-05-14,Tue

↑back Search

Time Nickname Message
00:43 🔗 omf_ Do we as heavy users of computer really workout enough.
00:43 🔗 omf_ I wouldn't be surprised if everyone in this irc channel is unfit, myself included.
00:44 🔗 omf_ Do things like standing desks help out?
00:49 🔗 SketchCow ha ha.
00:49 🔗 SketchCow Well, I have a fitbit scale and fitbit wearable.
00:49 🔗 SketchCow I have been low carb for over a week
00:50 🔗 SketchCow CPAP machine and blood pressure meds to keep that under control
00:50 🔗 omf_ Oh I didn't realize you dropped carbs AND caffeine. You must enjoy sleeping at least for a week or two
00:51 🔗 SketchCow No sense in going low carb and drinking diet soda
00:52 🔗 omf_ I give up caffeine over a year ago and it was really fucking hard. I was an 8+ cups a day for 11 years
00:52 🔗 omf_ Low carb worked for me as well but both at once, I give you credit man, that is hard
01:03 🔗 SketchCow archiveteam-health
01:03 🔗 SketchCow (Don't go there)
01:03 🔗 SketchCow (It's full of fried recipies
01:03 🔗 godane SketchCow: i have mirrored clutofmac articles
01:04 🔗 godane its about 469.3mb
01:18 🔗 SketchCow Bravo
01:19 🔗 godane still have to go after images
01:21 🔗 omf_ So you backed up the pages but not the images... why not get both at once?
01:21 🔗 godane i do more then one archive
01:21 🔗 godane first the index so i can get the links
01:21 🔗 godane then the articles
01:21 🔗 joepie91 speaking of which
01:22 🔗 joepie91 I had a bit of an issue wget-warc'ing engadget
01:22 🔗 joepie91 it... ran out of RAM
01:22 🔗 joepie91 how fix?
01:22 🔗 godane engadget is done
01:22 🔗 joepie91 well yes, but assuming a theoretical future crawl
01:22 🔗 joepie91 that might have the same issue
01:22 🔗 joepie91 (did you get joystiq and their subsites, btw?)
01:22 🔗 joepie91 (massively and wow in particular)
01:22 🔗 godane i did based on years
01:23 🔗 godane after i grabed the pages
01:23 🔗 joepie91 I see
01:23 🔗 joepie91 is there any way to do a 'regular' warc but have it store whatever it's throwing into RAM, in some other place?
01:23 🔗 joepie91 (I assume the URL list?)
01:24 🔗 omf_ godane, How is that more efficient than using the spidering features built into most website mirror software
01:24 🔗 omf_ joepie91, have you looked at the wget code
01:24 🔗 godane its was mostly so i didn't get hit by the 4gb wget warc limit
01:25 🔗 godane and my wifi drops sometimes
01:25 🔗 omf_ I have 20gb warcs I made with wget. That sounds more like a 32bit problem
01:25 🔗 omf_ or your running on Windows
01:26 🔗 joepie91 omf_: I have not
01:26 🔗 godane also this gives order to things
01:26 🔗 joepie91 I was more looking for a command line switch kind of thing
01:26 🔗 omf_ joepie91, that does not exist
01:27 🔗 DFJustin irony http://www.nature.com/nature/journal/v497/n7448/full/497183a.html
01:27 🔗 omf_ Paywalls all the way down
01:28 🔗 omf_ godane, how does having things in "order" help at all? When this gets shoved into the wayback machine it doesn't matter
01:28 🔗 godane it helps me do it in warc-proxy
01:29 🔗 godane it takes a very long time to make idx files on the fry locally
01:29 🔗 godane and again my wifi sucks sometimes
01:29 🔗 godane or my internet sucks sometimes
01:29 🔗 omf_ Is wifi all they offer in your area?
01:29 🔗 godane we have cable
01:30 🔗 godane but my room is not where the cable is
01:31 🔗 godane also i don't like upload 20gb files
01:31 🔗 godane even 5gb to 10gb files i try my best to not grab cause it takes to long
01:32 🔗 omf_ So you put up with and change your workflow to work around wifi drops when a few dollars in cable and some time would fix the problem permanently.
01:32 🔗 DFJustin even on a wired connection I've had hiccups uploading 20gb stuff to IA with consumer-level upstream
01:33 🔗 godane my dad will not like 60ft of cable running across the living room
01:33 🔗 omf_ So have I, but that happens way less than bad wifi single
01:34 🔗 godane again this is my way of doing things
01:34 🔗 godane i don't need the #comments pages
01:34 🔗 omf_ They sell these little plastic hooks with a sticky side at Lowes or Home Depot. They are paint safe and easy to remove. I have used them to run line on a ceiling so it is out of the way
01:34 🔗 godane or #top
01:34 🔗 godane cause there the same page
01:34 🔗 godane again i will not do that
01:35 🔗 godane wifi is my ownly option without me going with a netbook into that room
01:36 🔗 chronomex omf_: paint safe? that's a blatant lie.
01:36 🔗 omf_ chronomex, I have removed them after 3 years use and no paint damage
01:36 🔗 chronomex depends on the paint I guess
01:39 🔗 godane also based on the link list i cut my mirror to half cause there no bad or double links to the same story
01:39 🔗 godane all cause of a #comments like link
01:43 🔗 omf_ godane, URL fragments (aka #comment) can be handled by modern spiders automagically.
01:44 🔗 godane again i want to upload small sites
01:44 🔗 godane this made cultofmac small
01:44 🔗 godane we have the articles
01:45 🔗 godane also just spidering will get crap like this
01:45 🔗 godane http://cdn.cultofmac.com/wp-content/themes/com2012/i/a/gu
01:46 🔗 godane its a broken link but cultofmac.com redirects
01:46 🔗 godane more data
01:46 🔗 godane which makes it harder for me to upload
01:46 🔗 godane should have been this link: http://cdn.cultofmac.com/wp-content/themes/com2012/i/a/guest_post.jpg
01:46 🔗 omf_ But that redirection information can be useful. What if those links are used on another site? Leaving them out means your grabs are incomplete
01:47 🔗 omf_ BlueMax, they just make printers
01:48 🔗 omf_ a few people got in trouble for hosting the files for 3d guns
01:48 🔗 BlueMax ah right OK
01:48 🔗 omf_ there are other 3d printer makers as well, all trying to make the best printer
01:48 🔗 godane that redirect is to something i grabed in my article dump
01:48 🔗 omf_ If I had the money spare I would own one
01:48 🔗 godane no need to do a 2nd grab in my image dump
01:49 🔗 omf_ It doesn't matter. If the warc does not have information that the url you left out is a redirect then there is no record of that path to that content.
01:50 🔗 omf_ plus if you deduplicate before uploading then content fingerprinting cleans things up
01:50 🔗 godane i talked about dedup with the same warc
01:50 🔗 godane wget warc doesn't work that way
01:51 🔗 godane it can't figure it out in less its in older dump
01:51 🔗 godane that you point to cdx file to
02:03 🔗 omf_ Why not use warcdump to map and then warcfilter to clean it up. I assume that is why those warc tools were written
02:06 🔗 godane i never use those tools
02:06 🔗 godane will try at some point
02:06 🔗 godane but right now i'm doing it my way
02:15 🔗 dashcloud so, does anyone know of stories, videos, etc on the history of the slow-cooker (I guess Crock-Pot is the most well-known of the bunch)? It must have been a fascinating invention at the time- a cooking appliance you can safely leave on for 8 hours and not worry about something catching fire
02:15 🔗 omf_ I like how rice cookers can be used like crockpots now
02:17 🔗 dashcloud apparently some of the Japanese models play a short bit of music once you successfully cook the rice
02:35 🔗 SketchCow I heard some use a form of fuzzy logic to determine cooking time.
02:53 🔗 omf_ I know there are a fuck ton of bloggers out there now but how many software programming do you think? A few thousand that are actively updated
02:54 🔗 omf_ I was looking at the government numbers for programmers and is tiny. Then think that most of them do not blog and what exactly is left
03:01 🔗 godane so i got at least 40 videos today
03:01 🔗 godane most are tss previews for shows
03:01 🔗 godane but weeds out the index so i can search for other stuff
08:03 🔗 SmileyG So here I am at work
08:04 🔗 SmileyG Nothing ever changes, this makes me lololol.
08:26 🔗 SmileyG /dev/sdc1 1.4T 685G 642G 52% /home/tim.bowers/bin/ign/storage
08:26 🔗 SmileyG yes, warc's
08:29 🔗 norbert79 "Ain't nobody got time for that"... lol... Huge data
08:30 🔗 SmileyG norbert79: :D
08:30 🔗 SmileyG I have about 50+ more metadata files to write :<
12:51 🔗 godane so i have started to upload the cultofmac.com dumps i have
12:51 🔗 godane uploaded: https://archive.org/details/www.cultofmac.com-index-20130513
12:52 🔗 godane also note that the pages maybe also in my articles dump too
13:54 🔗 SmileyG damn, it's quiet in here today?
14:03 🔗 godane uploaded: https://archive.org/details/www.cultofmac.com-articles-20130513
14:08 🔗 godane i maybe able to get gphoria 2004
14:08 🔗 SmileyG what is/was that?
14:09 🔗 godane it was a award show on g4
14:10 🔗 godane i got the 2006 one on IA
14:12 🔗 godane funny that mpg can be played right away when its been downloaded but not mp4
14:12 🔗 SmileyG right
14:12 🔗 SmileyG im so bored out of my skull at work and have no clue wtf is going on atm (my brain has fried and I'm honestly lost) I'm just gonna work on metadata
14:13 🔗 SmileyG damn, newamerica is still fetching.
14:13 🔗 SmileyG as is pouet.
14:13 🔗 godane how big is it?
14:13 🔗 SmileyG waiting for it to tell me :DX
14:13 🔗 joepie91 pouet!
14:14 🔗 SmileyG 141x1.1Gb
14:14 🔗 SmileyG for new america, so far.
14:14 🔗 SmileyG -rw-r--r-- 1 tim.bowers games 34G May 14 15:14 ./rotavault.ign.com-2013-04-19.warc
14:15 🔗 SmileyG -rw-r--r-- 1 tim.bowers games 10G May 14 15:15 ./pouet/pouet.net_06052013.warc
14:25 🔗 godane i'm just glad i didn't have to do that now
14:25 🔗 godane newamerica that is
14:31 🔗 SmileyG aye
15:16 🔗 DFJustin https://twitter.com/mikko/status/334286983193047040
15:17 🔗 SmileyG http://www.h-online.com/security/news/item/Skype-with-care-Microsoft-is-reading-everything-you-write-1862870.html english version
18:13 🔗 omf_ SmileyG, hey yo ;)
18:14 🔗 SmileyG hey
18:14 🔗 SmileyG su
18:14 🔗 SmileyG sup?
18:14 🔗 omf_ need some jobs to run?
18:14 🔗 SmileyG JHust currently spazzing out atm so no thanks D:
18:15 🔗 omf_ Everything all right?
18:19 🔗 omf_ http://yro.slashdot.org/story/13/05/14/0134224/new-prenda-law-shell-corp-threatening-to-tell-your-neighbors-you-pirated-porn
18:19 🔗 omf_ I cannot wait for the new round of kick Prenda in court
18:19 🔗 sep332 what's left of them? thought they got dismantled by a judge.
18:20 🔗 sep332 or was that just their law firm maybe
18:20 🔗 omf_ They formed a new shell company and started right back up
18:20 🔗 omf_ same people
18:20 🔗 sep332 they've done that before, you'd think the judge woulda seen it coming
18:22 🔗 omf_ The judge cannot stop them from breaking the law again, only punish them
18:22 🔗 omf_ So today it is finally back into the 70s outside, we had a couple days back in the 40s and that sucked
18:22 🔗 sep332 looks like the previous punishment was just $80k fine (and the breakup)
18:23 🔗 sep332 yeah it's nice now! actually below freezing last night, ridiculous
18:27 🔗 joepie91 omf_: hahahaha
18:27 🔗 joepie91 these guys are just an infinite source of entertainment, aren't they
18:27 🔗 omf_ The cool thing is now that the judge has slapped and they did it again, now RICO can be brought in
18:27 🔗 joepie91 also, completely unrelated
18:27 🔗 joepie91 linux kernel sploit
18:27 🔗 joepie91 root priv escalation
18:27 🔗 joepie91 affects .32 (vswap) openvz kernels
18:27 🔗 joepie91 if you have an openvz VPS with vswap, I recommend backing up your shit and informing your provider
18:27 🔗 joepie91 http://www.lowendtalk.com/discussion/10514/linux-kernel-2.6.37-3.8.8-0day
18:28 🔗 joepie91 supposedly it allows you to gain system root from a container
18:38 🔗 omf_ Oh look Ubuntu canceled brainstorm, thats a shock. A distro runs a site for community input and then ignores all that input
18:40 🔗 chronomex seems typical
18:44 🔗 ersi omf_: Is setting up your jobs hard? Is it suitable to do in the warrior?
18:45 🔗 omf_ If I overhauled the warrior
18:47 🔗 ersi So, that's a no?
18:50 🔗 SmileyG omf_: btw what is the jobs? :D
18:50 🔗 omf_ Running the jobs is easy, the initial setup on Debian and CentOS is pretty involved because of how out of date the software is on those distros
18:51 🔗 ersi SmileyG: Posterous-screenshotting
18:51 🔗 SmileyG gentoo \o/
18:51 🔗 omf_ and just bumping to unstable or testing causes conflicts in the dependencies
18:51 🔗 SmileyG I can ssh into work and give it a poke
18:51 🔗 SmileyG though I just signed my self back off work D:
18:51 🔗 omf_ SmileyG, You running gentoo at work?
18:51 🔗 SmileyG yes, I'm fucking epic
18:51 🔗 omf_ haven't tried it on there, in theory it should be the easiest
18:52 🔗 SmileyG indeed
18:52 🔗 SmileyG give me commands
18:52 🔗 SmileyG FEED ME MOAR.
18:52 🔗 SmileyG sorry
18:52 🔗 SmileyG hyper/headfucked right now
18:52 🔗 omf_ you will have to figure out some of the package names since they are changed with every distro
18:52 🔗 SmileyG nod
18:52 🔗 SmileyG znurt to teh rescue.
18:53 🔗 SmileyG imagemagick by any chance?
18:53 🔗 omf_ nope
18:53 🔗 SmileyG bit of python I guess...
18:53 🔗 omf_ not even a chance
18:53 🔗 SmileyG :D
18:53 🔗 SmileyG Ok cool, just give me commands and set me going with a small set
18:54 🔗 omf_ Also job tuning is based on cpu cores, and RAM
18:56 🔗 SmileyG :O
18:56 🔗 SmileyG well ram is at a premium atm due to some huige wgets
18:57 🔗 omf_ It is more CPU bound
18:57 🔗 omf_ much more
18:57 🔗 SmileyG Ok good xD
18:57 🔗 SmileyG I can free up a lot of CPU quite easily.
18:57 🔗 SmileyG Just sorting font packages atm.. I suspect most are already installed.
18:57 🔗 joepie91 blah
18:58 🔗 joepie91 my internet is now so fast that my bottleneck is my disk I/O...
18:58 🔗 SmileyG xD
18:58 🔗 SmileyG ssd + ramdisks ftw.
18:58 🔗 joepie91 I'm doing an emergency backup of all my VPSes running on openvz .32 atm
18:58 🔗 joepie91 just in case
18:58 🔗 omf_ I doubt it. The bulk of them are Asian languages. So unless you frequently use Mandarin or Japanese I would think no.
18:58 🔗 SmileyG And then realise if you can download faster than you can save, you can stream stuff faster than you can consume it.
18:58 🔗 SmileyG joepie91: learn to QoS also.
18:58 🔗 joepie91 SmileyG: ?
18:59 🔗 omf_ gentoo-portage.com/media-fonts/corefonts isn't loading for me
18:59 🔗 joepie91 (also, I'm having a race with my internet right now - trying to clean out my disk in time before it fills up)
19:00 🔗 SmileyG :O
19:00 🔗 SmileyG pah, let me dig it out
19:01 🔗 SmileyG http://corefonts.sourceforge.net/
19:35 🔗 godane i just found Jurassic Park the ride E! Live Permiere Special
20:00 🔗 omf_ So bash fails with 100,000 items to glob
20:00 🔗 omf_ something new I learned today
20:02 🔗 balrog use xargs
20:06 🔗 sep332 usual bash line length is limited to 256 kB
20:12 🔗 omf_ I used to compile the kernel so this problem wouldn't happen. I got around it by doing: find . -mindepth 1 -maxdepth 1 -iname "*.png" | zip -0 -@ images.zip
20:17 🔗 omf_ Yeah sep332 I had forgot there was a size limit
20:17 🔗 sep332 yeah it happens :)
20:17 🔗 omf_ I thought for a minute it was # of items and not size of buffer
20:28 🔗 omf_ Some days it is hard to keep track of all the moving parts
21:33 🔗 omf_ Fight the power - https://9gag.com/gag/aejoYKv
21:36 🔗 godane so looks like usatoday.com is not in wayback
21:49 🔗 godane SketchCow: cultofmac.com is backed up now
21:49 🔗 godane uploaded: https://archive.org/details/cdn.cultofmac.com-images-20130513
22:41 🔗 godane i had to fix a desc and link for july 2 2008 epsisode of buzz out loud in the wiki i'm getting links and descs from
22:52 🔗 omf_ Load average: 27.43
22:52 🔗 omf_ Not sure if I am doing enough work yet
22:56 🔗 joepie91 omf_: pft, real men have at least 41

irclogger-viewer