#archiveteam 2011-12-01,Thu

↑back Search

Time Nickname Message
02:19 🔗 DFJustin magazines http://electrickery.xs4all.nl/comp/dai/doc/
02:19 🔗 db48x hrm
02:19 🔗 db48x either my scanner is messed up, or my linux scanner driver doesn't know what it's doing
02:26 🔗 db48x heh
02:26 🔗 db48x now a webm decoder in javascript
02:50 🔗 SketchCow http://www.manosinhd.com/?page_id=277 fairly big news
02:56 🔗 db48x SketchCow: awesome
03:03 🔗 db48x http://db48x.net/temp/my%20teeth%202011%20horizontal.png
03:16 🔗 underscor db48x: ouch!
03:17 🔗 db48x yea, you'd think so
03:18 🔗 db48x I've been putting off getting it taken out for a few months
03:18 🔗 db48x finally decided to do something about it when my gums became swollen
03:18 🔗 db48x now that is painful
03:19 🔗 db48x now that it is painful
03:19 🔗 db48x ooh, hard drive was delivered today
03:20 🔗 db48x too bad the office is closed
03:34 🔗 db48x I like this staircase: http://www.flickr.com/photos/hmvgetcloser/4950566117/
03:34 🔗 underscor That is a rad staircase
04:04 🔗 kennethre 50s design was so awesome
04:06 🔗 dashcloud wow- I pressed the sleep button on the sleep, and I had 1 battery- woke it back up to find I now had 2 batteries!
04:07 🔗 yipdw oh, goddamnit
04:07 🔗 yipdw [200975.810003] Killed process 13137 (wget-warc) vsz:429300kB, anon-rss:304128kB, file-rss:0kB
04:07 🔗 yipdw [200975.810003] Out of memory: kill process 12997 (dld-profile.sh) score 26728 or a child
04:07 🔗 yipdw I keep forgetting to turn on the damn swap
04:11 🔗 db48x heh
04:19 🔗 NotGLaDOS I had an idea some time back
04:19 🔗 NotGLaDOS Buy a RAMdisk, and just allocate the entirety of it to SWAP
04:19 🔗 yipdw if you're going to do that, why not just get more RAM
04:19 🔗 chronomex ^
04:35 🔗 dnova Why do hosted dedicated servers generally come with tiny disks?
04:36 🔗 dnova where are they even getting 80gb sata disks
04:36 🔗 dnova and do you think they'd be willing to pop in some disks I provide?
04:36 🔗 underscor probably not
04:36 🔗 underscor they can make more money if you rent them
04:37 🔗 dnova some don't even have an option to upgrade the disk
04:45 🔗 Coderjoe woo
04:46 🔗 Coderjoe i was just given 3 used (but supposedly still good) barracuda 750gb es drives
04:46 🔗 dnova score
05:19 🔗 NotGLaDOS Nice
05:19 🔗 NotGLaDOS Send some to the Australians!
06:43 🔗 arrith did something bad happen recently with a reporter?
06:45 🔗 dnova arrith: I was wondering too
06:46 🔗 dnova not sure what the story behind tha tis.
06:46 🔗 dnova that is.
06:47 🔗 arrith i've seen a few times people be fairly okay with reporters, then some story gets wildly twisted and the people won't talk to reporters anymore. it happened with the pirate bay guys for example.
07:08 🔗 Paradoks arrith / dnova : It's not that Archive Team is against having some sort of story done about it; it's that the people who are evidently trying to do a story about it do not tend to put their subjects in a good light.
07:08 🔗 Paradoks From my skimming of the website that Sketchcow mentioned, the person seemed to do stories where the subjects are crazy people. So we're unlikely to come off looking decent.
07:09 🔗 arrith Paradoks: what was the website?
07:10 🔗 Paradoks http://www.mattathiasschwartz.com/
07:10 🔗 Paradoks See http://badcheese.com/~steve/atlogs/?chan=archiveteam&day=2011-11-28 around 3:08 for the rest of the discussion.
07:12 🔗 arrith ah, thanks
07:19 🔗 dnova yes thanks
07:33 🔗 NotGLaDOS arrith: there's 2 that are trying to compile a story on Archive Team
07:51 🔗 yipdw also, I just can't trust someone with such a crazy spelling of "Matthias"
07:51 🔗 chronomex mattiathas
07:52 🔗 yipdw mattiattathias
07:52 🔗 dnova MORE SYLLABLES
07:52 🔗 yipdw we can turn it into a set of strings: ma{(ttia)^n}athas
08:02 🔗 arrith where n > 3
08:04 🔗 dnova anyone going to CES?
08:07 🔗 SketchCow Nope
08:07 🔗 SketchCow I midwifed a project that will debut at CES
08:07 🔗 chronomex why would we?
08:08 🔗 dnova seems like a lot of fun
08:08 🔗 dnova I'm hoping to go sometime.
08:08 🔗 chronomex hm.
08:09 🔗 SketchCow Go to Penny Arcade instead.
08:09 🔗 dnova pax?
08:09 🔗 SketchCow Yes.
08:10 🔗 dnova Are there age restrictions to PAX East?
08:10 🔗 dnova Nope. But if you're under 13, please make sure your parents know where you are.
08:10 🔗 dnova haha
09:43 🔗 kin37ik *sighs* ... *watches tumbleweed*
09:48 🔗 * BlueMax throws a knife in kin37ik's direction
09:50 🔗 * kin37ik dodges
09:51 🔗 kin37ik errr, knew i shouldnt have upgraded my linux box
09:58 🔗 kin37ik sooo, whats peoples opinion on this, blacklist thing they are trying to pass in the US?
10:00 🔗 dnova nobody wants it except for scared old people who don't understand the internet or technology in general
10:00 🔗 dnova many of which are unfortunately politicians
10:01 🔗 * kin37ik nods head and agrees
10:01 🔗 kin37ik being here in aus, i told a bunch of my mates, and they went up in arms in frustration about it
10:01 🔗 kin37ik much the same way i did
10:02 🔗 dnova maybe I am confused but I thought australia already had something like this
10:02 🔗 kin37ik no, they were trying to pass an ISP blacklist filter, but i dont think it made it, due to the fact that
10:02 🔗 kin37ik we can just use a proxy, and bypass the stupid blacklist anyway
10:03 🔗 kin37ik i mean, they can try and blacklist all they want, doesnt sotp proxies in the least
10:04 🔗 dnova until they ban vpns and ssh
10:04 🔗 kin37ik lol, if they did that, watch riots happen lol
10:04 🔗 dnova it's happening in pakistan right now.
10:04 🔗 dnova haven't heard of any riots.
10:04 🔗 kin37ik S:
10:05 🔗 kin37ik not good
10:05 🔗 kin37ik apparently in china thye have blocked google
10:06 🔗 kin37ik meybe not in pakistan but, people here love their internet so
11:28 🔗 alard Hi Splinder downloaders: I added an upload status table on the wiki. Please put 'Uploaded' or 'Uploading' next to your name, so we can get an idea of where everything is. http://www.archiveteam.org/index.php?title=Splinder#Upload_status
11:29 🔗 chronomex I started uploading and then I had to move house
11:29 🔗 chronomex still not unpacked enough to boot my box
11:30 🔗 db48x the computers are the second thing you are supposed to unpack
11:30 🔗 db48x right after the table/desk they sit on
11:30 🔗 chronomex no
11:30 🔗 chronomex toothbrush, bed, food.
11:30 🔗 * chronomex &
11:31 🔗 db48x hrm
11:31 🔗 * db48x is skeptical
11:35 🔗 * kin37ik sees a tumblrweed roll across
11:35 🔗 kin37ik tumble*
11:39 🔗 ersi flickrweed
11:51 🔗 Nemo_bis alard, what should I do with those directories with escaped characters?
11:54 🔗 Nemo_bis I mean http://p.defau.lt/?NITL0SVf4K4QFRgCKmlWIg etc.
12:00 🔗 ersi http://arstechnica.com/gaming/news/2011/11/gamepro-magazine-and-website-to-shutter-next-month-1.ars
12:01 🔗 alard Nemo_bis: Do you have many of those? Are they large?
12:01 🔗 alard If not, it's probably easiest to ignore the errors and we'll do them again later.
12:01 🔗 kin37ik woah
12:01 🔗 kin37ik better get Wget cracking
12:02 🔗 Nemo_bis alard, they should be about 20, I don't know how big they are
12:02 🔗 kin37ik sheete sheet
12:02 🔗 kin37ik where did i put it
12:03 🔗 alard Nemo_bis: Then I'd suggest to ignore them. (Even if you do manage to upload them, they'll continue to cause problems later.)
12:03 🔗 Nemo_bis but there shouldn't be any problem in it, except that the script looks for unescvaped characters and wget or whatever created a directory with escaped chars
12:03 🔗 ersi Gamepro magazine is shutting down their publication and website
12:03 🔗 Nemo_bis alard, what about renaming the directories?
12:04 🔗 kin37ik ersi: do you know if any of the ddirs on that website are case sensitive, my linux box is down right now so i gotta use windows
12:04 🔗 alard Nemo_bis: You could do that, but that's probably something that needs to be done in a similar way by other people. I have a few of those directories too, for example.
12:05 🔗 Nemo_bis alard, ok
12:09 🔗 ersi kin37ik: No idea, but case sensitivity is often very super bad at windows
12:10 🔗 kin37ik ersi: yeah i know, hence why i asked, but my linux box is well, to put it simply, very very unhappy with 11.10 atm S:
12:10 🔗 kin37ik tried rolling it back to 10.4 but it was very unhappy with installing the desktop on that
12:13 🔗 ersi I seldom upgrade with ubuntu, i usually do a clean reinstall :)
12:14 🔗 kin37ik yeah thats what i did (:
12:14 🔗 kin37ik it was fine for a few days then
12:15 🔗 kin37ik it started having fits S: locking up and crashing applications
12:20 🔗 underscor telnet miku.acm.uiuc.edu
12:25 🔗 kin37ik right, now i just gotta remember what my command line was last time
12:48 🔗 SketchCow Can I get ops, please
12:54 🔗 ersi Bam.
12:55 🔗 SketchCow Thank youuuuuuuuuuuuuuuuuu
12:55 🔗 ersi So how's life? Taken out the cash yet? Spread it around the bed, rolled around etc?
12:55 🔗 ersi I would :D
12:55 🔗 db48x lol
12:55 🔗 SketchCow I have spent, like, $13k of it
12:55 🔗 SketchCow 4 different shopping trips to B&H Photo Video in NYC
12:55 🔗 ersi meh, not as good
12:55 🔗 SketchCow I can take the most amazing shots now
12:55 🔗 ersi but still nice :)
12:56 🔗 SketchCow Holding $1700 lenses? It's nice. Trust me.
12:56 🔗 SketchCow Walking home with them? Nicer.
12:56 🔗 ersi not as nice as rolling around in $118k bills
12:56 🔗 ersi :D
13:04 🔗 SketchCow http://www.gamepro.com/forums/section/100304/general/1949/
13:04 🔗 SketchCow ------------------------------------------------
13:04 🔗 SketchCow COULD SOMEONE DOWNLOAD AT LEAST THE FORUMS OF GAMEPRO.COM?
13:04 🔗 SketchCow http://www.gamepro.com/forums/section/100304/general/1949/
13:05 🔗 SketchCow Dating back to 2001 - at least save those, and any user account names
13:05 🔗 SketchCow ------------------------------------------------
13:05 🔗 PatC I'm not sure how to do so, but I could do it if someone helps me
13:06 🔗 SketchCow They shut it off in 5 days.
13:06 🔗 PatC Ya, I saw that
13:08 🔗 SketchCow Well, make sure people talk first before diving into the forums.
13:09 🔗 SketchCow It shouldn't be THAT hard, but I'd like to eventually see a way to recreate the forums or allow later people to do so.
13:09 🔗 PatC Agreed
13:10 🔗 alard SketchCow: I've got Heritrix downloading the forums, though it's not going very fast (5000 out of 75000 urls in an hour).
13:10 🔗 ersi It's goin' atleast :)
13:11 🔗 alard They provide sitemaps, by the way, which may be useful: http://www.gamepro.com/sitemap_index.xml (I added the forum urls to my crawl).
13:11 🔗 ersi so 14h and it should've grabbed all 75000 URLs, if holding same speed
13:12 🔗 alard For some reason heritrix doesn't really listen to my parallelQueues = 15 setting. It's just running one queue.
13:12 🔗 kin37ik ill see if i can grba anything on the forums, gonna suck using a windows box but it's all i got atm
13:12 🔗 kin37ik though someone should try with a linux box, since mine is well, stuffed atm
13:21 🔗 kin37ik ah O.o my bad, half asleep
13:22 🔗 PatC if someone wants to give me some commands to run, I got a 30mbit connection that's going to be idling all day anyway
13:23 🔗 PatC with linux*
13:24 🔗 kin37ik PatC: try heritrix? or you wanna go Wget?
13:24 🔗 PatC I don't currently have heritrix, but I could get it if it makes a difference
13:25 🔗 kin37ik whatevers easier for you
13:25 🔗 PatC wget would be easier
13:26 🔗 kin37ik alright
13:27 🔗 kin37ik i do my wget kinda weirdly i guess but, it works for me just fine
13:27 🔗 PatC ok
13:29 🔗 ersi preferebly with WARC support.. so it can be ingested into IA
13:29 🔗 ersi But I guess alard's got that covered, since he's using Heretrix
13:30 🔗 kin37ik wget -nc -c -x --directory-prefix=prefixname -r -k -p http://www.urlgoeshere.com/
13:30 🔗 kin37ik thats what i use, which is rather bad of me imo but, it works
13:30 🔗 kin37ik till i fix my linux box which shall be tommorrow
13:30 🔗 PatC So, I run that and it'll grab the entire site?
13:31 🔗 kin37ik should do, as far as my memory serves me
13:31 🔗 kin37ik if it doubt, go heritrix
13:31 🔗 kin37ik in*
13:31 🔗 PatC "Converted 3 files in 0.001 seconds."
13:31 🔗 PatC and its done
13:32 🔗 PatC I don't think it is lol
13:32 🔗 kin37ik i went form the forums and then mine worked backwards
13:32 🔗 kin37ik mines chugging away like a boss lol
13:32 🔗 kin37ik infact, im downloading everything on the forums it seems O.o
13:33 🔗 ersi enjoy downloading the whole internet
13:33 🔗 ersi for what I see you don't limit your level of recursion
13:33 🔗 kin37ik lol
13:34 🔗 kin37ik hmm
13:34 🔗 PatC Well, I got it going now..
13:34 🔗 kin37ik was wondering why the line was a bit shorter than normal
13:34 🔗 PatC looks like it's getting all the articles
13:36 🔗 kin37ik aaand i hate windows.....
13:36 🔗 kin37ik >.> wheres that linux disc......
13:37 🔗 kin37ik PatC: id probably go with heritrix atm, cos im getting spat dud pages with wget S:
13:37 🔗 kin37ik not sure if yours is the same
13:37 🔗 PatC mine's getting the articles
13:37 🔗 PatC all the html files
13:37 🔗 kin37ik open one of the downloaded HTML files
13:37 🔗 kin37ik in firefox
13:38 🔗 kin37ik if it's like blank then ive screwed up and cant concentrate, need a redbull
13:39 🔗 PatC shit, i'm late
13:39 🔗 PatC i'll be back on in about 6 hours
13:44 🔗 tef hooray I am on a preservation working group meeting for the international internet preservation consortium
13:45 🔗 tef /riveting/
13:45 🔗 ersi tef: You that guy who was working with some WARC lib?
13:46 🔗 tef yes
13:46 🔗 tef I am mostly doing stuff for arc2warc conversion at the moment
13:47 🔗 tef I plan to put things like zip/unzip style manipulation of archives
13:49 🔗 kin37ik bugger it, going to bed, cbb dealing with wget atm, laters
13:49 🔗 tef I did add a http parser to it and I plan to add a proxy server for serving content from warcs soon too
14:12 🔗 db48x tef: sweet
14:44 🔗 Coderjoe geh. no -k without -K or warc
14:47 🔗 Coderjoe and warc is greatly preferred
17:05 🔗 Schbirid i'd like to offload my tiny part of splinder
17:05 🔗 Schbirid iirc some users were not finished :\
17:08 🔗 Schbirid 9.2G
17:08 🔗 Schbirid 10.4 actually
22:20 🔗 rude___ SketchCow: you now have a lot of newspapers
22:42 🔗 Coderjoe have we found a different copy of friendster.000230000-000240000.tar.gz without the corruption? or have we determined that the corrupt copy is the only one?
23:00 🔗 Coderjoe oh boy. youtube just switched me to a new site design
23:01 🔗 Coderjoe likely breaking half a dozen youtube archiving tools in the process
23:08 🔗 yipdw they did?
23:08 🔗 yipdw looks the same to me
23:08 🔗 Coderjoe possibly a selective thing. they have a thing to allow me to go back currently
23:08 🔗 Coderjoe http://www.youtube.com/t/new
23:08 🔗 yipdw "We're sorry, the page you requested cannot be found."
23:09 🔗 Coderjoe er, no. "about the new look" and "send feedback"
23:09 🔗 yipdw google likes this eventual consistency thing
23:09 🔗 Coderjoe are you logged into yt?
23:09 🔗 yipdw under an account, yes
23:09 🔗 Coderjoe ok. so that isn't it
23:10 🔗 yipdw http://depot.ninjawedding.org/footube.png
23:16 🔗 Coderjoe that /t/new page: http://i.imgur.com/DgHLu.jpg
23:16 🔗 Coderjoe video page: http://i.imgur.com/lGhZJ.png
23:23 🔗 instence fuck.
23:24 🔗 yipdw YouTube goes through new looks like Gackt
23:24 🔗 instence yep all of my scrapers are now going to be broken
23:25 🔗 yipdw I think youtube-dl still works
23:25 🔗 instence and i bet they are preloading tons of stuff using jquery
23:25 🔗 yipdw oh lol
23:25 🔗 yipdw if I use Chrome
23:25 🔗 instence well thats weeks worth of work i have to redo * sigh *
23:25 🔗 yipdw I get The GRay
23:27 🔗 yipdw so, when you load one of them newfangled video pages, your browser does something like 60 HTTP requests
23:27 🔗 yipdw 18 of those are for ads
23:28 🔗 Coderjoe grr
23:28 🔗 Coderjoe I need to see if I can find any of my old BBS lists. (I was assembling a list in high school to share with other computer geek friends)
23:29 🔗 Coderjoe I notice bbslist.textfiles.com doesn't have one BBS I remember using (I actually got my first copy of linux via this bbs)
23:45 🔗 bsmith093 do bbs clients still exist?
23:46 🔗 bsmith093 and can i get one for ubuntu lucid?
23:46 🔗 Coderjoe got a modem? minicom

irclogger-viewer