#archiveteam 2013-08-20,Tue

↑back Search

Time Nickname Message
07:07 🔗 SketchCow I'm back!
07:07 🔗 SketchCow (My machine had a hard drive going slightly bad, enough that once every 5-7 days it would crash the machine but SMART didn't pick it up.
07:08 🔗 omf_ SketchCow, FOS is full and causing us problems
07:08 🔗 SketchCow So I finally bit the bullet, bought a SSD drive, did the fragginatin' and the cloninatin' and here I am with a machine that boots in, like, 12 seconds.
07:09 🔗 SketchCow It is not full.
07:09 🔗 SketchCow It's bloated to be sure but not full.
07:10 🔗 omf_ xmc, was the disk full error returned to the tracker?
07:11 🔗 SketchCow Which tracker.
07:11 🔗 omf_ He didn't specify
07:26 🔗 xmc I just looked at the graph and saw that the disk had filled
07:26 🔗 xmc well strictly speaking we're about 350M out from disk-full
07:27 🔗 xmc I've seen creeping disk fill on the tracker several times, so it's probably something gone slightly off the rails
07:37 🔗 SmileyG underscor: ping when alive plz :)
07:40 🔗 xmc http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/df.html
08:46 🔗 SketchCow Yeah, see, none of these are my machine.
08:48 🔗 omf_ Is that the url tracker then?
08:49 🔗 GLaDOS That's the tracker machine, yes.
08:50 🔗 bsmith094 GLaDOS: what link
09:01 🔗 bsmith094 SketchCow: in january of 2012 you scraped some fanfiction.net stories using the fanfictiondownloader project from googlecode. for the last few months ive been doing the same thing. I have from id 1 to 3 million and have most of the intervening numbers between 3-5 and 5-7 million, running 2 paralell downloads. its still going and currently its at about 80gb of text files, is there somewhere i could rsync this? im almost out of sp
09:02 🔗 bsmith094 incidentally im also the guy who grabbed all of ao3
09:02 🔗 SmileyG bsmith094: you can upload it to IA....
09:02 🔗 SmileyG not via rsync, but.... ia3uploader script?
09:02 🔗 ersi Or just the form on the website, just log in first
09:05 🔗 bsmith094 what does 80gb of text compress to cause i ive only got 24gb of space left
09:08 🔗 xmc SketchCow: was not saying your box is full. said it elsewhere and omf_ seems to have gone off on his own tangent
09:15 🔗 bsmith094 SmileyG: whats the link to the ia3uploader script
09:17 🔗 bsmith094 off 2 bed back in ~10 hrs
09:17 🔗 SmileyG https://github.com/kngenie/ias3upload
09:17 🔗 SmileyG bsmith094: https://github.com/kngenie/ias3upload
09:17 🔗 bsmith094 thanks
10:49 🔗 omf_ GLaDOS, and I have been working on the docs for our servers. We have 6 servers, 5 of which are up and running different services for us
13:00 🔗 balrog I guess everyone here has seen http://arstechnica.com/tech-policy/2013/08/changing-ip-address-to-access-public-website-ruled-violation-of-us-law/
13:24 🔗 GLaDOS oh dear, im going to gitmo.
13:25 🔗 Smiley we all are :D
13:30 🔗 Jonimus Groklaw is stoping posting new things, no sign on an actual shutdown or not http://www.groklaw.net/article.php?story=20130818120421175
14:01 🔗 godane does anyone have any ideas on how to get all comments from groklaw.net?
14:13 🔗 godane i have another problem
16:21 🔗 SketchCow We should grab groklaw
16:27 🔗 Deewiant godane: Set the view mode to "nested" instead of "threaded" (or "flat" or "printable" would work too, I guess); then all comments show on the article page directly (it seems to set a cookie)
16:38 🔗 anon42 Just this morning PJ, the maintainer of Groklaw.net, basically announced the end of the site. Groklaw has covered many important cases and events in software patent/copyright law for many years and is a really unique source of that history. While it wouldn't be in her nature to just shut down the site without explicit warning or making a backup available, due to the nature of her last message I don't know if we can be sure. There are alre
16:38 🔗 anon42 all recent articles and other sections of the site are not backed up as far as I can tell. I am worried that if Groklaw is lost, a lot of important history will be lost along with it. I remembered hearing about archive team so I came here. Apologies for the essay.
16:48 🔗 balrog anon42: we're well aware...
16:54 🔗 anon42 balrog: oops. no reason to get dramatic then. what's your guys' take on it then?
17:11 🔗 Smiley anon42: our take doesn't matter
17:12 🔗 Smiley our take is
17:12 🔗 Smiley lets back that shit up
17:13 🔗 omf_ If you want to hear different opinions about how we feel ask in #archiveteam-bs. That is the channel where we have those discussions.
17:17 🔗 anon42 Good to know. I just realized the first message I saw in here is actually refers to this and I missed it. So, how can I help?
17:19 🔗 Smiley Firstly, does anyone want to call themselves out as doing it?
17:20 🔗 Smiley SketchCow: your normally on stuff like this damn fast?
17:20 🔗 Smiley anon42: we have a wiki which documents how you can take your own archive of the site at: http://www.archiveteam.org/index.php?title=Wget#Creating_WARC_with_wget
17:24 🔗 omf_ I have one grab going for the pages and plan a follow up for the pdfs
17:24 🔗 balrog https://law.resource.org/pub/us/code/ga/ needs backup
17:46 🔗 SketchCow it does.
17:57 🔗 omf_ I create a simple network diagram of the warriors. http://picpaste.com/6uO20RMg.png Is anything missing? Is there a better format to use?
17:59 🔗 omf_ I am going to lay it out different in the next version so none of the labels are obscured by connection lines
17:59 🔗 Smiley looks right
18:16 🔗 godane Deewlant: my problem is i don't know how do it in the url
18:16 🔗 godane give me the url that works then i can do it
18:17 🔗 Deewiant godane: Look into the cookie that it sets and just send that as part of the request, I don't think you can set it in the URL
18:21 🔗 omf_ Here is version 2 of the warrior network - http://picpaste.com/pics/Pz81z7Mx.1377022875.png
19:20 🔗 godane Deewiant: its not working
19:21 🔗 godane i only got 2 lines in cookies with growlaw.net
19:21 🔗 godane .groklaw.net TRUE / FALSE 1408547640 LastVisit 1377026061
19:21 🔗 godane .groklaw.net TRUE / FALSE 1377012240 LastVisitTemp 1377022575
19:22 🔗 omf_ I am already 500mb deep into my groklaw grab
19:22 🔗 godane are you grabbing all comments?
19:22 🔗 omf_ everything
19:23 🔗 godane what are you using for commands?
19:23 🔗 Deewiant godane: Ah sorry it's evidently just a post request, use &mode=nested in the url
19:23 🔗 Deewiant Didn't really look into it properly earlier, sorry again for the trouble
19:24 🔗 godane fuck yes
19:24 🔗 godane this well work
19:37 🔗 godane also good news is that if a get error on byte with a list of urls i only stop going to down that one page and goes to the next url in the list
19:37 🔗 godane so it doesn't just fail and calls quits on me
19:38 🔗 Smiley :/
19:38 🔗 Smiley good, can yoiu log the error too?
19:41 🔗 godane i log everything these days
19:41 🔗 godane to make sure we know what is missing or just 500 error on me
20:54 🔗 SketchCow https://vine.co/v/hMLVA1emhej
20:54 🔗 SketchCow Someone save that
20:54 🔗 SketchCow That's going to disappear.
20:56 🔗 omf_ got it
20:56 🔗 balrog o.o
20:56 🔗 omf_ óò
20:58 🔗 SketchCow http://www.buzzfeed.com/mikehayes/this-terrifying-vine-shows-the-exact-moment-a-truck-flies-ov
20:58 🔗 SketchCow (He survived)
22:18 🔗 Asparagir One of the newly uploaded Prelinger films is basically a mid-century modern animated version of the IPV4 --> IPV6 upgrades: https://archive.org/details/6317_Mr_Digit_and_the_Battle_of_Bubbling_Brook_01_15_17_02
22:18 🔗 Asparagir Really cute, and definitely quotable. Letters! They're hemming us in!

irclogger-viewer