Time |
Nickname |
Message |
07:07
🔗
|
SketchCow |
I'm back! |
07:07
🔗
|
SketchCow |
(My machine had a hard drive going slightly bad, enough that once every 5-7 days it would crash the machine but SMART didn't pick it up. |
07:08
🔗
|
omf_ |
SketchCow, FOS is full and causing us problems |
07:08
🔗
|
SketchCow |
So I finally bit the bullet, bought a SSD drive, did the fragginatin' and the cloninatin' and here I am with a machine that boots in, like, 12 seconds. |
07:09
🔗
|
SketchCow |
It is not full. |
07:09
🔗
|
SketchCow |
It's bloated to be sure but not full. |
07:10
🔗
|
omf_ |
xmc, was the disk full error returned to the tracker? |
07:11
🔗
|
SketchCow |
Which tracker. |
07:11
🔗
|
omf_ |
He didn't specify |
07:26
🔗
|
xmc |
I just looked at the graph and saw that the disk had filled |
07:26
🔗
|
xmc |
well strictly speaking we're about 350M out from disk-full |
07:27
🔗
|
xmc |
I've seen creeping disk fill on the tracker several times, so it's probably something gone slightly off the rails |
07:37
🔗
|
SmileyG |
underscor: ping when alive plz :) |
07:40
🔗
|
xmc |
http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/df.html |
08:46
🔗
|
SketchCow |
Yeah, see, none of these are my machine. |
08:48
🔗
|
omf_ |
Is that the url tracker then? |
08:49
🔗
|
GLaDOS |
That's the tracker machine, yes. |
08:50
🔗
|
bsmith094 |
GLaDOS: what link |
09:01
🔗
|
bsmith094 |
SketchCow: in january of 2012 you scraped some fanfiction.net stories using the fanfictiondownloader project from googlecode. for the last few months ive been doing the same thing. I have from id 1 to 3 million and have most of the intervening numbers between 3-5 and 5-7 million, running 2 paralell downloads. its still going and currently its at about 80gb of text files, is there somewhere i could rsync this? im almost out of sp |
09:02
🔗
|
bsmith094 |
incidentally im also the guy who grabbed all of ao3 |
09:02
🔗
|
SmileyG |
bsmith094: you can upload it to IA.... |
09:02
🔗
|
SmileyG |
not via rsync, but.... ia3uploader script? |
09:02
🔗
|
ersi |
Or just the form on the website, just log in first |
09:05
🔗
|
bsmith094 |
what does 80gb of text compress to cause i ive only got 24gb of space left |
09:08
🔗
|
xmc |
SketchCow: was not saying your box is full. said it elsewhere and omf_ seems to have gone off on his own tangent |
09:15
🔗
|
bsmith094 |
SmileyG: whats the link to the ia3uploader script |
09:17
🔗
|
bsmith094 |
off 2 bed back in ~10 hrs |
09:17
🔗
|
SmileyG |
https://github.com/kngenie/ias3upload |
09:17
🔗
|
SmileyG |
bsmith094: https://github.com/kngenie/ias3upload |
09:17
🔗
|
bsmith094 |
thanks |
10:49
🔗
|
omf_ |
GLaDOS, and I have been working on the docs for our servers. We have 6 servers, 5 of which are up and running different services for us |
13:00
🔗
|
balrog |
I guess everyone here has seen http://arstechnica.com/tech-policy/2013/08/changing-ip-address-to-access-public-website-ruled-violation-of-us-law/ |
13:24
🔗
|
GLaDOS |
oh dear, im going to gitmo. |
13:25
🔗
|
Smiley |
we all are :D |
13:30
🔗
|
Jonimus |
Groklaw is stoping posting new things, no sign on an actual shutdown or not http://www.groklaw.net/article.php?story=20130818120421175 |
14:01
🔗
|
godane |
does anyone have any ideas on how to get all comments from groklaw.net? |
14:13
🔗
|
godane |
i have another problem |
16:21
🔗
|
SketchCow |
We should grab groklaw |
16:27
🔗
|
Deewiant |
godane: Set the view mode to "nested" instead of "threaded" (or "flat" or "printable" would work too, I guess); then all comments show on the article page directly (it seems to set a cookie) |
16:38
🔗
|
anon42 |
Just this morning PJ, the maintainer of Groklaw.net, basically announced the end of the site. Groklaw has covered many important cases and events in software patent/copyright law for many years and is a really unique source of that history. While it wouldn't be in her nature to just shut down the site without explicit warning or making a backup available, due to the nature of her last message I don't know if we can be sure. There are alre |
16:38
🔗
|
anon42 |
all recent articles and other sections of the site are not backed up as far as I can tell. I am worried that if Groklaw is lost, a lot of important history will be lost along with it. I remembered hearing about archive team so I came here. Apologies for the essay. |
16:48
🔗
|
balrog |
anon42: we're well aware... |
16:54
🔗
|
anon42 |
balrog: oops. no reason to get dramatic then. what's your guys' take on it then? |
17:11
🔗
|
Smiley |
anon42: our take doesn't matter |
17:12
🔗
|
Smiley |
our take is |
17:12
🔗
|
Smiley |
lets back that shit up |
17:13
🔗
|
omf_ |
If you want to hear different opinions about how we feel ask in #archiveteam-bs. That is the channel where we have those discussions. |
17:17
🔗
|
anon42 |
Good to know. I just realized the first message I saw in here is actually refers to this and I missed it. So, how can I help? |
17:19
🔗
|
Smiley |
Firstly, does anyone want to call themselves out as doing it? |
17:20
🔗
|
Smiley |
SketchCow: your normally on stuff like this damn fast? |
17:20
🔗
|
Smiley |
anon42: we have a wiki which documents how you can take your own archive of the site at: http://www.archiveteam.org/index.php?title=Wget#Creating_WARC_with_wget |
17:24
🔗
|
omf_ |
I have one grab going for the pages and plan a follow up for the pdfs |
17:24
🔗
|
balrog |
https://law.resource.org/pub/us/code/ga/ needs backup |
17:46
🔗
|
SketchCow |
it does. |
17:57
🔗
|
omf_ |
I create a simple network diagram of the warriors. http://picpaste.com/6uO20RMg.png Is anything missing? Is there a better format to use? |
17:59
🔗
|
omf_ |
I am going to lay it out different in the next version so none of the labels are obscured by connection lines |
17:59
🔗
|
Smiley |
looks right |
18:16
🔗
|
godane |
Deewlant: my problem is i don't know how do it in the url |
18:16
🔗
|
godane |
give me the url that works then i can do it |
18:17
🔗
|
Deewiant |
godane: Look into the cookie that it sets and just send that as part of the request, I don't think you can set it in the URL |
18:21
🔗
|
omf_ |
Here is version 2 of the warrior network - http://picpaste.com/pics/Pz81z7Mx.1377022875.png |
19:20
🔗
|
godane |
Deewiant: its not working |
19:21
🔗
|
godane |
i only got 2 lines in cookies with growlaw.net |
19:21
🔗
|
godane |
.groklaw.net TRUE / FALSE 1408547640 LastVisit 1377026061 |
19:21
🔗
|
godane |
.groklaw.net TRUE / FALSE 1377012240 LastVisitTemp 1377022575 |
19:22
🔗
|
omf_ |
I am already 500mb deep into my groklaw grab |
19:22
🔗
|
godane |
are you grabbing all comments? |
19:22
🔗
|
omf_ |
everything |
19:23
🔗
|
godane |
what are you using for commands? |
19:23
🔗
|
Deewiant |
godane: Ah sorry it's evidently just a post request, use &mode=nested in the url |
19:23
🔗
|
Deewiant |
Didn't really look into it properly earlier, sorry again for the trouble |
19:24
🔗
|
godane |
fuck yes |
19:24
🔗
|
godane |
this well work |
19:37
🔗
|
godane |
also good news is that if a get error on byte with a list of urls i only stop going to down that one page and goes to the next url in the list |
19:37
🔗
|
godane |
so it doesn't just fail and calls quits on me |
19:38
🔗
|
Smiley |
:/ |
19:38
🔗
|
Smiley |
good, can yoiu log the error too? |
19:41
🔗
|
godane |
i log everything these days |
19:41
🔗
|
godane |
to make sure we know what is missing or just 500 error on me |
20:54
🔗
|
SketchCow |
https://vine.co/v/hMLVA1emhej |
20:54
🔗
|
SketchCow |
Someone save that |
20:54
🔗
|
SketchCow |
That's going to disappear. |
20:56
🔗
|
omf_ |
got it |
20:56
🔗
|
balrog |
o.o |
20:56
🔗
|
omf_ |
óò |
20:58
🔗
|
SketchCow |
http://www.buzzfeed.com/mikehayes/this-terrifying-vine-shows-the-exact-moment-a-truck-flies-ov |
20:58
🔗
|
SketchCow |
(He survived) |
22:18
🔗
|
Asparagir |
One of the newly uploaded Prelinger films is basically a mid-century modern animated version of the IPV4 --> IPV6 upgrades: https://archive.org/details/6317_Mr_Digit_and_the_Battle_of_Bubbling_Brook_01_15_17_02 |
22:18
🔗
|
Asparagir |
Really cute, and definitely quotable. Letters! They're hemming us in! |