Time |
Nickname |
Message |
00:12
🔗
|
chronomex |
strange surge in posterous grabbing rate this hour: http://zeppelin.xrtc.net/corp.xrtc.net/shilling.corp.xrtc.net/project_items.html |
00:19
🔗
|
chronomex |
alard: if you'd like to delegate a second dba, I'd love to have two people in charge :) |
00:19
🔗
|
chronomex |
alternately, document it on the wiki I guess |
00:34
🔗
|
chronomex |
http://archiveteam.org/index.php?title=Tracker |
01:35
🔗
|
SketchCow |
CHAT DONE |
01:35
🔗
|
SketchCow |
Gave the talk |
01:35
🔗
|
SketchCow |
Spoiler alert, I yelled |
01:49
🔗
|
omf_ |
Did they video it |
02:02
🔗
|
SketchCow |
Yepo |
02:02
🔗
|
SketchCow |
Up next week |
02:21
🔗
|
chronomex |
I see some spammers are coming in to the wiki again |
02:21
🔗
|
chronomex |
e.g. http://archiveteam.org/index.php?title=User:JPSMavis |
02:30
🔗
|
omf_ |
SketchCow, did webstock do full video recordings? Looking through the past talks coverage looks spotty |
02:30
🔗
|
SketchCow |
Yes |
03:24
🔗
|
balrog |
can someone explain this? http://web.archive.org/web/*/http://www.2ndwave.com/details.asp?ProductID=55 |
03:24
🔗
|
balrog |
I click on it |
03:24
🔗
|
balrog |
says "not in IA" :( |
03:49
🔗
|
balrog |
is seems all the captures from http://web.archive.org/web/20070624184153/http://www.2ndwave.com/ are gone :/ |
03:49
🔗
|
balrog |
was there some sort of data loss? |
14:44
🔗
|
ats |
more scanned magazines: http://www.digitpress.com/library/magazines/ |
15:03
🔗
|
Smiley |
https://twitter.com/hukl/status/307469987826761729 .... |
15:03
🔗
|
BlueMax |
holy shit that better not be true |
15:30
🔗
|
underscor |
Smiley: Isn't that just sudo password caching? |
15:30
🔗
|
Smiley |
underscor: kind of |
15:31
🔗
|
Smiley |
sudo checks the time of the cache against epoch |
15:31
🔗
|
Smiley |
.... |
15:31
🔗
|
Smiley |
so if your at epoch |
15:31
🔗
|
Smiley |
blam, stright into root |
15:31
🔗
|
underscor |
Have you tried that tweet? |
15:32
🔗
|
underscor |
It doesn't make sense to me that setting yourself near the epoch would give you instant root |
15:32
🔗
|
Smiley |
hmmm kind of |
15:32
🔗
|
Smiley |
but stipdly I did it on my desktop |
15:32
🔗
|
underscor |
(based on what I know of the sudo codebase) |
15:32
🔗
|
Smiley |
now I can do it clearly, let me try again without exploding all the ssl certs etc |
15:32
🔗
|
Smiley |
Oh wait |
15:32
🔗
|
Smiley |
yeah its not a bug in sudo |
15:32
🔗
|
Smiley |
its a bug in the fact X is running as root |
15:33
🔗
|
Smiley |
and KDM/Gnome lets you set the date... (as root). |
15:37
🔗
|
Smiley |
(as far as I understand). |
15:41
🔗
|
grawity |
hmm, the default polkit policy in GNOME requires authenticating as root, unless the user is already in the 'wheel' group |
15:41
🔗
|
grawity |
(in which case the user is already allowed to become root via pkexec) |
15:54
🔗
|
omf_ |
Has anyone tried using Heritrix instead of wget on sites that wget fails out on? |
17:18
🔗
|
sep332 |
how is the project switching IPs so often now? |
17:20
🔗
|
soultcer |
Amazon EC2 |
17:20
🔗
|
soultcer |
Check out #preposterus for more discussion on it btw |
17:24
🔗
|
sep332 |
makes sense, since they bill by the hour and only work for an hour |
17:25
🔗
|
titanous |
Hey, I just downloaded the warrior VM, and I can't get past the settings screen |
17:26
🔗
|
titanous |
the JS console says "INVALID_SETTINGS" |
17:26
🔗
|
soultcer |
titanous: The tracker server is currently overloaded, it can't send you a list of projects |
17:26
🔗
|
titanous |
ah, k, I'll wait for things to settle down then |
17:28
🔗
|
titanous |
looks like it's working now, is there anything special I should do about the Posterous ban? |
17:29
🔗
|
soultcer |
titanous: Posterous runs a cronjob that will ban you exactly 50 minutes after the full hour |
17:29
🔗
|
soultcer |
So it's best to start at XX:50 so you can do a full hour |
17:29
🔗
|
titanous |
k, got it |
17:29
🔗
|
titanous |
how long does the ban last? |
17:29
🔗
|
soultcer |
About a day or two |
18:05
🔗
|
ersi |
= Please join #preposterus - if you're here regarding Posterious closing down. = |
18:58
🔗
|
alard |
SketchCo2: Uploading of what? |
18:59
🔗
|
alard |
chronomex: I'm all for sharing the management access. |
19:44
🔗
|
SketchCo2 |
Posterous |
19:46
🔗
|
SketchCow |
I'm in conversation with the lead engineer of Twitter. |
19:46
🔗
|
SketchCow |
He may ask for IP ranges for ban amnesty. |
19:46
🔗
|
SketchCow |
P.S. Posterous is hosted on Rackspace |
19:46
🔗
|
SketchCow |
Twitter knew they were going to shut it down the day they bought it, so they didn't bother porting it to Amazon. |
19:50
🔗
|
eadler |
SketchCow: can they just provide a data-dump? |
20:09
🔗
|
sep332 |
that means, if you host your server at rackspace, and get Posterous's local-network IP addresses, they wouldn't have to pay bandwidth costs for the transfer |
20:11
🔗
|
alard |
Is there someone who wants to finish Punchfork? The users that are left on the tracker have been tried a few times before, so they're probably large or have some other problem. |
20:11
🔗
|
Smiley |
can i swith my warrior remotely |
20:11
🔗
|
Smiley |
i'll happily leave it running all weekend |
20:11
🔗
|
alard |
If you can access it, you can. |
20:12
🔗
|
Smiley |
yah, ssh |
20:12
🔗
|
sep332 |
i have one warrior running on punchfork. |
20:12
🔗
|
Smiley |
I just don't know the command. |
20:12
🔗
|
sep332 |
it has two jobs over 1,000 items and they look stuck |
20:13
🔗
|
alard |
Smiley: Tunnel the http connection (to port 8001), or use some curl thing (let me look that up). |
20:14
🔗
|
Smiley |
i need the curl really, i'm not on my linux system atm to tunnel easily |
20:14
🔗
|
alard |
curl -d "project_name=punchfork" http://localhost:8001/api/select-project |
20:15
🔗
|
Smiley |
crap, whats teh IP of my warrior :S |
20:16
🔗
|
Smiley |
oh yeah, do't need it |
20:16
🔗
|
Smiley |
urgh idiot moments for me. Thanks alard |
20:16
🔗
|
titanous |
punchfork is showing "No item received" for me |
20:17
🔗
|
Smiley |
I can't see it here :D |
20:17
🔗
|
Smiley |
I will be able to later... |
20:22
🔗
|
sep332 |
a have a stuck punchfork job :( |
20:22
🔗
|
sep332 |
urbanglobetrotterblog.com stopped after 1420 items |
20:22
🔗
|
sep332 |
*"URLs" |
20:22
🔗
|
Smiley |
some huge users |
20:22
🔗
|
Smiley |
2gb. |
20:23
🔗
|
alard |
sep332: Yes. That's why I don't really want to put these items back into the queue. If they're still not done they're likely to be difficult. |
20:24
🔗
|
Smiley |
:/ |
20:24
🔗
|
ersi |
I'll start up one of my larger work horses |
20:27
🔗
|
sep332 |
makes sense. difficult just meaning big, or requiring extra fiddling? |
20:28
🔗
|
ersi |
Both, probably - and unfortunally |
20:28
🔗
|
sep332 |
oh hey it went to 1440, guess it's not toally stuck |
20:30
🔗
|
ersi |
alard: I'm up for some punchfork. could you release just a few? like 5-10? |
20:37
🔗
|
soultcer |
sep332: Re Setting up a server at rackspace: Incoing traffic is free on Amazon EC2. The expensive part is outgoing traffic caused by uploading the warc.gz file to the internet archive |
20:37
🔗
|
soultcer |
I actually ran posterous-grab at rackspace - The 0,5 msec ping to posterous was nice, but outgoing traffic is actually more expensive than when using Amazon |
20:37
🔗
|
soultcer |
And instances are more expensive as well |
20:38
🔗
|
sep332 |
true about outgoing traffic. but it twitter's not paying for bandwidth, maybe they'd cooperate more... ok maybe not :) |
20:39
🔗
|
sep332 |
also if it's much faster, you might need the servers for less time so it could be cheaper. i have no idea about the math for that though. |
20:41
🔗
|
Smiley |
if time @ amazon + amazon b/w < time @rackspace + rackspace bw.... |
20:41
🔗
|
Smiley |
So if money saved from less time < money saved due to price diff @ amazon |
20:41
🔗
|
soultcer |
Amazon Spot instances are 10 times cheaper than rackspace regular instances |
20:42
🔗
|
soultcer |
Amazon traffic out is $0.12/gb, rackspace I think $0.16/gb |
20:42
🔗
|
Smiley |
so it'd need to be like 30x faster or something |
20:42
🔗
|
soultcer |
While the scripts and dynamic pages are on rackspace, static assets like images are actually on Amazon S3 |
20:56
🔗
|
ersi |
It'd be great to know people inside Amazon and Rackspace |
21:14
🔗
|
alard |
ersi, Smiley: I've added the remaining punchfork tasks to your queues. |
21:16
🔗
|
Smiley |
thanks |
21:16
🔗
|
Smiley |
it should keep crunching away quite happily, I gave my warrior extyra ram at some point in the past, not sure if it still has it tho |
21:16
🔗
|
Smiley |
alard: got the command to do the port forward handy? |
21:17
🔗
|
alard |
Smiley: Isn't that something with -L ? ssh -L LOCAL_PORT:127.0.0.1:8001 |
21:18
🔗
|
Smiley |
yeah, sounds about right |
21:18
🔗
|
* |
Smiley checks his logs |
21:18
🔗
|
Smiley |
478 ssh -i ./.ssh/amazonkey.pem -L 8002:localhost:8001 admin@ec2-184-72-85-21.compute-1.amazonaws.com |
21:18
🔗
|
Smiley |
there we go, thats basically it |
21:20
🔗
|
Smiley |
ssh -L 8001:localhost:8001 tim.bowers@10.2.1.134 -f -N |
21:20
🔗
|
Smiley |
thgat's backgrounded |
21:20
🔗
|
Smiley |
- Downloaded: 13440 URLs. |
21:20
🔗
|
Smiley |
Starting WgetDownload for Item user-Taylor_Lynn - Downloaded: 11040 URLs. |
21:20
🔗
|
Smiley |
yup, all big users |
21:20
🔗
|
Smiley |
grabbing at ~400Kbs+ |
22:47
🔗
|
ersi |
alard: Doesn't seem to pick 'em up |
22:49
🔗
|
ersi |
alard: I'm running punchfork-grab stand-alone, if that matters |
23:04
🔗
|
Smiley |
Guys.... |
23:04
🔗
|
Smiley |
WHo is alive who understands puynchfork? |
23:04
🔗
|
Smiley |
http://pastebin.com/rXvRgixt - it blew up when zipping. |
23:05
🔗
|
Smiley |
I have one currently at - Downloaded: 34050 URLs. too |
23:07
🔗
|
S[h]O[r]T |
ersi i think he returned them to just Smiley |
23:07
🔗
|
Smiley |
he gave them to us both |
23:07
🔗
|
S[h]O[r]T |
oh |
23:07
🔗
|
Smiley |
not sure if we goth both each or what |
23:09
🔗
|
ersi |
media-cdn1.pinterest.com doesn't exist |
23:10
🔗
|
S[h]O[r]T |
gthub is being super dumb..im looking at your pastebin smiley.. |
23:10
🔗
|
ersi |
guess we should try: wrap that bitch for socket.gaierrors |
23:10
🔗
|
Smiley |
S[h]O[r]T: o_O |
23:10
🔗
|
S[h]O[r]T |
and ^^. i see the connection failure at the end but also quesiton if it even downloaded any data for that user? |
23:10
🔗
|
S[h]O[r]T |
just a guess from looking at those lines in pipeline.py, probably wrong |
23:11
🔗
|
Smiley |
duno :/ |
23:12
🔗
|
Smiley |
i on't have ssh access directly to the warrior atm |
23:12
🔗
|
Smiley |
only the web interface |
23:12
🔗
|
Smiley |
its been doing upto 800Kb/s |
23:13
🔗
|
Smiley |
so it's doing "something" :/ |