Time |
Nickname |
Message |
02:35
🔗
|
Somebody2 |
Fixing tracker again; minu-me got stuck |
02:44
🔗
|
Somebody2 |
Turning off the queue for minu-me for a bit; let's see if it drains or not |
03:09
🔗
|
Somebody2 |
down to 40 rather than 70... we'll see how it goes |
03:55
🔗
|
Somebody2 |
ok, the queue finally drained; now re-doing the 7 I removed to clear the blockage |
03:55
🔗
|
Somebody2 |
we'll see if any of them consistently fail |
03:56
🔗
|
Somebody2 |
nope, they all went through very nicely |
03:56
🔗
|
Somebody2 |
strange |
03:57
🔗
|
Somebody2 |
ok, turning the queue back on at 60 this time |
04:17
🔗
|
|
Sk1d has joined #urlteam |
04:18
🔗
|
|
jornane has quit IRC (Ping timeout: 268 seconds) |
04:22
🔗
|
|
jornane has joined #urlteam |
04:26
🔗
|
Somebody2 |
lower queue to 40 |
05:48
🔗
|
Somebody2 |
reducing queue even further |
05:49
🔗
|
Somebody2 |
increase time between requests to 0.8 |
05:59
🔗
|
Somebody2 |
Frogging: very good point |
05:59
🔗
|
Somebody2 |
What do we have that handles resuming better? Warrior jobs? |
05:59
🔗
|
Frogging |
yes, something more custom |
06:00
🔗
|
Frogging |
warrior jobs don't have the resuming problem because they're split up into units anyway |
06:00
🔗
|
Frogging |
but throwing recursive crawlers at something like metafilter is asking for trouble, especially if it's archivebot |
06:01
🔗
|
Somebody2 |
If we're doing this on request of the site owner, I'm not sure why they can't do it themselves... |
06:01
🔗
|
Somebody2 |
Or mail IA a hard drive of the data, and we can convert it into web pages locally at our leasure. |
06:02
🔗
|
Frogging |
I don't think they'd get into wayback that way |
06:09
🔗
|
Somebody2 |
They could get into wayback eventually... |
06:10
🔗
|
Somebody2 |
If the site owner sent us the full data, we could spin up a virtual machine that thought it was the original, talk to it with a browser, and thereby get the same as the real site, but without the network delay (and cost) |
06:10
🔗
|
Somebody2 |
But I'm pretty sure IA isn't (yet) set up to do that. |
06:11
🔗
|
Somebody2 |
Anyway, this isn't actually #urlteam -- we both just mistakenly switched channels halfway through this conversation! |
06:23
🔗
|
Somebody2 |
turning off the queue on minu-me for the evening, to avoid it getting the tracker stuck |
06:38
🔗
|
Somebody2 |
trying the queue at 5 items, with 10 urls in each |
06:41
🔗
|
Somebody2 |
OK, that seems to be working; I'll leave that for the night |
09:42
🔗
|
|
Jonison has joined #urlteam |
10:34
🔗
|
|
JAA has joined #urlteam |
13:09
🔗
|
|
JAA has quit IRC (Quit: Page closed) |
15:50
🔗
|
chfoo |
i'll merge that pull request later today |
21:28
🔗
|
|
Jonison has quit IRC (Read error: Connection reset by peer) |