* runsv and EAGAIN
@ 2008-07-11 20:46 Charlie Brady
2008-07-12 1:27 ` Charlie Brady
0 siblings, 1 reply; 7+ messages in thread
From: Charlie Brady @ 2008-07-11 20:46 UTC (permalink / raw)
To: supervision
I'm just putting together a new sytem using CentOS 5.2 (i686 -
2.6.18-92.el5PAE), and have runsv chewing up lots of CPU:
...
1202 root 25 0 108 28 20 R 10.2 0.0 3:13.97 runsv
1300 root 25 0 112 36 20 R 10.2 0.0 3:17.01 runsv
1306 root 25 0 108 28 20 R 10.2 0.0 3:18.52 runsv
1322 root 25 0 112 36 20 R 10.2 0.0 3:16.97 runsv
1203 root 25 0 108 32 20 R 9.9 0.0 3:19.21 runsv
1228 root 25 0 112 36 20 R 9.9 0.0 3:21.39 runsv
1271 root 25 0 112 32 20 R 9.9 0.0 3:24.37 runsv
1200 root 25 0 112 32 20 R 9.6 0.0 3:23.17 runsv
...
Here's a sample strace:
rt_sigprocmask(SIG_UNBLOCK, [TERM], NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
poll([{fd=3, events=POLLIN}, {fd=9, events=POLLIN, revents=POLLIN},
{fd=11, events=POLLIN}], 3, 1000020) = 1
rt_sigprocmask(SIG_BLOCK, [TERM], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
read(3, 0xbfa256c3, 1) = -1 EAGAIN (Resource temporarily
unavailable)
waitpid(-1, 0xbfa256bc, WNOHANG) = 0
read(9, "", 1) = 0
read(11, 0xbfa256c3, 1) = -1 EAGAIN (Resource temporarily
unavailable)
gettimeofday({1215792640, 591768}, NULL) = 0
rt_sigprocmask(SIG_UNBLOCK, [TERM], NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
poll([{fd=3, events=POLLIN}, {fd=9, events=POLLIN, revents=POLLIN},
{fd=11, events=POLLIN}], 3, 1000020) = 1
rt_sigprocmask(SIG_BLOCK, [TERM], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
read(3, 0xbfa256c3, 1) = -1 EAGAIN (Resource temporarily
unavailable)
waitpid(-1, 0xbfa256bc, WNOHANG) = 0
read(9, "", 1) = 0
read(11, 0xbfa256c3, 1) = -1 EAGAIN (Resource temporarily
unavailable)
gettimeofday({1215792640, 593313}, NULL) = 0
...
I think this might be a case of a kernel being "this dumb" - iopause.c:
...
/* XXX: some kernels apparently need x[0] even if len is 0 */
/* XXX: how to handle EAGAIN? are kernels really this dumb? */
/* XXX: how to handle EINVAL? when exactly can this happen? */
...
...
ERRORS
The poll() function shall fail if:
EAGAIN The allocation of internal data structures failed but a
subsequent request may succeed.
...
The system isn't obviously out of memory:
[root@amc-upgrade:]# free
total used free shared buffers cached
Mem: 255608 213396 42212 0 29240 109132
-/+ buffers/cache: 75024 180584
Swap: 265064 0 265064
~
[root@amc-upgrade:]#
[root@amc-upgrade:]# cat /proc/meminfo
MemTotal: 255608 kB
MemFree: 47580 kB
Buffers: 29276 kB
Cached: 109176 kB
SwapCached: 0 kB
Active: 112132 kB
Inactive: 75084 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 255608 kB
LowFree: 47580 kB
SwapTotal: 265064 kB
SwapFree: 265064 kB
Dirty: 2176 kB
Writeback: 0 kB
AnonPages: 48780 kB
Mapped: 9236 kB
Slab: 15600 kB
PageTables: 2372 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
CommitLimit: 392868 kB
Committed_AS: 206484 kB
VmallocTotal: 772088 kB
VmallocUsed: 2284 kB
VmallocChunk: 769652 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
Hugepagesize: 2048 kB
~
[root@amc-upgrade:]#
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: runsv and EAGAIN
2008-07-11 20:46 runsv and EAGAIN Charlie Brady
@ 2008-07-12 1:27 ` Charlie Brady
2008-07-12 16:07 ` Dražen Kačar
0 siblings, 1 reply; 7+ messages in thread
From: Charlie Brady @ 2008-07-12 1:27 UTC (permalink / raw)
To: supervision
On Fri, 11 Jul 2008, Charlie Brady wrote:
> Here's a sample strace:
>
> rt_sigprocmask(SIG_UNBLOCK, [TERM], NULL, 8) = 0
> rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
> poll([{fd=3, events=POLLIN}, {fd=9, events=POLLIN, revents=POLLIN}, {fd=11,
> events=POLLIN}], 3, 1000020) = 1
> rt_sigprocmask(SIG_BLOCK, [TERM], NULL, 8) = 0
> rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
> read(3, 0xbfa256c3, 1) = -1 EAGAIN (Resource temporarily
> unavailable)
> waitpid(-1, 0xbfa256bc, WNOHANG) = 0
> read(9, "", 1) = 0
> read(11, 0xbfa256c3, 1) = -1 EAGAIN (Resource temporarily
> unavailable)
> gettimeofday({1215792640, 591768}, NULL) = 0
...
> ...
> ERRORS
> The poll() function shall fail if:
>
> EAGAIN The allocation of internal data structures failed but a
> subsequent request may succeed.
>
> ...
Sorry - this, from "man 3 read" is the relevant manpage section:
...
ERRORS
EAGAIN Non-blocking I/O has been selected using O_NONBLOCK and no
data was immediately available for reading.
...
So poll i saying that data is available, and read is saying that it isn't.
Is anyone else confused?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: runsv and EAGAIN
2008-07-12 1:27 ` Charlie Brady
@ 2008-07-12 16:07 ` Dražen Kačar
2008-07-13 2:46 ` Charlie Brady
0 siblings, 1 reply; 7+ messages in thread
From: Dražen Kačar @ 2008-07-12 16:07 UTC (permalink / raw)
To: supervision
Charlie Brady wrote:
> >Here's a sample strace:
> >
> >rt_sigprocmask(SIG_UNBLOCK, [TERM], NULL, 8) = 0
> >rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
> >poll([{fd=3, events=POLLIN}, {fd=9, events=POLLIN, revents=POLLIN},
> >{fd=11, events=POLLIN}], 3, 1000020) = 1
You only have POLLIN in revents for fd 9. Also, the return value is 1,
meaning that only one file descriptor has readable data.
> >rt_sigprocmask(SIG_BLOCK, [TERM], NULL, 8) = 0
> >rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
> >read(3, 0xbfa256c3, 1) = -1 EAGAIN (Resource temporarily
> >unavailable)
> >waitpid(-1, 0xbfa256bc, WNOHANG) = 0
> >read(9, "", 1) = 0
> >read(11, 0xbfa256c3, 1) = -1 EAGAIN (Resource temporarily
> >unavailable)
> >gettimeofday({1215792640, 591768}, NULL) = 0
> ...
> So poll i saying that data is available, and read is saying that it isn't.
> Is anyone else confused?
poll() is saying that the data is available on fd 9 and read() on fd 9
does not return an error. The other two file descriptors should have not
been read.
--
.-. .-. Yes, I am an agent of Satan, but my duties are largely
(_ \ / _) ceremonial.
|
| dave@fly.srk.fer.hr
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: runsv and EAGAIN
2008-07-12 16:07 ` Dražen Kačar
@ 2008-07-13 2:46 ` Charlie Brady
2009-01-20 23:21 ` runsv spinning 100% CPU (was Re: runsv and EAGAIN) Charlie Brady
0 siblings, 1 reply; 7+ messages in thread
From: Charlie Brady @ 2008-07-13 2:46 UTC (permalink / raw)
To: Dražen Kačar; +Cc: supervision
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed, Size: 3209 bytes --]
On Sat, 12 Jul 2008, Dra¾en Kaèar wrote:
> Charlie Brady wrote:
>>> Here's a sample strace:
>>>
>>> rt_sigprocmask(SIG_UNBLOCK, [TERM], NULL, 8) = 0
>>> rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
>>> poll([{fd=3, events=POLLIN}, {fd=9, events=POLLIN, revents=POLLIN},
>>> {fd=11, events=POLLIN}], 3, 1000020) = 1
>
> You only have POLLIN in revents for fd 9. Also, the return value is 1,
> meaning that only one file descriptor has readable data.
>
>>> rt_sigprocmask(SIG_BLOCK, [TERM], NULL, 8) = 0
>>> rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
>>> read(3, 0xbfa256c3, 1) = -1 EAGAIN (Resource temporarily
>>> unavailable)
>>> waitpid(-1, 0xbfa256bc, WNOHANG) = 0
>>> read(9, "", 1) = 0
>>> read(11, 0xbfa256c3, 1) = -1 EAGAIN (Resource temporarily
>>> unavailable)
>>> gettimeofday({1215792640, 591768}, NULL) = 0
>> ...
>
>> So poll i saying that data is available, and read is saying that it isn't.
>> Is anyone else confused?
>
> poll() is saying that the data is available on fd 9 and read() on fd 9
> does not return an error. The other two file descriptors should have not
> been read.
Thanks. I didn't read the strace at all carefully, did I?
I see in iopause() that poll() is called without checking the return
status.
fd 3 is runsv's selfpipe. fd 9 is the control fifo and fd 11 is the log
control fifo. Poll tells us that fd 9 had data to read, but then read
returned 0.
I'd better check (on Monday) to see whether I somehow have multiple runsv
processes in the same directory...
Here's the relevant part of runsv.c:
...
sig_unblock(sig_term);
sig_unblock(sig_child);
iopause(x, 2 +haslog, &deadline, &now);
sig_block(sig_term);
sig_block(sig_child);
while (read(selfpipe[0], &ch, 1) == 1)
;
for (;;) {
int child;
int wstat;
child =wait_nohang(&wstat);
if (!child) break;
if ((child == -1) && (errno != error_intr)) break;
if (child == svd[0].pid) {
svd[0].pid =0;
pidchanged =1;
svd[0].wstat =wstat;
svd[0].ctrl &=~C_TERM;
if (svd[0].state != S_FINISH)
if ((fd =open_read("finish")) != -1) {
close(fd);
svd[0].state =S_FINISH;
update_status(&svd[0]);
continue;
}
svd[0].state =S_DOWN;
taia_uint(&deadline, 1);
taia_add(&deadline, &svd[0].start, &deadline);
taia_now(&svd[0].start);
update_status(&svd[0]);
if (taia_less(&svd[0].start, &deadline)) sleep(1);
}
if (haslog) {
if (child == svd[1].pid) {
svd[1].pid =0;
pidchanged =1;
svd[1].state =S_DOWN;
svd[1].ctrl &=~C_TERM;
taia_uint(&deadline, 1);
taia_add(&deadline, &svd[1].start, &deadline);
taia_now(&svd[1].start);
update_status(&svd[1]);
if (taia_less(&svd[1].start, &deadline)) sleep(1);
}
}
}
if (read(svd[0].fdcontrol, &ch, 1) == 1) ctrl(&svd[0], ch);
if (haslog)
if (read(svd[1].fdcontrol, &ch, 1) == 1) ctrl(&svd[1], ch);
...
^ permalink raw reply [flat|nested] 7+ messages in thread
* runsv spinning 100% CPU (was Re: runsv and EAGAIN)
2008-07-13 2:46 ` Charlie Brady
@ 2009-01-20 23:21 ` Charlie Brady
2009-02-10 12:41 ` Gerrit Pape
0 siblings, 1 reply; 7+ messages in thread
From: Charlie Brady @ 2009-01-20 23:21 UTC (permalink / raw)
To: supervision
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed, Size: 4264 bytes --]
I've finally found time to look at this issue again, and have found a way
to replicate the issue at will.
To replicate, create a service directory containing an empty file
supervise/control in place of the fifo. Create a symlink so that runsvdir
spawns a new runsv process. runsv will then behave as shown in the strace
plus commentary shown below.
The unexpected behaviour comes from a combination of two factors -
supervise/control is a file and not a fifo, and linux poll() sets POLLIN
for a regular file at EOF:
http://www.greenend.org.uk/rjk/2001/06/poll.html
IMO this is an error condition which runsv should handle more gracefully.
The question is - how should the error be handled or corrected? One way
would be for runsv to replace any existing supervise/control at startup
unless it is a fifo.
On Sat, 12 Jul 2008, Charlie Brady wrote:
> On Sat, 12 Jul 2008, Dra¾en Kaèar wrote:
>
>> Charlie Brady wrote:
>> > > Here's a sample strace:
>> > >
>> > > rt_sigprocmask(SIG_UNBLOCK, [TERM], NULL, 8) = 0
>> > > rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
>> > > poll([{fd=3, events=POLLIN}, {fd=9, events=POLLIN, revents=POLLIN},
>> > > {fd=11, events=POLLIN}], 3, 1000020) = 1
>>
>> You only have POLLIN in revents for fd 9. Also, the return value is 1,
>> meaning that only one file descriptor has readable data.
>>
>> > > rt_sigprocmask(SIG_BLOCK, [TERM], NULL, 8) = 0
>> > > rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
>> > > read(3, 0xbfa256c3, 1) = -1 EAGAIN (Resource
>> > > temporarily
>> > > unavailable)
>> > > waitpid(-1, 0xbfa256bc, WNOHANG) = 0
>> > > read(9, "", 1) = 0
>> > > read(11, 0xbfa256c3, 1) = -1 EAGAIN (Resource
>> > > temporarily
>> > > unavailable)
>> > > gettimeofday({1215792640, 591768}, NULL) = 0
>> > ...
>>
>> > So poll i saying that data is available, and read is saying that it
>> > isn't.
>> > Is anyone else confused?
>>
>> poll() is saying that the data is available on fd 9 and read() on fd 9
>> does not return an error. The other two file descriptors should have not
>> been read.
>
> Thanks. I didn't read the strace at all carefully, did I?
>
> I see in iopause() that poll() is called without checking the return status.
>
> fd 3 is runsv's selfpipe. fd 9 is the control fifo and fd 11 is the log
> control fifo. Poll tells us that fd 9 had data to read, but then read
> returned 0.
>
> I'd better check (on Monday) to see whether I somehow have multiple runsv
> processes in the same directory...
>
> Here's the relevant part of runsv.c:
>
> ...
> sig_unblock(sig_term);
> sig_unblock(sig_child);
> iopause(x, 2 +haslog, &deadline, &now);
> sig_block(sig_term);
> sig_block(sig_child);
>
> while (read(selfpipe[0], &ch, 1) == 1)
> ;
> for (;;) {
> int child;
> int wstat;
>
> child =wait_nohang(&wstat);
> if (!child) break;
> if ((child == -1) && (errno != error_intr)) break;
> if (child == svd[0].pid) {
> svd[0].pid =0;
> pidchanged =1;
> svd[0].wstat =wstat;
> svd[0].ctrl &=~C_TERM;
> if (svd[0].state != S_FINISH)
> if ((fd =open_read("finish")) != -1) {
> close(fd);
> svd[0].state =S_FINISH;
> update_status(&svd[0]);
> continue;
> }
> svd[0].state =S_DOWN;
> taia_uint(&deadline, 1);
> taia_add(&deadline, &svd[0].start, &deadline);
> taia_now(&svd[0].start);
> update_status(&svd[0]);
> if (taia_less(&svd[0].start, &deadline)) sleep(1);
> }
> if (haslog) {
> if (child == svd[1].pid) {
> svd[1].pid =0;
> pidchanged =1;
> svd[1].state =S_DOWN;
> svd[1].ctrl &=~C_TERM;
> taia_uint(&deadline, 1);
> taia_add(&deadline, &svd[1].start, &deadline);
> taia_now(&svd[1].start);
> update_status(&svd[1]);
> if (taia_less(&svd[1].start, &deadline)) sleep(1);
> }
> }
> }
> if (read(svd[0].fdcontrol, &ch, 1) == 1) ctrl(&svd[0], ch);
> if (haslog)
> if (read(svd[1].fdcontrol, &ch, 1) == 1) ctrl(&svd[1], ch);
>
> ...
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: runsv spinning 100% CPU (was Re: runsv and EAGAIN)
2009-01-20 23:21 ` runsv spinning 100% CPU (was Re: runsv and EAGAIN) Charlie Brady
@ 2009-02-10 12:41 ` Gerrit Pape
2009-02-18 0:09 ` Laurent Bercot
0 siblings, 1 reply; 7+ messages in thread
From: Gerrit Pape @ 2009-02-10 12:41 UTC (permalink / raw)
To: supervision
Charlie Brady <charlieb-supervision <at> budge.apana.org.au> writes:
> I've finally found time to look at this issue again, and have found a way
> to replicate the issue at will.
>
> To replicate, create a service directory containing an empty file
> supervise/control in place of the fifo. Create a symlink so that runsvdir
> spawns a new runsv process. runsv will then behave as shown in the strace
> plus commentary shown below.
>
> The unexpected behaviour comes from a combination of two factors -
> supervise/control is a file and not a fifo, and linux poll() sets POLLIN
> for a regular file at EOF:
>
> http://www.greenend.org.uk/rjk/2001/06/poll.html
Yes, I see.
> IMO this is an error condition which runsv should handle more gracefully.
> The question is - how should the error be handled or corrected? One way
> would be for runsv to replace any existing supervise/control at startup
> unless it is a fifo.
Another option would be to have runsv print out an error message like
'supervise/control exists but is not a fifo.' and exit with 111. This
may be better than forcefully removing a file.
Regards, Gerrit.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: runsv spinning 100% CPU (was Re: runsv and EAGAIN)
2009-02-10 12:41 ` Gerrit Pape
@ 2009-02-18 0:09 ` Laurent Bercot
0 siblings, 0 replies; 7+ messages in thread
From: Laurent Bercot @ 2009-02-18 0:09 UTC (permalink / raw)
To: supervision
> Another option would be to have runsv print out an error message like
> 'supervise/control exists but is not a fifo.' and exit with 111. This
> may be better than forcefully removing a file.
I think this is the wiser choice. When in doubt, exit with an error
message and let the administrator handle it.
However, I'd exit 100, not 111, because the problem is structural,
not temporary: it's not that some resource is unavailable, it's that
runsv is used incorrectly (i.e. supervise/ has been tampered with).
--
Laurent
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-02-18 0:09 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-11 20:46 runsv and EAGAIN Charlie Brady
2008-07-12 1:27 ` Charlie Brady
2008-07-12 16:07 ` Dražen Kačar
2008-07-13 2:46 ` Charlie Brady
2009-01-20 23:21 ` runsv spinning 100% CPU (was Re: runsv and EAGAIN) Charlie Brady
2009-02-10 12:41 ` Gerrit Pape
2009-02-18 0:09 ` Laurent Bercot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).