* misleading message for SIGFPE
@ 2024-09-24 12:36 Vincent Lefevre
2024-09-24 13:09 ` Andreas Kähäri
2024-09-24 18:05 ` Bart Schaefer
0 siblings, 2 replies; 9+ messages in thread
From: Vincent Lefevre @ 2024-09-24 12:36 UTC (permalink / raw)
To: zsh-workers
When a command is terminated by SIGFPE, I get a message saying
"floating point exception" (this comes from Src/signames.c):
qaa% sh -c 'kill -FPE $$'
zsh: floating point exception (core dumped) sh -c 'kill -FPE $$'
However, a SIGFPE may also be generated by integer operations
(such as 1 / 0).
ISO C and POSIX use the term "erroneous arithmetic operation".
The GNU C Library manual says "fatal arithmetic error".
BTW, in addition to the signal description, I would suggest to
output the signal name, e.g.
SIGFPE - erroneous arithmetic operation (core dumped)
SIGSEGV - segmentation fault (core dumped)
SIGKILL - killed
Otherwise it is not clear that the command termination is due to
a signal (even though the exit status can be checked / reported).
--
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: misleading message for SIGFPE
2024-09-24 12:36 misleading message for SIGFPE Vincent Lefevre
@ 2024-09-24 13:09 ` Andreas Kähäri
2024-09-25 12:26 ` Vincent Lefevre
2024-09-24 18:05 ` Bart Schaefer
1 sibling, 1 reply; 9+ messages in thread
From: Andreas Kähäri @ 2024-09-24 13:09 UTC (permalink / raw)
To: zsh-workers
On Tue, Sep 24, 2024 at 02:36:52PM +0200, Vincent Lefevre wrote:
> When a command is terminated by SIGFPE, I get a message saying
> "floating point exception" (this comes from Src/signames.c):
>
> qaa% sh -c 'kill -FPE $$'
> zsh: floating point exception (core dumped) sh -c 'kill -FPE $$'
>
> However, a SIGFPE may also be generated by integer operations
> (such as 1 / 0).
>
> ISO C and POSIX use the term "erroneous arithmetic operation".
> The GNU C Library manual says "fatal arithmetic error".
>
> BTW, in addition to the signal description, I would suggest to
> output the signal name, e.g.
>
> SIGFPE - erroneous arithmetic operation (core dumped)
> SIGSEGV - segmentation fault (core dumped)
> SIGKILL - killed
>
> Otherwise it is not clear that the command termination is due to
> a signal (even though the exit status can be checked / reported).
Isn't it already clear from the value of $? what the signal was?
$ sh -c 'kill -FPE $$'
zsh: floating point exception sh -c 'kill -FPE $$'
$ echo $?
136
$ kill -l 136
FPE
--
Andreas (Kusalananda) Kähäri
Uppsala, Sweden
.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: misleading message for SIGFPE
2024-09-24 12:36 misleading message for SIGFPE Vincent Lefevre
2024-09-24 13:09 ` Andreas Kähäri
@ 2024-09-24 18:05 ` Bart Schaefer
2024-09-24 20:03 ` Bart Schaefer
1 sibling, 1 reply; 9+ messages in thread
From: Bart Schaefer @ 2024-09-24 18:05 UTC (permalink / raw)
To: zsh-workers
On Tue, Sep 24, 2024 at 5:37 AM Vincent Lefevre <vincent@vinc17.net> wrote:
>
> When a command is terminated by SIGFPE, I get a message saying
> "floating point exception" (this comes from Src/signames.c):
The "right way" to handle this would be to use the SIGFPE si_codes
from /usr/include/siginfo.h to break down the base signal into the
more-specific cases that cause it, but I'm not very familiar with
usage of siginfo or whether the parent process is able to obtain it
about the exiting child. Anyone?
There are several other signals (e.g., SEGV, POLL) that have added
details available via siginfo.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: misleading message for SIGFPE
2024-09-24 18:05 ` Bart Schaefer
@ 2024-09-24 20:03 ` Bart Schaefer
2024-09-25 12:33 ` Vincent Lefevre
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Bart Schaefer @ 2024-09-24 20:03 UTC (permalink / raw)
To: zsh-workers
On Tue, Sep 24, 2024 at 11:05 AM Bart Schaefer
<schaefer@brasslantern.com> wrote:
>
> I'm not very familiar with
> usage of siginfo or whether the parent process is able to obtain it
Looks as if this is possible (man 5 siginfo). It appears we'd have to
switch to using waitid(2) for child reaping. However, there doesn't
appear to be a strsignal(3) equivalent of psiginfo(3) to grab the
error message rather than spew it on stderr.
OTOH if there's a reason we're not using strsignal() when it's
available, I've forgotten it.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: misleading message for SIGFPE
2024-09-24 13:09 ` Andreas Kähäri
@ 2024-09-25 12:26 ` Vincent Lefevre
0 siblings, 0 replies; 9+ messages in thread
From: Vincent Lefevre @ 2024-09-25 12:26 UTC (permalink / raw)
To: zsh-workers
On 2024-09-24 15:09:37 +0200, Andreas Kähäri wrote:
> On Tue, Sep 24, 2024 at 02:36:52PM +0200, Vincent Lefevre wrote:
> > When a command is terminated by SIGFPE, I get a message saying
> > "floating point exception" (this comes from Src/signames.c):
> >
> > qaa% sh -c 'kill -FPE $$'
> > zsh: floating point exception (core dumped) sh -c 'kill -FPE $$'
> >
> > However, a SIGFPE may also be generated by integer operations
> > (such as 1 / 0).
> >
> > ISO C and POSIX use the term "erroneous arithmetic operation".
> > The GNU C Library manual says "fatal arithmetic error".
> >
> > BTW, in addition to the signal description, I would suggest to
> > output the signal name, e.g.
> >
> > SIGFPE - erroneous arithmetic operation (core dumped)
> > SIGSEGV - segmentation fault (core dumped)
> > SIGKILL - killed
> >
> > Otherwise it is not clear that the command termination is due to
> > a signal (even though the exit status can be checked / reported).
>
> Isn't it already clear from the value of $? what the signal was?
>
> $ sh -c 'kill -FPE $$'
> zsh: floating point exception sh -c 'kill -FPE $$'
> $ echo $?
> 136
>
> $ kill -l 136
> FPE
This requires another step. The idea would be to have it in the
error message. This could be useful in bug reports, like here:
https://github.com/pytorch/pytorch/issues/89817
You cannot go back in time to get the $? value.
--
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: misleading message for SIGFPE
2024-09-24 20:03 ` Bart Schaefer
@ 2024-09-25 12:33 ` Vincent Lefevre
2024-09-25 18:35 ` Bart Schaefer
` (2 subsequent siblings)
3 siblings, 0 replies; 9+ messages in thread
From: Vincent Lefevre @ 2024-09-25 12:33 UTC (permalink / raw)
To: zsh-workers
On 2024-09-24 13:03:27 -0700, Bart Schaefer wrote:
> On Tue, Sep 24, 2024 at 11:05 AM Bart Schaefer
> <schaefer@brasslantern.com> wrote:
> >
> > I'm not very familiar with
> > usage of siginfo or whether the parent process is able to obtain it
>
> Looks as if this is possible (man 5 siginfo). It appears we'd have to
> switch to using waitid(2) for child reaping.
I don't think that it brings anything useful. You'll get the si_code
of the parent (i.e. corresponding to SIGCHLD)[*], not the one that
corresponds to the SIGFPE for the child.
[*] So, as the waitid(2) man page says:
si_code
Set to one of: CLD_EXITED (child called _exit(2)); CLD_KILLED
(child killed by signal); CLD_DUMPED (child killed by signal, and
dumped core); CLD_STOPPED (child stopped by signal); CLD_TRAPPED
(traced child has trapped); or CLD_CONTINUED (child continued by
SIGCONT).
--
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: misleading message for SIGFPE
2024-09-24 20:03 ` Bart Schaefer
2024-09-25 12:33 ` Vincent Lefevre
@ 2024-09-25 18:35 ` Bart Schaefer
2024-09-26 7:58 ` Stephane Chazelas
2024-10-02 18:59 ` zeurkous
3 siblings, 0 replies; 9+ messages in thread
From: Bart Schaefer @ 2024-09-25 18:35 UTC (permalink / raw)
To: zsh-workers
On Tue, Sep 24, 2024 at 1:03 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> OTOH if there's a reason we're not using strsignal() when it's
> available, I've forgotten it.
I compared the output of zsh using sig_msg[] from signames.c to the
output of calling strsignal(), on Ubuntu 20.04. For most signals
(including FPE) the difference is only whether the first character of
the error message is capitalized. The messages that differed are
below (zsh first, strsignal following). There is a USE_SUSPENDED
macro that determines the difference in the SIGT* signals. Perhaps we
avoid strsignal() just for output compatibility with old zsh from
before that was available?
illegal hardware instruction (core dumped)
Illegal instruction (core dumped)
trace trap (core dumped)
Trace/breakpoint trap (core dumped)
abort (core dumped)
Aborted (core dumped)
alarm
Alarm clock
SIGSTKFLT
Stack fault
suspended (signal)
Stopped (signal)
suspended
Stopped
suspended (tty input)
Stopped (tty input)
suspended (tty output)
Stopped (tty output)
cpu limit exceeded (core dumped)
CPU time limit exceeded (core dumped)
virtual time alarm
Virtual timer expired
profile signal
Profiling timer expired
pollable event occurred
I/O possible
power fail
Power failure
invalid system call (core dumped)
Bad system call (core dumped)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: misleading message for SIGFPE
2024-09-24 20:03 ` Bart Schaefer
2024-09-25 12:33 ` Vincent Lefevre
2024-09-25 18:35 ` Bart Schaefer
@ 2024-09-26 7:58 ` Stephane Chazelas
2024-10-02 18:59 ` zeurkous
3 siblings, 0 replies; 9+ messages in thread
From: Stephane Chazelas @ 2024-09-26 7:58 UTC (permalink / raw)
To: Bart Schaefer; +Cc: zsh-workers
FYI, bosh, a POSIXified fork of the Bourne shell by the late
Jörg Schilling has $/, ${.sh.code}, ${.sh.codename} and a few
more special parameters to complement $?. See also the $status
of rc that has text representation of the exit status.
$ bosh -o fullexitcode -c 'sh -c "kill -s SEGV \$\$"; printf "%s\n" "?=$?" "/=$/" "code=${.sh.code}" "codename=${.sh.codename}" "status=${.sh.status}" "termsig=${.sh.termsig}" "signo=${.sh.signo}" "signame=${.sh.signame}"' Segmentation fault - core dumped
?=139
/=SEGV
code=3
codename=DUMPED
status=11
termsig=SEGV
signo=17
signame=CHLD
$ rc -c 'sh -c '\''kill -s SEGV $$'\''; echo $status'
segmentation violation--core dumped
sigsegv+core
See in bosh man page at https://codeberg.org/schilytools/schilytools/src/commit/e835e64f0d84a614b3c8d619ac646060ea6922a5/sh/sh.1#L1809
> ? The decimal value returned by the last synchronously executed
> command or a decimal number derived from the signal number that
> killed the process.
>
> Only the low 8 bits of the exit code from the command are visi‐
> ble unless exit code masking is switched off by
> ``set -o fullexitcode''. The ability to see all 32 bits from
> the exit code requires a modern UNIX compliant operating system
> with working support for waitid(2).
>
> If the executable file could not be found, the returned value
> is 127. If the file exists but could not be executed, the re‐
> turned value is 126.
>
> If bosh has been compiled with DO_EXIT_MODFIX (which is not the
> default and not recommended by POSIX) and if a command's exit
> code modulo 256 is zero and ``set -o fullexitcode'' is not in
> effect, the returned value is 128, except when the operating
> system does not support waitid(2), as the exit code then is
> masked by the kernel.
>
> If the command was killed by a signal, the returned value is
> 128 + the signal number. As a result, apparent exit code val‐
> ues in the range 129..200 may also have been caused by a sig‐
> nal.
>
> If the shell itself or a sub shell catches a signal while
> preparing a job, the exit code is 2000, or (when exit codes are
> masked to only the low 8 bits) 208.
>
> / A decimal number or text indicating the exit status returned by
> the last synchronously executed command.
>
> If $/ returns a decimal number, this is (on a POSIX system) the
> 32 bit exit code from the last command that did normally exit.
> Older non-POSIX systems like Linux or UNIX systems from before
> SVr4 return only the low 8 bits from the exit code. In any
> case, the number was a result from a normal program exit.
>
> If $/ returns text, this is either a signal name with the lead‐
> ing ``SIG'' stripped off, like ``INT'' (see kill -l) for the
> signal that terminated the program or one of the strings
> ``NOEXEC'' or ``NOTFOUND'', in case the program could not be
> run at all. The strings ``NOEXEC'' and ``NOTFOUND'' are re‐
> turned reliably from vfork(2) childs or when the related state
> is already known by the cache. This is true for all simple
> commands.
>
> Note that unless ``set -o fullexitcode'' is in effect, $/ may
> have a non-zero value where value mod 256 == 0 and the shell in
> such a case evaluates conditional execution as if the exit code
> was zero. This is the default behavior required by POSIX for
> compatibility with historic shells.
[...]
> .sh.code
> The numerical reason waitid(2) returned for the child status
> change. It matches the CLD_* definitions from signal.h. Note
> that the numbers are usually in the range 1..6 but this is not
> guaranteed. Use ${.sh.codename} for portability.
>
> .sh.codename
> The reason waitid(2) returned for the child status change as
> text that is generated by stripping off CLD_ from the related
> definitions from signal.h. Possible values are:
>
> EXITED The program had a normal termination and the
> exit(2) code is in ${.sh.status}.
>
> KILLED The program was killed by a signal, the signal num‐
> ber is in ${.sh.status} the signal name is in
> ${.sh.termsig}.
>
> DUMPED The program was killed by a signal, similar to
> KILLED above, but the program in addition created a
> core dump.
>
> TRAPPED A traced child has trapped.
>
> STOPPED The program was stopped by a signal, the signal
> number is in ${.sh.status} the signal name is in
> ${.sh.termsig}.
>
> CONTINUED A stopped child was continued.
>
> NOEXEC An existing file could not be executed. This can
> happen when e.g. either the type of the file is not
> plain file or when the file does not have execute
> permission, or when the argument list is too long.
>
> This is not a result from waitid(2) but from ex‐
> ecve(2).
>
> NOTFOUND A file was not found and thus could not be exe‐
> cuted.
>
> This is not a result from waitid(2) but from ex‐
> ecve(2).
>
> The child codes NOEXEC and NOTFOUND in ${.sh.codename} need
> shared memory (e.g. from vfork(2)) to allow a reliable report‐
> ing.
[...]
> .sh.status
> The decimal value returned by the last synchronously executed
> command. The value is unaltered and contains the full int from
> the exit(2) call in the child in case the shell is run on a
> modern os.
"modern os" here meaning one where waitid() returns the full
value (not truncated to 8 bits) which is not case of Linux (Jörg
had a bit of a grudge against GNU/Linux).
>
> .sh.termsig
> The signal name related to the numerical ${.sh.status} value.
> The translation to signal names takes place regardless of
> whether the child was terminated by a signal or terminated nor‐
> mally.
[...]
And also, only remotely related:
> .sh.signame
> The name of the causing signal. If the status is related to a
> set of waitid(2) return values, this is CHLD or CLD, depending
> on the os. When a trap(1) command is executed, ${.sh.signame}
> holds the signal that caused the trap.
>
> .sh.signo
> The signal number related to ${.sh.signame}.
--
Stephane
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Re: misleading message for SIGFPE
2024-09-24 20:03 ` Bart Schaefer
` (2 preceding siblings ...)
2024-09-26 7:58 ` Stephane Chazelas
@ 2024-10-02 18:59 ` zeurkous
3 siblings, 0 replies; 9+ messages in thread
From: zeurkous @ 2024-10-02 18:59 UTC (permalink / raw)
To: Bart Schaefer, zsh-workers
On Tue, 24 Sep 2024 13:03:27 -0700, Bart Schaefer <schaefer@brasslantern.com> wrote:
> OTOH if there's a reason we're not using strsignal() when it's
> available, I've forgotten it.
AOL!
--zeurkous.
--
Friggin' Machines!
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-10-02 19:01 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-24 12:36 misleading message for SIGFPE Vincent Lefevre
2024-09-24 13:09 ` Andreas Kähäri
2024-09-25 12:26 ` Vincent Lefevre
2024-09-24 18:05 ` Bart Schaefer
2024-09-24 20:03 ` Bart Schaefer
2024-09-25 12:33 ` Vincent Lefevre
2024-09-25 18:35 ` Bart Schaefer
2024-09-26 7:58 ` Stephane Chazelas
2024-10-02 18:59 ` zeurkous
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).