* [PATCH] Document imperfections in POSIX/sh compatibility @ 2021-04-10 23:31 dana 2021-04-10 23:50 ` Bart Schaefer 2021-04-13 16:01 ` Daniel Shahaf 0 siblings, 2 replies; 60+ messages in thread From: dana @ 2021-04-10 23:31 UTC (permalink / raw) To: Zsh hackers list Lawrence bumping 47794 reminded me of this. Someone on IRC was trying to use zsh as sh and they were very annoyed to learn that the sh emulation has imperfections that aren't really documented anywhere. I said i would add a mention. Let me know if this is editorialising it too much dana diff --git a/Doc/Zsh/compat.yo b/Doc/Zsh/compat.yo index f1be15fee..a09187918 100644 --- a/Doc/Zsh/compat.yo +++ b/Doc/Zsh/compat.yo @@ -74,3 +74,8 @@ tt(PROMPT_SUBST) and tt(SINGLE_LINE_ZLE) options are set if zsh is invoked as tt(ksh). + +Please note that zsh's emulation of other shells, as well as the degree +of its POSIX compliance, is provided on a `best effort' basis. Full +compatibility is not guaranteed, and is not necessarily a goal of the +project. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH] Document imperfections in POSIX/sh compatibility 2021-04-10 23:31 [PATCH] Document imperfections in POSIX/sh compatibility dana @ 2021-04-10 23:50 ` Bart Schaefer 2021-04-11 0:19 ` dana 2021-04-13 16:01 ` Daniel Shahaf 1 sibling, 1 reply; 60+ messages in thread From: Bart Schaefer @ 2021-04-10 23:50 UTC (permalink / raw) To: dana; +Cc: Zsh hackers list On Sat, Apr 10, 2021 at 4:32 PM dana <dana@dana.is> wrote: > > Someone on IRC was trying to use > zsh as sh and they were very annoyed to learn that the sh emulation has > imperfections Any particular ones? You're right about > not necessarily a goal but I'm curious. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH] Document imperfections in POSIX/sh compatibility 2021-04-10 23:50 ` Bart Schaefer @ 2021-04-11 0:19 ` dana 2021-04-11 16:54 ` Bart Schaefer 0 siblings, 1 reply; 60+ messages in thread From: dana @ 2021-04-11 0:19 UTC (permalink / raw) To: Bart Schaefer; +Cc: Zsh hackers list On 10 Apr 2021, at 18:50, Bart Schaefer <schaefer@brasslantern.com> wrote: > Any particular ones? I had someone pull the logs. It looks like the person was just warned by someone else that the sh mode's accuracy is 'iffy' and that zsh doesn't have POSIX compliance as an actual project goal. There was no specific functionality mentioned Since you asked, off the top of my head, zsh's getopts isn't POSIX-compliant due to its handling of '+'-prefixed options. I think i had a patch somewhere to make that respect POSIX_BUILTINS dana ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH] Document imperfections in POSIX/sh compatibility 2021-04-11 0:19 ` dana @ 2021-04-11 16:54 ` Bart Schaefer 2021-04-11 17:57 ` sh emulation POSIX non-conformances (Was: [PATCH] Document imperfections in POSIX/sh compatibility) Stephane Chazelas 2021-04-11 23:04 ` [PATCH] Document imperfections in POSIX/sh compatibility dana 0 siblings, 2 replies; 60+ messages in thread From: Bart Schaefer @ 2021-04-11 16:54 UTC (permalink / raw) To: dana; +Cc: Zsh hackers list On Sat, Apr 10, 2021 at 5:19 PM dana <dana@dana.is> wrote: > > Since you asked, off the top of my head, zsh's getopts isn't POSIX-compliant > due to its handling of '+'-prefixed options. I think i had a patch somewhere > to make that respect POSIX_BUILTINS You're talking about the thread from workers/42248, which itself mentions a thread from workers/35317. Did none of those patches get applied? ^ permalink raw reply [flat|nested] 60+ messages in thread
* sh emulation POSIX non-conformances (Was: [PATCH] Document imperfections in POSIX/sh compatibility) 2021-04-11 16:54 ` Bart Schaefer @ 2021-04-11 17:57 ` Stephane Chazelas 2021-04-11 18:13 ` Bart Schaefer ` (4 more replies) 2021-04-11 23:04 ` [PATCH] Document imperfections in POSIX/sh compatibility dana 1 sibling, 5 replies; 60+ messages in thread From: Stephane Chazelas @ 2021-04-11 17:57 UTC (permalink / raw) To: Zsh hackers list Some non-POSIX conformances I can think of ATM: * "echo -" does not output -\n last (or one of the last) time it was brought up, I pointed out that anyway few shells were POSIX compliant in that regard in that for instance POSIX requires "echo -e" to output "-e\n". I've since asked POSIX allow echo -e, echo -E (and combinations of those and -n), and zsh's echo -. They've rejected the latter part. See https://www.austingroupbugs.net/view.php?id=1222 So I think it would make sense now to stop accepting "-" as option delimiter in sh emulation. * a few of zsh's reserved words are still available in POSIX mode. $ zsh --emulate sh -c 'foreach() { true; }' zsh:1: parse error near `()' $ zsh --emulate sh -c 'end() { true; }' zsh:1: parse error near `end' * IFS treated as separator and not delimiter: $ a='a:b:' zsh --emulate sh -c 'IFS=:; printf "<%s>\n" $a' <a> <b> <> (POSIX requires <a> and <b> only). Many shells used to behave like zsh, but switched for POSIX compliance even though the zsh behaviour is more useful for instance to break down variables like $PATH (like /bin:/usr/bin: which should be split into "/bin", "/usr/bin" and ""). * whitespace characters other than SPC/TAB/NL not treated as IFS whitespace. $ a=$'\ra\r\rb' zsh --emulate sh -c $'IFS=\r; printf "<%s>\n" $a' <> <a> <> <b> POSIX requires only <a> and <b> above as isspace('\r') is true so \r should receive the same treatment as space, \t, \n. Few shells do it. bash has only started recently doing it and only for single byte characters (so is not POSIX either). -- Stephane ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (Was: [PATCH] Document imperfections in POSIX/sh compatibility) 2021-04-11 17:57 ` sh emulation POSIX non-conformances (Was: [PATCH] Document imperfections in POSIX/sh compatibility) Stephane Chazelas @ 2021-04-11 18:13 ` Bart Schaefer 2021-04-11 19:18 ` sh emulation POSIX non-conformances (no word splitting upon arithmetic expansion) Stephane Chazelas ` (3 subsequent siblings) 4 siblings, 0 replies; 60+ messages in thread From: Bart Schaefer @ 2021-04-11 18:13 UTC (permalink / raw) To: Zsh hackers list On Sun, Apr 11, 2021 at 10:58 AM Stephane Chazelas <stephane@chazelas.org> wrote: > > Some non-POSIX conformances I can think of ATM: Thanks for this list ... but POSIX conformance and /bin/sh "replaceability" are not necessarily the same thing. If I have time, I'll add failure tests for these to E03posix.ztst (when that finally gets merged from declarednull). ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (no word splitting upon arithmetic expansion) 2021-04-11 17:57 ` sh emulation POSIX non-conformances (Was: [PATCH] Document imperfections in POSIX/sh compatibility) Stephane Chazelas 2021-04-11 18:13 ` Bart Schaefer @ 2021-04-11 19:18 ` Stephane Chazelas 2021-04-22 15:03 ` Vincent Lefevre 2021-04-11 19:31 ` sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) Stephane Chazelas ` (2 subsequent siblings) 4 siblings, 1 reply; 60+ messages in thread From: Stephane Chazelas @ 2021-04-11 19:18 UTC (permalink / raw) To: Zsh hackers list 2021-04-11 18:57:26 +0100, Stephane Chazelas: > Some non-POSIX conformances I can think of ATM: [...] Another one: $ zsh --emulate sh -c 'IFS=2; printf "<%s>\n" $((11*11))' <121> While POSIX (beleive it or not) requires: <1> <1> Again, that's one of the cases where many shells (most ash-based ones, pdksh, yash at least) behaved like zsh but switched for POSIX compliance (even though that hardly makes sense). -- Stephane ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (no word splitting upon arithmetic expansion) 2021-04-11 19:18 ` sh emulation POSIX non-conformances (no word splitting upon arithmetic expansion) Stephane Chazelas @ 2021-04-22 15:03 ` Vincent Lefevre 2021-04-22 18:27 ` Bart Schaefer 0 siblings, 1 reply; 60+ messages in thread From: Vincent Lefevre @ 2021-04-22 15:03 UTC (permalink / raw) To: zsh-workers On 2021-04-11 20:18:05 +0100, Stephane Chazelas wrote: > 2021-04-11 18:57:26 +0100, Stephane Chazelas: > > Some non-POSIX conformances I can think of ATM: > [...] > > Another one: > > $ zsh --emulate sh -c 'IFS=2; printf "<%s>\n" $((11*11))' > <121> > > While POSIX (beleive it or not) requires: > > <1> > <1> > > Again, that's one of the cases where many shells (most ash-based ones, > pdksh, yash at least) behaved like zsh but switched for POSIX > compliance (even though that hardly makes sense). I disagree. I think that the fact that $((11*11)) behaves in a way similar to parameter expansion makes more sense and is less surprising: $ sh -c 'foo=121; IFS=2; echo $foo $((11*11))' 1 1 1 1 $ zsh --emulate sh -c 'foo=121; IFS=2; echo $foo $((11*11))' 1 1 121 $ If something hardly makes sense, this is "IFS=2". -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (no word splitting upon arithmetic expansion) 2021-04-22 15:03 ` Vincent Lefevre @ 2021-04-22 18:27 ` Bart Schaefer 0 siblings, 0 replies; 60+ messages in thread From: Bart Schaefer @ 2021-04-22 18:27 UTC (permalink / raw) To: Zsh hackers list On Thu, Apr 22, 2021 at 8:04 AM Vincent Lefevre <vincent@vinc17.net> wrote: > > I disagree. I think that the fact that $((11*11)) behaves in > a way similar to parameter expansion makes more sense I think it would make the most sense if number results were never field-split, but given this ... % integer x=121 % IFS=2 emulate sh -c 'print $[11*11]' 121 % IFS=2 emulate sh -c 'print $x' 1 1 ... I think applying splitting to math expansions is probably the right thing (to do at some point). ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-11 17:57 ` sh emulation POSIX non-conformances (Was: [PATCH] Document imperfections in POSIX/sh compatibility) Stephane Chazelas 2021-04-11 18:13 ` Bart Schaefer 2021-04-11 19:18 ` sh emulation POSIX non-conformances (no word splitting upon arithmetic expansion) Stephane Chazelas @ 2021-04-11 19:31 ` Stephane Chazelas 2021-04-12 20:41 ` Bart Schaefer 2021-04-11 19:33 ` sh emulation POSIX non-conformances (some of zsh's special variables) Stephane Chazelas 2021-04-11 19:42 ` sh emulation POSIX non-conformances (printf %10s and bytes vs character) Stephane Chazelas 4 siblings, 1 reply; 60+ messages in thread From: Stephane Chazelas @ 2021-04-11 19:31 UTC (permalink / raw) To: Zsh hackers list 2021-04-11 18:57:26 +0100, Stephane Chazelas: > Some non-POSIX conformances I can think of ATM: [...] Also: $ zsh --emulate sh -c 'inf=42; echo $((inf))' Inf (POSIX requires 42 there). ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-11 19:31 ` sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) Stephane Chazelas @ 2021-04-12 20:41 ` Bart Schaefer 2021-04-13 7:17 ` Stephane Chazelas 0 siblings, 1 reply; 60+ messages in thread From: Bart Schaefer @ 2021-04-12 20:41 UTC (permalink / raw) To: Zsh hackers list On Sun, Apr 11, 2021 at 12:32 PM Stephane Chazelas <stephane@chazelas.org> wrote: > > $ zsh --emulate sh -c 'inf=42; echo $((inf))' > Inf > > (POSIX requires 42 there). Is that because "Inf" is case-sensitive, or because POSIX requires evaluating the variable? E.g. what does sh -c 'Inf=42; echo $((Inf))' yield in POSIX? What about sh -c 'unset Inf; echo $((Inf))' sh -c 'unset inf; echo $((inf))' ?? I don't have a POSIX shell to test with, it seems. Ksh "Version A 2020.0.0" responds the same as zsh, and bash "5.0.17(1)-release" doesn't seem to have Inf at all (and gives a syntax error on floating-point arithmetic?). ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-12 20:41 ` Bart Schaefer @ 2021-04-13 7:17 ` Stephane Chazelas 2021-04-22 15:31 ` Vincent Lefevre 0 siblings, 1 reply; 60+ messages in thread From: Stephane Chazelas @ 2021-04-13 7:17 UTC (permalink / raw) To: Bart Schaefer; +Cc: Zsh hackers list 2021-04-12 13:41:58 -0700, Bart Schaefer: > On Sun, Apr 11, 2021 at 12:32 PM Stephane Chazelas > <stephane@chazelas.org> wrote: > > > > $ zsh --emulate sh -c 'inf=42; echo $((inf))' > > Inf > > > > (POSIX requires 42 there). > > Is that because "Inf" is case-sensitive, or because POSIX requires > evaluating the variable? E.g. what does That was because "inf" in an arithmetic expression, where inf is the name of a variable whose contents is an integer constant (decimal, octal or hex) is meant to represent the corresponding integer number (and an empty or unset variable is meant to yield 0).. https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/basedefs/V1_chap08.html#tag_08 warns about some of the special variables used by some shells (such as RANDOM and SECONDS), but in those cases, $((RANDOM)) still expands to the value of the variable. Here, a user must make sure they don't use "inf" (or INF/InF/Inf, nan/NAN...) as the name of a variable if they want to use it in an arithmetic expression. Same family of issues as those linked to zsh special parameters or keywords/builtins. Since the POSIX shell language doesn't support floating point arithmetics, zsh could disable it in POSIX mode, but it may not be worth the bother. Since floating point arithmetics is supported by a few shells (ksh93, yash, zsh), maybe a better approach would be for text to be added to the POSIX standard to warn against using those as variable names. I think it would be worth documenting that nan and inf are recognised in arithmetic expressions (and warn against using variables with the same name). Maybe something like: diff --git a/Doc/Zsh/arith.yo b/Doc/Zsh/arith.yo index bc3e35ad5..44c35edab 100644 --- a/Doc/Zsh/arith.yo +++ b/Doc/Zsh/arith.yo @@ -120,6 +120,11 @@ taken for a parameter name. All numeric parts (before and after the decimal point and in the exponent) may contain underscores after the leading digit for visual guidance; these are ignored in computation. +tt(Inf)) and tt(NaN) and all their variation of case (tt(inf), tt(NAN), etc.) +are also recognised as meaning "infinity" and "not-a-number" floating point +constants respectively. One should avoid using variables with such names when +they are to be used in arithmetic expressions. + cindex(arithmetic operators) cindex(operators, arithmetic) An arithmetic expression uses nearly the same syntax and > > sh -c 'Inf=42; echo $((Inf))' > > yield in POSIX? What about 42 > sh -c 'unset Inf; echo $((Inf))' 0 > sh -c 'unset inf; echo $((inf))' 0 > ?? I don't have a POSIX shell to test with, it seems. Ksh "Version A > 2020.0.0" responds the same as zsh, and bash "5.0.17(1)-release" > doesn't seem to have Inf at all (and gives a syntax error on > floating-point arithmetic?). There's not really such a thing as a POSIX shell. There's a standard specification of the POSIX sh *language*, and a number of shell implementations that try to provide a compliant interpreter for that language. That language specification was initially based on a subset of ksh88's, but with some deviations. The only Unix shells that I know that have been /certified/ as compliant are some ksh88 derivatives (like on Solaris, AIX, HPUX) and some versions of bash (in posix mode and with xpg_echo enabled, like on macos, Inspur K/UX). They both have non-conformances, especially ksh88-based ones (which have much more serious ones than the one I've listed in this thread). I'd say zsh's sh emulation is probably more compliant than those ksh88-based ones. The Opengroup does have a certification test suite, but I don't think it's publicly available. Note that ksh2020 development has been abandoned. It was based on a beta version of ksh93 released when AT&T Research was shut down, but eventually deemed too buggy to fix. There is still some community effort to maintain a ksh93u+ based shell. Having said that, ksh93/ksh2020 is one of the few Bourne-like shells that support floating point arithmetic expressions (and the first one that did). $((inf)) expands to "inf" for me with those (which would also make it non-compliant). yash (the other shell with floating point arithmetic expressions and which was written to the POSIX specification) doesn't support nan/inf in arithmetic expressions unless you do inf=inf nan=nan (and Inf=inf, NaN=nan... if you want to use those) and disables floating point arithmetics when in posix mode (yash -o posix). All three handle the locale decimal radix character differently, which makes me think there's little hope floating point arithmetics ever makes it to the POSIX spec. -- Stephane ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-13 7:17 ` Stephane Chazelas @ 2021-04-22 15:31 ` Vincent Lefevre 2021-04-22 18:55 ` Bart Schaefer 0 siblings, 1 reply; 60+ messages in thread From: Vincent Lefevre @ 2021-04-22 15:31 UTC (permalink / raw) To: zsh-workers; +Cc: Bart Schaefer On 2021-04-13 08:17:42 +0100, Stephane Chazelas wrote: > 2021-04-12 13:41:58 -0700, Bart Schaefer: > > On Sun, Apr 11, 2021 at 12:32 PM Stephane Chazelas > > <stephane@chazelas.org> wrote: > > > > > > $ zsh --emulate sh -c 'inf=42; echo $((inf))' > > > Inf > > > > > > (POSIX requires 42 there). > > > > Is that because "Inf" is case-sensitive, or because POSIX requires > > evaluating the variable? E.g. what does > > That was because "inf" in an arithmetic expression, where inf is > the name of a variable whose contents is an integer constant > (decimal, octal or hex) is meant to represent the corresponding > integer number (and an empty or unset variable is meant to yield > 0).. I think that it would have been better if zsh chose something that does not correspond to the name of a variable, e.g. @Inf@ and @NaN@ (this is what MPFR does, so that it cannot be confused with numbers in large bases, where letters are used as digits). > I think it would be worth documenting that nan and inf are > recognised in arithmetic expressions (and warn against using > variables with the same name). IMHO, zsh should also output a warning when such variables are used. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-22 15:31 ` Vincent Lefevre @ 2021-04-22 18:55 ` Bart Schaefer 2021-04-22 20:45 ` Daniel Shahaf 2021-04-23 16:45 ` Vincent Lefevre 0 siblings, 2 replies; 60+ messages in thread From: Bart Schaefer @ 2021-04-22 18:55 UTC (permalink / raw) To: Zsh hackers list On Thu, Apr 22, 2021 at 8:31 AM Vincent Lefevre <vincent@vinc17.net> wrote: > > On 2021-04-13 08:17:42 +0100, Stephane Chazelas wrote: > > I think it would be worth documenting that nan and inf are > > recognised in arithmetic expressions (and warn against using > > variables with the same name). > > IMHO, zsh should also output a warning when such variables are used. Exactly how would that work? Warning anytime the names "inf" or "nan" (and case variants) get values assigned to them seems like overkill. If they appear as $inf or $nan then there's no conflict. Still warn? If "inf" and "nan" appear in math context as the bare strings they're currently taken as constants. Always make a (usually wasted/fruitless) check to see whether there happens to be a variable of the same name and emit a "watch out, not used" message? ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-22 18:55 ` Bart Schaefer @ 2021-04-22 20:45 ` Daniel Shahaf 2021-04-22 21:25 ` Bart Schaefer 2021-04-23 16:45 ` Vincent Lefevre 1 sibling, 1 reply; 60+ messages in thread From: Daniel Shahaf @ 2021-04-22 20:45 UTC (permalink / raw) To: zsh-workers Bart Schaefer wrote on Thu, Apr 22, 2021 at 11:55:25 -0700: > On Thu, Apr 22, 2021 at 8:31 AM Vincent Lefevre <vincent@vinc17.net> wrote: > > > > On 2021-04-13 08:17:42 +0100, Stephane Chazelas wrote: > > > I think it would be worth documenting that nan and inf are > > > recognised in arithmetic expressions (and warn against using > > > variables with the same name). > > > > IMHO, zsh should also output a warning when such variables are used. > > Exactly how would that work? > > Warning anytime the names "inf" or "nan" (and case variants) get > values assigned to them seems like overkill. > Warn only when the variable is created, e.g., upon «typeset -F inf» or «(( nan = 3.14 ))», but not subsequent assignments? > If they appear as $inf or $nan then there's no conflict. Still warn? No, I guess? No one expects «$42» and «42» to mean the same thing, nor «$0.0» and «0.0». > If "inf" and "nan" appear in math context as the bare strings they're > currently taken as constants. Always make a (usually > wasted/fruitless) check to see whether there happens to be a variable > of the same name and emit a "watch out, not used" message? No; optimize for the common case that $inf and $nan don't exist. Warning upon creation of those suffices. > Predefine $inf and $nan as readonly float variables initialized to the respective values? (Not saying this is a good idea; just mentioning it for completeness) Cheers, Daniel ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-22 20:45 ` Daniel Shahaf @ 2021-04-22 21:25 ` Bart Schaefer 0 siblings, 0 replies; 60+ messages in thread From: Bart Schaefer @ 2021-04-22 21:25 UTC (permalink / raw) To: Zsh hackers list On Thu, Apr 22, 2021 at 1:46 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > > Warn only when the variable is created, e.g., upon «typeset -F inf» or > «(( nan = 3.14 ))», but not subsequent assignments? The latter is already an error. The former warns spuriously if the user has no intention of ever mentioning the variable in math context (or of ever using math at all). And the problem exists in math context whether or not the variable is declared to be of a numeric type. > Predefine $inf and $nan as readonly float variables initialized to the > respective values? (Not saying this is a good idea; just mentioning it > for completeness) The constants are case-insensitive, so we'd have to predefine twelve names. And we couldn't do this in POSIX (sh) context, which sort of defeats the purpose. I'm waiting for Vincent to tell us when/where he thinks the warning should come from, but I don't think there's any sensible place for it. The only good solution would have been to pick otherwise-syntactically-impossible tokens in the first place (as Vincent also mentioned), but it may be too late. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-22 18:55 ` Bart Schaefer 2021-04-22 20:45 ` Daniel Shahaf @ 2021-04-23 16:45 ` Vincent Lefevre 2021-04-23 20:31 ` Bart Schaefer 1 sibling, 1 reply; 60+ messages in thread From: Vincent Lefevre @ 2021-04-23 16:45 UTC (permalink / raw) To: zsh-workers On 2021-04-22 11:55:25 -0700, Bart Schaefer wrote: > On Thu, Apr 22, 2021 at 8:31 AM Vincent Lefevre <vincent@vinc17.net> wrote: > > > > On 2021-04-13 08:17:42 +0100, Stephane Chazelas wrote: > > > I think it would be worth documenting that nan and inf are > > > recognised in arithmetic expressions (and warn against using > > > variables with the same name). > > > > IMHO, zsh should also output a warning when such variables are used. > > Exactly how would that work? When inf or nan is used in a math context and the corresponding variable exists. For instance: zira% echo $((inf)) Inf zira% inf=17 zira% echo $((inf)) Inf There should be a warning for the 3rd command. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-23 16:45 ` Vincent Lefevre @ 2021-04-23 20:31 ` Bart Schaefer 2021-04-23 22:46 ` Oliver Kiddle 0 siblings, 1 reply; 60+ messages in thread From: Bart Schaefer @ 2021-04-23 20:31 UTC (permalink / raw) To: Zsh hackers list On Fri, Apr 23, 2021 at 9:46 AM Vincent Lefevre <vincent@vinc17.net> wrote: > > On 2021-04-22 11:55:25 -0700, Bart Schaefer wrote: > > On Thu, Apr 22, 2021 at 8:31 AM Vincent Lefevre <vincent@vinc17.net> wrote: > > > > > > IMHO, zsh should also output a warning when such variables are used. > > > > Exactly how would that work? > > When inf or nan is used in a math context and the corresponding > variable exists. So, it's the "expend effort on a check that is nearly always going to fail" option. Not advocating for the below patch, just providing for reference. gmail may have munged tabs causing indentation to look funny. diff --git a/Src/math.c b/Src/math.c index 1d0d86639..50c34416d 100644 --- a/Src/math.c +++ b/Src/math.c @@ -865,12 +865,22 @@ zzlex(void) ptr = ie; if (ie - p == 3) { if (strncasecmp(p, "NaN", 3) == 0) { + char iec = *ie; *ie = 0; + if (issetvar(p)) { + zwarn("%s: using constant NaN", p); + } + *ie = iec; yyval.type = MN_FLOAT; yyval.u.d = 0.0; yyval.u.d /= yyval.u.d; return NUM; } else if (strncasecmp(p, "Inf", 3) == 0) { + char iec = *ie; *ie = 0; + if (issetvar(p)) { + zwarn("%s: using constant Inf", p); + } + *ie = iec; yyval.type = MN_FLOAT; yyval.u.d = 0.0; yyval.u.d = 1.0 / yyval.u.d; diff --git a/Test/C01arith.ztst b/Test/C01arith.ztst index d0092fefa..e6333890c 100644 --- a/Test/C01arith.ztst +++ b/Test/C01arith.ztst @@ -306,17 +306,22 @@ in=1 info=2 Infinity=3 Inf=4 print $(( in )) $(( info )) $(( Infinity )) $(( $Inf )) $(( inf )) $(( INF )) $(( Inf )) $(( iNF )) 0:Infinity parsing +?(eval):2: Inf: using constant Inf >1 2 3 4 Inf Inf Inf Inf integer Inf print $(( Inf[0] )) 1:Refer to Inf with an array subscript +?(eval):2: Inf: using constant Inf ?(eval):2: bad base syntax + integer NaN (( NaN = 1 )) 2:Assign to NaN -?(eval):1: bad math expression: lvalue required +?(eval):2: NaN: using constant NaN +?(eval):2: bad math expression: lvalue required + unset Inf a='Inf' (( b = 1e500 )) print $((1e500)) $(($((1e500)))) $(( a )) $b $(( b )) $(( 3.0 / 0 )) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-23 20:31 ` Bart Schaefer @ 2021-04-23 22:46 ` Oliver Kiddle 2021-04-23 23:34 ` Bart Schaefer 2021-04-24 23:02 ` Vincent Lefevre 0 siblings, 2 replies; 60+ messages in thread From: Oliver Kiddle @ 2021-04-23 22:46 UTC (permalink / raw) To: Zsh hackers list Bart Schaefer wrote: > On Fri, Apr 23, 2021 at 9:46 AM Vincent Lefevre <vincent@vinc17.net> wrote: > > > > IMHO, zsh should also output a warning when such variables are used. I disagree. We also have variables named 0, 1, 2, 3 and so on - the positional parameters. But nobody would suggest warning about literal value 7 in maths context. > So, it's the "expend effort on a check that is nearly always going to > fail" option. I can just about see the case for warning on typeset -i/-F inf or nan but not on use. The traditional Unix approach is not to stop people who choose to shoot themselves in the foot. In a maths expression, you're going to end up with inf, -inf or nan as your result anyway so it quickly becomes fairly obvious what is going on. If we're worrying about POSIX compliance, it'd be easy enough to disable Inf and NaN in POSIX mode but there are dozens of special variables defined that aren't in the POSIX spec either and which could clash. While @inf would have avoided clashing with the variable, I've not seen any other language which does that and consistency makes it easier. The shell is one language where special characters should be especially avoided because it is used interactively. Not everyone has a US-layout keyboard and @ is often in weird places. It'd also raise the question of what we should output. printf should follow the standard. Note that in bash (and quite a few other implementations): printf '%f\n' inf inf It is generally helpful to be able to re-input the output as input in a later line of code. > + if (issetvar(p)) { > + zwarn("%s: using constant NaN", p); I'm not sure that "constant" is even the correct term for what NaN or even Inf is? Note that NaN == NaN is, by definition, false. For an accurate description perhaps check IEEE754 but it is more along the lines of being an intrinsic literal value. > integer Inf > print $(( Inf[0] )) > 1:Refer to Inf with an array subscript That could potentially be made to work as an array lookup because subscripts have no other meaning that would clash. Oliver ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-23 22:46 ` Oliver Kiddle @ 2021-04-23 23:34 ` Bart Schaefer 2021-04-24 2:10 ` Daniel Shahaf 2021-04-24 23:02 ` Vincent Lefevre 1 sibling, 1 reply; 60+ messages in thread From: Bart Schaefer @ 2021-04-23 23:34 UTC (permalink / raw) To: Zsh hackers list On Fri, Apr 23, 2021 at 3:47 PM Oliver Kiddle <opk@zsh.org> wrote: > > I can just about see the case for warning on typeset -i/-F inf or nan > but not on use. Warning on -i/-F but not on simple scalar nor on array/hash is not very useful because math context doesn't limit expansions to just numerics. > If we're worrying about POSIX compliance, it'd be easy enough to disable > Inf and NaN in POSIX mode but there are dozens of special variables Inf and NaN are actually lexical tokens in context, so it's not quite the same situation as special variables. > Bart Schaefer wrote: > > + if (issetvar(p)) { > > + zwarn("%s: using constant NaN", p); > > I'm not sure that "constant" is even the correct term for what NaN > or even Inf is? Perhaps not. The patch was more to show the cost of implementing the warning than to attempt to get the warning text right. > > integer Inf > > print $(( Inf[0] )) > > 1:Refer to Inf with an array subscript > > That could potentially be made to work as an array lookup because > subscripts have no other meaning that would clash. Here though you mean array lookup in math context generically, not specifically for variables with these names? ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-23 23:34 ` Bart Schaefer @ 2021-04-24 2:10 ` Daniel Shahaf 2021-04-24 3:42 ` Bart Schaefer 0 siblings, 1 reply; 60+ messages in thread From: Daniel Shahaf @ 2021-04-24 2:10 UTC (permalink / raw) To: zsh-workers Bart Schaefer wrote on Fri, 23 Apr 2021 23:34 +00:00: > On Fri, Apr 23, 2021 at 3:47 PM Oliver Kiddle <opk@zsh.org> wrote: > > > > I can just about see the case for warning on typeset -i/-F inf or nan > > but not on use. > > Warning on -i/-F but not on simple scalar nor on array/hash is not > very useful because math context doesn't limit expansions to just > numerics. Disagree. The fact that «zsh -fc 'inf=42; : $((inf))'» has a shadowing issue doesn't make warning on «zsh -fc 'typeset -F inf=42; : $((inf))'» "not useful". The latter code doesn't do what its author likely expected it to, so warning about it would be useful. In general, a patch needn't fix every variant of a problem in order to be committable. Bart Schaefer wrote on Thu, 22 Apr 2021 21:25 +00:00: > The former warns spuriously if the user has no intention of ever > mentioning the variable in math context (or of ever using math at all). What's a use-case for declaring an *integer or float* variable that doesn't get ever get mentioned in math context? Cheers, Daniel ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-24 2:10 ` Daniel Shahaf @ 2021-04-24 3:42 ` Bart Schaefer 2021-04-24 7:33 ` Stephane Chazelas 0 siblings, 1 reply; 60+ messages in thread From: Bart Schaefer @ 2021-04-24 3:42 UTC (permalink / raw) To: Daniel Shahaf; +Cc: Zsh hackers list On Fri, Apr 23, 2021 at 7:13 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > > Bart Schaefer wrote on Fri, 23 Apr 2021 23:34 +00:00: > > Warning on -i/-F but not on simple scalar nor on array/hash is not > > very useful > > Disagree. There's room for that. > Bart Schaefer wrote on Thu, 22 Apr 2021 21:25 +00:00: > > The former warns spuriously if the user has no intention of ever > > mentioning the variable in math context (or of ever using math at all). > > What's a use-case for declaring an *integer or float* variable that > doesn't get ever get mentioned in math context? I was (not clearly) using "spuriously" with an assumption that the warning would apply to non-numeric type[def]s as well. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-24 3:42 ` Bart Schaefer @ 2021-04-24 7:33 ` Stephane Chazelas 2021-04-24 16:04 ` Bart Schaefer 0 siblings, 1 reply; 60+ messages in thread From: Stephane Chazelas @ 2021-04-24 7:33 UTC (permalink / raw) To: Bart Schaefer; +Cc: Daniel Shahaf, Zsh hackers list BTW, there's also: $ var=42 zsh -c 'printf "%g\n" var' 42 $ Infinity=42 zsh -c 'printf "%g\n" Infinity' inf In ksh93: $ var=42 ksh -c 'printf "%g\n" var' 42 $ Infinity=42 ksh -c 'printf "%g\n" Infinity' 42 POSIX leaves it implementation-defined whether %g/%f... are supported, but I'd expect it requires that "inf" output in the second case where it is (as that's what strtod() returns), so zsh would be more compliant than ksh in that regard. There's still possibly scope for improving documentation. IMO, the code doesn't need to be changed to add warnings. -- Stephane ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-24 7:33 ` Stephane Chazelas @ 2021-04-24 16:04 ` Bart Schaefer 0 siblings, 0 replies; 60+ messages in thread From: Bart Schaefer @ 2021-04-24 16:04 UTC (permalink / raw) To: Zsh hackers list On Sat, Apr 24, 2021 at 12:33 AM Stephane Chazelas <stephane@chazelas.org> wrote: > > $ Infinity=42 zsh -c 'printf "%g\n" Infinity' > inf That's ... curious ... Infinity=42 zsh -c 'printf "%g\n" $(( Infinity ))' 42 > (as that's what strtod() returns) Er, hm. Does that mean we should be using strtod() in math context instead of hard-coding "Inf" into the lexer? ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-23 22:46 ` Oliver Kiddle 2021-04-23 23:34 ` Bart Schaefer @ 2021-04-24 23:02 ` Vincent Lefevre 2021-04-25 2:18 ` Bart Schaefer 1 sibling, 1 reply; 60+ messages in thread From: Vincent Lefevre @ 2021-04-24 23:02 UTC (permalink / raw) To: zsh-workers On 2021-04-24 00:46:44 +0200, Oliver Kiddle wrote: > Bart Schaefer wrote: > > On Fri, Apr 23, 2021 at 9:46 AM Vincent Lefevre <vincent@vinc17.net> wrote: > > > > > IMHO, zsh should also output a warning when such variables are used. > > I disagree. We also have variables named 0, 1, 2, 3 and so on - the > positional parameters. But nobody would suggest warning about literal > value 7 in maths context. This is different: the digits are already used in integers, and the behavior is the same in all shells. > > So, it's the "expend effort on a check that is nearly always going to > > fail" option. > > I can just about see the case for warning on typeset -i/-F inf or nan > but not on use. The traditional Unix approach is not to stop people who > choose to shoot themselves in the foot. In a maths expression, you're > going to end up with inf, -inf or nan as your result anyway so it > quickly becomes fairly obvious what is going on. People could waste a lot of time finding what is going on, in particular those who don't use floating point at all. > If we're worrying about POSIX compliance, it'd be easy enough to disable > Inf and NaN in POSIX mode but there are dozens of special variables > defined that aren't in the POSIX spec either and which could clash. However, it is a bit easier to see what is going on with special variables. And they have been documented for a long time. On the opposite, the zsh manual (for 5.8) is silent on Inf and NaN. And both for the behavior and documentation, be careful with the locales, in particular in Turkish ones. I don't know whether this is expected in zsh, but... zira% zsh -fc 'export LC_ALL=fr_FR.utf8; echo $((Inf)) $((inf))' Inf Inf zira% zsh -fc 'export LC_ALL=tr_TR.utf8; echo $((Inf)) $((inf))' Inf 0 With ksh93: zira% ksh93 -fc 'export LC_ALL=fr_FR.utf8; echo $((Inf)) $((inf))' inf inf zira% ksh93 -fc 'export LC_ALL=tr_TR.utf8; echo $((Inf)) $((inf))' inf inf > It is generally helpful to be able to re-input the output as input > in a later line of code. Even with the POSIX behavior, a problem is unlikely to occur, because zsh uses a mix of uppercase and lowercase letters, and this is uncommon in variable names. Moreover, if people get Inf or NaN in their computation, I doubt that they would use such variable names. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-24 23:02 ` Vincent Lefevre @ 2021-04-25 2:18 ` Bart Schaefer 2021-04-25 20:17 ` Vincent Lefevre 0 siblings, 1 reply; 60+ messages in thread From: Bart Schaefer @ 2021-04-25 2:18 UTC (permalink / raw) To: Zsh hackers list On Sat, Apr 24, 2021 at 4:03 PM Vincent Lefevre <vincent@vinc17.net> wrote: > > both for the behavior and documentation, be careful with the > locales, in particular in Turkish ones. I don't know whether > this is expected in zsh, but... > > zira% zsh -fc 'export LC_ALL=fr_FR.utf8; echo $((Inf)) $((inf))' > Inf Inf > zira% zsh -fc 'export LC_ALL=tr_TR.utf8; echo $((Inf)) $((inf))' > Inf 0 schaefer[856] ls -ld /usr/share/i18n/locales/tr_TR -rw-r--r-- 1 root root 168368 Dec 16 03:04 /usr/share/i18n/locales/tr_TR schaefer[857] Src/zsh -fc 'export LC_ALL=tr_TR.utf8; echo $((Inf)) $((inf))' Inf Inf ?? ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-25 2:18 ` Bart Schaefer @ 2021-04-25 20:17 ` Vincent Lefevre 2021-04-25 21:58 ` Bart Schaefer 2021-04-25 22:00 ` Bart Schaefer 0 siblings, 2 replies; 60+ messages in thread From: Vincent Lefevre @ 2021-04-25 20:17 UTC (permalink / raw) To: zsh-workers On 2021-04-24 19:18:25 -0700, Bart Schaefer wrote: > On Sat, Apr 24, 2021 at 4:03 PM Vincent Lefevre <vincent@vinc17.net> wrote: > > > > both for the behavior and documentation, be careful with the > > locales, in particular in Turkish ones. I don't know whether > > this is expected in zsh, but... > > > > zira% zsh -fc 'export LC_ALL=fr_FR.utf8; echo $((Inf)) $((inf))' > > Inf Inf > > zira% zsh -fc 'export LC_ALL=tr_TR.utf8; echo $((Inf)) $((inf))' > > Inf 0 > > schaefer[856] ls -ld /usr/share/i18n/locales/tr_TR > -rw-r--r-- 1 root root 168368 Dec 16 03:04 /usr/share/i18n/locales/tr_TR > schaefer[857] Src/zsh -fc 'export LC_ALL=tr_TR.utf8; echo $((Inf)) $((inf))' > Inf Inf > > ?? You may have non-standard Turkish locales, because the following line in math.c is obviously incorrect: else if (strncasecmp(p, "Inf", 3) == 0) { -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-25 20:17 ` Vincent Lefevre @ 2021-04-25 21:58 ` Bart Schaefer 2021-04-26 10:28 ` Vincent Lefevre 2021-04-25 22:00 ` Bart Schaefer 1 sibling, 1 reply; 60+ messages in thread From: Bart Schaefer @ 2021-04-25 21:58 UTC (permalink / raw) To: Zsh hackers list On Sun, Apr 25, 2021 at 1:18 PM Vincent Lefevre <vincent@vinc17.net> wrote: > > You may have non-standard Turkish locales I have whatever is standard for Ubuntu 20.04.2 LTS. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-25 21:58 ` Bart Schaefer @ 2021-04-26 10:28 ` Vincent Lefevre 0 siblings, 0 replies; 60+ messages in thread From: Vincent Lefevre @ 2021-04-26 10:28 UTC (permalink / raw) To: zsh-workers On 2021-04-25 14:58:33 -0700, Bart Schaefer wrote: > On Sun, Apr 25, 2021 at 1:18 PM Vincent Lefevre <vincent@vinc17.net> wrote: > > > > You may have non-standard Turkish locales > > I have whatever is standard for Ubuntu 20.04.2 LTS. Indeed, this depends on the system. Under Debian: zira% export LC_ALL=tr_TR.utf8 zira% echo ${${:-Inf}:l} ${${:-inf}:u} ınf İNF Ditto under CentOS Linux 7 (AltArch): gcc1-power7% export LC_ALL=tr_TR.utf8 gcc1-power7% echo ${${:-Inf}:l} ${${:-inf}:u} ınf İNF But under macOS: vinc17@minimac ~ % export LC_ALL=tr_TR.UTF-8 vinc17@minimac ~ % echo ${${:-Inf}:l} ${${:-inf}:u} inf INF -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-25 20:17 ` Vincent Lefevre 2021-04-25 21:58 ` Bart Schaefer @ 2021-04-25 22:00 ` Bart Schaefer 2021-04-26 10:34 ` Vincent Lefevre 1 sibling, 1 reply; 60+ messages in thread From: Bart Schaefer @ 2021-04-25 22:00 UTC (permalink / raw) To: Zsh hackers list On Sun, Apr 25, 2021 at 1:18 PM Vincent Lefevre <vincent@vinc17.net> wrote: > > math.c is obviously incorrect: > > else if (strncasecmp(p, "Inf", 3) == 0) { What should it (and the corresponding NaN line) be, then? ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-25 22:00 ` Bart Schaefer @ 2021-04-26 10:34 ` Vincent Lefevre 2021-04-26 23:25 ` Vincent Lefevre 0 siblings, 1 reply; 60+ messages in thread From: Vincent Lefevre @ 2021-04-26 10:34 UTC (permalink / raw) To: zsh-workers On 2021-04-25 15:00:03 -0700, Bart Schaefer wrote: > On Sun, Apr 25, 2021 at 1:18 PM Vincent Lefevre <vincent@vinc17.net> wrote: > > > > math.c is obviously incorrect: > > > > else if (strncasecmp(p, "Inf", 3) == 0) { > > What should it (and the corresponding NaN line) be, then? This depends on the expected behavior: Should it match only [Ii][Nn][Ff], or also the lowercase and uppercase versions? ... so that in Turkish locales under Debian: [Iıİi][Nn][Ff], i.e. also with U+0130 LATIN CAPITAL LETTER I WITH DOT ABOVE U+0131 LATIN SMALL LETTER DOTLESS I -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) 2021-04-26 10:34 ` Vincent Lefevre @ 2021-04-26 23:25 ` Vincent Lefevre 0 siblings, 0 replies; 60+ messages in thread From: Vincent Lefevre @ 2021-04-26 23:25 UTC (permalink / raw) To: zsh-workers [-- Attachment #1: Type: text/plain, Size: 1269 bytes --] On 2021-04-26 12:34:36 +0200, Vincent Lefevre wrote: > On 2021-04-25 15:00:03 -0700, Bart Schaefer wrote: > > On Sun, Apr 25, 2021 at 1:18 PM Vincent Lefevre <vincent@vinc17.net> wrote: > > > > > > math.c is obviously incorrect: > > > > > > else if (strncasecmp(p, "Inf", 3) == 0) { > > > > What should it (and the corresponding NaN line) be, then? > > This depends on the expected behavior: Should it match only > [Ii][Nn][Ff], or also the lowercase and uppercase versions? > ... so that in Turkish locales under Debian: [Iıİi][Nn][Ff], > i.e. also with > U+0130 LATIN CAPITAL LETTER I WITH DOT ABOVE > U+0131 LATIN SMALL LETTER DOTLESS I I think that matching ASCII only would be the expected behavior, as this would not depend on the locales and this would be like ksh93, strtod(), and so on. See attached patch. Alternatively, you could keep the strncasecmp for NaN and possibly for the "nf" part of "Inf" since AFAIK, 'i' is the only letter to have such an issue, but for 3 characters, individual comparisons seem OK. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) [-- Attachment #2: inf-nan.patch --] [-- Type: text/plain, Size: 706 bytes --] diff --git a/Src/math.c b/Src/math.c index 1d0d86639..ade02d80c 100644 --- a/Src/math.c +++ b/Src/math.c @@ -864,13 +864,17 @@ zzlex(void) p = ptr; ptr = ie; if (ie - p == 3) { - if (strncasecmp(p, "NaN", 3) == 0) { + if ((p[0] == 'N' || p[0] == 'n') && + (p[1] == 'A' || p[1] == 'a') && + (p[2] == 'N' || p[2] == 'n')) { yyval.type = MN_FLOAT; yyval.u.d = 0.0; yyval.u.d /= yyval.u.d; return NUM; } - else if (strncasecmp(p, "Inf", 3) == 0) { + else if ((p[0] == 'I' || p[0] == 'i') && + (p[1] == 'N' || p[1] == 'n') && + (p[2] == 'F' || p[2] == 'f')) { yyval.type = MN_FLOAT; yyval.u.d = 0.0; yyval.u.d = 1.0 / yyval.u.d; ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (some of zsh's special variables) 2021-04-11 17:57 ` sh emulation POSIX non-conformances (Was: [PATCH] Document imperfections in POSIX/sh compatibility) Stephane Chazelas ` (2 preceding siblings ...) 2021-04-11 19:31 ` sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) Stephane Chazelas @ 2021-04-11 19:33 ` Stephane Chazelas 2021-04-11 19:42 ` sh emulation POSIX non-conformances (printf %10s and bytes vs character) Stephane Chazelas 4 siblings, 0 replies; 60+ messages in thread From: Stephane Chazelas @ 2021-04-11 19:33 UTC (permalink / raw) To: Zsh hackers list 2021-04-11 18:57:26 +0100, Stephane Chazelas: > Some non-POSIX conformances I can think of ATM: [...] $ zsh --emulate sh -c 'EUID=10; echo "$EUID"' zsh:1: failed to change effective user ID: operation not permitted But then again, many shells have similar problems with their own special variables. -- Stephane ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-11 17:57 ` sh emulation POSIX non-conformances (Was: [PATCH] Document imperfections in POSIX/sh compatibility) Stephane Chazelas ` (3 preceding siblings ...) 2021-04-11 19:33 ` sh emulation POSIX non-conformances (some of zsh's special variables) Stephane Chazelas @ 2021-04-11 19:42 ` Stephane Chazelas 2021-04-13 15:57 ` Daniel Shahaf 2021-04-22 13:59 ` Vincent Lefevre 4 siblings, 2 replies; 60+ messages in thread From: Stephane Chazelas @ 2021-04-11 19:42 UTC (permalink / raw) To: Zsh hackers list 2021-04-11 18:57:26 +0100, Stephane Chazelas: > Some non-POSIX conformances I can think of ATM: [...] Another POSIX bug fixed by zsh (but which makes it non-compliant): With multibyte characters: $ printf '|%10s|\n' Stéphane Chazelas | Stéphane| | Chazelas| POSIX requires: | Stéphane| | Chazelas| (with a UTF-8 é encoded one 2 bytes), that is, the width to be a number of bytes not characters. ksh93 has printf %20Ls where width is based on the display width of characters. $ zsh -c "printf '|%10Ls|\n' Ste$'\u0301'phane Chazelas" | Stéphane| | Chazelas| $ ksh -c "printf '|%10Ls|\n' Ste$'\u0301'phane Chazelas" | Stéphane| | Chazelas| (that one is not specified by POSIX) -- Stephane ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-11 19:42 ` sh emulation POSIX non-conformances (printf %10s and bytes vs character) Stephane Chazelas @ 2021-04-13 15:57 ` Daniel Shahaf 2021-04-13 18:03 ` Stephane Chazelas 2021-04-22 13:59 ` Vincent Lefevre 1 sibling, 1 reply; 60+ messages in thread From: Daniel Shahaf @ 2021-04-13 15:57 UTC (permalink / raw) To: Zsh hackers list Stephane Chazelas wrote on Sun, Apr 11, 2021 at 20:42:05 +0100: > Another POSIX bug fixed by zsh (but which makes it non-compliant): > > With multibyte characters: > > $ printf '|%10s|\n' Stéphane Chazelas > | Stéphane| > | Chazelas| > > POSIX requires: > > | Stéphane| > | Chazelas| > > (with a UTF-8 é encoded one 2 bytes Note that e-with-acute has two encodings in Unicode: é, one codepoint, two UTF-8 bytes é, two codepoints, three UTF-8 bytes https://en.wikipedia.org/wiki/Unicode_equivalence#Normal_forms ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-13 15:57 ` Daniel Shahaf @ 2021-04-13 18:03 ` Stephane Chazelas 2021-04-13 21:09 ` Bart Schaefer 0 siblings, 1 reply; 60+ messages in thread From: Stephane Chazelas @ 2021-04-13 18:03 UTC (permalink / raw) To: Daniel Shahaf; +Cc: Zsh hackers list 2021-04-13 15:57:44 +0000, Daniel Shahaf: > Stephane Chazelas wrote on Sun, Apr 11, 2021 at 20:42:05 +0100: > > Another POSIX bug fixed by zsh (but which makes it non-compliant): > > > > With multibyte characters: > > > > $ printf '|%10s|\n' Stéphane Chazelas > > | Stéphane| > > | Chazelas| > > > > POSIX requires: > > > > | Stéphane| > > | Chazelas| > > > > (with a UTF-8 é encoded one 2 bytes > > Note that e-with-acute has two encodings in Unicode: > > é, one codepoint, two UTF-8 bytes > é, two codepoints, three UTF-8 bytes > > https://en.wikipedia.org/wiki/Unicode_equivalence#Normal_forms That was shown already in the part of my message you didn't quote, where I pointed out how ksh93 addresses it with its %Ls (zsh also has ${(ml[10])var} for that though). See also: https://unix.stackexchange.com/questions/350240/why-is-printf-shrinking-umlaut Cheers, Stephane ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-13 18:03 ` Stephane Chazelas @ 2021-04-13 21:09 ` Bart Schaefer 0 siblings, 0 replies; 60+ messages in thread From: Bart Schaefer @ 2021-04-13 21:09 UTC (permalink / raw) To: Daniel Shahaf, Zsh hackers list On Tue, Apr 13, 2021 at 11:05 AM Stephane Chazelas <stephane@chazelas.org> wrote: > > That was shown already in the part of my message you didn't > quote, where I pointed out how ksh93 addresses it with its %Ls Zsh printf currently flat-out ignores size modifiers (%ls %Ls %hs). Just skips over them. That might leave some room for a change, if anyone cares. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-11 19:42 ` sh emulation POSIX non-conformances (printf %10s and bytes vs character) Stephane Chazelas 2021-04-13 15:57 ` Daniel Shahaf @ 2021-04-22 13:59 ` Vincent Lefevre 2021-04-22 14:28 ` Vincent Lefevre 2021-04-22 19:22 ` Bart Schaefer 1 sibling, 2 replies; 60+ messages in thread From: Vincent Lefevre @ 2021-04-22 13:59 UTC (permalink / raw) To: zsh-workers On 2021-04-11 20:42:05 +0100, Stephane Chazelas wrote: > 2021-04-11 18:57:26 +0100, Stephane Chazelas: > > Some non-POSIX conformances I can think of ATM: > [...] > > Another POSIX bug fixed by zsh (but which makes it non-compliant): > > With multibyte characters: > > $ printf '|%10s|\n' Stéphane Chazelas > | Stéphane| > | Chazelas| > > POSIX requires: > > | Stéphane| > | Chazelas| I would think that's intentional, at least for the precision (e.g. %.4s) in order to prevent buffer overflow. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-22 13:59 ` Vincent Lefevre @ 2021-04-22 14:28 ` Vincent Lefevre 2021-04-22 19:22 ` Bart Schaefer 1 sibling, 0 replies; 60+ messages in thread From: Vincent Lefevre @ 2021-04-22 14:28 UTC (permalink / raw) To: zsh-workers On 2021-04-22 15:59:34 +0200, Vincent Lefevre wrote: > I would think that's intentional, at least for the precision > (e.g. %.4s) in order to prevent buffer overflow. The behavior with incomplete UTF-8 sequences (the one with "\x84\x9d") is rather ugly: zira% printf "%3s\n" $(printf "\xe2\x84\x9d") | hd 00000000 20 20 e2 84 9d 0a | ....| 00000006 zira% printf "%3s\n" $(printf "\x84\x9d") | hd 00000000 20 84 9d 0a | ...| 00000004 zira% printf "%.1s\n" $(printf "\xe2\x84\x9d") | hd 00000000 e2 84 9d 0a |....| 00000004 zira% printf "%.1s\n" $(printf "\x84\x9d") | hd 00000000 84 9d 0a |...| 00000003 I think that only the POSIX spec makes sense, unless you consider that %s must handle valid characters, in which case it should fail with an error on any invalid sequence. But I would say that a different conversion specifier should be used, as an extension. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-22 13:59 ` Vincent Lefevre 2021-04-22 14:28 ` Vincent Lefevre @ 2021-04-22 19:22 ` Bart Schaefer 2021-04-23 16:53 ` Vincent Lefevre 1 sibling, 1 reply; 60+ messages in thread From: Bart Schaefer @ 2021-04-22 19:22 UTC (permalink / raw) To: Zsh hackers list On Thu, Apr 22, 2021 at 7:01 AM Vincent Lefevre <vincent@vinc17.net> wrote: > > > POSIX requires: > > > > | Stéphane| > > | Chazelas| > > I would think that's intentional, at least for the precision > (e.g. %.4s) in order to prevent buffer overflow. That makes sense for C-ish languages, but I would think it was a bad idea in general for the shell to allocate buffer sizes based on user input. Someone can probably point to the rationale document, but I'm guessing this is really because of equivalence with sprintf() formats. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-22 19:22 ` Bart Schaefer @ 2021-04-23 16:53 ` Vincent Lefevre 2021-04-23 23:01 ` Oliver Kiddle 2021-04-24 7:09 ` Stephane Chazelas 0 siblings, 2 replies; 60+ messages in thread From: Vincent Lefevre @ 2021-04-23 16:53 UTC (permalink / raw) To: zsh-workers On 2021-04-22 12:22:12 -0700, Bart Schaefer wrote: > On Thu, Apr 22, 2021 at 7:01 AM Vincent Lefevre <vincent@vinc17.net> wrote: > > > > > POSIX requires: > > > > > > | Stéphane| > > > | Chazelas| > > > > I would think that's intentional, at least for the precision > > (e.g. %.4s) in order to prevent buffer overflow. > > That makes sense for C-ish languages, but I would think it was a bad > idea in general for the shell to allocate buffer sizes based on user > input. > > Someone can probably point to the rationale document, but I'm guessing > this is really because of equivalence with sprintf() formats. Some file formats have fields with a byte-size limit. Providing more than this limit could have unexpected effects. One may also want to limit the size of generated filenames (see NAME_MAX). -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-23 16:53 ` Vincent Lefevre @ 2021-04-23 23:01 ` Oliver Kiddle 2021-04-24 21:41 ` Vincent Lefevre 2021-04-24 21:46 ` Vincent Lefevre 2021-04-24 7:09 ` Stephane Chazelas 1 sibling, 2 replies; 60+ messages in thread From: Oliver Kiddle @ 2021-04-23 23:01 UTC (permalink / raw) To: zsh-workers Vincent Lefevre wrote: > On 2021-04-22 12:22:12 -0700, Bart Schaefer wrote: > > Someone can probably point to the rationale document, but I'm guessing > > this is really because of equivalence with sprintf() formats. > > Some file formats have fields with a byte-size limit. Providing > more than this limit could have unexpected effects. One may also > want to limit the size of generated filenames (see NAME_MAX). And for every one time that someone needs something like that, there are a zillion cases where people just want to line up output neatly in columns and are thwarted. More likely, this was just standards people insisting on exactly matching the C printf(). A high level language like a shell should not force users to know about low-level character encodings. Python 3 gets this fairly badly wrong. I'd prefer that we make it useful first and if the POSIX committee decree some crazyness, we have the emulation facility. Oliver ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-23 23:01 ` Oliver Kiddle @ 2021-04-24 21:41 ` Vincent Lefevre 2021-04-24 21:46 ` Vincent Lefevre 1 sibling, 0 replies; 60+ messages in thread From: Vincent Lefevre @ 2021-04-24 21:41 UTC (permalink / raw) To: zsh-workers On 2021-04-24 01:01:59 +0200, Oliver Kiddle wrote: > Vincent Lefevre wrote: > > On 2021-04-22 12:22:12 -0700, Bart Schaefer wrote: > > > Someone can probably point to the rationale document, but I'm guessing > > > this is really because of equivalence with sprintf() formats. > > > > Some file formats have fields with a byte-size limit. Providing > > more than this limit could have unexpected effects. One may also > > want to limit the size of generated filenames (see NAME_MAX). > > And for every one time that someone needs something like that, there > are a zillion cases where people just want to line up output neatly in > columns and are thwarted. More likely, this was just standards people > insisting on exactly matching the C printf(). Why not define an extension, then? > A high level language like a shell should not force users to know about > low-level character encodings. So the current zsh behavior is wrong: If one wants an output that doesn't take more than N bytes, one must take the low-level character encoding into account. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-23 23:01 ` Oliver Kiddle 2021-04-24 21:41 ` Vincent Lefevre @ 2021-04-24 21:46 ` Vincent Lefevre 1 sibling, 0 replies; 60+ messages in thread From: Vincent Lefevre @ 2021-04-24 21:46 UTC (permalink / raw) To: zsh-workers On 2021-04-24 01:01:59 +0200, Oliver Kiddle wrote: > Vincent Lefevre wrote: > > On 2021-04-22 12:22:12 -0700, Bart Schaefer wrote: > > > Someone can probably point to the rationale document, but I'm guessing > > > this is really because of equivalence with sprintf() formats. > > > > Some file formats have fields with a byte-size limit. Providing > > more than this limit could have unexpected effects. One may also > > want to limit the size of generated filenames (see NAME_MAX). > > And for every one time that someone needs something like that, there > are a zillion cases where people just want to line up output neatly in > columns and are thwarted. More likely, this was just standards people > insisting on exactly matching the C printf(). And even for that, this is currently buggy (at least under Debian) with double-width characters. Try: printf "|%3s|\n" a 🍸 -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-23 16:53 ` Vincent Lefevre 2021-04-23 23:01 ` Oliver Kiddle @ 2021-04-24 7:09 ` Stephane Chazelas 2021-04-24 21:52 ` Vincent Lefevre 1 sibling, 1 reply; 60+ messages in thread From: Stephane Chazelas @ 2021-04-24 7:09 UTC (permalink / raw) To: zsh-workers 2021-04-23 18:53:26 +0200, Vincent Lefevre: [...] > Some file formats have fields with a byte-size limit. Providing > more than this limit could have unexpected effects. One may also > want to limit the size of generated filenames (see NAME_MAX). [...] printf is to print formatted text. If you want to work at byte level, you need to use the C locale (or in zsh disable the multibyte option). In zsh, that applies to printf and all other text utilities and operators which is more consistent than the POSIX API. Note that the printf of perl, fish, gawk (and I'd expect most modern languages) work at character level (possibly as wrappers for C's wprintf()). $ perl -CLSA -le 'for (@ARGV) {printf "|%10s|\n", $_}' Stephane Stéphane | Stephane| | Stéphane| -- Stephane ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-24 7:09 ` Stephane Chazelas @ 2021-04-24 21:52 ` Vincent Lefevre 2021-04-24 22:28 ` Bart Schaefer 0 siblings, 1 reply; 60+ messages in thread From: Vincent Lefevre @ 2021-04-24 21:52 UTC (permalink / raw) To: zsh-workers On 2021-04-24 08:09:40 +0100, Stephane Chazelas wrote: > Note that the printf of perl, fish, gawk (and I'd expect most > modern languages) work at character level (possibly as wrappers > for C's wprintf()). > > $ perl -CLSA -le 'for (@ARGV) {printf "|%10s|\n", $_}' Stephane Stéphane > | Stephane| > | Stéphane| And perl has the same issue as zsh with double-width characters. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-24 21:52 ` Vincent Lefevre @ 2021-04-24 22:28 ` Bart Schaefer 2021-04-24 23:18 ` Vincent Lefevre 0 siblings, 1 reply; 60+ messages in thread From: Bart Schaefer @ 2021-04-24 22:28 UTC (permalink / raw) To: Zsh hackers list On Sat, Apr 24, 2021 at 2:53 PM Vincent Lefevre <vincent@vinc17.net> wrote: > > And perl has the same issue as zsh with double-width characters. This implies to me that it's not actually a zsh problem. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-24 22:28 ` Bart Schaefer @ 2021-04-24 23:18 ` Vincent Lefevre 2021-04-25 2:20 ` Bart Schaefer 0 siblings, 1 reply; 60+ messages in thread From: Vincent Lefevre @ 2021-04-24 23:18 UTC (permalink / raw) To: zsh-workers On 2021-04-24 15:28:21 -0700, Bart Schaefer wrote: > On Sat, Apr 24, 2021 at 2:53 PM Vincent Lefevre <vincent@vinc17.net> wrote: > > > > And perl has the same issue as zsh with double-width characters. > > This implies to me that it's not actually a zsh problem. Why not? -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-24 23:18 ` Vincent Lefevre @ 2021-04-25 2:20 ` Bart Schaefer 2021-04-25 11:07 ` Vincent Lefevre 0 siblings, 1 reply; 60+ messages in thread From: Bart Schaefer @ 2021-04-25 2:20 UTC (permalink / raw) To: Zsh hackers list On Sat, Apr 24, 2021 at 4:18 PM Vincent Lefevre <vincent@vinc17.net> wrote: > > On 2021-04-24 15:28:21 -0700, Bart Schaefer wrote: > > On Sat, Apr 24, 2021 at 2:53 PM Vincent Lefevre <vincent@vinc17.net> wrote: > > > > > > And perl has the same issue as zsh with double-width characters. > > > > This implies to me that it's not actually a zsh problem. > > Why not? Because multiple programs exhibiting identical incorrect behavior points to a problem in the C library or a system call. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) 2021-04-25 2:20 ` Bart Schaefer @ 2021-04-25 11:07 ` Vincent Lefevre 0 siblings, 0 replies; 60+ messages in thread From: Vincent Lefevre @ 2021-04-25 11:07 UTC (permalink / raw) To: zsh-workers On 2021-04-24 19:20:53 -0700, Bart Schaefer wrote: > On Sat, Apr 24, 2021 at 4:18 PM Vincent Lefevre <vincent@vinc17.net> wrote: > > > > On 2021-04-24 15:28:21 -0700, Bart Schaefer wrote: > > > On Sat, Apr 24, 2021 at 2:53 PM Vincent Lefevre <vincent@vinc17.net> wrote: > > > > > > > > And perl has the same issue as zsh with double-width characters. > > > > > > This implies to me that it's not actually a zsh problem. > > > > Why not? > > Because multiple programs exhibiting identical incorrect behavior > points to a problem in the C library or a system call. Perl is not the same language as zsh. For instance: $ perl -le 'for (@ARGV) {printf "|%10s|\n", $_}' Stephane Stéphane | Stephane| | Stéphane| So, now, you have a different behavior with Perl. I assume that the intent in Perl was not column formatting, but to be similar to C. So I suppose that the Perl printf behaves like printf(3) for 8-bit strings, and like wprintf(3) for Unicode strings. Note that this is independent of the locales, i.e. the Perl printf is not sensitive to the locales for %s, this is more like a datatype-based behavior. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH] Document imperfections in POSIX/sh compatibility 2021-04-11 16:54 ` Bart Schaefer 2021-04-11 17:57 ` sh emulation POSIX non-conformances (Was: [PATCH] Document imperfections in POSIX/sh compatibility) Stephane Chazelas @ 2021-04-11 23:04 ` dana 1 sibling, 0 replies; 60+ messages in thread From: dana @ 2021-04-11 23:04 UTC (permalink / raw) To: Bart Schaefer; +Cc: Zsh hackers list On 11 Apr 2021, at 11:54, Bart Schaefer <schaefer@brasslantern.com> wrote: > You're talking about the thread from workers/42248, which itself > mentions a thread from workers/35317. Did none of those patches get > applied? No, this is a separate problem. POSIX says getopts shall end option processing when it encounters either an error or a non-optarg consisting of '--' or not beginning with '-'. zsh's getopts has no way of disabling '+'-prefixed option handling, so it isn't conformant with that. Regarding the other thing, Peter's patch from 35318 was applied, but mine was not. I guess Peter wanted me to change the behaviour only with POSIX_BUILTINS set. I'll look at it again later dana ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH] Document imperfections in POSIX/sh compatibility 2021-04-10 23:31 [PATCH] Document imperfections in POSIX/sh compatibility dana 2021-04-10 23:50 ` Bart Schaefer @ 2021-04-13 16:01 ` Daniel Shahaf 2021-04-13 16:12 ` Peter Stephenson 2021-04-13 20:28 ` Oliver Kiddle 1 sibling, 2 replies; 60+ messages in thread From: Daniel Shahaf @ 2021-04-13 16:01 UTC (permalink / raw) To: dana; +Cc: Zsh hackers list dana wrote on Sat, Apr 10, 2021 at 18:31:31 -0500: > Lawrence bumping 47794 reminded me of this. Someone on IRC was trying to use > zsh as sh and they were very annoyed to learn that the sh emulation has > imperfections that aren't really documented anywhere. I said i would add a > mention. Let me know if this is editorialising it too much > > dana > > > diff --git a/Doc/Zsh/compat.yo b/Doc/Zsh/compat.yo > index f1be15fee..a09187918 100644 > --- a/Doc/Zsh/compat.yo > +++ b/Doc/Zsh/compat.yo > @@ -74,3 +74,8 @@ tt(PROMPT_SUBST) > and > tt(SINGLE_LINE_ZLE) > options are set if zsh is invoked as tt(ksh). > + > +Please note that zsh's emulation of other shells, as well as the degree > +of its POSIX compliance, is provided on a `best effort' basis. Full > +compatibility is not guaranteed, and is not necessarily a goal of the > +project. I'm concerned that saying "is not necessarily a goal of the project" might discourage people from even reporting bugs in the first place. No objection to setting expectations, of course, but could we phrase it differently? E.g., by documenting a list of known incompatibilities that won't be fixed? Sorry for going off-topic ;-) Cheers, Daniel ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH] Document imperfections in POSIX/sh compatibility 2021-04-13 16:01 ` Daniel Shahaf @ 2021-04-13 16:12 ` Peter Stephenson 2021-04-13 20:28 ` Oliver Kiddle 1 sibling, 0 replies; 60+ messages in thread From: Peter Stephenson @ 2021-04-13 16:12 UTC (permalink / raw) To: Zsh hackers list > On 13 April 2021 at 17:01 Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > dana wrote on Sat, Apr 10, 2021 at 18:31:31 -0500: > > +Please note that zsh's emulation of other shells, as well as the degree > > +of its POSIX compliance, is provided on a `best effort' basis. Full > > +compatibility is not guaranteed, and is not necessarily a goal of the > > +project. > > I'm concerned that saying "is not necessarily a goal of the project" > might discourage people from even reporting bugs in the first place. > No objection to setting expectations, of course, but could we phrase it > differently? E.g., by documenting a list of known incompatibilities > that won't be fixed? This is certainly a good point. The classic list of differences is in the FAQ, "how does zsh differ from..>". It refers to "sh", implying the classic Bourne shell, rather than POSIX, but this is probably the right starting point. I think referring to the FAQ here is probably the right thing to do --- it simultaneously makes the points that (i) we are in principle interested (ii) it's not, however, necessarily something that's ultimately going to be dealt with in the shell itself. pws ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH] Document imperfections in POSIX/sh compatibility 2021-04-13 16:01 ` Daniel Shahaf 2021-04-13 16:12 ` Peter Stephenson @ 2021-04-13 20:28 ` Oliver Kiddle 2021-04-13 21:40 ` dana 1 sibling, 1 reply; 60+ messages in thread From: Oliver Kiddle @ 2021-04-13 20:28 UTC (permalink / raw) To: Zsh hackers list Daniel Shahaf wrote: > dana wrote on Sat, Apr 10, 2021 at 18:31:31 -0500: > > +Please note that zsh's emulation of other shells, as well as the degree > > +of its POSIX compliance, is provided on a `best effort' basis. Full > > +compatibility is not guaranteed, and is not necessarily a goal of the > > +project. > > I'm concerned that saying "is not necessarily a goal of the project" > might discourage people from even reporting bugs in the first place. Or of contributing fixes. In the past we've been open to POSIX related fixes. The entire project is done on a `best effort' basis with no guarantees. > No objection to setting expectations, of course, but could we phrase it > differently? E.g., by documenting a list of known incompatibilities > that won't be fixed? It might be sensible to have a file separate from Etc/BUGS for listing issues related to POSIX compliance. I don't imagine there is anything that "won't be fixed" in the sense that it has been outright rejected as opposed to nobody has come forward with an implementation. Mostly when I've come across people complaining about zsh's lack of compliance, they either aren't using emulation at all or are expecting bash scripts to work unchanged. And they're too lazy to understand the issues. It doesn't help when even something like the Intel C compiler comes with idiotic advice to use bash -c 'source ...;exec zsh' despite the necessary fix being no more than a one-line tweak. In the repository is the file Etc/STD-TODO that documents some incompatibilities, mainly against ksh93. In that context, "standard" is not used to mean a formal standard such as POSIX but rather features common to several shells. That file is excluded from a release distribution. Oliver ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH] Document imperfections in POSIX/sh compatibility 2021-04-13 20:28 ` Oliver Kiddle @ 2021-04-13 21:40 ` dana 2021-04-13 22:02 ` Bart Schaefer 2021-04-14 12:38 ` Daniel Shahaf 0 siblings, 2 replies; 60+ messages in thread From: dana @ 2021-04-13 21:40 UTC (permalink / raw) To: Oliver Kiddle; +Cc: Zsh hackers list, Daniel Shahaf On 13 Apr 2021, at 11:01, Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > I'm concerned that saying "is not necessarily a goal of the project" > might discourage people from even reporting bugs in the first place. On 13 Apr 2021, at 15:28, Oliver Kiddle <opk@zsh.org> wrote: > Or of contributing fixes. Yeah, i did worry about that myself. What about this? dana diff --git a/Doc/Zsh/compat.yo b/Doc/Zsh/compat.yo index f1be15fee..ab8f4d8dc 100644 --- a/Doc/Zsh/compat.yo +++ b/Doc/Zsh/compat.yo @@ -74,3 +74,9 @@ tt(PROMPT_SUBST) and tt(SINGLE_LINE_ZLE) options are set if zsh is invoked as tt(ksh). + +Please note that, whilst reasonable efforts are taken to address +incompatibilities where they arise, zsh does not guarantee complete +emulation of other shells, nor POSIX compliance. For more information on +the differences between zsh and other shells, please refer to chapter 2 +of the shell FAQ, uref(http://www.zsh.org/FAQ/). ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH] Document imperfections in POSIX/sh compatibility 2021-04-13 21:40 ` dana @ 2021-04-13 22:02 ` Bart Schaefer 2021-04-14 12:38 ` Daniel Shahaf 1 sibling, 0 replies; 60+ messages in thread From: Bart Schaefer @ 2021-04-13 22:02 UTC (permalink / raw) To: dana; +Cc: Oliver Kiddle, Zsh hackers list, Daniel Shahaf On Tue, Apr 13, 2021 at 2:40 PM dana <dana@dana.is> wrote: > > Yeah, i did worry about that myself. What about this? This seems fine, although perhaps part of the problem is that the whole section begins with the words "Zsh tries to emulate ..." (where "tries" is open to interpretation). ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH] Document imperfections in POSIX/sh compatibility 2021-04-13 21:40 ` dana 2021-04-13 22:02 ` Bart Schaefer @ 2021-04-14 12:38 ` Daniel Shahaf 2021-04-18 4:50 ` dana 1 sibling, 1 reply; 60+ messages in thread From: Daniel Shahaf @ 2021-04-14 12:38 UTC (permalink / raw) To: dana; +Cc: Zsh hackers list dana wrote on Tue, Apr 13, 2021 at 16:40:19 -0500: > +++ b/Doc/Zsh/compat.yo > @@ -74,3 +74,9 @@ tt(PROMPT_SUBST) > and > tt(SINGLE_LINE_ZLE) > options are set if zsh is invoked as tt(ksh). Looks good. A couple of minor points: > +Please note that, whilst reasonable efforts are taken to address > +incompatibilities where they arise, zsh does not guarantee complete s/where/when/ ? > +emulation of other shells, nor POSIX compliance. For more information on > +the differences between zsh and other shells, please refer to chapter 2 s/chapter/Chapter/. There is some relevant information in §3 as well, specifically, in 3.31 "Why does my bash script report an error when I run it under zsh?". However, that question hasn't been published yet, so perhaps we should just move it to §2. The space in "Chapter 2" should be a non-breaking one. An nbsp() macro was added to yodl in 4.02.00, which is the version I have, but when I try to use it (without worrying about compatibility to older versions, for the sake of testing), I just get «expn.yo:36: No macro: nbsp(...)». I'm not sure why. (Compatibility to older versions could probably be done with IFBUILTIN().) > +of the shell FAQ, uref(http://www.zsh.org/FAQ/). s/http/https/ Cheers, Daniel ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH] Document imperfections in POSIX/sh compatibility 2021-04-14 12:38 ` Daniel Shahaf @ 2021-04-18 4:50 ` dana 2021-04-20 21:26 ` Daniel Shahaf 0 siblings, 1 reply; 60+ messages in thread From: dana @ 2021-04-18 4:50 UTC (permalink / raw) To: Daniel Shahaf; +Cc: Zsh hackers list On 14 Apr 2021, at 07:38, Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > s/where/when/ ? I guess so On 14 Apr 2021, at 07:38, Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > s/chapter/Chapter/. I'd specifically checked to see if there was a preference here and found that the FAQ itself uses lower-case. Should i go against that? On 14 Apr 2021, at 07:38, Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > There is some relevant information in §3 as well, specifically, in 3.31 > "Why does my bash script report an error when I run it under zsh?". > However, that question hasn't been published yet, so perhaps we should > just move it to §2. Looking at the contents of the two chapters, it does seem like moving it might make sense. I can do that On 14 Apr 2021, at 07:38, Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > The space in "Chapter 2" should be a non-breaking one. An nbsp() macro > was added to yodl in 4.02.00, which is the version I have, but when I > try to use it (without worrying about compatibility to older versions, > for the sake of testing), I just get «expn.yo:36: No macro: nbsp(...)». So what would you suggest? I only have yodl 3.05, and i can't find any instances of 'nbsp' or '00a0' or $'\u00a0' in Doc/ or Etc/ to use as precedent. Should i just use a literal nbsp? On 14 Apr 2021, at 07:38, Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > s/http/https/ I also based that on existing precedent. Maybe i should do another patch to find-and-replace them throughout the docs dana ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH] Document imperfections in POSIX/sh compatibility 2021-04-18 4:50 ` dana @ 2021-04-20 21:26 ` Daniel Shahaf 2021-05-03 23:42 ` dana 0 siblings, 1 reply; 60+ messages in thread From: Daniel Shahaf @ 2021-04-20 21:26 UTC (permalink / raw) To: dana; +Cc: Zsh hackers list dana wrote on Sat, Apr 17, 2021 at 23:50:11 -0500: > On 14 Apr 2021, at 07:38, Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > > s/where/when/ ? > > I guess so > > On 14 Apr 2021, at 07:38, Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > > s/chapter/Chapter/. > > I'd specifically checked to see if there was a preference here and found that > the FAQ itself uses lower-case. Should i go against that? No; consistency comes first. > On 14 Apr 2021, at 07:38, Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > > There is some relevant information in §3 as well, specifically, in 3.31 > > "Why does my bash script report an error when I run it under zsh?". > > However, that question hasn't been published yet, so perhaps we should > > just move it to §2. > > Looking at the contents of the two chapters, it does seem like moving it might > make sense. I can do that That'd be great. > On 14 Apr 2021, at 07:38, Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > > The space in "Chapter 2" should be a non-breaking one. An nbsp() macro > > was added to yodl in 4.02.00, which is the version I have, but when I > > try to use it (without worrying about compatibility to older versions, > > for the sake of testing), I just get «expn.yo:36: No macro: nbsp(...)». > > So what would you suggest? I only have yodl 3.05, and i can't find any > instances of 'nbsp' or '00a0' or $'\u00a0' in Doc/ or Etc/ to use as > precedent. Should i just use a literal nbsp? I suppose we could try that and see if someone complains it breaks their build. (Feel free to blame me ;-)) Or we could just put a literal space (plus or minus a «COMMENT(TODO: )»). > On 14 Apr 2021, at 07:38, Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > > s/http/https/ > > I also based that on existing precedent. Maybe i should do another patch to > find-and-replace them throughout the docs *nod* Thanks, dana. Daniel ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH] Document imperfections in POSIX/sh compatibility 2021-04-20 21:26 ` Daniel Shahaf @ 2021-05-03 23:42 ` dana 0 siblings, 0 replies; 60+ messages in thread From: dana @ 2021-05-03 23:42 UTC (permalink / raw) To: Daniel Shahaf; +Cc: Zsh hackers list On 20 Apr 2021, at 16:26, Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > That'd be great. > *nod* I've committed the disclaimer plus the http:// -> https:// update and the FAQ move. You're probably aware of this, but just to be clear, the zsh.org FAQ links redirect to zsh.sourceforge.net, which is HTTP only, so the link changes don't really have any functional effect at the moment dana ^ permalink raw reply [flat|nested] 60+ messages in thread
end of thread, other threads:[~2021-05-03 23:42 UTC | newest] Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-04-10 23:31 [PATCH] Document imperfections in POSIX/sh compatibility dana 2021-04-10 23:50 ` Bart Schaefer 2021-04-11 0:19 ` dana 2021-04-11 16:54 ` Bart Schaefer 2021-04-11 17:57 ` sh emulation POSIX non-conformances (Was: [PATCH] Document imperfections in POSIX/sh compatibility) Stephane Chazelas 2021-04-11 18:13 ` Bart Schaefer 2021-04-11 19:18 ` sh emulation POSIX non-conformances (no word splitting upon arithmetic expansion) Stephane Chazelas 2021-04-22 15:03 ` Vincent Lefevre 2021-04-22 18:27 ` Bart Schaefer 2021-04-11 19:31 ` sh emulation POSIX non-conformances ("inf"/"Inf" in arithmetic expressions) Stephane Chazelas 2021-04-12 20:41 ` Bart Schaefer 2021-04-13 7:17 ` Stephane Chazelas 2021-04-22 15:31 ` Vincent Lefevre 2021-04-22 18:55 ` Bart Schaefer 2021-04-22 20:45 ` Daniel Shahaf 2021-04-22 21:25 ` Bart Schaefer 2021-04-23 16:45 ` Vincent Lefevre 2021-04-23 20:31 ` Bart Schaefer 2021-04-23 22:46 ` Oliver Kiddle 2021-04-23 23:34 ` Bart Schaefer 2021-04-24 2:10 ` Daniel Shahaf 2021-04-24 3:42 ` Bart Schaefer 2021-04-24 7:33 ` Stephane Chazelas 2021-04-24 16:04 ` Bart Schaefer 2021-04-24 23:02 ` Vincent Lefevre 2021-04-25 2:18 ` Bart Schaefer 2021-04-25 20:17 ` Vincent Lefevre 2021-04-25 21:58 ` Bart Schaefer 2021-04-26 10:28 ` Vincent Lefevre 2021-04-25 22:00 ` Bart Schaefer 2021-04-26 10:34 ` Vincent Lefevre 2021-04-26 23:25 ` Vincent Lefevre 2021-04-11 19:33 ` sh emulation POSIX non-conformances (some of zsh's special variables) Stephane Chazelas 2021-04-11 19:42 ` sh emulation POSIX non-conformances (printf %10s and bytes vs character) Stephane Chazelas 2021-04-13 15:57 ` Daniel Shahaf 2021-04-13 18:03 ` Stephane Chazelas 2021-04-13 21:09 ` Bart Schaefer 2021-04-22 13:59 ` Vincent Lefevre 2021-04-22 14:28 ` Vincent Lefevre 2021-04-22 19:22 ` Bart Schaefer 2021-04-23 16:53 ` Vincent Lefevre 2021-04-23 23:01 ` Oliver Kiddle 2021-04-24 21:41 ` Vincent Lefevre 2021-04-24 21:46 ` Vincent Lefevre 2021-04-24 7:09 ` Stephane Chazelas 2021-04-24 21:52 ` Vincent Lefevre 2021-04-24 22:28 ` Bart Schaefer 2021-04-24 23:18 ` Vincent Lefevre 2021-04-25 2:20 ` Bart Schaefer 2021-04-25 11:07 ` Vincent Lefevre 2021-04-11 23:04 ` [PATCH] Document imperfections in POSIX/sh compatibility dana 2021-04-13 16:01 ` Daniel Shahaf 2021-04-13 16:12 ` Peter Stephenson 2021-04-13 20:28 ` Oliver Kiddle 2021-04-13 21:40 ` dana 2021-04-13 22:02 ` Bart Schaefer 2021-04-14 12:38 ` Daniel Shahaf 2021-04-18 4:50 ` dana 2021-04-20 21:26 ` Daniel Shahaf 2021-05-03 23:42 ` dana
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).