zsh-users
 help / color / mirror / code / Atom feed
* Discrepancy in IFS handling (zsh is POSIX compliant)
@ 2023-03-30 11:11 Felipe Contreras
  2023-03-30 12:05 ` Lawrence Velázquez
  2023-03-31 16:38 ` Thomas Paulsen
  0 siblings, 2 replies; 12+ messages in thread
From: Felipe Contreras @ 2023-03-30 11:11 UTC (permalink / raw)
  To: Zsh Users

Hi,

I was going to report a bug about a discrepancy in the handling of
IFS, until I read what the POSIX standard says about it [1].

The example is this:

    IFS=,
    str='foo,bar,,roo,'
    printf '"%s"\n' $str

In bash there's four fields, the last comma is ignored, in zsh there's
five fields. In my system dash and ksh also output four fields, like
bash.

However, this is what POSIX says:

    3.b. Each occurrence in the input of an IFS character that is not
IFS white space, along with any adjacent IFS white space, shall
delimit a field, as described previously.

We ignore all the white space stuff (since we are not using white
spaces), and thus:

    Each occurrence in the input of an IFS character shall delimit a field.

In zsh each occurrence of a comma does delimit a field (4 commas, 5
fields), which to me is what POSIX says should happen.

So in this particular case it seems zsh is complying with POSIX (even
in zsh mode), and all other shells are not.

So there's no bug (at least in zsh), I just wanted to let you know
what I found, and see if you agreed with my interpretation.

Cheers.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_05

-- 
Felipe Contreras


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discrepancy in IFS handling (zsh is POSIX compliant)
  2023-03-30 11:11 Discrepancy in IFS handling (zsh is POSIX compliant) Felipe Contreras
@ 2023-03-30 12:05 ` Lawrence Velázquez
  2023-03-30 12:10   ` Discrepancy in IFS handling (zsh is *not* " Felipe Contreras
  2023-03-31 20:16   ` Discrepancy in IFS handling (zsh is " Felipe Contreras
  2023-03-31 16:38 ` Thomas Paulsen
  1 sibling, 2 replies; 12+ messages in thread
From: Lawrence Velázquez @ 2023-03-30 12:05 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: Zsh Users

[-- Attachment #1: Type: text/plain, Size: 1325 bytes --]

> On Mar 30, 2023, at 7:13 AM, Felipe Contreras <felipe.contreras@gmail.com> wrote:
> However, this is what POSIX says:
> 
>    3.b. Each occurrence in the input of an IFS character that is not
> IFS white space, along with any adjacent IFS white space, shall
> delimit a field, as described previously.
> 
> We ignore all the white space stuff (since we are not using white
> spaces), and thus:
> 
>    Each occurrence in the input of an IFS character shall delimit a field.
> 
> In zsh each occurrence of a comma does delimit a field (4 commas, 5
> fields), which to me is what POSIX says should happen.
> 
> So in this particular case it seems zsh is complying with POSIX (even
> in zsh mode), and all other shells are not.

Before the excerpt you quoted, XCU 2.6.5 says: “The shell shall treat each character of the IFS as a delimiter and use the delimiters as field terminators to split the results of parameter expansion, command substitution, and arithmetic expansion into fields.”

The bash/dash/ksh behavior is not unreasonable if the phrase “field terminators” is interpreted strictly.

In any case, I believe the standard intends to describe the ksh behavior:
https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xcu_chap02.html#tag_23_02_06_05

-- 
vq
Sent from my iPhone

[-- Attachment #2: Type: text/html, Size: 2862 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Discrepancy in IFS handling (zsh is *not* POSIX compliant)
  2023-03-30 12:05 ` Lawrence Velázquez
@ 2023-03-30 12:10   ` Felipe Contreras
  2023-03-30 14:49     ` Ray Andrews
  2023-03-30 14:57     ` Bart Schaefer
  2023-03-31 20:16   ` Discrepancy in IFS handling (zsh is " Felipe Contreras
  1 sibling, 2 replies; 12+ messages in thread
From: Felipe Contreras @ 2023-03-30 12:10 UTC (permalink / raw)
  To: Lawrence Velázquez; +Cc: Zsh Users

On Thu, Mar 30, 2023 at 6:05 AM Lawrence Velázquez <larryv@zsh.org> wrote:
>
> On Mar 30, 2023, at 7:13 AM, Felipe Contreras <felipe.contreras@gmail.com> wrote:
>
> However, this is what POSIX says:
>
>    3.b. Each occurrence in the input of an IFS character that is not
> IFS white space, along with any adjacent IFS white space, shall
> delimit a field, as described previously.
>
> We ignore all the white space stuff (since we are not using white
> spaces), and thus:
>
>    Each occurrence in the input of an IFS character shall delimit a field.
>
> In zsh each occurrence of a comma does delimit a field (4 commas, 5
> fields), which to me is what POSIX says should happen.
>
> So in this particular case it seems zsh is complying with POSIX (even
> in zsh mode), and all other shells are not.
>
>
> Before the excerpt you quoted, XCU 2.6.5 says: “The shell shall treat each character of the IFS as a delimiter and use the delimiters as field terminators to split the results of parameter expansion, command substitution, and arithmetic expansion into fields.”
>
> The bash/dash/ksh behavior is not unreasonable if the phrase “field terminators” is interpreted strictly.
>
> In any case, I believe the standard intends to describe the ksh behavior:

Yes, I was about to click send to point that out.

So if IFS contains terminators, and not separators, this should
generate 5 fields:

    IFS=';'
    str='foo;bar;;roo;;'
    printf '"%s"\n' $str

For: 'foo;' 'bar;' ';' 'roo;' ';'

In which case bash is correct, zsh generates 6 fields, so it's not.

Seems weird that a variable called Internal Field Separator is not a
*separator*, but a terminator.

I'm changing the subject to reflect that.

Cheers.

-- 
Felipe Contreras


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discrepancy in IFS handling (zsh is *not* POSIX compliant)
  2023-03-30 12:10   ` Discrepancy in IFS handling (zsh is *not* " Felipe Contreras
@ 2023-03-30 14:49     ` Ray Andrews
  2023-03-30 15:09       ` Felipe Contreras
  2023-03-30 14:57     ` Bart Schaefer
  1 sibling, 1 reply; 12+ messages in thread
From: Ray Andrews @ 2023-03-30 14:49 UTC (permalink / raw)
  To: zsh-users


On 2023-03-30 05:10, Felipe Contreras wrote:
> Seems weird that a variable called Internal Field Separator is not a
> *separator*, but a terminator.
>
> I'm changing the subject to reflect that.
Just some unwanted commentary:  Should one need to be a technical lawyer 
to decide this?  If one pointedly adds another 
separator/terminator/delimiter/ender or whatever one might call it, one 
has probably done so for a reason and that reason would almost 
inevitably be that one intends to add another field even if empty. Thus 
any shell the ignores such a character is throwing away syntax space and 
acceding to the idea that characters in code can be ignored -- which 
might in very limited situations be admissible but not very often.  So 
if zsh did other than it does and I crashed into that while writing 
something, I'd foam at the mouth.  So zsh is the good-guy here IMHO.  
Practicality should trump legality almost every time.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discrepancy in IFS handling (zsh is *not* POSIX compliant)
  2023-03-30 12:10   ` Discrepancy in IFS handling (zsh is *not* " Felipe Contreras
  2023-03-30 14:49     ` Ray Andrews
@ 2023-03-30 14:57     ` Bart Schaefer
  2023-03-30 15:34       ` Felipe Contreras
  1 sibling, 1 reply; 12+ messages in thread
From: Bart Schaefer @ 2023-03-30 14:57 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: Lawrence Velázquez, Zsh Users

This has been discussed before, e.g. workers/48498 about 2 years ago.
There are even xfail tests in E03posix.ztst making note of it, added
in workers/48560.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discrepancy in IFS handling (zsh is *not* POSIX compliant)
  2023-03-30 14:49     ` Ray Andrews
@ 2023-03-30 15:09       ` Felipe Contreras
  2023-03-30 15:31         ` Ray Andrews
  0 siblings, 1 reply; 12+ messages in thread
From: Felipe Contreras @ 2023-03-30 15:09 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

On Thu, Mar 30, 2023 at 8:49 AM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
> On 2023-03-30 05:10, Felipe Contreras wrote:
> > Seems weird that a variable called Internal Field Separator is not a
> > *separator*, but a terminator.
> >
> > I'm changing the subject to reflect that.

> Just some unwanted commentary:  Should one need to be a technical lawyer
> to decide this?  If one pointedly adds another
> separator/terminator/delimiter/ender or whatever one might call it, one
> has probably done so for a reason and that reason would almost
> inevitably be that one intends to add another field even if empty. Thus
> any shell the ignores such a character is throwing away syntax space and
> acceding to the idea that characters in code can be ignored -- which
> might in very limited situations be admissible but not very often.  So
> if zsh did other than it does and I crashed into that while writing
> something, I'd foam at the mouth.  So zsh is the good-guy here IMHO.
> Practicality should trump legality almost every time.

Yeah, I agree zsh's behavior is much more useful, but I'm not talking
about zsh's behavior by default, but in sh mode.

If POSIX seems to specify terminators instead of separators, and
that's what most shells do, shouldn't zsh in sh mode do the same?

-- 
Felipe Contreras


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discrepancy in IFS handling (zsh is *not* POSIX compliant)
  2023-03-30 15:09       ` Felipe Contreras
@ 2023-03-30 15:31         ` Ray Andrews
  0 siblings, 0 replies; 12+ messages in thread
From: Ray Andrews @ 2023-03-30 15:31 UTC (permalink / raw)
  To: zsh-users


On 2023-03-30 08:09, Felipe Contreras wrote:
> Yeah, I agree zsh's behavior is much more useful, but I'm not talking
> about zsh's behavior by default, but in sh mode.
>
Ah!  Then I should be ranting ;-)   It should be that way by default.  
If in doubt, do the useful thing.




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discrepancy in IFS handling (zsh is *not* POSIX compliant)
  2023-03-30 14:57     ` Bart Schaefer
@ 2023-03-30 15:34       ` Felipe Contreras
  0 siblings, 0 replies; 12+ messages in thread
From: Felipe Contreras @ 2023-03-30 15:34 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Lawrence Velázquez, Zsh Users

On Thu, Mar 30, 2023 at 8:58 AM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> This has been discussed before, e.g. workers/48498 about 2 years ago.
> There are even xfail tests in E03posix.ztst making note of it, added
> in workers/48560.

OK. But I don't see much of a conclusion.

Do you believe POSIX says these should be two fields? IFS=: str="a:b:"

POSIX does say the delimiter shall be considered a field terminator,
but str="a" has no field terminator, does that mean there's no valid
field? I understand from the point of view of processing strings in a
language like C it makes sense to consider an unterminated field
valid, but that's an assumption, POSIX doesn't specify that. It could
be considered that "return 0" is not a valid field (if it doesn't end
in a semicolon).

Or, one could assume POSIX meant in the case of str="a" the end of
string shall be considered an implicit terminator, but in that case
"a:b:" would have three fields, therefore making the terminators
identical to separators. In which case zsh is actually compatible with
POSIX.

So the options are:

1. zsh is compatible with POSIX
2. bash and other shells are compatible with POSIX
3. All are compatible since POSIX isn't clear

If POSIX isn't clear, then there's not much reason to implement
behavior just because other shells do it. But if you believe POSIX is
clear that the behavior of other shells is the correct one, then it
might make sense to implement it in sh mode.

Cheers.

-- 
Felipe Contreras


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discrepancy in IFS handling (zsh is POSIX compliant)
  2023-03-30 11:11 Discrepancy in IFS handling (zsh is POSIX compliant) Felipe Contreras
  2023-03-30 12:05 ` Lawrence Velázquez
@ 2023-03-31 16:38 ` Thomas Paulsen
  2023-03-31 20:18   ` Felipe Contreras
  1 sibling, 1 reply; 12+ messages in thread
From: Thomas Paulsen @ 2023-03-31 16:38 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: zsh-users

Hi,

indeed an original bourne shell, and ksh88 and ksh94 behave exactly like bash
On the other hand, tcsh and the ultra modern fish behave like zsh. 

Thus, I don't see the need for changing zsh. Let it as it is. It's in proud companionship. 

Cheers Tom

--- Ursprüngliche Nachricht ---
Von: Felipe Contreras <felipe.contreras@gmail.com>
Datum: 30.03.2023 13:11:46
An: Zsh Users <zsh-users@zsh.org>
Betreff: Discrepancy in IFS handling (zsh is POSIX compliant)

Hi,

I was going to report a bug about a discrepancy in the handling of
IFS, until I read what the POSIX standard says about it [1].

The example is this:

    IFS=,
    str='foo,bar,,roo,'
    printf '"%s"\n' $str

In bash there's four fields, the last comma is ignored, in zsh there's
five fields. In my system dash and ksh also output four fields, like
bash.

However, this is what POSIX says:

    3.b. Each occurrence in the input of an IFS character that is not
IFS white space, along with any adjacent IFS white space, shall
delimit a field, as described previously.

We ignore all the white space stuff (since we are not using white
spaces), and thus:

    Each occurrence in the input of an IFS character shall delimit a field.


In zsh each occurrence of a comma does delimit a field (4 commas, 5
fields), which to me is what POSIX says should happen.

So in this particular case it seems zsh is complying with POSIX (even
in zsh mode), and all other shells are not.

So there's no bug (at least in zsh), I just wanted to let you know
what I found, and see if you agreed with my interpretation.

Cheers.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_05


-- 
Felipe Contreras






^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discrepancy in IFS handling (zsh is POSIX compliant)
  2023-03-30 12:05 ` Lawrence Velázquez
  2023-03-30 12:10   ` Discrepancy in IFS handling (zsh is *not* " Felipe Contreras
@ 2023-03-31 20:16   ` Felipe Contreras
  2023-04-01 19:20     ` Lawrence Velázquez
  1 sibling, 1 reply; 12+ messages in thread
From: Felipe Contreras @ 2023-03-31 20:16 UTC (permalink / raw)
  To: Lawrence Velázquez; +Cc: Zsh Users

On Thu, Mar 30, 2023 at 6:05 AM Lawrence Velázquez <larryv@zsh.org> wrote:
>
> On Mar 30, 2023, at 7:13 AM, Felipe Contreras <felipe.contreras@gmail.com> wrote:
>
> However, this is what POSIX says:
>
>    3.b. Each occurrence in the input of an IFS character that is not
> IFS white space, along with any adjacent IFS white space, shall
> delimit a field, as described previously.
>
> We ignore all the white space stuff (since we are not using white
> spaces), and thus:
>
>    Each occurrence in the input of an IFS character shall delimit a field.
>
> In zsh each occurrence of a comma does delimit a field (4 commas, 5
> fields), which to me is what POSIX says should happen.
>
> So in this particular case it seems zsh is complying with POSIX (even
> in zsh mode), and all other shells are not.
>
>
> Before the excerpt you quoted, XCU 2.6.5 says: “The shell shall treat each character of the IFS as a delimiter and use the delimiters as field terminators to split the results of parameter expansion, command substitution, and arithmetic expansion into fields.”

I was just about to mention that, and I thought I replied to you, but
apparently not.

So if IFS contains terminators, and not separators, this should
generate 5 fields:

    IFS=';'
    str='foo;bar;;roo;;'
    printf '"%s"\n' $str

For: 'foo;' 'bar;' ';' 'roo;' ';'

In which case bash is correct, and zsh is not.

But, 'foo' doesn't contain any terminators, so it does not contain any
field, and should be dropped. Unless 1) you consider the end of the
string as a terminator, or 2) consider the terminator of the last
field as optional.

If you consider the end of the string as a terminator (1), then 'foo;'
contains two fields, not one, in which case zsh is correct. This makes
the terminators behave identically as separators.

If you consider the terminator of the last field as optional, then
bash (and other shells) are correct, but in that case what's the point
of terminators if they aren't actually going to demarcate the
*terminaton* of fields?

I think everyone can agree POSIX is not clear about this.

-- 
Felipe Contreras


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discrepancy in IFS handling (zsh is POSIX compliant)
  2023-03-31 16:38 ` Thomas Paulsen
@ 2023-03-31 20:18   ` Felipe Contreras
  0 siblings, 0 replies; 12+ messages in thread
From: Felipe Contreras @ 2023-03-31 20:18 UTC (permalink / raw)
  To: Thomas Paulsen; +Cc: zsh-users

On Fri, Mar 31, 2023 at 10:38 AM Thomas Paulsen
<thomas.paulsen@firemail.de> wrote:

> indeed an original bourne shell, and ksh88 and ksh94 behave exactly like bash
> On the other hand, tcsh and the ultra modern fish behave like zsh.
>
> Thus, I don't see the need for changing zsh. Let it as it is. It's in proud companionship.

Nobody is saying we should change zsh.

I'm saying if POSIX says the behavior shall be like bourne shell, then
perhaps zsh should do that in sh mode (not in zsh mode).

Cheers.

-- 
Felipe Contreras


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discrepancy in IFS handling (zsh is POSIX compliant)
  2023-03-31 20:16   ` Discrepancy in IFS handling (zsh is " Felipe Contreras
@ 2023-04-01 19:20     ` Lawrence Velázquez
  0 siblings, 0 replies; 12+ messages in thread
From: Lawrence Velázquez @ 2023-04-01 19:20 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: zsh-users

On Fri, Mar 31, 2023, at 4:16 PM, Felipe Contreras wrote:
> I think everyone can agree POSIX is not clear about this.

I do agree.  It looks like a situation where everyone involved was
already familiar with the intended behavior (as per Chet Ramey [1])
and baked their assumptions into the drafted text, leaving it less
explicit than it should have been.

For the curious, a bug report has been opened on the Austin Group
defect tracker [2].

  [1]: https://lists.gnu.org/archive/html/bug-bash/2023-03/msg00175.html
  [2]: https://austingroupbugs.net/view.php?id=1649

-- 
vq


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-04-01 19:22 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-30 11:11 Discrepancy in IFS handling (zsh is POSIX compliant) Felipe Contreras
2023-03-30 12:05 ` Lawrence Velázquez
2023-03-30 12:10   ` Discrepancy in IFS handling (zsh is *not* " Felipe Contreras
2023-03-30 14:49     ` Ray Andrews
2023-03-30 15:09       ` Felipe Contreras
2023-03-30 15:31         ` Ray Andrews
2023-03-30 14:57     ` Bart Schaefer
2023-03-30 15:34       ` Felipe Contreras
2023-03-31 20:16   ` Discrepancy in IFS handling (zsh is " Felipe Contreras
2023-04-01 19:20     ` Lawrence Velázquez
2023-03-31 16:38 ` Thomas Paulsen
2023-03-31 20:18   ` Felipe Contreras

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).