zsh-workers
 help / color / mirror / code / Atom feed
* '<<-' here-documents oddity with line continuation
@ 2018-02-03 17:39 Martijn Dekker
  2018-02-09  7:01 ` Martijn Dekker
  0 siblings, 1 reply; 8+ messages in thread
From: Martijn Dekker @ 2018-02-03 17:39 UTC (permalink / raw)
  To: Zsh hackers list

zsh has an oddity with here-documents using the '<<-' operator.

(Note: below, <tab> represents a tab character, not the literal string
'<tab>'.)

POSIX says:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_04
| If the redirection operator is "<<-", all leading <tab> characters
| shall be stripped from input lines and the line containing the
| trailing delimiter.

In a construct like

	cat <<-EOF
<tab>	one \
<tab>	two
<tab>	EOF

where the newline after "one \" is backslash-escaped (line
continuation), zsh outputs

one two

whereas all other shells (bash, dash, *ksh, yash, etc.) output
one <tab>two

Superficially, it looks like zsh is the only shell that actually
complies with POSIX, as it strips the leading <tab> characters from all
lines in the here-document, including lines followed by a line ending in
slash.

However, line continuation in POSIXy shells is parsed at a very early
stage, even before token recognition:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02_01
| A <backslash> that is not quoted shall preserve the literal value of
| the following character, with the exception of a <newline>. If a
| <newline> follows the <backslash>, the shell shall interpret this as
| line continuation. The <backslash> and <newline> shall be removed
| before splitting the input into tokens. Since the escaped <newline>
| is removed entirely from the input and is not replaced by any white
| space, it cannot serve as a token separator.

(One funny effect of this: reserved words such as 'while' or 'select'
are not recognised if any part of them is quoted, but they can still be
split over multiple lines using line continuation!)

So it would seem logical that the definition of "input line" used by
POSIX for here-documents is based on lines resulting *after* parsing
line continuation. That would then keep the <tab>s from being stripped
from "continued" lines.

Here's a quick test script (compatible with all POSIX shells). It
outputs "zsh" on zsh and "ok" on all other shells.

tab=$(printf '\t')
lf=$(printf '\nX'); lf=${lf%X}
eval "foo=\$(cat <<-EOF${lf}${tab}1\\${lf}${tab}2${lf}${tab}EOF${lf})"
case $foo in
( 1${tab}2 ) echo ok ;;
( 12 )       echo zsh ;;
( * )        echo NEWBUG ;;
esac

Since zsh's behaviour looks sensible on the face of it, I'm reluctant to
call it a bug, but it is certainly an incompatibility and seems to be
non-compliant with POSIX. Maybe something to fix in emulation?

Thanks,

- M.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-02-12 10:07 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-03 17:39 '<<-' here-documents oddity with line continuation Martijn Dekker
2018-02-09  7:01 ` Martijn Dekker
2018-02-09  7:58   ` Martijn Dekker
2018-02-09  9:24   ` Peter Stephenson
2018-02-09 15:27     ` Stephane Chazelas
2018-02-09 16:07       ` Martijn Dekker
2018-02-09 18:19         ` Martijn Dekker
2018-02-12 10:07           ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).