zsh-workers
 help / color / mirror / code / Atom feed
From: Martijn Dekker <martijn@inlv.org>
To: Zsh hackers list <zsh-workers@zsh.org>
Subject: '<<-' here-documents oddity with line continuation
Date: Sat, 3 Feb 2018 18:39:13 +0100	[thread overview]
Message-ID: <c552e444-92c4-5035-271b-4c38cc27bbeb@inlv.org> (raw)

zsh has an oddity with here-documents using the '<<-' operator.

(Note: below, <tab> represents a tab character, not the literal string
'<tab>'.)

POSIX says:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_04
| If the redirection operator is "<<-", all leading <tab> characters
| shall be stripped from input lines and the line containing the
| trailing delimiter.

In a construct like

	cat <<-EOF
<tab>	one \
<tab>	two
<tab>	EOF

where the newline after "one \" is backslash-escaped (line
continuation), zsh outputs

one two

whereas all other shells (bash, dash, *ksh, yash, etc.) output
one <tab>two

Superficially, it looks like zsh is the only shell that actually
complies with POSIX, as it strips the leading <tab> characters from all
lines in the here-document, including lines followed by a line ending in
slash.

However, line continuation in POSIXy shells is parsed at a very early
stage, even before token recognition:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02_01
| A <backslash> that is not quoted shall preserve the literal value of
| the following character, with the exception of a <newline>. If a
| <newline> follows the <backslash>, the shell shall interpret this as
| line continuation. The <backslash> and <newline> shall be removed
| before splitting the input into tokens. Since the escaped <newline>
| is removed entirely from the input and is not replaced by any white
| space, it cannot serve as a token separator.

(One funny effect of this: reserved words such as 'while' or 'select'
are not recognised if any part of them is quoted, but they can still be
split over multiple lines using line continuation!)

So it would seem logical that the definition of "input line" used by
POSIX for here-documents is based on lines resulting *after* parsing
line continuation. That would then keep the <tab>s from being stripped
from "continued" lines.

Here's a quick test script (compatible with all POSIX shells). It
outputs "zsh" on zsh and "ok" on all other shells.

tab=$(printf '\t')
lf=$(printf '\nX'); lf=${lf%X}
eval "foo=\$(cat <<-EOF${lf}${tab}1\\${lf}${tab}2${lf}${tab}EOF${lf})"
case $foo in
( 1${tab}2 ) echo ok ;;
( 12 )       echo zsh ;;
( * )        echo NEWBUG ;;
esac

Since zsh's behaviour looks sensible on the face of it, I'm reluctant to
call it a bug, but it is certainly an incompatibility and seems to be
non-compliant with POSIX. Maybe something to fix in emulation?

Thanks,

- M.


             reply	other threads:[~2018-02-03 17:44 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-03 17:39 Martijn Dekker [this message]
2018-02-09  7:01 ` Martijn Dekker
2018-02-09  7:58   ` Martijn Dekker
2018-02-09  9:24   ` Peter Stephenson
2018-02-09 15:27     ` Stephane Chazelas
2018-02-09 16:07       ` Martijn Dekker
2018-02-09 18:19         ` Martijn Dekker
2018-02-12 10:07           ` Peter Stephenson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c552e444-92c4-5035-271b-4c38cc27bbeb@inlv.org \
    --to=martijn@inlv.org \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).