From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7000 invoked by alias); 29 Dec 2016 22:38:10 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 40245 Received: (qmail 12820 invoked from network); 29 Dec 2016 22:38:10 -0000 X-Qmail-Scanner-Diagnostics: from forward2h.cmail.yandex.net by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.99.2/21882. spamassassin: 3.4.1. Clear:RC:0(87.250.230.17):SA:0(-0.7/5.0):. Processed in 1.945524 secs); 29 Dec 2016 22:38:10 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-0.7 required=5.0 tests=FREEMAIL_FROM, RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_PASS,T_DKIM_INVALID autolearn=unavailable autolearn_force=no version=3.4.1 X-Envelope-From: kp-pav@yandex.ru X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: pass (ns1.primenet.com.au: SPF record at _spf-ipv4.yandex.ru designates 87.250.230.17 as permitted sender) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1483050697; bh=lG7ke/W6HQLLJGWEFKhNUXGVWCyvdwTv1hM6tkkI+so=; h=From:To:Cc:In-Reply-To:References:Subject:Message-Id:Date; b=PdsT03Jum6ztByPpVZZ2oJbuDiCVdIFiQVxDwGDeC21pMFt9zR9z5HKm8y5Ba9qPf DNH2FI+i+xGg+/TX96wVQWcClYOx4aAc0DkwMXMN3ZeBVf4Qo3VET4fILwDcToL77b oAkANXHUwO53VVfIJtT5+ZnudEBTD8BfkO4IdXd8= Authentication-Results: mxback6o.mail.yandex.net; dkim=pass header.i=@yandex.ru From: "Nikolay Aleksandrovich Pavlov (ZyX)" To: Bart Schaefer , Dave Yost Cc: zsh workers In-Reply-To: References: Subject: Re: indented heredocs MIME-Version: 1.0 Message-Id: <1302831483050697@web1o.yandex.ru> X-Mailer: Yamail [ http://yandex.ru ] 5.0 Date: Fri, 30 Dec 2016 01:31:37 +0300 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=utf-8 22.12.2016, 01:11, "Bart Schaefer" : > On Wed, Dec 21, 2016 at 11:29 AM, Dave Yost wrote: >>  Surely people have thought of this (Alternative 1): >> >>  0 Wed 10:53:53 ~ >>  205 Z% cat <>    foo >>    bar >>    xx >>  foo >>  bar >>  0 Wed 10:53:53 ~ >>  206 Z% >> >>  but shells don’t do that. > > [...] > >>  I suggested this (Alternative 2), which [Bourne] liked: >> >>  0 Wed 10:53:53 ~ >>  206 Z% cat \ >>    <>    foo >>    bar >>    xx >>  foo >>  bar >>  0 Wed 10:54:10 ~ >>  207 Z% > > I'm not thrilled with this idea because it gives special semantics to > backslash-newline (as well as to leading spaces before "<<") which do > not currently exist. In existing syntax, backslash-newline can simply > be discarded without changing the meaning of the command line, I think > even before tokenization. > > I would propose instead something similar (read on below) to this: > > % cat <<-' xx' >   foo >   bar >   xx > foo > bar > % > > This explicitly quotes the leading space that is to be stripped, so > there is no parsing ambiguity, and it piggybacks on the existing <<- > syntax, merely changing the expected leading space from "all tabs" to > "the leading whitespace on the end marker". This makes changing the indent rather tricky. YAML does better here: amount of stripped indent is either determined based on the first non-blank line (e.g. ``` cat <<| EOF xx x EOF ``` will produce ``` xx x ``` because `xx` is first non-blank and it has 3 leading spaces here and `x` has four, meaning that the result is "\nxx\n\x20x") or is specified explicitly, relative to the indent of the line where block scalar starts (e.g. ``` cat <<|1 EOF xx x EOF ``` will produce ``` xx x ``` because `cat` has single space as indent, `xx` has 3 and it was requested that meaningful content starts with 1 (cat indent) + 1 (`1` before EOF) = 2 spaces, meaning that the result is "\n\x20xx\n\x20\x20x": has one more indent then in previous example). > >>  I don’t think that would help anything. If the parser doesn’t know how to do >>  the new syntax with the existing << operator, you’ll get an error, and if the >>  parser doesn’t know the new operator, you’ll get an error. Same difference. > > It is a consideration that we might prefer that older shells choke on > the new syntax. I think having them choke by failing to find the end > marker is rather worse than having them choke by failing to recognize > the operator -- something that wrongly appears to be the end marker > might appear later in the script if we go your "Alternative 2" route. > > Taken literally, my example above would be accepted by an older shell > and processed without stripping the leading spaces. If that's > unacceptable, we need a different (and currently invalid) replacement > for "<<-" (the only thing that comes to mind is "<<|" which seems a > bad choice). YAML uses `|` and `>` to start block scalars, that’s why I used `|` above (`<<>` seems odd and may be confused with `<>`). Not sure why this should be a bad choice: `|` already has different meanings in different contexts, though only three (pipe, or and array subtraction (`${:|}`)) so far. `-` used in `<<-` has much more meanings: negation/subtraction, stripping leading spaces, prepending `-` to `argv[0]` (i.e. running as login shell in most cases), stdin, rest arguments separator (`echo - -E` outputs just `-E`, though not sure whether it is intentional, `--` in many commands definitely is), close (in `>& -`), range, default (in `${:-}`), dereference (in `*(-/)`), flags leader (in almost any command and also in `$-`). --- `sed`-based alternative is not good for the same reason I would reject any explicitly added spaces. If bother with this at all, it should satisfy the following requirements: - Keep extra indent (or `<<-` would be mostly fine, though better something which also removes spaces). - Allow easy reindenting with simple editor command that reindents (like `<{motion}` and `>{motion}` in Vim) without any additional actions (or `sed` would be mostly fine). - Allow indenting end marker as user likes (or, at least, as the initial indent: one space in the examples): basically I would treat `cat <<| EOF` as something like `{` or `do` and `EOF` as `}` or `done`: semantically they are literal block header and literal block terminator and thus `EOF` should be with the same indent as `cat` and *less* indented then other text which it is not a part of.