From: Peter Stephenson <Peter.Stephenson@csr.com>
To: zsh-workers@zsh.org (Zsh hackers list)
Subject: Re: PATCH: bash-style substrings & subarrays
Date: Tue, 23 Nov 2010 11:14:23 +0000 [thread overview]
Message-ID: <20101123111423.60a04caf@pwslap01u.europe.root.pri> (raw)
In-Reply-To: <201011211702.oALH2ci6003141@pws-pc.ntlworld.com>
On Sun, 21 Nov 2010 17:02:38 +0000
Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> Should ${foo:1} always start 1 character/element beyond the
> first one, regardless which subscripting rules are in use? I'm now
> inclining in that direction.
Nobody commented but this is the change with some more careful
documentation.
Index: Doc/Zsh/expn.yo
===================================================================
RCS file: /cvsroot/zsh/zsh/Doc/Zsh/expn.yo,v
retrieving revision 1.123
diff -p -u -r1.123 expn.yo
--- Doc/Zsh/expn.yo 18 Nov 2010 13:57:19 -0000 1.123
+++ Doc/Zsh/expn.yo 23 Nov 2010 11:09:33 -0000
@@ -588,23 +588,29 @@ remove the non-matched elements).
xitem(tt(${)var(name)tt(:)var(offset)tt(}))
item(tt(${)var(name)tt(:)var(offset)tt(:)var(length)tt(}))(
This syntax gives effects similar to parameter subscripting
-in the form tt($)var(name)tt({)var(offset)tt(,)var(end)tt(}) but in
-a form compatible with other shells.
+in the form tt($)var(name)tt({)var(start)tt(,)var(end)tt(}), but is
+compatible with other shells; note that both var(offset) and var(length)
+are interpreted differently from the components of a subscript.
+
+If var(offset) is non-negative, then if the variable var(name) is a
+scalar substitute the contents starting var(offset) characters from the
+first character of the string, and if var(name) is an array substitute
+elements starting var(offset) elements from the first element. If
+var(length) is given, substitute that many characters or elements,
+otherwise the entire rest of the scalar or array.
+
+A positive var(offset) is always treated as the offset of a character or
+element in var(name) from the first character or element of the array
+(this is different from native zsh subscript notation). Hence 0
+refers to the first character or element regardless of the setting of
+the option tt(KSH_ARRAYS).
-If the variable var(name) is a scalar, substitute the contents
-starting from offset var(offset); if var(name) is an array,
-substitute elements from element var(offset). If var(length) is
-given, substitute that many characters or elements, otherwise the
-entire rest of the scalar or array.
-
-var(offset) is treated similarly to a parameter subscript:
-the offset of the first character or element in var(name)
-is 0 if the option tt(KSH_ARRAYS) is set, else 1; a negative
-subscript counts backwards so that -1 corresponds to the last
-character or element.
+A negative offset counts backwards from the end of the scalar or array,
+so that -1 corresponds to the last character or element, and so on.
var(length) is always treated directly as a length and hence may not be
-negative.
+negative. The option tt(MULTIBYTE) is obeyed, i.e. the offset and length
+count multibyte characters where appropriate.
var(offset) and var(length) undergo the same set of shell substitutions
as for scalar assignment; in addition, they are then subject to arithmetic
@@ -615,19 +621,29 @@ print ${foo: 1 + 2}
print ${foo:$(( 1 + 2))}
print ${foo:$(echo 1 + 2)})
-all have the same effect.
+all have the same effect, extracting the string starting at the fourth
+character of tt($foo) if the substution would otherwise return a scalar,
+or the array starting at the fourth element if tt($foo) would return an
+array. Note that with the option tt(KSH_ARRAYS) tt($foo) always returns
+a scalar (regardless of the use of the offset syntax) and a form
+such as tt($foo[*]:3) is required to extract elements of an array named
+tt(foo).
-Note that if var(offset) is negative, the tt(-) may not appear immediately
+If var(offset) is negative, the tt(-) may not appear immediately
after the tt(:) as this indicates the
-tt(${)var(name)tt(:-)var(word)tt(}) form of substitution; a space
+tt(${)var(name)tt(:-)var(word)tt(}) form of substitution. Instead, a space
may be inserted before the tt(-). Furthermore, neither var(offset) nor
var(length) may begin with an alphabetic character or tt(&) as these are
-used to indicate history-style modifiers.
+used to indicate history-style modifiers. To substitute a value from a
+variable, the recommended approach is to proceed it with a tt($) as this
+signifies the intention (parameter substitution can easily be rendered
+unreadable); however, as arithmetic substitution is performed, the
+expression tt(${var: offs}) does work, retrieving the offset from
+tt($offs).
For further compatibility with other shells there is a special case
-when the tt(KSH_ARRAYS) option is active, as in emulation of
-Bourne-style shells. In this case array subscript 0 usually refers to the
-first element of the array. However, if the substitution refers to the
+for array offset 0. This usually accesses to the
+first element of the array. However, if the substitution refers the
positional parameter array, e.g. tt($@) or tt($*), then offset 0
instead refers to tt($0), offset 1 refers to tt($1), and so on. In
other words, the positional parameter array is effectively extended by
Index: Src/subst.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/subst.c,v
retrieving revision 1.111
diff -p -u -r1.111 subst.c
--- Src/subst.c 20 Nov 2010 23:46:26 -0000 1.111
+++ Src/subst.c 23 Nov 2010 11:09:34 -0000
@@ -1640,7 +1640,7 @@ paramsubst(LinkList l, LinkNode n, char
int subexp;
/*
* If we're referring to the positional parameters, then
- * e.g ${*:1:1} refers to $1 even if KSH_ARRAYS is in effect.
+ * e.g ${*:1:1} refers to $1.
* This is for compatibility.
*/
int horrible_offset_hack = 0;
@@ -2768,16 +2768,15 @@ paramsubst(LinkList l, LinkNode n, char
return NULL;
}
}
- if (!isset(KSHARRAYS) || horrible_offset_hack) {
+ if (horrible_offset_hack) {
/*
* As part of the 'orrible hoffset 'ack,
* (what hare you? Han 'orrible hoffset 'ack,
* sergeant major), if we are given a ksh/bash/POSIX
- * style array which includes offset 0, we use
- * $0.
+ * style positional parameter array which includes
+ * offset 0, we use $0.
*/
- if (isset(KSHARRAYS) && horrible_offset_hack &&
- offset == 0 && isarr) {
+ if (offset == 0 && isarr) {
offset_hack_argzero = 1;
} else if (offset > 0) {
offset--;
Index: Test/D04parameter.ztst
===================================================================
RCS file: /cvsroot/zsh/zsh/Test/D04parameter.ztst,v
retrieving revision 1.46
diff -p -u -r1.46 D04parameter.ztst
--- Test/D04parameter.ztst 18 Nov 2010 13:57:19 -0000 1.46
+++ Test/D04parameter.ztst 23 Nov 2010 11:09:34 -0000
@@ -1268,15 +1268,15 @@
print ${foo:$(echo 3 + 3):`echo 4 - 3`}
print ${foo: -1}
print ${foo: -10}
-0:Bash-style subscripts, scalar
->3456789
+0:Bash-style offsets, scalar
>456789
>56789
>6789
->3
+>789
>4
>5
>6
+>7
>9
>123456789
@@ -1291,15 +1291,15 @@
print ${foo:$(echo 3 + 3):`echo 4 - 3`}
print ${foo: -1}
print ${foo: -10}
-0:Bash-style subscripts, array
->3 4 5 6 7 8 9
+0:Bash-style offsets, array
>4 5 6 7 8 9
>5 6 7 8 9
>6 7 8 9
->3
+>7 8 9
>4
>5
>6
+>7
>9
>1 2 3 4 5 6 7 8 9
@@ -1321,7 +1321,7 @@
echo ${str: -1:1}
}
testfn
-0:Bash-style subscripts, Bourne-style indexing
+0:Bash-style offsets, Bourne-style indexing
>1
>2
>3
--
Peter Stephenson <pws@csr.com> Software Engineer
Tel: +44 (0)1223 692070 Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK
Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
next prev parent reply other threads:[~2010-11-23 11:14 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-17 16:54 Peter Stephenson
2010-11-18 12:44 ` Peter Stephenson
2010-11-19 18:01 ` Bart Schaefer
2010-11-20 21:15 ` Peter Stephenson
2010-11-21 6:34 ` Bart Schaefer
2010-11-21 17:02 ` Peter Stephenson
2010-11-21 20:11 ` Bart Schaefer
2010-11-21 20:51 ` Greg Klanderman
2010-11-23 11:14 ` Peter Stephenson [this message]
2010-11-25 10:35 ` Peter Stephenson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20101123111423.60a04caf@pwslap01u.europe.root.pri \
--to=peter.stephenson@csr.com \
--cc=zsh-workers@zsh.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).