zsh-users
 help / color / mirror / code / Atom feed
* [Review Request] Arrays and their usage
@ 2021-05-30 23:24 René Neumann
  2021-05-31  0:28 ` Mikael Magnusson
  2021-05-31 17:36 ` Stephane Chazelas
  0 siblings, 2 replies; 10+ messages in thread
From: René Neumann @ 2021-05-30 23:24 UTC (permalink / raw)
  To: zsh-users

Hi *,

I always feel a little unsure around arrays in zsh. I've currently used 
the following:

    local pkgs=( `makepkg --printsrcinfo | \
                  sed -n -e 's/pkgname = \(.*\)$/\1/p'` ) 

    pkgs=(${pkgs/#/"$DATABASE/"})
    sudo pacman -S $pkgs

Intention: Generate a list of packages, prepend "$DATABASE/", and pass 
each modified package as a separate argument to pacman.

Question: Is this the correct/zshonic way of doing this?
I personally find the change of behavior by adding ( ) too easy to 
overlook. Is there an alternative with ${(...)}?

Thanks,
René


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Review Request] Arrays and their usage
  2021-05-30 23:24 [Review Request] Arrays and their usage René Neumann
@ 2021-05-31  0:28 ` Mikael Magnusson
  2021-05-31  4:24   ` Bart Schaefer
  2021-05-31 19:41   ` René Neumann
  2021-05-31 17:36 ` Stephane Chazelas
  1 sibling, 2 replies; 10+ messages in thread
From: Mikael Magnusson @ 2021-05-31  0:28 UTC (permalink / raw)
  To: René Neumann; +Cc: zsh-users

On 5/31/21, René Neumann <lists@necoro.eu> wrote:
> Hi *,
>
> I always feel a little unsure around arrays in zsh. I've currently used
> the following:
>
>     local pkgs=( `makepkg --printsrcinfo | \
>                   sed -n -e 's/pkgname = \(.*\)$/\1/p'` )
>
>     pkgs=(${pkgs/#/"$DATABASE/"})
>     sudo pacman -S $pkgs
>
> Intention: Generate a list of packages, prepend "$DATABASE/", and pass
> each modified package as a separate argument to pacman.
>
> Question: Is this the correct/zshonic way of doing this?
> I personally find the change of behavior by adding ( ) too easy to
> overlook. Is there an alternative with ${(...)}?

I don't know what the exact output of your makepkg pipeline can be,
but unquoted `` will split words on all whitespace, so that is a bit
of a danger. I would probably have written it like this

local pkgs=( ${(f)"$(makepkg --blabla | sed blabla)"} )
sudo pacman -S $DATABASE/$^pkgs

That's assuming that the packages from the pipeline are separated by newlines.

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Review Request] Arrays and their usage
  2021-05-31  0:28 ` Mikael Magnusson
@ 2021-05-31  4:24   ` Bart Schaefer
  2021-05-31 19:41   ` René Neumann
  1 sibling, 0 replies; 10+ messages in thread
From: Bart Schaefer @ 2021-05-31  4:24 UTC (permalink / raw)
  To: René Neumann; +Cc: Zsh Users

> On 5/31/21, René Neumann <lists@necoro.eu> wrote:
> >
> > Question: Is this the correct/zshonic way of doing this?

Using arrays, rather than repeatedly building strings that must be
split on spaces, is definitely the "zshonic" idiom.

> > I personally find the change of behavior by adding ( ) too easy to
> > overlook. Is there an alternative with ${(...)}?

I'm uncertain which change of behavior you mean?

On Sun, May 30, 2021 at 5:28 PM Mikael Magnusson <mikachu@gmail.com> wrote:
>
> I would probably have written it like this
>
> local pkgs=( ${(f)"$(makepkg --blabla | sed blabla)"} )
> sudo pacman -S $DATABASE/$^pkgs

Of course it's possible to do this without the "pkgs" array:

sudo pacman -S $DATABASE/${^${(f)"$(makepkg --blabla | sed blabla)"}}

But you should do whichever you'll be able to understand when you come
back to it later.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Review Request] Arrays and their usage
  2021-05-30 23:24 [Review Request] Arrays and their usage René Neumann
  2021-05-31  0:28 ` Mikael Magnusson
@ 2021-05-31 17:36 ` Stephane Chazelas
  2021-05-31 20:04   ` René Neumann
  1 sibling, 1 reply; 10+ messages in thread
From: Stephane Chazelas @ 2021-05-31 17:36 UTC (permalink / raw)
  To: René Neumann; +Cc: zsh-users

2021-05-31 01:24:36 +0200, René Neumann:
[...]
> I always feel a little unsure around arrays in zsh. I've currently used the
> following:
> 
>    local pkgs=( `makepkg --printsrcinfo | \
>                  sed -n -e 's/pkgname = \(.*\)$/\1/p'` )

Here's my take on answering this (repeating some of what has
already been said).

First, I'd say: `...` form of command substitution should really
be banned these days. That's really a broken heritage from the
Bourne shell. There's not good reason to keep using it these
days. The main problem with it is the handling of backslash
inside it (and the awkward nesting and the fact that it's less
legible, etc...).

The sed command could be written:

  sed -n 's/pkgname = //p'

Strictly speaking, it's not exactly equivalent with some sed
implementations when the input contains sequences of bytes that
don't form valid text in the current locale. For instance, in a
locale that uses UTF-8 as its charmap, on the output of:

printf 'foopkgname = bar\200baz\n'

yours would output foobar<LF> while mine would output
foobar<0x80>baz<LF> with GNU sed as the . regexp operator only
matches on *characters* so would not match on that 0x80 byte
which doesn't form a valid character in UTF-8.

Then, leaving `...` (or the better $(...)) unquoted performs
IFS-splitting, so you're left with the same kind of conundrum as
you get in POSIX shells when you leave any form of expansion
($param, $((arith)) as well there!) unquoted though at least
zsh doesn't perform globbing there:

either you're happy that the default value of IFS (space, tab,
newline, nul) is good enough for all splitting, or you need to
set it every time you use it (in zsh, that's for unquoted $(...)
or the $=param operator).

Here, you can do

pkgs=(
  $(
    makepkg --printsrcinfo |
     sed -n 's/pkgname = //p'
  )
)

With the default value of $IFS if you know the values don't
contain any of the $IFS characters. If that can't be guaranteed,
you'd need:

IFS=$'\n'
pkgs=(
  $(
    makepkg --printsrcinfo |
     sed -n 's/pkgname = //p'
  )
)

But in zsh, rather than using IFS-splitting which is cumbersome
to use as it relised on a global parameter, you can use explicit
splitting operators, using the "f" (short for "ps:\n:")
parameter expansion flag. Then you don't have to worry about
what $IFS may contain at the time:

pkgs=(
  ${(f)"$(
    makepkg --printsrcinfo |
     sed -n 's/pkgname = //p'
  )"}
)

Here, we're quoteing the $(...) to disable IFS-splitting and use
the "f" flag to do splitting on line feeds. Note that empty
elements are discarded.

>    pkgs=(${pkgs/#/"$DATABASE/"})

The more idiomatic zsh variant to that ksh syntax would be:

pkgs=( $DATABASE/$^pkgs )

(same as rc's pkgs = ( $DATABASE/$pkgs )).

-- 
Stephane


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Review Request] Arrays and their usage
  2021-05-31  0:28 ` Mikael Magnusson
  2021-05-31  4:24   ` Bart Schaefer
@ 2021-05-31 19:41   ` René Neumann
  1 sibling, 0 replies; 10+ messages in thread
From: René Neumann @ 2021-05-31 19:41 UTC (permalink / raw)
  To: zsh-users

> I don't know what the exact output of your makepkg pipeline can be,
> but unquoted `` will split words on all whitespace, so that is a bit
> of a danger. I would probably have written it like this
> 
> local pkgs=( ${(f)"$(makepkg --blabla | sed blabla)"} )

There shouldn't be any spaces in it, but better safe than sorry. Even 
though the array + (f) + quoting seems a handful, which probably ends up 
in some head scratching later on. But well...

> sudo pacman -S $DATABASE/$^pkgs

Oooh... the ${^pkgs} is nice. Haven't known this one.
Thanks!

- René


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Review Request] Arrays and their usage
  2021-05-31 17:36 ` Stephane Chazelas
@ 2021-05-31 20:04   ` René Neumann
  2021-05-31 21:42     ` Bart Schaefer
                       ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: René Neumann @ 2021-05-31 20:04 UTC (permalink / raw)
  To: zsh-users

> Here's my take on answering this (repeating some of what has
> already been said).

Greatly appreciated!

> First, I'd say: `...` form of command substitution should really
> be banned these days. That's really a broken heritage from the
> Bourne shell. There's not good reason to keep using it these
> days. The main problem with it is the handling of backslash
> inside it (and the awkward nesting and the fact that it's less
> legible, etc...).

Fun fact: I prefer `...`, because I find it more legible, especially in 
the x=`cmd y z` form¹. But TIL, that `` and $() are not interchangable. 
Up to now, I thought the one is syntactic sugar for the other.

> The sed command could be written:
> 
>    sed -n 's/pkgname = //p'

Good catch, thanks
> Then, leaving `...` (or the better $(...)) unquoted performs
> IFS-splitting, so you're left with the same kind of conundrum as
> you get in POSIX shells when you leave any form of expansion
> ($param, $((arith)) as well there!) unquoted though at least
> zsh doesn't perform globbing there:
> 
> [...]
> 
> pkgs=(
>    ${(f)"$(
>      makepkg --printsrcinfo |
>       sed -n 's/pkgname = //p'
>    )"}
> )

Thank you for this detailed explanation. Not relying on IFS seems a good 
thing to do, although the rest of the script probably does here and there².
Also thanks for this example of code structuring. /me likes.
(NB though: The linebreak for the two pipe elements was inserted for 
this email only, with me hoping, that backslash newline was the correct 
thing to do ;))

> The more idiomatic zsh variant to that ksh syntax would be:
> 
> pkgs=( $DATABASE/$^pkgs )
> 
> (same as rc's pkgs = ( $DATABASE/$pkgs )).

What does 'rc' stand for?

Again, thanks for this effort!

- René

¹ Longer story: $() is easily confused with ${}. Also, `...` is more "in 
the background" and I let my highlighting make it clear to me, that I'm 
in a subcommand. I would always prefer $() in large complex expressions 
though, because.

² Although one could argue that setting IFS to something else than $'\n' 
WILL break a lot of stuff, so one can expect it to be sane.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Review Request] Arrays and their usage
  2021-05-31 20:04   ` René Neumann
@ 2021-05-31 21:42     ` Bart Schaefer
  2021-05-31 21:43     ` Lawrence Velázquez
  2021-06-01  5:59     ` Stephane Chazelas
  2 siblings, 0 replies; 10+ messages in thread
From: Bart Schaefer @ 2021-05-31 21:42 UTC (permalink / raw)
  To: René Neumann; +Cc: Zsh Users

On Mon, May 31, 2021 at 1:04 PM René Neumann <lists@necoro.eu> wrote:
>
> What does 'rc' stand for?

It's the name of another shell.  I seem to recall it is short for "run
command" or something like that.  Similar to the "rc" on the tail of
"bashrc" or "zshrc".

Features borrowed from the rc shell are the reason zsh has options
RC_QUOTES and RC_EXPAND_PARAM ($^param is the shortcut for turning on
the latter for just one expansion).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Review Request] Arrays and their usage
  2021-05-31 20:04   ` René Neumann
  2021-05-31 21:42     ` Bart Schaefer
@ 2021-05-31 21:43     ` Lawrence Velázquez
  2021-05-31 22:05       ` René Neumann
  2021-06-01  5:59     ` Stephane Chazelas
  2 siblings, 1 reply; 10+ messages in thread
From: Lawrence Velázquez @ 2021-05-31 21:43 UTC (permalink / raw)
  To: René Neumann; +Cc: zsh-users

On Mon, May 31, 2021, at 4:04 PM, René Neumann wrote:
> Fun fact: I prefer `...`, because I find it more legible, especially in 
> the x=`cmd y z` form¹.

If you're doing command substitution in a context where you have
to suppress word-splitting yourself, then you're choosing between

    outer_cmd foo "`inner_cmd bar`" baz

and

    outer_cmd foo "$(inner_cmd bar)" baz

Many (including me) would consider "`...`" less legible.

> (NB though: The linebreak for the two pipe elements was inserted for 
> this email only, with me hoping, that backslash newline was the correct 
> thing to do ;))

The backslash isn't necessary if the vertical bar is at the end of
the line.

> What does 'rc' stand for?

rc is the shell for Research Unix Version 10 and Plan 9.

https://en.wikipedia.org/wiki/Rc

-- 
vq


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Review Request] Arrays and their usage
  2021-05-31 21:43     ` Lawrence Velázquez
@ 2021-05-31 22:05       ` René Neumann
  0 siblings, 0 replies; 10+ messages in thread
From: René Neumann @ 2021-05-31 22:05 UTC (permalink / raw)
  To: zsh-users

Am 31.05.21 um 23:43 schrieb Lawrence Velázquez:
> On Mon, May 31, 2021, at 4:04 PM, René Neumann wrote:
>> Fun fact: I prefer `...`, because I find it more legible, especially in
>> the x=`cmd y z` form¹.
> 
> If you're doing command substitution in a context where you have
> to suppress word-splitting yourself, then you're choosing between
> 
>      outer_cmd foo "`inner_cmd bar`" baz
> 
> and
> 
>      outer_cmd foo "$(inner_cmd bar)" baz
> 
> Many (including me) would consider "`...`" less legible.
> 

I'm with you on that. As soon as quotes (or ${}) are involved, `...` loses.

But, at least in my code, simple stuff like
    local tmpdir=`mktemp -d`
is the majority of command calls, and there it helps (me) :)

- René


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Review Request] Arrays and their usage
  2021-05-31 20:04   ` René Neumann
  2021-05-31 21:42     ` Bart Schaefer
  2021-05-31 21:43     ` Lawrence Velázquez
@ 2021-06-01  5:59     ` Stephane Chazelas
  2 siblings, 0 replies; 10+ messages in thread
From: Stephane Chazelas @ 2021-06-01  5:59 UTC (permalink / raw)
  To: René Neumann; +Cc: zsh-users

2021-05-31 22:04:44 +0200, René Neumann:
[...]
> >      makepkg --printsrcinfo |
> >       sed -n 's/pkgname = //p'
[...]
> (NB though: The linebreak for the two pipe elements was inserted for this
> email only, with me hoping, that backslash newline was the correct thing to
> do ;))
[...]

\ in front a a newline removes the newline before the parser
interprets the code but here, a newline is fine, same as after
||, &&, ;, &, if, then, else, while, do... You can even have
comments:

  set -o pipefail
  print -P - $var | # prompt
                    # expansion

    sed 's/^[[:blank:]]*//' | # trim leading blanks

    wc -c ||

    die "pipeline failed"


At the prompt of an interactive shell, you'll notice that if you
press enter after a pipe, you see:

$ echo foo |
pipe>

Which tells you the parser is still expecting a command after
that |.

Where you might argue it's inconsistent is that you don't get
something similar after redirection operators:

$ echo foo >
zsh: parse error near `\n'

Or after "for"/"select"/"repeat"

$ for
zsh: parse error near `\n'

It's not specific to zsh though.

-- 
Stephane


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-06-01  6:00 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-30 23:24 [Review Request] Arrays and their usage René Neumann
2021-05-31  0:28 ` Mikael Magnusson
2021-05-31  4:24   ` Bart Schaefer
2021-05-31 19:41   ` René Neumann
2021-05-31 17:36 ` Stephane Chazelas
2021-05-31 20:04   ` René Neumann
2021-05-31 21:42     ` Bart Schaefer
2021-05-31 21:43     ` Lawrence Velázquez
2021-05-31 22:05       ` René Neumann
2021-06-01  5:59     ` Stephane Chazelas

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).