more splitting travails

zsh-users
 help / color / mirror / code / Atom feed

* more splitting travails
@ 2024-01-12 19:05 Ray Andrews
  2024-01-12 19:19 ` Bart Schaefer
  0 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-12 19:05 UTC (permalink / raw)
  To: Zsh Users

There's nothing harder than really getting on top of splitting issues :(

I have a file with blank lines in it, I read it into a variable then 
pass that variable to a function " % n_list $var ".  Inside the function 
I assign to a local variable 'List'.  The file/variable has five lines 
of data and four blank lines.

     #    local List=( "$@" ) # ONE element but $List prints correctly 
with spaces (nine lines).
     #    local List=( "${=@}" ) # FIVE elements but blank lines are 
gone (prints five lines).

How can I end up with nine elements and it prints correctly too? If it 
were a 'normal' variable I know this works:

     List=$some_variable[@]

... but:

     List=$@[@]

... doesn't work and looks sick and twisted anyway.

Other various attempts give me the number of elements being the 
character count.  Weirdly there's places where I iterate over all the 
lines in 'List' and it *counts* nine, but only displays five.

I could pound away at this but I'm going to just ask the people who 
understand these things.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: more splitting travails
  2024-01-12 19:05 more splitting travails Ray Andrews
@ 2024-01-12 19:19 ` Bart Schaefer
  2024-01-12 19:56   ` Ray Andrews
       [not found]   ` <CAA=-s3zc5a+PA7draaA=FmXtwU9K8RrHbb70HbQN8MhmuXTYrQ@mail.gmail.com>
  0 siblings, 2 replies; 83+ messages in thread
From: Bart Schaefer @ 2024-01-12 19:19 UTC (permalink / raw)
  To: Ray Andrews; +Cc: Zsh Users

On Fri, Jan 12, 2024 at 11:05 AM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
> There's nothing harder than really getting on top of splitting issues :(
>
> I have a file with blank lines in it, I read it into a variable

This is probably the place you're getting messed up.  HOW do you read
it into a variable?

> Other various attempts give me the number of elements being the
> character count.  Weirdly there's places where I iterate over all the
> lines in 'List' and it *counts* nine, but only displays five.

Here you've probably done an earlier step right, but are forgetting to
quote what you're passing to print (or whatever "display" means).  The
default for arrays on the command line is to remove empty elements
unless quoted.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: more splitting travails
  2024-01-12 19:19 ` Bart Schaefer
@ 2024-01-12 19:56   ` Ray Andrews
  2024-01-12 20:07     ` Mark J. Reed
       [not found]   ` <CAA=-s3zc5a+PA7draaA=FmXtwU9K8RrHbb70HbQN8MhmuXTYrQ@mail.gmail.com>
  1 sibling, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-12 19:56 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 438 bytes --]


On 2024-01-12 11:19, Bart Schaefer wrote:
> On Fri, Jan 12, 2024 at 11:05 AM Ray Andrews<rayandrews@eastlink.ca>  wrote:
> This is probably the place you're getting messed up. HOW do you read
> it into a variable?
1 /aWorking/Zsh/Source/Wk 0 % vvar=( ${"$(<testfile2)"} ) 1 
/aWorking/Zsh/Source/Wk 0 % print -l $vvar[@] one two three four five 
six seven

eight

... so it' looks right. Mind, I know there can be invisible problems.


[-- Attachment #2: Type: text/html, Size: 1222 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Fwd: more splitting travails
       [not found]   ` <CAA=-s3zc5a+PA7draaA=FmXtwU9K8RrHbb70HbQN8MhmuXTYrQ@mail.gmail.com>
@ 2024-01-12 20:03     ` Bart Schaefer
  2024-01-12 20:32       ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Bart Schaefer @ 2024-01-12 20:03 UTC (permalink / raw)
  To: Zsh Users

[-- Attachment #1: Type: text/plain, Size: 2787 bytes --]

This was sent privately to me but pretty obviously intended for the list.

---------- Forwarded message ---------
From: Mark J. Reed <markjreed@gmail.com>
Date: Fri, Jan 12, 2024, 11:54 AM
Subject: Re: more splitting travails
To: Bart Schaefer <schaefer@brasslantern.com>

We talked offlist about how zsh inherited the paradoxical-looking and
"${array[@]}" from earlier shells,  though it works differently such that a
simple $array usually does what you want?

Well, here's where the exception to that "usually" comes in.

Although the elements of an array expanded as simply $array don't get
generally re-split on whitespace the way an unquoted ${array[@]} or
${array[*]} does in bash, any elements that are completely empty do get
lost:

        *zsh% *array=(one two '' '' five )
    *zsh% *echo $#array
    5
    *zsh%* printf '%s\n' $array # King Arthur, is that you?
    one
    two
    five
    *zsh% *

So that's where you need the wonky inherited syntax if you want to include
empty values:

    *zsh%* printf '%s\n' "${array[@]}"
    one
    two

    five
    *zsh%*

Bart's question is also salient, however:

    > I have a file with blank lines in it, I read it into a variable

How do you read it in, and into what kind of variable? If you want to read
a file into an array of lines, preserving empty lines, you can do this:

    lines=("${(@f)$(<~"$filename")}")

 The stuff in parentheses just inside the ${ are parameter expansion flags,
which you can read about on the zshexpn man page.  In particular, the f
flag splits the value on newlines, while the @ flag does the same thing as
"${array[@]}" (which could therefore also be written as "${(@)array}"), but
also works in other kinds of expansions, such as $(<filename) used here
(which means  "get the contents of the file").

On Fri, Jan 12, 2024 at 2:19 PM Bart Schaefer <schaefer@brasslantern.com>
wrote:

> On Fri, Jan 12, 2024 at 11:05 AM Ray Andrews <rayandrews@eastlink.ca>
> wrote:
> >
> > There's nothing harder than really getting on top of splitting issues :(
> >
> > I have a file with blank lines in it, I read it into a variable
>
> This is probably the place you're getting messed up.  HOW do you read
> it into a variable?
>
> > Other various attempts give me the number of elements being the
> > character count.  Weirdly there's places where I iterate over all the
> > lines in 'List' and it *counts* nine, but only displays five.
>
> Here you've probably done an earlier step right, but are forgetting to
> quote what you're passing to print (or whatever "display" means).  The
> default for arrays on the command line is to remove empty elements
> unless quoted.
>
>

-- 
Mark J. Reed <markjreed@gmail.com>

[-- Attachment #2: Type: text/html, Size: 5576 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: more splitting travails
  2024-01-12 19:56   ` Ray Andrews
@ 2024-01-12 20:07     ` Mark J. Reed
  0 siblings, 0 replies; 83+ messages in thread
From: Mark J. Reed @ 2024-01-12 20:07 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

[-- Attachment #1: Type: text/plain, Size: 840 bytes --]

On Fri, Jan 12, 2024 at 2:57 PM Ray Andrews <rayandrews@eastlink.ca> wrote:

> % vvar=( ${"$(<testfile2)"} )
>

OK, that's interesting, you're using array assignment syntax ( vvar=( ... )
) but you're only assigning a single value; after that assignment, $vvar[1]
contains the entire contents of the file.  The blank lines aren't getting
lost here because they're inside that one string, rather than being their
own standalone values.

If you want to get the lines of the file into individual elements of an
array, you can use the (f) expansion flag to split the file contents on
newline. However, that will remove blank lines unless you also include the
(@) flag to preserve them.  Which gives the recipe I mentioned in my last
message:

    % vvar=("${(@f)"$(<testfile2)"}")

-- 
Mark J. Reed <markjreed@gmail.com>

[-- Attachment #2: Type: text/html, Size: 1532 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-12 20:03     ` Fwd: " Bart Schaefer
@ 2024-01-12 20:32       ` Ray Andrews
  2024-01-12 20:50         ` Roman Perepelitsa
  2024-01-12 20:51         ` Bart Schaefer
  0 siblings, 2 replies; 83+ messages in thread
From: Ray Andrews @ 2024-01-12 20:32 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 1099 bytes --]


On 2024-01-12 12:03, Bart Schaefer wrote:
> This was sent privately to me but pretty obviously intended for the list.
I got a private copy from Mark, but with a cc to the list.
>
>
> Well, here's where the exception to that "usually" comes in.
>
cut ... I'll need time to digest all this ...


Yesss.  When it's captured as one element I see the newlines because are 
all there in the single element! ... it looks split but it isn't !! So I 
think I'm getting closer to what I want but actually I'm further way.

     % vvar=("${(@f)"$(<testfile2)"}")


1 /aWorking/Zsh/Source/Wk 0 % vvar=("${(@f)"$(<testfile2)"}") ; print -l 
"$vvar[@]"; print "\n ... and the number of elements is: ..... $#vvar \n 
Ta-Taaaa :-)
"
one two
three

four

five six seven


eight

  ... and the number of elements is: ..... 9
  Ta-Taaaa :-)

... and that's not as Byzantine as it looks: " (@f) : read: "include 
blank lines, and split on newlines (including empty lines) "   ... Yes?

Yes, more invisible differences.  Old printout and new printout look 
identical but internally they're chalk and cheese.



[-- Attachment #2: Type: text/html, Size: 2351 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-12 20:32       ` Ray Andrews
@ 2024-01-12 20:50         ` Roman Perepelitsa
  2024-01-13  2:12           ` Ray Andrews
  2024-01-12 20:51         ` Bart Schaefer
  1 sibling, 1 reply; 83+ messages in thread
From: Roman Perepelitsa @ 2024-01-12 20:50 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

On Fri, Jan 12, 2024 at 9:33 PM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
> Yesss.  When it's captured as one element I see the newlines because are all there in the single element! ... it looks split but it isn't !!

A reminder: https://www.zsh.org/mla/users/2023/msg00444.html

Forgetting basic tools that you can use to understand your own code is
making your life a lot more difficult than it could otherwise be. It
might be a good idea to print it on a piece of paper and stick to a
wall. Or find some other way to stop stepping on the same rake over
and over.

Roman.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-12 20:32       ` Ray Andrews
  2024-01-12 20:50         ` Roman Perepelitsa
@ 2024-01-12 20:51         ` Bart Schaefer
  2024-01-12 21:57           ` Mark J. Reed
  2024-01-13  2:19           ` Fwd: more splitting travails Ray Andrews
  1 sibling, 2 replies; 83+ messages in thread
From: Bart Schaefer @ 2024-01-12 20:51 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

On Fri, Jan 12, 2024 at 12:32 PM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
>     % vvar=("${(@f)"$(<testfile2)"}")
>
> ... and that's not as Byzantine as it looks: " (@f) : read: "include blank lines, and split on newlines (including empty lines) "   ... Yes?

Close.  It's the inner double quotes around $(<...) that preserve the
exact contents of the file.**  (f) means split on newlines and (@)
means "this is an array, don't join it into a string".  Then the
outermost set of double quotes means "include empty array elements"
(which, because of (f), is empty lines in this case).

** Except that $(...) trims off TRAILING newlines, so if the last
line(s) of the file are blank they'll vanish.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-12 20:51         ` Bart Schaefer
@ 2024-01-12 21:57           ` Mark J. Reed
  2024-01-12 22:09             ` Bart Schaefer
  2024-01-13  2:19           ` Fwd: more splitting travails Ray Andrews
  1 sibling, 1 reply; 83+ messages in thread
From: Mark J. Reed @ 2024-01-12 21:57 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 316 bytes --]

On Fri, Jan 12, 2024 at 15:52 Bart Schaefer <schaefer@brasslantern.com>
wrote:

> ** Except that $(...) trims off TRAILING newlines, so if the last
> line(s) of the file are blank they'll vanish.


Ooh, good catch. I didn't realize *multiple* new lines would get chomped,
either; I thought it was just the last one.

[-- Attachment #2: Type: text/html, Size: 655 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-12 21:57           ` Mark J. Reed
@ 2024-01-12 22:09             ` Bart Schaefer
  2024-01-13  3:06               ` Ray Andrews
  2024-01-13  5:39               ` Roman Perepelitsa
  0 siblings, 2 replies; 83+ messages in thread
From: Bart Schaefer @ 2024-01-12 22:09 UTC (permalink / raw)
  To: zsh-users

On Fri, Jan 12, 2024 at 1:57 PM Mark J. Reed <markjreed@gmail.com> wrote:
>
> On Fri, Jan 12, 2024 at 15:52 Bart Schaefer <schaefer@brasslantern.com> wrote:
>>
>> ** Except that $(...) trims off TRAILING newlines>
>
>I didn't realize *multiple* new lines would get chomped, either; I thought it was just the last one.

Yeah, oddly, there's no straightforward way to get an unaltered file
into a shell variable.  Even
  read -rd '' < file
trims off trailing newlines.  The only somewhat-obvious way is to use
  zmodload zsh/mapfile
  var=$mapfile[file]
but up until a recent dev version, on platforms that don't implement
memmap that just sneakily reverts to $(<file).

I'm expecting Roman or someone to point out a different trick I've forgotten.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-12 20:50         ` Roman Perepelitsa
@ 2024-01-13  2:12           ` Ray Andrews
  0 siblings, 0 replies; 83+ messages in thread
From: Ray Andrews @ 2024-01-13  2:12 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 975 bytes --]

On 2024-01-12 12:50, Roman Perepelitsa wrote:
> Forgetting basic tools that you can use to understand your own code is
> making your life a lot more difficult than it could otherwise be. It
> might be a good idea to print it on a piece of paper and stick to a
> wall. Or find some other way to stop stepping on the same rake over
> and over.

That's deserved.  I tend to see these things in the heat of some battle 
and I remember that there IS a way, but can't remember what it is.  Also 
I play with zsh in binges -- I can go a year without touching her, and I 
forget a whole lot that was only tenuously learned.  BTW, how does one 
search the archives?  I'd look there first, especially when I clearly 
remember that I had some issue before even if I don't remember how it 
resolved.   Just googling I often come across one of my own posts but a 
direct search in the archives would be best.  Meantime I'll have 
'typeset -p' tattooed onto the back of my hand.

[-- Attachment #2: Type: text/html, Size: 1486 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-12 20:51         ` Bart Schaefer
  2024-01-12 21:57           ` Mark J. Reed
@ 2024-01-13  2:19           ` Ray Andrews
  2024-01-13  3:59             ` Bart Schaefer
  1 sibling, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-13  2:19 UTC (permalink / raw)
  To: zsh-users


On 2024-01-12 12:51, Bart Schaefer wrote:
> On Fri, Jan 12, 2024 at 12:32 PM Ray Andrews <rayandrews@eastlink.ca> wrote:
>>      % vvar=("${(@f)"$(<testfile2)"}")
>>
>> ... and that's not as Byzantine as it looks: " (@f) : read: "include blank lines, and split on newlines (including empty lines) "   ... Yes?
> Close.  It's the inner double quotes around $(<...) that preserve the
> exact contents of the file.**
Yes
> (f) means split on newlines
Yup
> and (@)
> means "this is an array, don't join it into a string".
There's a mistake: I thought the @ was what meant 'all lines'.
>    Then the
> outermost set of double quotes means "include empty array elements"
> (which, because of (f), is empty lines in this case).

Nuts, if the innermost quotes say 'exact contents' then that seems 
redundant.

And I thought the outermost parenthesis said 'this is an array', not the '@'

> ** Except that $(...) trims off TRAILING newlines, so if the last
> line(s) of the file are blank they'll vanish.

Easy when you know how.




^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-12 22:09             ` Bart Schaefer
@ 2024-01-13  3:06               ` Ray Andrews
  2024-01-13  3:36                 ` Ray Andrews
  2024-01-13  5:39               ` Roman Perepelitsa
  1 sibling, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-13  3:06 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 2546 bytes --]


On 2024-01-12 14:09, Bart Schaefer wrote:
> Yeah, oddly, there's no straightforward way to get an unaltered file
> into a shell variable.  Even
>    read -rd '' < file
> trims off trailing newlines.

In my situation that would be welcome, tho on principle I do wish the 
culture was: 'If I want something done to my data I'll ask for it'.

> I'm expecting Roman or someone to point out a different trick I've forgotten.

Roman would know.  I have my utility working quite well,   where the 
input isn't an array it's very simple:

     % varis variable

Without the dollar sign.  It evals it internally *after* capturing the 
name. You put the utility inside some function to trace variable values, 
it prints out like:

/
/

    function test1 ()
    {

    local my_variable="Shall I compare thee to a summer\'s day?"
    varis my_variable "some comment"
    }

    0 /aWorking/Zsh/Source/Wk 1 % . test1; test1

    ...

    test1():4 in: test1:6 > "$my_variable" is: |Shall I compare thee to
    a summer's day?| some comment - 18:33:06

    ... It reports name of function, logical line, running file,
    physical line, name of variable, content, some comment if present
    and the time.  I love it.  Anyway, it was all good except that
    arrays print on multiple lines if requested and they were not
    showing any blank lines, which I object to on principle -- so I'm
    trying to fix that.  It's 99% there:

    0 /aWorking/Zsh/Source/Wk 1 % print -l "$vvar[@]"     # What it
    really looks like.
    one two
    three

    four

    five six seven


    eight

    0 /aWorking/Zsh/Source/Wk 1 % varis ,m vvar ! this is comment     #
    squashed array, usually fine but ...

    zsh():15 in: zsh:15 > "$vvar" is:
    one two
    three
    four
    five six seven
    eight
    ! this is comment - 18:41:06

    0 /aWorking/Zsh/Source/Wk 1 % varis ,m "${(@f)vvar}" ! this is
    comment     # ... when I'm determined to see the blanks:

    zsh():16 in: zsh:16 > RAW:
    one two
    three

    four

    five six seven


    eight




    this is comment - 18:41:13


... as discussed I have to twist the shell's arm to get my file into a 
variable as it actually is.  In practice it's not a big deal but on 
principle it would be nice to avoid "  "${(@f) var}" "

This sure looks good:

    0 /aWorking/Zsh/Source/Wk 0 % read -rd '' < testfile2 aaa

    0 /aWorking/Zsh/Source/Wk 0 % print -l $aaa     # Sheesh, it's so
    simple here.
    one two
    three

    four

    five six seven


    eight


[-- Attachment #2: Type: text/html, Size: 4152 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-13  3:06               ` Ray Andrews
@ 2024-01-13  3:36                 ` Ray Andrews
  2024-01-13  4:07                   ` Bart Schaefer
  0 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-13  3:36 UTC (permalink / raw)
  To: zsh-users

On 2024-01-12 19:06, Ray Andrews wrote:

  I'm going to read more often, it's  good for the mind and the sanity:

1 /aWorking/Zsh/Source/Wk 0 % read -rd '' < testfile2 aaa

1 /aWorking/Zsh/Source/Wk 0 % print $aaa    # No " ${aaa[@]} " required!
one two
three

four

five six seven

eight

1 /aWorking/Zsh/Source/Wk 0 % varis ,m aaa ! this is comment    # I can 
pass the name of the array without the dollar, as I usually do, and:

zsh():327 in: zsh:327 > "$aaa" is:
one two
three

four

five six seven

eight
! this is comment - 19:27:02

... I get my blanks.  All this is how God meant it to be.  No grief.   
It couldn't be simpler.

And if 'read' had an option to keep even trailing blanks, one would have 
squeaky-clean literal, perfect copies.  It's a matter of principle.

Thank you gentlemen.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-13  2:19           ` Fwd: more splitting travails Ray Andrews
@ 2024-01-13  3:59             ` Bart Schaefer
  2024-01-13  4:54               ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Bart Schaefer @ 2024-01-13  3:59 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

On Fri, Jan 12, 2024 at 6:19 PM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
> On 2024-01-12 12:51, Bart Schaefer wrote:
> > Close.  It's the inner double quotes around $(<...) that preserve the
> > exact contents of the file.**
> >    Then the
> > outermost set of double quotes means "include empty array elements"
> > (which, because of (f), is empty lines in this case).
>
> Nuts, if the innermost quotes say 'exact contents' then that seems
> redundant.

The innermost quotes are undone by (@f).  Working from the inside out,
past the (f) you no longer have the exact contents.

> And I thought the outermost parenthesis said 'this is an array', not the '@'

The outermost parens say that you are ASSIGNING TO an array, not WHAT
you are assigning to it.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-13  3:36                 ` Ray Andrews
@ 2024-01-13  4:07                   ` Bart Schaefer
  0 siblings, 0 replies; 83+ messages in thread
From: Bart Schaefer @ 2024-01-13  4:07 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

On Fri, Jan 12, 2024 at 7:36 PM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
>   I'm going to read more often, it's  good for the mind and the sanity:
>
> 1 /aWorking/Zsh/Source/Wk 0 % read -rd '' < testfile2 aaa
>
> 1 /aWorking/Zsh/Source/Wk 0 % print $aaa    # No " ${aaa[@]} " required!

OK, but now you're back to having a scalar (non-array) variable that
has the newlines embedded in it. Apparently the who thing with (f) and
an array was a snipe hunt?

> 1 /aWorking/Zsh/Source/Wk 0 % varis ,m aaa ! this is comment    # I can
> pass the name of the array without the dollar, as I usually do, and:

But ... aaa is not the name of an array, after the foregoing "read".
Or if it somehow is, we're back to an array of one element where that
element has all the newlines within.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-13  3:59             ` Bart Schaefer
@ 2024-01-13  4:54               ` Ray Andrews
  2024-01-13  5:51                 ` Roman Perepelitsa
  0 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-13  4:54 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 945 bytes --]


On 2024-01-12 19:59, Bart Schaefer wrote:
> The outermost parens say that you are ASSIGNING TO an array, not WHAT
> you are assigning to it.

Ah!  Output, not input.  Obvious now that you mention it.


> But ... aaa is not the name of an array, after the foregoing "read".
Or if it somehow is, we're back to an array of one element where that
element has all the newlines within.

Well, Roman will be glad to hear that I was just about to do a typeset -p on both versions to see the difference.  I'll have to think about it, but for now what matters is just that I see it correctly, even if that's via a cheat -- it shows the array as it is even if via a somewhat strange route.


0 /aWorking/Zsh/Source/Wk 0 % typeset -p vvar
typeset -a vvar=( 'one two' three '' four '' 'five six seven' '' '' eight )

0 /aWorking/Zsh/Source/Wk 0 % typeset -p aaa
typeset aaa=$'one two\nthree\n\nfour\n\nfive six seven\n\n\neight'

I won't forget again.


>

[-- Attachment #2: Type: text/html, Size: 1687 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-12 22:09             ` Bart Schaefer
  2024-01-13  3:06               ` Ray Andrews
@ 2024-01-13  5:39               ` Roman Perepelitsa
  2024-01-13 20:02                 ` Slurping a file (was: more spllitting travails) Bart Schaefer
  1 sibling, 1 reply; 83+ messages in thread
From: Roman Perepelitsa @ 2024-01-13  5:39 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-users

On Fri, Jan 12, 2024 at 11:09 PM Bart Schaefer
<schaefer@brasslantern.com> wrote:
>
> Yeah, oddly, there's no straightforward way to get an unaltered file
> into a shell variable.  Even
>   read -rd '' < file
> trims off trailing newlines.  The only somewhat-obvious way is to use
>   zmodload zsh/mapfile
>   var=$mapfile[file]
> but up until a recent dev version, on platforms that don't implement
> memmap that just sneakily reverts to $(<file).
>
> I'm expecting Roman or someone to point out a different trick I've forgotten.

The standard trick here is to print an extra character after the
content of the file and then remove it. This works when capturing
stdout of commands, too.

    printf '\n\nA\n\nB\n\n' >file

    # read the FULL file content
    file_content=${"$(<file && print -n .)"%.} || return

    # read the FULL command stdout
    cmd_stdout=${"$(cat file && print -n .)"%.} || return

    typeset -p file_content cmd_stdout

Unfortunately, these constructs require an extra fork. In
performance-sensitive code the best solution is mapfile or sysread.
They have comparable performance but sysread has the advantage of
being able to read from any file descriptor rather than just a file.

    # Reads stdin until EOF and stores all read content in REPLY.
    # On error, leaves REPLY unmodified and returns 1.
    function slurp() {
      emulate -L zsh -o no_multibyte
      zmodload zsh/system || return
      local content
      while true; do
        sysread 'content[$#content+1]' && continue
        (( $? == 5 )) || return
        break
      done
      typeset -g REPLY=$content
    }

    printf '\n\nA\n\nB\n\n' >file

    slurp <file || return
    file_content=$REPLY

    slurp < <(cat file) || return
    cmd_stdout=$REPLY

    typeset -p file_content cmd_stdout

Roman.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-13  4:54               ` Ray Andrews
@ 2024-01-13  5:51                 ` Roman Perepelitsa
  2024-01-13 16:40                   ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Roman Perepelitsa @ 2024-01-13  5:51 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

On Sat, Jan 13, 2024 at 5:55 AM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
> Well, Roman will be glad to hear that I was just about to do a typeset -p on both versions to see the difference.
>
> 0 /aWorking/Zsh/Source/Wk 0 % typeset -p vvar
> typeset -a vvar=( 'one two' three '' four '' 'five six seven' '' '' eight )
>
> 0 /aWorking/Zsh/Source/Wk 0 % typeset -p aaa
> typeset aaa=$'one two\nthree\n\nfour\n\nfive six seven\n\n\neight'
>
> I won't forget again.

I *am* glad.

Whenever I start  learning a new programming language, the very first
thing I always research is how to do "printf debugging" in it. This is
what the technique of inserting manual instrumentation in your code is
called. If you know how to printf debug, you can see what your
programs do, which in turn allows you to understand the programming
language by experimentation.

So, `typeset -p` isn't just one of the multitude of tricks that zsh
practitioners have in their toolbox. It's not in the same category as
"how to read the full content of a file" or "how to generate a secure
random number". No, it's a meta tool that lets you understand your own
code. You absolutely must remember it.

Roman.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-13  5:51                 ` Roman Perepelitsa
@ 2024-01-13 16:40                   ` Ray Andrews
  2024-01-13 18:22                     ` Bart Schaefer
  0 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-13 16:40 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 1604 bytes --]

On 2024-01-12 21:51, Roman Perepelitsa wrote:
> I *am* glad.
> ... No, it's a meta tool that lets you understand your own
> code. You absolutely must remember it.

I quite understand!  To know the trade, the first thing is to know the 
tools.  Ironically my 'varis' is precisely a souped up: "printf 
debugging" tool and I will not endure that arrays with blanks will not 
show me the bleeding blanks. If I want the blanks unshown I will *ask* 
for them to be unshown. If I copy an array I want an exact duplicate 
*unless* I specify otherwise. If I'm typing a letter don't correct my 
spelling automatically, I'll *ask* for spellcheck if I want it. 
Sometimes you make a speling mestake on purpose. [/micro-rant]

As a self-taught zsheller working by myself I haven't the advantage of 
the 'up close and personal' chance to absorb culture and craft from 
experts like yourself. It's a bit like trying to learn Finnish over the 
internet talking to people who only speak Finnish when I myself don't 
yet know one word of it. But slowly, slowly things come clear. You guys 
are wonderfully patient. A single sentence can work wonders. Things you 
know so deeply you don't think they need to be mentioned. Eg. Bart 
saying that if it's not in quotes, or an expansion, then it's active. 
And if it is an expansion then '(P)' makes it active again. It blows 
away the fog. (And naturally there will be 17.02 exceptions -- 18.1 on 
Tuesday -- but it's a head start.)

BTW, speaking of tools, wouldn't it be nice to be able to step thru the 
code, one line at a time?  Could do that with C back in the day.

[-- Attachment #2: Type: text/html, Size: 2161 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-13 16:40                   ` Ray Andrews
@ 2024-01-13 18:22                     ` Bart Schaefer
  2024-01-13 19:08                       ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Bart Schaefer @ 2024-01-13 18:22 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

On Sat, Jan 13, 2024 at 8:40 AM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
> And if it is an expansion then '(P)' makes it active again.

That would be tilde, not (P).

> BTW, speaking of tools, wouldn't it be nice to be able to step thru the code, one line at a time?

http://rocky.github.io/zshdb/

Never really tried it myself, but the author has sought advice from
the lists on occasion.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Fwd: more splitting travails
  2024-01-13 18:22                     ` Bart Schaefer
@ 2024-01-13 19:08                       ` Ray Andrews
  0 siblings, 0 replies; 83+ messages in thread
From: Ray Andrews @ 2024-01-13 19:08 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-users

[-- Attachment #1: Type: text/plain, Size: 1113 bytes --]

On 2024-01-13 10:22, Bart Schaefer wrote:
> That would be tilde, not (P).
I was about to explore the difference.  I have them conflated in my head 
at the moment.
> http://rocky.github.io/zshdb/
> Never really tried it myself, but the author has sought advice from
> the lists on occasion.

I'll check it out.  Tx.

So, bottom line of this thread is that things have to be sent forward 
with the right massaging to make sure blanks are not removed: > 
"${(@f)vvar}" < ... trying to retrieve the blanks after the fact is 
impossible cuz they're gone.  Only thing left to wish for is someway of 
making it the default that all assignments will include *everything*.  
Oh, and don't be fooled by a single element that happens to have 
newlines within it -- they might look the same but they ain't.  > % read 
  read -rd '' < testfile2 aaa < ... is a false friend because '$aaa' 
will be a single element.  Likewise > : ${vvar2::=$vvar} < is a false 
friend because it looses blanks.   Seems a verbatim exact assignment is 
not really provided, which seems very strange.  But Roman has shown a 
solution.

[-- Attachment #2: Type: text/html, Size: 2005 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Slurping a file (was: more spllitting travails)
  2024-01-13  5:39               ` Roman Perepelitsa
@ 2024-01-13 20:02                 ` Bart Schaefer
  2024-01-13 20:07                   ` Slurping a file Ray Andrews
  2024-01-14 10:34                   ` Slurping a file (was: more spllitting travails) Roman Perepelitsa
  0 siblings, 2 replies; 83+ messages in thread
From: Bart Schaefer @ 2024-01-13 20:02 UTC (permalink / raw)
  To: Zsh Users

On Fri, Jan 12, 2024 at 9:39 PM Roman Perepelitsa
<roman.perepelitsa@gmail.com> wrote:
>
> The standard trick here is to print an extra character after the
> content of the file and then remove it. This works when capturing
> stdout of commands, too.

This actually led me to the best (?) solution:

  IFS= read -rd '' file_content <file

If IFS is not set, newlines are not stripped.  Of course this still
only works if the file does not contain nul bytes, the -d delimiter
has to be something that's not in the file.  Roman's sysread approach
doesn't care (and sysread is exactly the thing I forgot that I was
expecting Roman to remind me of, although we both seem to have
forgotten IFS).

For commands,

  command | IFS= read -rd '' cmd_stdout

also works, thanks to zsh's fork-to-the-left semantics.  This would
not work in other shells.

>     # Reads stdin until EOF and stores all read content in REPLY.
>     # On error, leaves REPLY unmodified and returns 1.
>     function slurp() {
>       emulate -L zsh -o no_multibyte
>       zmodload zsh/system || return
>       local content
>       while true; do
>         sysread 'content[$#content+1]' && continue

You can speed this up a little by using the -c option to sysread to
get back a count of bytes read, and accumulate that in another var to
avoid having to re-calculate $#content on every loop.

>         (( $? == 5 )) || return
>         break
>       done
>       typeset -g REPLY=$content
>     }

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-13 20:02                 ` Slurping a file (was: more spllitting travails) Bart Schaefer
@ 2024-01-13 20:07                   ` Ray Andrews
  2024-01-14  5:03                     ` zcurses mouse delay (not Re: Slurping a file) Bart Schaefer
  2024-01-14 10:34                   ` Slurping a file (was: more spllitting travails) Roman Perepelitsa
  1 sibling, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-13 20:07 UTC (permalink / raw)
  To: zsh-users


On 2024-01-13 12:02, Bart Schaefer wrote:


Hey, just a quick and dirty question on another topic:

playing with zcurses:

      zcurses mouse delay 2000

... I get an error msg saying that an integer value is required. No 
matter what number I throw in there I get the same msg. Manual:

--zcurses mouse [ delay num | [+|-]motion ]
The subcommand mouse can be used to configure the use of the mouse. There is
no window argument; mouse options are global.  `zcurses mouse' with no
arguments returns status 0 if mouse handling is possible, else status 1.
Otherwise, the possible arguments (which may be combined on the same command
line) are as follows. delay num sets the maximum delay in milliseconds
between press and release events to be considered as a click; the value 
0 dis‐
ables click resolution,

No number works, not even '0' ... but I'm sure that worked for me in the 
past.





^ permalink raw reply	[flat|nested] 83+ messages in thread

* zcurses mouse delay (not Re: Slurping a file)
  2024-01-13 20:07                   ` Slurping a file Ray Andrews
@ 2024-01-14  5:03                     ` Bart Schaefer
  2024-01-14  5:35                       ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Bart Schaefer @ 2024-01-14  5:03 UTC (permalink / raw)
  To: Ray Andrews; +Cc: Zsh Users

Please don't reply/re-use an unrelated subject for a new question.

On Sat, Jan 13, 2024 at 12:07 PM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
>       zcurses mouse delay 2000
>
> ... I get an error msg saying that an integer value is required. No
> matter what number I throw in there I get the same msg.

As far as I can tell, either this has never worked, or something
unexpected changed in the implementation of zstrtol() sometime after
2007.

The test for successful read of a number is wrong, and there's also a
typo in the call to print an error message in the event of an
unrecognized mouse subcommand.

Patch going separately to zsh-workers.  Sorry, Ray, but you're out of
luck without a rebuild.  I'll Cc you on the patch, it should apply to
any zsh/curses module since zsh 4.3.5 of 2008-02-01.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: zcurses mouse delay (not Re: Slurping a file)
  2024-01-14  5:03                     ` zcurses mouse delay (not Re: Slurping a file) Bart Schaefer
@ 2024-01-14  5:35                       ` Ray Andrews
  0 siblings, 0 replies; 83+ messages in thread
From: Ray Andrews @ 2024-01-14  5:35 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 1594 bytes --]

On 2024-01-13 21:03, Bart Schaefer wrote:
> Please don't reply/re-use an unrelated subject for a new question.
Pardon, thought I'd just sneak it in there.  Lazy.
> Patch going separately to zsh-workers.  Sorry, Ray, but you're out of
> luck without a rebuild.  I'll Cc you on the patch, it should apply to
> any zsh/curses module since zsh 4.3.5 of 2008-02-01.
>
It's time I started using the latest builds.  Usta do it, then I lost 
all  my notes during switch to new computer so now I'm with whatever 
Debian is offering (5.8).  BTW speaking of things that don't work -- 
from the manual:

- Any events that occurred as separate items; usually there will be just 
one.
An event consists of PRESSED, RELEASED, CLICKED, DOUBLE_CLICKED or 
TRIPLE_CLICKED

... AFAICT 'RELEASED' isn't there, a fast click reports 'CLICKED' a slow 
press (no release) reports 'PRESSED' and on release the 'mouse' array is 
null.  I worked around it, but since the 'fast click' time is said to be 
6ms by default, and I'm often a slow clicker, zcurses takes that as two 
separate mouse events -- which is why 'zcurses mouse delay' was of 
interest.  It's not hard to cope with but 'RELEASED' would seem more 
kosher than a null array.  Oh, and if it was ever of use -- say some 
notion of dragging something with the mouse -- the array's holding of 
the cursor position could be another reason the array should not be null 
on release.  We don't think of zsh as doing that kind of thing, but 
clicking and dragging something in a zcurses window could be an 
interesting party trick.  Watch out Thunar!

[-- Attachment #2: Type: text/html, Size: 2409 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-01-13 20:02                 ` Slurping a file (was: more spllitting travails) Bart Schaefer
  2024-01-13 20:07                   ` Slurping a file Ray Andrews
@ 2024-01-14 10:34                   ` Roman Perepelitsa
  2024-01-14 10:57                     ` Roman Perepelitsa
                                       ` (4 more replies)
  1 sibling, 5 replies; 83+ messages in thread
From: Roman Perepelitsa @ 2024-01-14 10:34 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

On Sat, Jan 13, 2024 at 9:02 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Fri, Jan 12, 2024 at 9:39 PM Roman Perepelitsa
> <roman.perepelitsa@gmail.com> wrote:
> >
> > The standard trick here is to print an extra character after the
> > content of the file and then remove it. This works when capturing
> > stdout of commands, too.
>
> This actually led me to the best (?) solution:
>
>   IFS= read -rd '' file_content <file
>
> If IFS is not set, newlines are not stripped.  Of course this still
> only works if the file does not contain nul bytes, the -d delimiter
> has to be something that's not in the file.

In addition to being unable to read files with nul bytes, this
solution suffers from additional drawbacks:

- It's impossible to distinguish EOF from I/O error.
- It's slow when reading from non-file file descriptors.
- It's slower than the optimized sysread-based slurp (see below) for
larger files.

Conversely, sysread-based slurp can read the full content of any file
descriptor quickly and report success if and only if it manages to
read until EOF. Its only downside is that it can be up to 2x slower
for tiny files.

> >         sysread 'content[$#content+1]' && continue
>
> You can speed this up a little by using the -c option to sysread to
> get back a count of bytes read, and accumulate that in another var to
> avoid having to re-calculate $#content on every loop.

Indeed, this would be faster but the code would still have quadratic
time complexity. Here's a version with linear time complexity:

    function slurp() {
      emulate -L zsh -o no_multibyte
      zmodload zsh/system || return
      local -a content
      local -i i
      while true; do
        sysread 'content[++i]' && continue
        (( $? == 5 )) || return
        break
      done
      typeset -g REPLY=${(j::)content}
    }

(I am not certain it's linear. I've benchmarked it for files up to
512MB in size, and it is linear in practice.)

I've benchmarked read and slurp for reading files and pipes.

    emulate -L zsh -o pipe_fail -o no_multibyte
    zmodload zsh/datetime || return

    local -i i len

    function bench() {
      local REPLY
      local -F start end
      start=EPOCHREALTIME
      eval $1
      end=EPOCHREALTIME
      (( $#REPLY == len )) || return
      printf ' %10d' '1e6 * (end - start)' || return
    }

    printf '%2s %7s %10s %10s %10s %10s\n' \
           n size read-file slurp-file read-pipe slurp-pipe || return

    for ((i = 1; i != 26; ++i)); do
      len='i == 1 ? 0 : 1 << (i - 2)'
      head -c $len </dev/urandom | tr '\0' x >$i || return
      <$i >/dev/null || return

      printf '%2d %7d' i len || return

      # read-file
      bench 'IFS= read -rd "" <$i' || return

      # slurp-file
      bench 'slurp <$i || return' || return

      # read-pipe
      bench '<$i | IFS= read -rd ""' || return

      # slurp-pipe
      bench '<$i | slurp || return' || return

      print || return
    done

Here's the output (best viewed with a fixed-width font):

     n    size  read-file slurp-file  read-pipe slurp-pipe
     1       0         74        107       1908       2068
     2       1         52        126       2182       1931
     3       2         52        111       1863       2471
     4       4         65        150       2097       2028
     5       8         58        159       1849       2073
     6      16         61        118       1934       2089
     7      32         73        123       1867       2235
     8      64         73        120       2067       2033
     9     128        102        122       1904       2172
    10     256        129        115       2025       2114
    11     512        254        123       2070       2089
    12    1024        372        137       2441       2190
    13    2048        762        156       2624       2132
    14    4096       1306        177       3488       2500
    15    8192       2486        263       4446       2540
    16   16384       4718        390       6565       3140
    17   32768      13919        953      13524       4323
    18   65536      20965       1195      21532       5195
    19  131072      41741       2124     127089      11325
    20  262144      81777       4214     461189      12515
    21  524288     161077       8342    1068388      21149
    22 1048576     312015      16330    2321501      37422
    23 2097152     606270      31752    4773261      67625
    24 4194304    1291121      61298   10253544     154340
    25 8388608    2534093     135694   19551480     264041

The second column is the file size, ranging from 0 to 8MB. After that
we have four columns listing the amount of time it takes to read the
file in various ways, in microseconds.

Observations from the data:

- All routines appear to have linear time complexity.
- For small files, read is up to twice as fast as slurp.
- For files over 256 bytes in size, slurp is faster.
- With slurp, the time it takes to read from a pipe is about 2x
  compared to reading from a file. With read, the penalty is 8x.
- For an 8MB file, slurp is 20 times faster than read when reading
  from a file, and 70 times faster when reading from a pipe.

I am tempted to declare slurp the winner here.

Roman.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-01-14 10:34                   ` Slurping a file (was: more spllitting travails) Roman Perepelitsa
@ 2024-01-14 10:57                     ` Roman Perepelitsa
  2024-01-14 15:36                     ` Slurping a file Ray Andrews
                                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 83+ messages in thread
From: Roman Perepelitsa @ 2024-01-14 10:57 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

On Sun, Jan 14, 2024 at 11:34 AM Roman Perepelitsa
<roman.perepelitsa@gmail.com> wrote:
>
> I've benchmarked read and slurp for reading files and pipes.
>
> [...]
>
> Observations from the data:
>
> - All routines appear to have linear time complexity.
> - For small files, read is up to twice as fast as slurp.
> - For files over 256 bytes in size, slurp is faster.
> - With slurp, the time it takes to read from a pipe is about 2x
>   compared to reading from a file. With read, the penalty is 8x.
> - For an 8MB file, slurp is 20 times faster than read when reading
>   from a file, and 70 times faster when reading from a pipe.

I've also benchmarked mapfile. As expected, it is the fastest method
of reading a file. For small files, slurp is up to 5 times slower, but
for larger files the difference is rather small: for a 64kB file slurp
is only 25% slower.

Roman.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-14 10:34                   ` Slurping a file (was: more spllitting travails) Roman Perepelitsa
  2024-01-14 10:57                     ` Roman Perepelitsa
@ 2024-01-14 15:36                     ` Ray Andrews
  2024-01-14 15:41                       ` Roman Perepelitsa
  2024-01-14 20:13                       ` Lawrence Velázquez
  2024-01-14 22:09                     ` Slurping a file (was: more spllitting travails) Bart Schaefer
                                       ` (2 subsequent siblings)
  4 siblings, 2 replies; 83+ messages in thread
From: Ray Andrews @ 2024-01-14 15:36 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 2182 bytes --]


On 2024-01-14 02:34, Roman Perepelitsa wrote:

That's a master class.  I'm going to save that post and absorb it even 
if it takes a month.

> Indeed, this would be faster but the code would still have quadratictime complexity. Here's a version with linear time complexity:

Quadratic? In this kind of situation I'd understand linear, geometric 
and exponential. What's quadratic? Hmmm .... to guess ... well yeah, you 
must mean some combination -- an exponential vector + a geometric vector 
+ a linear vector, yes? My math skills aren't up to it, but they say 
there's a way of throwing a progression like those into an engine that 
returns the quadratic that best fits. Great fun if one could find it. 
Actually I did that once in Geogebra.

Please let me know when yourself and Bart have a final cut of slurp -- 
for a guy like me who thinks a copy should be a copy, not an edit, slurp 
seems an essential tool on principal. I can't understand how something 
so basic could not be built into the shell. Seems to me that at a first 
estimation one might want:

1) Full exact copy -- byte identical including blanks, newlines, 
trailing stuff and naughty chars. slurp.

2) Byte identical up to the last 'real' character -- yes, strip off any 
trailing garbage. This would be: > % copy=( "${(@f)original}" ) < , yes? 
Good enough 99% of the time.

3) As it is now -- no blanks. I myself have arrays in which blanks must 
be preserved ...

Hey ... How does all that work with associative arrays? It's one thing 
to remove a blank/empty element in a normal array, but in an A array ... 
even when there's no value, the keyword is still there, no? I'm thinking 
that A arrays must be auto-immune to removal of blank values, yes? But ...

main[currentE]=1 # Absolute index of the highlighted element (returned 
to calling fce on ENTER or mouse click). main[hilighted]=1 # RELATIVE 
index of the highlighted element relative to top of current page.

main[hilighted]= # Don't even think about doing this!

... I've noticed that if initializing an A array, there *must* be a 
value otherwise the pairs go out of whack. So, really, there never is an 
empty value and the issue is moot.

[-- Attachment #2: Type: text/html, Size: 3193 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-14 15:36                     ` Slurping a file Ray Andrews
@ 2024-01-14 15:41                       ` Roman Perepelitsa
  2024-01-14 20:13                       ` Lawrence Velázquez
  1 sibling, 0 replies; 83+ messages in thread
From: Roman Perepelitsa @ 2024-01-14 15:41 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

On Sun, Jan 14, 2024 at 4:37 PM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
> On 2024-01-14 02:34, Roman Perepelitsa wrote:
>
> > Indeed, this would be faster but the code would still have quadratic time complexity. Here's a version with linear time complexity:
>
> Quadratic? In this kind of situation I'd understand linear, geometric and exponential. What's quadratic?

O(n^2) where n is the file size.

> main[hilighted]= # Don't even think about doing this!
>
> ... I've noticed that if initializing an A array, there *must* be a value otherwise the pairs go out of whack.

You must be using `print $main` to see what's going on. Use a better tool.

Roman.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-14 15:36                     ` Slurping a file Ray Andrews
  2024-01-14 15:41                       ` Roman Perepelitsa
@ 2024-01-14 20:13                       ` Lawrence Velázquez
  2024-01-15  0:03                         ` Ray Andrews
  1 sibling, 1 reply; 83+ messages in thread
From: Lawrence Velázquez @ 2024-01-14 20:13 UTC (permalink / raw)
  To: zsh-users

On Sun, Jan 14, 2024, at 10:36 AM, Ray Andrews wrote:
>> Indeed, this would be faster but the code would still have quadratic time complexity. Here's a version with linear time complexity:
>
> Quadratic? In this kind of situation I'd understand linear, geometric 
> and exponential. What's quadratic? Hmmm .... to guess ... well yeah, 
> you must mean some combination -- an exponential vector + a geometric 
> vector + a linear vector, yes?

No.  Vectors have nothing to do with it.

https://en.wikipedia.org/wiki/Time_complexity#Table_of_common_time_complexities

> I can't understand how 
> something so basic could not be built into the shell.

Unix tools are traditionally designed to work on text, one
(LF-delimited) line at a time, possibly split into fields based on
whitespace.  Using them differently can be a pain, if it's even
possible.

> Seems to me that 
> at a first estimation one might want: 
>
> 1) Full exact copy -- byte identical including blanks, newlines, 
> trailing stuff and naughty chars. slurp.

Reading an entire file into a variable is often -- not always, but
often -- a red flag that suggests the entire script is poorly
designed.  It uses more memory and makes it difficult to feed the
data to external utilties (although the latter is less of a concern
with zsh in particular, which relies on external utilities less
than other shells do).

> Hey ... How does all that work with associative arrays? It's one thing 
> to remove a blank/empty element in a normal array, but in an A array 
> ... even when there's no value, the keyword is still there, no? I'm 
> thinking that A arrays must be auto-immune to removal of blank values, 
> yes?

Elision of empty values is a property of unquoted expansion.  It
has NOTHING to do with variable types.

-- 
vq

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-01-14 10:34                   ` Slurping a file (was: more spllitting travails) Roman Perepelitsa
  2024-01-14 10:57                     ` Roman Perepelitsa
  2024-01-14 15:36                     ` Slurping a file Ray Andrews
@ 2024-01-14 22:09                     ` Bart Schaefer
  2024-01-15  8:53                       ` Roman Perepelitsa
  2024-01-15  2:00                     ` Slurping a file (was: more spllitting travails) Bart Schaefer
  2024-02-10 20:48                     ` Stephane Chazelas
  4 siblings, 1 reply; 83+ messages in thread
From: Bart Schaefer @ 2024-01-14 22:09 UTC (permalink / raw)
  To: Zsh Users

On Sun, Jan 14, 2024 at 2:34 AM Roman Perepelitsa
<roman.perepelitsa@gmail.com> wrote:
>
> On Sat, Jan 13, 2024 at 9:02 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
> >
> >   IFS= read -rd '' file_content <file
>
> In addition to being unable to read files with nul bytes, this
> solution suffers from additional drawbacks:
>
> - It's impossible to distinguish EOF from I/O error.

Pretty sure you can do that by examining $ERRNO on nonzero status?

> - It's slow when reading from non-file file descriptors.
> - It's slower than the optimized sysread-based slurp (see below) for
> larger files.

I'm curious whether
  setopt nomultibyte
  read -u 0 -k 8192 ...
is actually that much slower in a slurp-like loop.

> Here's a version with linear time complexity:
>
>     function slurp() {
>       emulate -L zsh -o no_multibyte
>       zmodload zsh/system || return
>       local -a content
>       local -i i
>       while true; do
>         sysread 'content[++i]' && continue

Another thought:  Use -c count option to get number of bytes read and
-s $size option to specify buffer size.  If (( $count == $size )) then
double $size for the next read.

>         (( $? == 5 )) || return
>         break
>       done
>       typeset -g REPLY=${(j::)content}

Why the typeset here?  Just assign?

>     }


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-14 20:13                       ` Lawrence Velázquez
@ 2024-01-15  0:03                         ` Ray Andrews
  2024-01-15  0:55                           ` Empty element elision and associative arrays (was Re: Slurping a file) Bart Schaefer
  0 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-15  0:03 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 1011 bytes --]


On 2024-01-14 12:13, Lawrence Velázquez wrote:
> https://en.wikipedia.org/wiki/Time_complexity#Table_of_common_time_complexities
Yikes!  I'm sorry I asked.  Thanks tho, but that's scary stuff.
>> Seems to me that
>> at a first estimation one might want:
>>
>> 1) Full exact copy -- byte identical including blanks, newlines,
>> trailing stuff and naughty chars. slurp.
> Reading an entire file into a variable is often -- not always, but
> often -- a red flag that suggests the entire script is poorly
> designed.  It uses more memory and makes it difficult to feed the
> data to external utilties (although the latter is less of a concern
> with zsh in particular, which relies on external utilities less
> than other shells do).

I appreciate  the historical context.  Much is understandable when you know

the history.

> Elision of empty values is a property of unquoted expansion.  It
> has NOTHING to do with variable types.

I'll take another look, thanks.  I may have gotten my wires crossed.


>

[-- Attachment #2: Type: text/html, Size: 2273 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-15  0:03                         ` Ray Andrews
@ 2024-01-15  0:55                           ` Bart Schaefer
  2024-01-15  4:09                             ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Bart Schaefer @ 2024-01-15  0:55 UTC (permalink / raw)
  To: Ray Andrews; +Cc: Zsh Users

On Sun, Jan 14, 2024 at 4:03 PM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
>> Elision of empty values is a property of unquoted expansion.  It
>> has NOTHING to do with variable types.
>
> I'll take another look, thanks.  I may have gotten my wires crossed.

When assigning to an associative array ...

% typeset -A asc=( $something )

... the expansion of $something has to yield an even number of
"words", yes.  But remember the previous lesson about those outer
parens -- they only mean that the thing to the left is an array, they
don't matter to what's inside those parens.  So if $something is an
array with empty elements, those elements are going to be elided,
which might leave you with an odd number of words and break the
assignment -- or possibly worse, turn some values into keys and some
keys into values.

Another thing that may be confusing you is that associative arrays
only expand their values by default, and an empty value will be elided
just like any other empty array element.

% typeset -A asc=( one 1 two 2 three 3 nil '' )
% typeset -p asc
typeset -A asc=( [nil]='' [one]=1 [three]=3 [two]=2 )
% printf "<<%s>>\n" ${asc}
<<1>>
<<2>>
<<3>>
% printf "<<%s>>\n" "${asc[@]}"
<<1>>
<<2>>
<<3>>
<<>>

(output may vary because associative arrays are not ordered).

Also, an empty key can have a value:

% asc+=( '' empty )
% typeset -p asc
typeset -A asc=( ['']=empty [nil]='' [one]=1 [three]=3 [two]=2 )
% printf "<<%s>>\n" ${asc}
<<empty>>
<<1>>
<<2>>
<<3>>

But if you reference the keys, the empty one will be elided when not quoted:

% printf "<<%s>>\n" ${(k)asc}
<<one>>
<<two>>
<<three>>
<<nil>>
% printf "<<%s>>\n" "${(@k)asc}"
<<>>
<<one>>
<<two>>
<<three>>
<<nil>>

Returning to the original example, that means if  you're copying one
associative array to another, you need to copy both the keys and the
values, and quote it:

% typeset -A asc=( "${(@kv)otherasc}" )

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-01-14 10:34                   ` Slurping a file (was: more spllitting travails) Roman Perepelitsa
                                       ` (2 preceding siblings ...)
  2024-01-14 22:09                     ` Slurping a file (was: more spllitting travails) Bart Schaefer
@ 2024-01-15  2:00                     ` Bart Schaefer
  2024-01-15  4:24                       ` Slurping a file Ray Andrews
                                         ` (2 more replies)
  2024-02-10 20:48                     ` Stephane Chazelas
  4 siblings, 3 replies; 83+ messages in thread
From: Bart Schaefer @ 2024-01-15  2:00 UTC (permalink / raw)
  To: Roman Perepelitsa; +Cc: Zsh Users

On Sun, Jan 14, 2024 at 2:34 AM Roman Perepelitsa
<roman.perepelitsa@gmail.com> wrote:
>
> I've benchmarked read and slurp for reading files and pipes.

Sadly there's another utility named "slurp":

slurp
  cli utility to select a region in a Wayland compositor

If we can find a different function name, this would be a good
candidate to add to Functions/Misc/.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-15  0:55                           ` Empty element elision and associative arrays (was Re: Slurping a file) Bart Schaefer
@ 2024-01-15  4:09                             ` Ray Andrews
  2024-01-15  7:01                               ` Lawrence Velázquez
  2024-01-18 22:34                               ` Bart Schaefer
  0 siblings, 2 replies; 83+ messages in thread
From: Ray Andrews @ 2024-01-15  4:09 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 1696 bytes --]


On 2024-01-14 16:55, Bart Schaefer wrote:
> On Sun, Jan 14, 2024 at 4:03 PM Ray Andrews<rayandrews@eastlink.ca>  wrote:
> When assigning to an associative array ...
>
> % typeset -A asc=( $something )
>
> ... the expansion of $something has to yield an even number of
> "words", yes.  But remember the previous lesson about those outer
> parens -- they only mean that the thing to the left is an array, they
> don't matter to what's inside those parens.  So if $something is an
> array with empty elements, those elements are going to be elided,
> which might leave you with an odd number of words and break the
> assignment -- or possibly worse, turn some values into keys and some
> keys into values.

Exactly as I understand it.  A arrays are 'not very clever' -- they 
don't try to protect you from yourself, there's no internal place 
holding for a missing value, and as you say things must be kept to pairs.

> (output may vary because associative arrays are not ordered).
I've noticed that.  One  might think that the order of assignment would 
be 'the order' by inevitability, but that seems not to be the case.  I 
don't understand how it could be otherwise but nevermind.
> Returning to the original example, that means if  you're copying one
> associative array to another, you need to copy both the keys and the
> values, and quote it:
>
> % typeset -A asc=( "${(@kv)otherasc}" )

You said something the other day that was important: the parens do not 
say 'I am an array' they say 'turn me into an array'  it's one of those 
things that must be clear.

Yeah, I'm getting somewhat competent with A's.  They're not very 
forgiving but the rules aren't that hard to remember.


[-- Attachment #2: Type: text/html, Size: 2755 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-15  2:00                     ` Slurping a file (was: more spllitting travails) Bart Schaefer
@ 2024-01-15  4:24                       ` Ray Andrews
  2024-01-15  6:56                         ` Lawrence Velázquez
  2024-01-15  7:26                       ` Slurping a file (was: more spllitting travails) Lawrence Velázquez
  2024-01-15 13:13                       ` Slurping a file (was: more spllitting travails) Marc Chantreux
  2 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-15  4:24 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 179 bytes --]


On 2024-01-14 18:00, Bart Schaefer wrote:
> Sadly there's another utility named "slurp":
> how about Xact or eggsact or egzact or dooplikate or cohpee or kopy or Rcopy or zcopy.

[-- Attachment #2: Type: text/html, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-15  4:24                       ` Slurping a file Ray Andrews
@ 2024-01-15  6:56                         ` Lawrence Velázquez
  2024-01-15 14:37                           ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Lawrence Velázquez @ 2024-01-15  6:56 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

On Sun, Jan 14, 2024, at 11:24 PM, Ray Andrews wrote:
> On 2024-01-14 18:00, Bart Schaefer wrote:
>> Sadly there's another utility named "slurp":
>
> how about Xact or eggsact or egzact or dooplikate or cohpee or kopy or Rcopy or zcopy.

Please stop changing the "Subject" header when you reply within
existing discussions.  It is screwing up threading in one of my
MUAs.

This is the other side of the coin Bart flipped earlier:

> Please don't reply/re-use an unrelated subject for a new question.

-- 
vq


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-15  4:09                             ` Ray Andrews
@ 2024-01-15  7:01                               ` Lawrence Velázquez
  2024-01-15 14:47                                 ` Ray Andrews
  2024-01-18 16:20                                 ` Mark J. Reed
  2024-01-18 22:34                               ` Bart Schaefer
  1 sibling, 2 replies; 83+ messages in thread
From: Lawrence Velázquez @ 2024-01-15  7:01 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

On Sun, Jan 14, 2024, at 11:09 PM, Ray Andrews wrote:
> On 2024-01-14 16:55, Bart Schaefer wrote:
>> (output may vary because associative arrays are not ordered).
>
> I've noticed that.  One  might think that the order of assignment would be 'the order' by inevitability, but that seems not to be the case.  I don't understand how it could be otherwise but nevermind.

Consider learning about how associative arrays are typically
implemented.  Then you would see that there is no "inevitable"
iteration order; one must be usually be chosen and implemented
explicitly, which is extra complexity that many languages choose
not to bother with.

https://en.wikipedia.org/wiki/Associative_array#Ordered_dictionary

-- 
vq


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-01-15  2:00                     ` Slurping a file (was: more spllitting travails) Bart Schaefer
  2024-01-15  4:24                       ` Slurping a file Ray Andrews
@ 2024-01-15  7:26                       ` Lawrence Velázquez
  2024-01-15 14:48                         ` Slurping a file Ray Andrews
  2024-01-15 13:13                       ` Slurping a file (was: more spllitting travails) Marc Chantreux
  2 siblings, 1 reply; 83+ messages in thread
From: Lawrence Velázquez @ 2024-01-15  7:26 UTC (permalink / raw)
  To: Bart Schaefer, Roman Perepelitsa; +Cc: zsh-users

On Sun, Jan 14, 2024, at 9:00 PM, Bart Schaefer wrote:
> Sadly there's another utility named "slurp":
>
> slurp
>   cli utility to select a region in a Wayland compositor
>
> If we can find a different function name, this would be a good
> candidate to add to Functions/Misc/.

Quick sleepy brainstorm:

SERIOUS DIVISION

readall - sensible, moderately self-descriptive
readfile - sounds like it can only be used with files
readinput - not obvious how this differs from read(1)

GOOFY DIVISION

chug
guzzle
inhale
quaff
siphon
swig - would probably be confused with SWIG
vacuum

-- 
vq


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-01-14 22:09                     ` Slurping a file (was: more spllitting travails) Bart Schaefer
@ 2024-01-15  8:53                       ` Roman Perepelitsa
  2024-01-16 19:57                         ` Bart Schaefer
  0 siblings, 1 reply; 83+ messages in thread
From: Roman Perepelitsa @ 2024-01-15  8:53 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

On Sun, Jan 14, 2024 at 11:10 PM Bart Schaefer
<schaefer@brasslantern.com> wrote:
>
> On Sun, Jan 14, 2024 at 2:34 AM Roman Perepelitsa
> <roman.perepelitsa@gmail.com> wrote:
> >
> > On Sat, Jan 13, 2024 at 9:02 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
> > >
> > >   IFS= read -rd '' file_content <file
> >
> > In addition to being unable to read files with nul bytes, this
> > solution suffers from additional drawbacks:
> >
> > - It's impossible to distinguish EOF from I/O error.
>
> Pretty sure you can do that by examining $ERRNO on nonzero status?

I wouldn't do that other than for debugging. In general, you can
examine errno only for functions that explicitly document how they set
it. If this part is not documented, you have to assume the function
may set errno to anything both on success and on error. Also, most
libc functions may set errno to anything on success.

In this specific case perhaps `read` calls `malloc` after an I/O
error, which may trash errno. Or perhaps at the end of `read <file`
the file descriptor is closed, which again may trash errno. I haven't
verified either of these things. I am merely suggesting why `read`
conceivably could fail to propagate errno from an I/O error in the
absence of explicit guarantees in the docs.

> I'm curious whether
>   setopt nomultibyte
>   read -u 0 -k 8192 ...
> is actually that much slower in a slurp-like loop.

It is slightly *faster*. For smaller files the difference is about
25%. From 512KB and up there is no discernible difference.

> Another thought:  Use -c count option to get number of bytes read and
> -s $size option to specify buffer size.  If (( $count == $size )) then
> double $size for the next read.

This does not seem to help, although this might be dependent on the
device and filesystem. Here's a benchmark for various file sizes
(rows) and various fixed buffer sizes (columns):

     n   fsize    1KB    2KB     4KB    8KB   16KB   32KB   64KB
     1       0     41     43      43     43     51     52     53
     2       1     47     48      49     48     57     57     59
     3       2     48     48      48     48     56     57     58
     4       4     49     49      48     49     62     61     59
     5       8     74     75      51     49     62     61     63
     6      16     47     51      49     49     57     61     63
     7      32     47     50      49     50     58     58     59
     8      64     54     53      49     50     59     58     71
     9     128     50     50      51     51     59     60     61
    10     256     49     52      51     51     60     61     63
    11     512     53     55      55     54     64     64     65
    12    1024     58     61      60     61     57     68     71
    13    2048     77     72      71     74     83     83     83
    14    4096    112    102      88     89    107    100    108
    15    8192    188    153     152    145    161    163    140
    16   16384    343    290     270    259    265    240    225
    17   32768    658    577     427    471    499    495    489
    18   65536   1281   1082     983    771    938    827    937
    19  131072   2659   2214    2046   1952   1893   1928   1506
    20  262144   4818   4608    4195   4254   3810   3955   3043
    21  524288  10174   8967    7502   6382   7632   6142   7148
    22 1048576  21591  18205   16424  15691  15243  14327  14889
    23 2097152  41156  36087   32731  31840  30104  30090  29913
    24 4194304  89814  72949   66447  62716  60998  60252  59485
    25 8388608 191579 147195  125987 116327 121544 122384 122631

4KB and 8KB buffers perform best in this benchmark across all file
sizes. Given that 8KB is the default for sysread, there is no apparent
reason to use `-s`.

> >       typeset -g REPLY=${(j::)content}
>
> Why the typeset here?  Just assign?

Just a habit from using warn_create_global in my scripts. It catches
typos and missing `local` declarations quite well.

> Sadly there's another utility named "slurp":
>
> slurp
>   cli utility to select a region in a Wayland compositor

That's too bad: "slurp" is a well-known moniker for reading the full
content of a file (https://www.google.com/search?q=file+slurp).

Perhaps zslurp?

Roman.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-01-15  2:00                     ` Slurping a file (was: more spllitting travails) Bart Schaefer
  2024-01-15  4:24                       ` Slurping a file Ray Andrews
  2024-01-15  7:26                       ` Slurping a file (was: more spllitting travails) Lawrence Velázquez
@ 2024-01-15 13:13                       ` Marc Chantreux
  2 siblings, 0 replies; 83+ messages in thread
From: Marc Chantreux @ 2024-01-15 13:13 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Roman Perepelitsa, Zsh Users

On Sun, Jan 14, 2024 at 06:00:57PM -0800, Bart Schaefer wrote:
> On Sun, Jan 14, 2024 at 2:34 AM Roman Perepelitsa
> <roman.perepelitsa@gmail.com> wrote:
> Sadly there's another utility named "slurp"

both are badly named as they are too generic. that's why they can
collide.

> slurp
>   cli utility to select a region in a Wayland compositor

which should be renamed wayland-slurp-region

> If we can find a different function name, this would be a good
> candidate to add to Functions/Misc/.

file-slurp ?

regards,
-- 
Marc Chantreux
Pôle CESAR (Calcul et services avancés à la recherche)
Université de Strasbourg
14 rue René Descartes,
BP 80010, 67084 STRASBOURG CEDEX
03.68.85.60.79



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-15  6:56                         ` Lawrence Velázquez
@ 2024-01-15 14:37                           ` Ray Andrews
  2024-01-15 15:10                             ` Marc Chantreux
  0 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-15 14:37 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 412 bytes --]


On 2024-01-14 22:56, Lawrence Velázquez wrote:
> Please stop changing the "Subject" header when you reply within
> existing discussions.  It is screwing up threading in one of my
> M
So if I want a new subject, it's not sufficient to change the Subject?  
I mean it must be a fresh post, not a reply?  But I hardly ever do that 
anyway, and not recently.  If it just happened it wasn't me.  Please 
advise.

[-- Attachment #2: Type: text/html, Size: 857 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-15  7:01                               ` Lawrence Velázquez
@ 2024-01-15 14:47                                 ` Ray Andrews
  2024-01-18 16:20                                 ` Mark J. Reed
  1 sibling, 0 replies; 83+ messages in thread
From: Ray Andrews @ 2024-01-15 14:47 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 912 bytes --]

On 2024-01-14 23:01, Lawrence Velázquez wrote:
> Consider learning about how associative arrays are typically
> implemented.  Then you would see that there is no "inevitable"
> iteration order; one must be usually be chosen and implemented
> explicitly, which is extra complexity that many languages choose
> not to bother with.

It would take a deep dive into the subject that I'm not competent to 
understand anyway.  I take it on faith that it's more trouble than it's 
worth.  The naive view would be that the array ends up as a block of 
data that has a beginning and an end and so when one starts reading one 
will naturally start at the beginning.  But I suspect that it's far more 
complicated and I don't need to know and wouldn't understand.  Linked 
lists or something.  Last write = first read maybe.  Doesn't matter.

>
> https://en.wikipedia.org/wiki/Associative_array#Ordered_dictionary
>

[-- Attachment #2: Type: text/html, Size: 1681 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-15  7:26                       ` Slurping a file (was: more spllitting travails) Lawrence Velázquez
@ 2024-01-15 14:48                         ` Ray Andrews
  0 siblings, 0 replies; 83+ messages in thread
From: Ray Andrews @ 2024-01-15 14:48 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 400 bytes --]


On 2024-01-14 23:26, Lawrence Velázquez wrote:
> Quick sleepy brainstorm:
>
> SERIOUS DIVISION
>
> readall - sensible, moderately self-descriptive
> readfile - sounds like it can only be used with files
> readinput - not obvious how this differs from read(1)
>
> GOOFY DIVISION
>
> chug
> guzzle
> inhale
> quaff
> siphon
> swig - would probably be confused with SWIG
> vacuum
>
verbatim

veritas


[-- Attachment #2: Type: text/html, Size: 851 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-15 14:37                           ` Ray Andrews
@ 2024-01-15 15:10                             ` Marc Chantreux
  2024-01-15 15:29                               ` Mark J. Reed
  2024-01-16  7:23                               ` Slurping a file Lawrence Velázquez
  0 siblings, 2 replies; 83+ messages in thread
From: Marc Chantreux @ 2024-01-15 15:10 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

hello Ray,

On Mon, Jan 15, 2024 at 06:37:43AM -0800, Ray Andrews wrote:
>     Please stop changing the "Subject" header when you reply within
>     existing discussions.  It is screwing up threading in one of my
>     M

MUA ?

Well. In contrary, I really appreciate when people change the subject
when a part of the thread opens a new topic that is related to the
original question

* In-Reply-To header should be used by your MUA to build the thread
* You can rely on the subject to keep track of the open subtopic

which is really convenient but requires to use a decent MUA
(mutt, thunderbird, … you name it)

> So if I want a new subject, it's not sufficient to change the
> Subject? I mean it must be a fresh post, not a reply?

If you want your MUA to lose track of the thread, just remove
the In-Reply-To header.

regards,

-- 
Marc Chantreux
Pôle CESAR (Calcul et services avancés à la recherche)
Université de Strasbourg
14 rue René Descartes,
BP 80010, 67084 STRASBOURG CEDEX
03.68.85.60.79



^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-15 15:10                             ` Marc Chantreux
@ 2024-01-15 15:29                               ` Mark J. Reed
  2024-01-15 16:16                                 ` Marc Chantreux
  2024-01-16  7:23                               ` Slurping a file Lawrence Velázquez
  1 sibling, 1 reply; 83+ messages in thread
From: Mark J. Reed @ 2024-01-15 15:29 UTC (permalink / raw)
  To: Marc Chantreux; +Cc: Ray Andrews, zsh-users

[-- Attachment #1: Type: text/plain, Size: 673 bytes --]

On Mon, Jan 15, 2024 at 10:10 AM Marc Chantreux <mc@unistra.fr> wrote:

> MUA ?

MUA = Mail User Agent, a.k.a. an email client: a program you use to compose
and read email. Apple Mail, Microsoft Outlook, Mozilla Thunderbird, old
school programs like mutt, ELM, MH, /bin/mail or Mail/mailx. O course for
many of us our MUA is not a standalone program but a web app running in a
browser (e.g. GMail), but that still counts.

The MUA is specifically distinguished from the Mail *Transfer *Agent (MTA)
which is responsible for actually getting the email where it needs to go;
Microsoft Exchange and sendmail are MTAs.

--
Mark J. Reed <markjreed@gmail.com>

[-- Attachment #2: Type: text/html, Size: 1134 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-15 15:29                               ` Mark J. Reed
@ 2024-01-15 16:16                                 ` Marc Chantreux
  2024-01-15 16:33                                   ` MUAs (was: Re: Slurping a file) zeurkous
  0 siblings, 1 reply; 83+ messages in thread
From: Marc Chantreux @ 2024-01-15 16:16 UTC (permalink / raw)
  To: Mark J. Reed; +Cc: Ray Andrews, zsh-users

hello,

On Mon, Jan 15, 2024 at 10:29:16AM -0500, Mark J. Reed wrote:
>     MUA ?

I was asking if the M in the message from Ray was meant to be MUA.

> programs like mutt, ELM, MH, /bin/mail or Mail/mailx. O course for
> many of us our MUA is not a standalone program but a web app running
> in a browser (e.g.  GMail), but that still counts.

I really don't think so:

* webmail are really painful to use and lack a lot of features
  available from a classic MUA. I personally still use mutt
  (with vim as editor and maildir-utils to index and search)
  just because the "modern" replacements are far to provide all
  the things I found valuable when writting mail.
* I just wrote a script to show the main MUA used by the authors
  of the messages to the list I haven't deleted. the result is:

	1 822_dng.
	1 Cyrus-JMAP
	1 Evolution
	1 ForteAgent
	1 K-9
	6 Mozilla
	2 Mutt

I don't know 822_dng so I remove it from the list. so according to
my very short pannel, only 1/12th of us is using a webmail.

humanity isn't that screwed after all :)

regards,
marc

-- 
Marc Chantreux
Pôle CESAR (Calcul et services avancés à la recherche)
Université de Strasbourg
14 rue René Descartes,
BP 80010, 67084 STRASBOURG CEDEX
03.68.85.60.79

^ permalink raw reply	[flat|nested] 83+ messages in thread

* MUAs (was: Re: Slurping a file)
  2024-01-15 16:16                                 ` Marc Chantreux
@ 2024-01-15 16:33                                   ` zeurkous
  0 siblings, 0 replies; 83+ messages in thread
From: zeurkous @ 2024-01-15 16:33 UTC (permalink / raw)
  To: Marc Chantreux, Mark J. Reed; +Cc: Ray Andrews, zsh-users

Haai,

On Mon, 15 Jan 2024 17:16:19 +0100, Marc Chantreux <mc@unistra.fr> wrote:
> I don't know 822_dng so I remove it from the list. so according to
> my very short pannel, only 1/12th of us is using a webmail.

It's not webmail, so make that 1/13th. 

> humanity isn't that screwed after all :)

:)

        --zeurkous.

-- 
Friggin' Machines!


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-15 15:10                             ` Marc Chantreux
  2024-01-15 15:29                               ` Mark J. Reed
@ 2024-01-16  7:23                               ` Lawrence Velázquez
  2024-01-16 14:37                                 ` Ray Andrews
  1 sibling, 1 reply; 83+ messages in thread
From: Lawrence Velázquez @ 2024-01-16  7:23 UTC (permalink / raw)
  To: zsh-users

On Mon, Jan 15, 2024, at 10:10 AM, Marc Chantreux wrote:
> Well. In contrary, I really appreciate when people change the subject
> when a part of the thread opens a new topic that is related to the
> original question

I agree.  What I meant to refer to (but didn't make clear, my bad)
was changing the subject when the topic is NOT new.  Ray has done
that several times in this thread, but Bart is surely correct about
this being Thunderbird's fault.  (The possibility didn't occur to
me.)

-- 
vq


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-16  7:23                               ` Slurping a file Lawrence Velázquez
@ 2024-01-16 14:37                                 ` Ray Andrews
  2024-01-17  3:50                                   ` Lawrence Velázquez
  0 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-16 14:37 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 1076 bytes --]


On 2024-01-15 23:23, Lawrence Velázquez wrote:
> I agree.  What I meant to refer to (but didn't make clear, my bad)
> was changing the subject when the topic is NOT new.  Ray has done
> that several times in this thread, but Bart is surely correct about
> this being Thunderbird's fault.  (The possibility didn't occur to
> me.)
Years ago I was unaware it was an issue, then I did it knowingly once or 
twice but got slapped down for it, but it's been a very long time since 
I did anything like touch the Subject line in any way.  I don't know 
anything about how threading works but if there's something I should 
keep an eye on I will.  BTW there's a perhaps related issue: I almost 
always get two  copies of every post, one will have the option of 'reply 
to list' the other: 'reply to all'. I'm in the habit of automatically 
deleting the second -- which, now that I think about it tend to come 
first.  I'd prefer to only get one copy.  Also, if I reply to the one 
instead of the other, something changes but honestly I haven't bothered 
to pay any attention.
>

[-- Attachment #2: Type: text/html, Size: 1710 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-01-15  8:53                       ` Roman Perepelitsa
@ 2024-01-16 19:57                         ` Bart Schaefer
  2024-01-16 20:07                           ` Slurping a file Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Bart Schaefer @ 2024-01-16 19:57 UTC (permalink / raw)
  To: Zsh Users

On Mon, Jan 15, 2024 at 12:53 AM Roman Perepelitsa
<roman.perepelitsa@gmail.com> wrote:
>
> Perhaps zslurp?

Sure.  I was going to suggest "zlurp" but it looks odder written down
than thought about.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-16 19:57                         ` Bart Schaefer
@ 2024-01-16 20:07                           ` Ray Andrews
  2024-01-16 20:14                             ` Roman Perepelitsa
  0 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-16 20:07 UTC (permalink / raw)
  To: zsh-users


On 2024-01-16 11:57, Bart Schaefer wrote:
> On Mon, Jan 15, 2024 at 12:53 AM Roman Perepelitsa
> <roman.perepelitsa@gmail.com> wrote:
>> Perhaps zslurp?
> Sure.  I was going to suggest "zlurp" but it looks odder written down
> than thought about.
Nothing wrong with something light-hearted just so long as it's 
intuitive what it does.  'zlurp' is a bridge too far IMHO.  The thing 
makes an exact copy, what's the word for that?  Sorta why I liked 'xact' 
or 'xcopy' -- but that's taken, so 'zcopy'.
>


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-16 20:07                           ` Slurping a file Ray Andrews
@ 2024-01-16 20:14                             ` Roman Perepelitsa
  2024-01-16 20:38                               ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Roman Perepelitsa @ 2024-01-16 20:14 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

On Tue, Jan 16, 2024 at 9:08 PM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
>
> On 2024-01-16 11:57, Bart Schaefer wrote:
> > On Mon, Jan 15, 2024 at 12:53 AM Roman Perepelitsa
> > <roman.perepelitsa@gmail.com> wrote:
> >> Perhaps zslurp?
> > Sure.  I was going to suggest "zlurp" but it looks odder written down
> > than thought about.
> Nothing wrong with something light-hearted just so long as it's
> intuitive what it does.  'zlurp' is a bridge too far IMHO.  The thing
> makes an exact copy, what's the word for that?  Sorta why I liked 'xact'
> or 'xcopy' -- but that's taken, so 'zcopy'.

When I see "slurp", I know exactly what it does: reads a full file
into a string. If you don't get the same immediate reaction, you
really should: google "file slurp" and see that it's the way this
facility is called in many programming languages.

Now that "slurp" is taken by an unrelated command, "zslurp" is an
obvious alternative to go along with zstat, zselect, zcalc, etc. Note
that it's zstat and zselect and not ztat and zelect, hence zslurp and
not zlurp.

Roman.

P.S.

When I see "xcopy", I also know what it does but it's unrelated. Also,
"copy" for a file is a very different operation.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-16 20:14                             ` Roman Perepelitsa
@ 2024-01-16 20:38                               ` Ray Andrews
  2024-01-16 20:43                                 ` Roman Perepelitsa
  0 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-16 20:38 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 510 bytes --]


On 2024-01-16 12:14, Roman Perepelitsa wrote:
> When I see "slurp", I know exactly what it does: reads a full file
> into a string. If you don't get the same immediate reaction, you
> really should: google "file slurp" and see that it's the way this
> facility is called in many programming languages.
>
> Now that "slurp" is taken by an unrelated command, "zslurp" is an
> obvious alternative

So it is.  Not 'zlurp' but 'zslurp' -- that fixes it.  Besides, it's 
your baby and you should make the call.



[-- Attachment #2: Type: text/html, Size: 975 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-16 20:38                               ` Ray Andrews
@ 2024-01-16 20:43                                 ` Roman Perepelitsa
  2024-01-16 22:27                                   ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Roman Perepelitsa @ 2024-01-16 20:43 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

On Tue, Jan 16, 2024 at 9:38 PM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
> On 2024-01-16 12:14, Roman Perepelitsa wrote:
>
> When I see "slurp", I know exactly what it does: reads a full file
>
> into a string. If you don't get the same immediate reaction, you
> really should: google "file slurp" and see that it's the way this
> facility is called in many programming languages.
>
> Now that "slurp" is taken by an unrelated command, "zslurp" is an
> obvious alternative
>
> So it is.  Not 'zlurp' but 'zslurp' -- that fixes it.  Besides, it's your baby and you should make the call.

I cannot take credit for it. The implementation is trivial and given
my late arrival to zsh I'm sure there are many implementations that do
the same thing already, likely even by the same name. That said, even
they didn't invent anything of importance: it really is a trivial
thing.

Roman.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-16 20:43                                 ` Roman Perepelitsa
@ 2024-01-16 22:27                                   ` Ray Andrews
  0 siblings, 0 replies; 83+ messages in thread
From: Ray Andrews @ 2024-01-16 22:27 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 473 bytes --]


On 2024-01-16 12:43, Roman Perepelitsa wrote:
> I cannot take credit for it. The implementation is trivial and given
> my late arrival to zsh I'm sure there are many implementations that do
> the same thing already, likely even by the same name. That said, even
> they didn't invent anything of importance: it really is a trivial
> thing.

Trivial unless you need it, don't have it, and don't know where to get 
it.  I consider it fundamental on principle.


>
> Roman.
>

[-- Attachment #2: Type: text/html, Size: 1112 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-16 14:37                                 ` Ray Andrews
@ 2024-01-17  3:50                                   ` Lawrence Velázquez
  2024-01-17  5:10                                     ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Lawrence Velázquez @ 2024-01-17  3:50 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

On Tue, Jan 16, 2024, at 9:37 AM, Ray Andrews wrote:
> BTW there's a perhaps related issue: I almost always get two
> copies of every post, one will have the option of 'reply to list'
> the other: 'reply to all'.

This happens when someone addresses both you and the list, as I am
doing now.  One copy is delivered by the list, while the other is
delivered to you directly.  You can verify this by inspecting the
message headers.

> I'm in the habit of automatically deleting the second -- which,
> now that I think about it tend to come first.

The list adds a noticeable delay to delivery, so the direct copy
will almost always beat out the list copy.

> I'd prefer to only get one copy.

Some list software [*] can be configured to suppress the list copy
if your address is in To or Cc, but ours (Sympa) doesn't seem to
have that feature.

If your client can deduplicate messages (based on Message-Id,
perhaps), you could take advantage of that.  Otherwise, you could
write a rule that automatically deletes messages that contain
"zsh-users@zsh.org" in To or Cc but were not sent by the list (e.g.,
they lack a List-Id header).  This would delete direct copies unless
the sender put the list on Bcc.

Finally, you could beg everyone to reply to ONLY the list and NOT
to you and hope no one ever forgets.  Maybe mention it in your
signature or something.

  [*]: https://docs.mailman3.org/en/latest/userguide.html#how-can-i-avoid-getting-duplicate-messages

-- 
vq

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file
  2024-01-17  3:50                                   ` Lawrence Velázquez
@ 2024-01-17  5:10                                     ` Ray Andrews
  0 siblings, 0 replies; 83+ messages in thread
From: Ray Andrews @ 2024-01-17  5:10 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 868 bytes --]


On 2024-01-16 19:50, Lawrence Velázquez wrote:
> The list adds a noticeable delay to delivery, so the direct copy
> will almost always beat out the list copy.
Correct, now that I pay attention to it.
> Some list software [*] can be configured to suppress the list copy
> if your address is in To or Cc, but ours (Sympa) doesn't seem to
> have that feature.
>
> Finally, you could beg everyone to reply to ONLY the list and NOT
> to you and hope no one ever forgets.  Maybe mention it in your
> signature or something.

It's no issue for me really.  I don't do anything as complex as you seem 
to, I know what threading is but I don't use it, my email life is as 
simple as it could be.  My only concern is not to disrupt anything via 
'Subject' or anything else.  Bart says he changed something.  Meanwhile 
I got two copies of your post as usual.  I'm cool.



[-- Attachment #2: Type: text/html, Size: 1571 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-15  7:01                               ` Lawrence Velázquez
  2024-01-15 14:47                                 ` Ray Andrews
@ 2024-01-18 16:20                                 ` Mark J. Reed
  2024-01-18 17:22                                   ` Ray Andrews
  1 sibling, 1 reply; 83+ messages in thread
From: Mark J. Reed @ 2024-01-18 16:20 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 1398 bytes --]

On Mon, Jan 15, 2024 at 2:02 AM Lawrence Velázquez <larryv@zsh.org> wrote:

> There is no "inevitable" teration order; one must be usually be chosen and
> implemented
> explicitly, which is extra complexity that many languages choose not to
> bother with.

Indeed. To be clear, several do choose to bother; examples include Python
since 3.6, Ruby since 1.9, and PHP, which all maintain their associative
data types in insertion order. (That introduces potential pitfalls in PHP,
where *all* arrays are associative; regular ones just have numeric keys,
but those can still be rearranged out of numerical order! JavaScript avoids
that pitfall by using numeric order for numeric keys while maintaining
insertion order for non-numeric ones.)

Closer to home, the KornShell keeps its associative arrays in lexical order
by key. Which is to say there is precedent for keeping things in a
specified order, but Zsh doesn't do so. Even if it were changed such that
it started to do so as of 6.0 or whatever, you'd still have to write code
to check the version and either fail ungracefully or fall back to
maintaining the order in a separate array anyway. So even in that world,
it's simpler to just do the separate ordering in the first place, yielding
code that works in any version – at least, any version with associative
arrays.

-- 
Mark J. Reed <markjreed@gmail.com>

[-- Attachment #2: Type: text/html, Size: 1866 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-18 16:20                                 ` Mark J. Reed
@ 2024-01-18 17:22                                   ` Ray Andrews
  2024-01-18 17:36                                     ` Mark J. Reed
  0 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-18 17:22 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 647 bytes --]

On 2024-01-18 08:20, Mark J. Reed wrote:
>
> Even if it were changed such that it started to do so as of 6.0 or 
> whatever, you'd still have to write code to check the version

I'm not qualified to disagree with that, still I'm curious: if the 
arrays were sorted (alphabetical is perhaps 'obvious') what could that 
ever break?  Since the order now is indeterminate, who/what would ever 
actually rely on that?  As if the order were depended on to be random.  
Really?  So if they were sorted, who would know or care *unless* the 
decided to take advantage of it? Seems to me to be the sort of thing 
that can't possibly have any gotchas.

[-- Attachment #2: Type: text/html, Size: 1256 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-18 17:22                                   ` Ray Andrews
@ 2024-01-18 17:36                                     ` Mark J. Reed
  2024-01-18 17:55                                       ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Mark J. Reed @ 2024-01-18 17:36 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 1328 bytes --]

On Thu, Jan 18, 2024 at 12:22 PM Ray Andrews <rayandrews@eastlink.ca> wrote:

> Seems to me to be the sort of thing that can't possibly have any gotchas.

I'm not saying it would introduce gotchas. I don't think anyone's writing
code that would break if assoc arrays were suddenly sorted. :)

I'm saying that if zsh changed to keep its associative arrays in sorted
order, and then you wrote a zsh program that *relied* on that behavior,
you'd have a backward compatibility problem: your new script wouldn't work
properly in older versions of zsh that didn't have the sorting. If you
wanted your script to behave consistently even when run on such versions,
you'd have to write manual code to keep the ordering, separately from the
array itself. You could pair that with a check of $ZSH_VERSION and only do
the manual bit when not running a version that does it for you
automatically, but unless you're happy to cut out backward compatibility
entirely, you'd still have to write it.

And if you'd have to write it anyway, you might as well just use it
unconditionally and avoid the complexity of switching code in and out based
on the version.

That's all I was saying.  It wasn't much of a point, and probably didn't
warrant three messages clarifying it. :)

--
Mark J. Reed <markjreed@gmail.com>

[-- Attachment #2: Type: text/html, Size: 1989 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-18 17:36                                     ` Mark J. Reed
@ 2024-01-18 17:55                                       ` Ray Andrews
  0 siblings, 0 replies; 83+ messages in thread
From: Ray Andrews @ 2024-01-18 17:55 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 887 bytes --]

On 2024-01-18 09:36, Mark J. Reed wrote:
>
> I'm saying that if zsh changed to keep its associative arrays in 
> sorted order, and then you wrote a zsh program that /relied/ on that 
> behavior, you'd have a backward compatibility problem:

Ah!  I was looking at that issue from the wrong end.  Deep subject, 
backwards compatibility and my thoughts on it are naive but it seems to 
me sorta a tautology that new features require the new version and if 
your code uses the new feature then it won't work on the old version -- 
which is so obvious that I'm probably not really getting the problem ... 
mind, I understand that in shell culture using specific versions for 
specific things is not rare.  But surely the user of a new feature must 
take care himself to make sure that his version is compatible?  How 
could it be otherwise?  Never mind, this is out of my experience.

[-- Attachment #2: Type: text/html, Size: 1469 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-15  4:09                             ` Ray Andrews
  2024-01-15  7:01                               ` Lawrence Velázquez
@ 2024-01-18 22:34                               ` Bart Schaefer
  2024-01-18 23:08                                 ` Ray Andrews
  1 sibling, 1 reply; 83+ messages in thread
From: Bart Schaefer @ 2024-01-18 22:34 UTC (permalink / raw)
  To: zsh-users

On Sun, Jan 14, 2024 at 8:09 PM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
> Exactly as I understand it.  A arrays are 'not very clever' -- they don't try to protect you from yourself, there's no internal place holding for a missing value, and as you say things must be kept to pairs.

There is an alternate (in zsh, it's the default in e.g. ksh)
assignment syntax for associative arrays:

  asc=( [one]=1 [two]=2 [nil]= )

The "not clever" pairwise assignment is a concession to zsh's base
practice of always passing expansions by value rather than by
reference.  That is, you can not do

  typeset -A asc=$otherasc

because the defined expansion of $otherasc is an array of values, not
an object reference that can be assigned in toto.

> One  might think that the order of assignment would be 'the order' by inevitability

Associative arrays are maintained as hash tables for fast access.
Expansion is in hash key order, and the internal hash keys are
computed directly from the array key.  Anything else requires
performing a sort at expansion time, and/or (especially if order of
assignment is to be preserved) storing extra data about the order
(such as a linked list through the hash elements).  Zsh opted to do
neither of these on the premise that access to individual elements by
key is far more frequent than access to the entire table.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-18 22:34                               ` Bart Schaefer
@ 2024-01-18 23:08                                 ` Ray Andrews
  2024-01-19  2:46                                   ` Bart Schaefer
  0 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-18 23:08 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 1664 bytes --]


On 2024-01-18 14:34, Bart Schaefer wrote:
>    asc=( [one]=1 [two]=2 [nil]= )
>
> The "not clever" pairwise assignment
I don't want to sound to flippant.  Anything other than pairwise would 
obviously be much more difficult to implement and I myself have never 
seen any reason why simply maintaining the pairs should be considered a 
problem.  I did experience some crash and burns on first playing with A 
arrays, but it's not hard once you know how.
> Associative arrays are maintained as hash tables for fast access.
It's something way over my head.
> ... storing extra data about the order
> (such as a linked list through the hash elements).  Zsh opted to do
> neither of these on the premise that access to individual elements by
> key is far more frequent than access to the entire table.

Sure.  Perhaps Mark has some idea of the nuts and bolts of the thing in 
ksh -- whether or not it's a labor to implement the determined order or 
not -- but that's up to you devs.  As it is, for diagnostics I do end up 
printing entire arrays, and it would be nice to have a predicted order 
but not if it's going to be any trouble.  Funny, now that I try to break 
the order I can't:

typeset -A asc=()
asc=( [one]=1 [two]=2 [three]=3 [four]=4 [five]=5 )
print -l "$asc"
asc[six]=6
print -l "$asc"
asc[two]=22
print -l "$asc"
asc[three]=
print -l "$asc"
asc[three]=33
print -l "$asc"
asc[one]=
print -l "$asc"
asc[one]=11
print -l "$asc"

1 5 4 2 3
1 5 4 2 3 6
1 5 4 22 3 6
1 5 4 22  6
1 5 4 22 33 6
  5 4 22 33 6
11 5 4 22 33 6

... once it decides on 'one five four two three' ... it keeps it.  How 
hard would it be to get the first hash 'right'?





>

[-- Attachment #2: Type: text/html, Size: 2979 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-18 23:08                                 ` Ray Andrews
@ 2024-01-19  2:46                                   ` Bart Schaefer
  2024-01-19  2:58                                     ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Bart Schaefer @ 2024-01-19  2:46 UTC (permalink / raw)
  To: zsh-users

On Thu, Jan 18, 2024 at 3:08 PM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
> How hard would it be to get the first hash 'right'?

That's not how hash tables work.  Ignoring some details, the position
of an element in the table is mathematically computed (hence "hash")
from that specific element, without reference to other elements that
may already be there or be added later.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-19  2:46                                   ` Bart Schaefer
@ 2024-01-19  2:58                                     ` Ray Andrews
  2024-01-19 10:27                                       ` Stephane Chazelas
  0 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-19  2:58 UTC (permalink / raw)
  To: zsh-users

On 2024-01-18 18:46, Bart Schaefer wrote:
> On Thu, Jan 18, 2024 at 3:08 PM Ray Andrews <rayandrews@eastlink.ca> wrote:
>> How hard would it be to get the first hash 'right'?
> That's not how hash tables work.

I have no idea how they work which is why my comments have little 
value.  If ksh can do it then it can be done, but as I said, it simply 
might not be worth the effort and I'm in no position to judge the 
matter.  I wish I'd gotten that far in C, I'd heard of hash tables but 
never coded one.  But I did note in that code previous that the order 
seemed not to change.  Nevermind.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-19  2:58                                     ` Ray Andrews
@ 2024-01-19 10:27                                       ` Stephane Chazelas
  2024-01-19 13:45                                         ` Mikael Magnusson
  0 siblings, 1 reply; 83+ messages in thread
From: Stephane Chazelas @ 2024-01-19 10:27 UTC (permalink / raw)
  To: Ray Andrews; +Cc: zsh-users

2024-01-18 18:58:41 -0800, Ray Andrews:
> 
> On 2024-01-18 18:46, Bart Schaefer wrote:
> > On Thu, Jan 18, 2024 at 3:08 PM Ray Andrews <rayandrews@eastlink.ca> wrote:
> > > How hard would it be to get the first hash 'right'?
> > That's not how hash tables work.
> 
> I have no idea how they work which is why my comments have little value.  If
> ksh can do it then it can be done, but as I said, it simply might not be
> worth the effort and I'm in no position to judge the matter.  I wish I'd
> gotten that far in C, I'd heard of hash tables but never coded one.  But I
> did note in that code previous that the order seemed not to change. 
> Nevermind.

As said before, in "${!hash[@]}", ksh93 orders the keys
lexically, not in the order they were inserted in the hash
table. It doesn't record that order either.

If you want to do the same in zsh, use the o parameter expansion
flag to order the keys. See also n to order numerically:

$ ksh -c 'typeset -A a; for i do a[$i]=$(( ++n )); done; printf "%s\n" "${!a[@]}"' ksh {1..20}
1
10
11
12
13
14
15
16
17
18
19
2
20
3
4
5
6
7
8
9

$ zsh -c 'typeset -A a; for i do a[$i]=$(( ++n )); done; printf "%s\n" "${(ko@)a}"' ksh {1..20}
1
10
11
12
13
14
15
16
17
18
19
2
20
3
4
5
6
7
8
9
$ zsh -c 'typeset -A a; for i do a[$i]=$(( ++n )); done; printf "%s\n" "${(kn@)a}"' ksh {1..20}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

-- 
Stephane


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-19 10:27                                       ` Stephane Chazelas
@ 2024-01-19 13:45                                         ` Mikael Magnusson
  2024-01-19 14:37                                           ` Mark J. Reed
  0 siblings, 1 reply; 83+ messages in thread
From: Mikael Magnusson @ 2024-01-19 13:45 UTC (permalink / raw)
  To: zsh-users

On 1/19/24, Stephane Chazelas <stephane@chazelas.org> wrote:
> 2024-01-18 18:58:41 -0800, Ray Andrews:
>>
>> On 2024-01-18 18:46, Bart Schaefer wrote:
>> > On Thu, Jan 18, 2024 at 3:08 PM Ray Andrews <rayandrews@eastlink.ca>
>> > wrote:
>> > > How hard would it be to get the first hash 'right'?
>> > That's not how hash tables work.
>>
>> I have no idea how they work which is why my comments have little value.
>> If
>> ksh can do it then it can be done, but as I said, it simply might not be
>> worth the effort and I'm in no position to judge the matter.  I wish I'd
>> gotten that far in C, I'd heard of hash tables but never coded one.  But
>> I
>> did note in that code previous that the order seemed not to change.
>> Nevermind.
>
> As said before, in "${!hash[@]}", ksh93 orders the keys
> lexically, not in the order they were inserted in the hash
> table. It doesn't record that order either.
>
> If you want to do the same in zsh, use the o parameter expansion
> flag to order the keys. See also n to order numerically:

Unfortunately, combining o with kv doesn't work as one might hope:
% typeset -A foo; foo=(a b c d e f g h 1 2 3 4 i 5 j 6 7 k 8 l)
% printf '%s = %s\n' ${(kv)foo}
3 = 4
g = h
i = 5
7 = k
j = 6
8 = l
a = b
c = d
1 = 2
e = f
% printf '%s = %s\n' ${(okv)foo}
1 = 2
3 = 4
5 = 6
7 = 8
a = b
c = d
e = f
g = h
i = j
k = l

I guess this is actually a bug since the manual states:

  v      Used with k, substitute (as two consecutive words) both the
key and the value of each associative array element.  Used with
subscripts, force values to  be substituted even if the subscript form
refers to indices or keys.

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-19 13:45                                         ` Mikael Magnusson
@ 2024-01-19 14:37                                           ` Mark J. Reed
  2024-01-19 14:57                                             ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Mark J. Reed @ 2024-01-19 14:37 UTC (permalink / raw)
  To: Mikael Magnusson; +Cc: zsh-users

[-- Attachment #1: Type: text/plain, Size: 679 bytes --]

On Fri, Jan 19, 2024 at 8:46 AM Mikael Magnusson <mikachu@gmail.com> wrote:

>   v      Used with k, substitute (as two consecutive words) both the key
> and the value of each associative array element.
>

It seems that the `kv` is effectively applied before the `o`, so you get
the keys and values intermixed and then the whole list sorted together.
Hard to imagine a use case where that would be the desired outcome.

I understand that the expansion flags are generally intended to be
independent of each other, but I wonder if it's not worth special-casing a
few combinations to ensure the most useful interpretation.

-- 
Mark J. Reed <markjreed@gmail.com>

[-- Attachment #2: Type: text/html, Size: 1198 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-19 14:37                                           ` Mark J. Reed
@ 2024-01-19 14:57                                             ` Ray Andrews
  2024-01-19 15:46                                               ` Mark J. Reed
  0 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-19 14:57 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 510 bytes --]


On 2024-01-19 06:37, Mark J. Reed wrote:
>
>
> It seems that the `kv` is effectively applied before the `o`, so you 
> get the keys and values intermixed and then the whole list sorted 
> together. Hard to imagine a use case where that would be the desired 
> outcome.
>
So it seems that zsh already does intend the ability to output the array 
in a declared way as Stephane showed (hash tables notwithstanding).  
Thus it's simply a matter of making that work as it proposes to do.  
Couldn't ask for more.

[-- Attachment #2: Type: text/html, Size: 1139 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-19 14:57                                             ` Ray Andrews
@ 2024-01-19 15:46                                               ` Mark J. Reed
  2024-01-19 16:01                                                 ` Mikael Magnusson
  0 siblings, 1 reply; 83+ messages in thread
From: Mark J. Reed @ 2024-01-19 15:46 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 3439 bytes --]

On Fri, Jan 19, 2024 at 9:58 AM Ray Andrews <rayandrews@eastlink.ca> wrote:

> So it seems that zsh already does intend the ability to output the array
> in a declared way as Stephane showed (hash tables notwithstanding).  Thus
> it's simply a matter of making that work as it proposes to do.  Couldn't
> ask for more.
>

Well, let's be clear about what's happening.

As a general rule, associative arrays, as implementations of the abstract
data type known as Map or Table, are fundamentally unordered, regardless of
the implementation details. It's true that some implementations naturally
keep the keys in a recognizable order (e.g. binary trees in lexical order
by key, association lists in insertion order), while others (such as hash
tables) do not, but such details don't necessarily dictate the features of
any given implementation.

Most systems with associative arrays don't provide a mechanism to retrieve
the elements in a specified order. Some do – Ksh93+ was brought up, and I
mentioned JavaScript, PHP, Python, and Ruby.  Sometimes this is an actual
feature and other times an accidental detail; in Clojure, maps that are
small enough (less than about eight pairs) are stored using a data
structure that keeps the keys in lexical order, so you might be lulled into
thinking that's true generally, but larger maps use a hash table and revert
to a seemingly unordered state.

Zsh has expansion flags that let you get *either* the keys ( *k* ) or the
values in lexical order, ascending ( *o *) or descending ( *O* ).

Unfortunately it does not have a way to get the keys and values
simultaneously in any sort of meaningful order while maintaining the
pairwise association. If you use both flags *k *and *v* together with
either *o *or *O*, you will get an undistinguished muddle of keys and
values all sorted together into one big list.

Which is a consequence of the way the flags work. With no flags, an array
expands to its values. The *k *flag says "get the keys instead". The *v* flag
says "include the values, too". The resulting list alternates between keys
and values, but it is still a flat, one-dimensional array; it has no
internal structure keeping the pairs together. The sort triggered by the
ordering flags has no way to know that it's a list of pairs.

You could make the case that it should know that based on the flags; the
flag triplets *kvo* and *kvO *could order just the keys while maintaining
the pairwise association with their values, as a special case. Though that
privileges the keys over the values; might one not also conceivably want to
sort by value while maintaining the associated keys?

In fact, as far as I can tell, there's currently no good way to get from a
value to its associated key at all. If you want the key/value pairs in
order with current zsh, there is a ready solution: simply iterate through
the sorted keys returned by *ko* and use them to get the associated value:

    for key in "${(@ko)array}"; do
        ... something with "$key" and "${array[$key]}" here ...
    done

But if you instead iterate through the values (with or without *o*),
there's no analogous way to get back to the corresponding key. Maybe that's
the missing functionality we should look into instead: inverting an
associative array. Of course duplicate values are troublesome in this
regard.

-- 
Mark J. Reed <markjreed@gmail.com>

[-- Attachment #2: Type: text/html, Size: 4436 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-19 15:46                                               ` Mark J. Reed
@ 2024-01-19 16:01                                                 ` Mikael Magnusson
  2024-01-19 17:15                                                   ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Mikael Magnusson @ 2024-01-19 16:01 UTC (permalink / raw)
  To: Mark J. Reed; +Cc: zsh-users

On 1/19/24, Mark J. Reed <markjreed@gmail.com> wrote:
> On Fri, Jan 19, 2024 at 9:58 AM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
> In fact, as far as I can tell, there's currently no good way to get from a
> value to its associated key at all. If you want the key/value pairs in
> order with current zsh, there is a ready solution: simply iterate through
> the sorted keys returned by *ko* and use them to get the associated value:
>
>     for key in "${(@ko)array}"; do
>         ... something with "$key" and "${array[$key]}" here ...
>     done
>
> But if you instead iterate through the values (with or without *o*),
> there's no analogous way to get back to the corresponding key. Maybe that's
> the missing functionality we should look into instead: inverting an
> associative array. Of course duplicate values are troublesome in this
> regard.

Keys are unique but values aren't, so it's sort of a nonsensical
request; that said, you can do it ;).
% typeset -A foo; foo=( 1 bar 2 bar 3 quux 4 baz )
% echo ${(k)foo[(R)bar]}
1 2

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-19 16:01                                                 ` Mikael Magnusson
@ 2024-01-19 17:15                                                   ` Ray Andrews
  2024-01-19 17:42                                                     ` Bart Schaefer
  0 siblings, 1 reply; 83+ messages in thread
From: Ray Andrews @ 2024-01-19 17:15 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 1766 bytes --]


On 2024-01-19 08:01, Mikael Magnusson wrote:
> Keys are unique but values aren't, so it's sort of a nonsensical
> request; that said, you can do it ;).

I dunno, you might have some sort of efficiency test and store the 
results in an array keyed to the names of the various tests and you want 
to see them in the order of best to worst.

I think that for the time being this works for me:


typeset -A array=( [1test_one]=123 [2test_two]=345 [3test_three]=111 
[4test_four]=5 )

echo "\nraw"
printf "\n%-20s %s" ${(kv)array}
echo "\n\nsorted on key"
printf "\n%-20s %s" ${(kv)array} | sort
echo "\nsorted on value"
printf "\n%-20s %s" ${(kv)array} | sort -k2
echo "\nsorted on value numerically"
printf "\n%-20s %s" ${(kv)array} | sort -k2g

output:


5 /aWorking/Zsh/Source/Wk 0 % . test2

raw # No recognizable order

1test_one            123
2test_two            345
4test_four           5
3test_three          111

sorted on key # If the array itself won't do it, then make 'sort' do it:

1test_one            123
2test_two            345
3test_three          111
4test_four           5

sorted on value # dictionary sort

3test_three          111
1test_one            123
2test_two            345
4test_four           5

sorted on value numerically # numeric sort

4test_four           5    # This isn't going anywhere.
3test_three          111
1test_one            123
2test_two            345  # Here's the winner, congratulations test_two 
team.

... so however much I think there should be a defined order to the way 
the arrays print, it seems easy enough to use other tools to do it.


[-- Attachment #2: Type: text/html, Size: 2891 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-19 17:15                                                   ` Ray Andrews
@ 2024-01-19 17:42                                                     ` Bart Schaefer
  2024-01-19 18:45                                                       ` Ray Andrews
  0 siblings, 1 reply; 83+ messages in thread
From: Bart Schaefer @ 2024-01-19 17:42 UTC (permalink / raw)
  To: zsh-users

On Fri, Jan 19, 2024 at 9:15 AM Ray Andrews <rayandrews@eastlink.ca> wrote:
>
> ... so however much I think there should be a defined order to the way the arrays print, it seems easy enough to use other tools to do it.

 % typeset -A foo=( [zero]=0 [one]=1 [two]=2 [three]=3 [four]=4 )
 % foo[other]='some random $text here'
 % foo[nil]=''
 % print -raC2 -- ${(z)${(*ok)foo/(#b)(*)/${(q)match[1]} ${(q-)foo[$match[1]]}}}
 four   4
 nil    ''
 one    1
 other  'some random $text here'
 three  3
 two    2
 zero   0

(the * in (*ok) requires zsh 5.9, otherwise setopt extendedglob)


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Empty element elision and associative arrays (was Re: Slurping a file)
  2024-01-19 17:42                                                     ` Bart Schaefer
@ 2024-01-19 18:45                                                       ` Ray Andrews
  0 siblings, 0 replies; 83+ messages in thread
From: Ray Andrews @ 2024-01-19 18:45 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 154 bytes --]


On 2024-01-19 09:42, Bart Schaefer wrote:
> (the * in (*ok) requires zsh 5.9, otherwise setopt extendedglob)
Got to get cracking with the latest version.

[-- Attachment #2: Type: text/html, Size: 776 bytes --]

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-01-14 10:34                   ` Slurping a file (was: more spllitting travails) Roman Perepelitsa
                                       ` (3 preceding siblings ...)
  2024-01-15  2:00                     ` Slurping a file (was: more spllitting travails) Bart Schaefer
@ 2024-02-10 20:48                     ` Stephane Chazelas
  2024-02-11  0:59                       ` Mikael Magnusson
  2024-02-11  4:46                       ` Bart Schaefer
  4 siblings, 2 replies; 83+ messages in thread
From: Stephane Chazelas @ 2024-02-10 20:48 UTC (permalink / raw)
  To: Roman Perepelitsa; +Cc: Bart Schaefer, Zsh Users

2024-01-14 11:34:00 +0100, Roman Perepelitsa:
[...]
>     function slurp() {
>       emulate -L zsh -o no_multibyte
>       zmodload zsh/system || return
>       local -a content
>       local -i i
>       while true; do
>         sysread 'content[++i]' && continue
>         (( $? == 5 )) || return
>         break
>       done
>       typeset -g REPLY=${(j::)content}
>     }
[...]

IMO, it would be more useful if the result was returned in the
variable whose name was given as argument (defaulting to REPLY
if none was given like for read or sysread).

And would be better if upon error the returned variable
contained either what was successfully read or nothing (like
read but unlikely sysread).

Maybe something like:

zslurp() {
  emulate -L zsh -o no_multibyte
  typeset -n _zslurp_var=${1-REPLY}
  _zslurp_var=
  zmodload zsh/system || return
  local -a _zslurp_content
  local -i _zslurp_i _zslurp_ret
  while true; do
    sysread '_zslurp_content[++_zslurp_i]' && continue
    _zslurp_ret=$?
    break
  done
  _zslurp_var=${(j::)_zslurp_content}
  (( _zslurp_ret == 5 ))
}

-- 
Stephane


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-02-10 20:48                     ` Stephane Chazelas
@ 2024-02-11  0:59                       ` Mikael Magnusson
  2024-02-11  4:49                         ` Bart Schaefer
  2024-02-11  4:46                       ` Bart Schaefer
  1 sibling, 1 reply; 83+ messages in thread
From: Mikael Magnusson @ 2024-02-11  0:59 UTC (permalink / raw)
  To: Roman Perepelitsa, Bart Schaefer, Zsh Users

On 2/10/24, Stephane Chazelas <stephane@chazelas.org> wrote:
> 2024-01-14 11:34:00 +0100, Roman Perepelitsa:
> [...]
>>     function slurp() {
>>       emulate -L zsh -o no_multibyte
>>       zmodload zsh/system || return
>>       local -a content
>>       local -i i
>>       while true; do
>>         sysread 'content[++i]' && continue
>>         (( $? == 5 )) || return
>>         break
>>       done
>>       typeset -g REPLY=${(j::)content}
>>     }
> [...]
>
> IMO, it would be more useful if the result was returned in the
> variable whose name was given as argument (defaulting to REPLY
> if none was given like for read or sysread).
>
> And would be better if upon error the returned variable
> contained either what was successfully read or nothing (like
> read but unlikely sysread).
>
> Maybe something like:
>
> zslurp() {
>   emulate -L zsh -o no_multibyte
>   typeset -n _zslurp_var=${1-REPLY}
>   _zslurp_var=
>   zmodload zsh/system || return
>   local -a _zslurp_content
>   local -i _zslurp_i _zslurp_ret
>   while true; do
>     sysread '_zslurp_content[++_zslurp_i]' && continue
>     _zslurp_ret=$?
>     break
>   done
>   _zslurp_var=${(j::)_zslurp_content}
>   (( _zslurp_ret == 5 ))
> }

I believe one of the motivating factors for this function was speed,
and copying all the data an extra time probably doesn't help with
that.

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-02-10 20:48                     ` Stephane Chazelas
  2024-02-11  0:59                       ` Mikael Magnusson
@ 2024-02-11  4:46                       ` Bart Schaefer
  2024-02-11  5:06                         ` Mikael Magnusson
  2024-02-11  7:09                         ` Stephane Chazelas
  1 sibling, 2 replies; 83+ messages in thread
From: Bart Schaefer @ 2024-02-11  4:46 UTC (permalink / raw)
  To: Zsh Users

On Sat, Feb 10, 2024 at 2:48 PM Stephane Chazelas <stephane@chazelas.org> wrote:
>
> IMO, it would be more useful if the result was returned in the
> variable whose name was given as argument (defaulting to REPLY
> if none was given like for read or sysread).

Could also read a file provided by name as an argument instead of only
reading stdin, but I elected to commit the most straightforward
version.

> And would be better if upon error the returned variable
> contained either what was successfully read or nothing (like
> read but unlikely sysread).

I had the impression this slurp-er was intended to work like examples
from other languages, which do not have "read"-like behavior.

> zslurp() {
>   emulate -L zsh -o no_multibyte
>   typeset -n _zslurp_var=${1-REPLY}

Is there really any reason to prefix the locals with "_zslurp_" ?
That's good practice if the function might call other code that's less
careful about it's names and scoping, or if you need to the variable
to become global, but nothing like that occurs here.

^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-02-11  0:59                       ` Mikael Magnusson
@ 2024-02-11  4:49                         ` Bart Schaefer
  2024-02-11  5:04                           ` Mikael Magnusson
  0 siblings, 1 reply; 83+ messages in thread
From: Bart Schaefer @ 2024-02-11  4:49 UTC (permalink / raw)
  To: Zsh Users

On Sat, Feb 10, 2024 at 6:59 PM Mikael Magnusson <mikachu@gmail.com> wrote:
>
> I believe one of the motivating factors for this function was speed,
> and copying all the data an extra time probably doesn't help with
> that.

Where's the extra copy in Stephane's version?

Roman's optimized version uses the (j::) as well, turns out it's
faster at a certain file size to build up the array and then copy it
with the join than it is to append to a string without using the
array.


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-02-11  4:49                         ` Bart Schaefer
@ 2024-02-11  5:04                           ` Mikael Magnusson
  0 siblings, 0 replies; 83+ messages in thread
From: Mikael Magnusson @ 2024-02-11  5:04 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

On 2/11/24, Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Sat, Feb 10, 2024 at 6:59 PM Mikael Magnusson <mikachu@gmail.com> wrote:
>>
>> I believe one of the motivating factors for this function was speed,
>> and copying all the data an extra time probably doesn't help with
>> that.
>
> Where's the extra copy in Stephane's version?
>
> Roman's optimized version uses the (j::) as well, turns out it's
> faster at a certain file size to build up the array and then copy it
> with the join than it is to append to a string without using the
> array.

Oops, I just looked at the end of the updated version and saw it had
an assignment of the read data to the passed variable and didn't
compare it properly to the original version.

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-02-11  4:46                       ` Bart Schaefer
@ 2024-02-11  5:06                         ` Mikael Magnusson
  2024-02-11  7:09                         ` Stephane Chazelas
  1 sibling, 0 replies; 83+ messages in thread
From: Mikael Magnusson @ 2024-02-11  5:06 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

On 2/11/24, Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Sat, Feb 10, 2024 at 2:48 PM Stephane Chazelas <stephane@chazelas.org>
> wrote:
>>
>> IMO, it would be more useful if the result was returned in the
>> variable whose name was given as argument (defaulting to REPLY
>> if none was given like for read or sysread).
>
> Could also read a file provided by name as an argument instead of only
> reading stdin, but I elected to commit the most straightforward
> version.
>
>> And would be better if upon error the returned variable
>> contained either what was successfully read or nothing (like
>> read but unlikely sysread).
>
> I had the impression this slurp-er was intended to work like examples
> from other languages, which do not have "read"-like behavior.
>
>> zslurp() {
>>   emulate -L zsh -o no_multibyte
>>   typeset -n _zslurp_var=${1-REPLY}
>
> Is there really any reason to prefix the locals with "_zslurp_" ?
> That's good practice if the function might call other code that's less
> careful about it's names and scoping, or if you need to the variable
> to become global, but nothing like that occurs here.

I guess it makes this outcome less likely,
% zslurp _zslurp_var
zslurp:2: _zslurp_var: invalid self reference

Is there any way to avoid it completely?

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 83+ messages in thread

* Re: Slurping a file (was: more spllitting travails)
  2024-02-11  4:46                       ` Bart Schaefer
  2024-02-11  5:06                         ` Mikael Magnusson
@ 2024-02-11  7:09                         ` Stephane Chazelas
  1 sibling, 0 replies; 83+ messages in thread
From: Stephane Chazelas @ 2024-02-11  7:09 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

2024-02-10 22:46:15 -0600, Bart Schaefer:
> On Sat, Feb 10, 2024 at 2:48 PM Stephane Chazelas <stephane@chazelas.org> wrote:
> >
> > IMO, it would be more useful if the result was returned in the
> > variable whose name was given as argument (defaulting to REPLY
> > if none was given like for read or sysread).
> 
> Could also read a file provided by name as an argument instead of only
> reading stdin, but I elected to commit the most straightforward
> version.

Yes, using redirection as in:

zslurp a < file1
zslurp b < file2

is easy enough

> 
> > And would be better if upon error the returned variable
> > contained either what was successfully read or nothing (like
> > read but unlikely sysread).
> 
> I had the impression this slurp-er was intended to work like examples
> from other languages, which do not have "read"-like behavior.

It sounded safer to me to avoid leaving the variable unmodified
if the input could not be read in case the caller forgets to do
error handling, but then again in

zslurp a < file1

above, if file1 cannot be opened, zslurp won't be called so $a
will be left unmodified regardless (same in read var < file1),
so yes, probably pointless.

> 
> > zslurp() {
> >   emulate -L zsh -o no_multibyte
> >   typeset -n _zslurp_var=${1-REPLY}
> 
> Is there really any reason to prefix the locals with "_zslurp_" ?
> That's good practice if the function might call other code that's less
> careful about it's names and scoping, or if you need to the variable
> to become global, but nothing like that occurs here.

See https://zsh.org/workers/52530
From my quick testing I had assumed zsh's nameref worked like
bash ones and were just a lexical dereferencing, hence the
namespacing. But it seems that is only needed in cases where
referenced variables were not declared in the caller. If
workers/52530 is addressed, that namespacing is no longer
needed.

-- 
Stephane


^ permalink raw reply	[flat|nested] 83+ messages in thread

end of thread, other threads:[~2024-02-11  7:10 UTC | newest]

Thread overview: 83+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-12 19:05 more splitting travails Ray Andrews
2024-01-12 19:19 ` Bart Schaefer
2024-01-12 19:56   ` Ray Andrews
2024-01-12 20:07     ` Mark J. Reed
     [not found]   ` <CAA=-s3zc5a+PA7draaA=FmXtwU9K8RrHbb70HbQN8MhmuXTYrQ@mail.gmail.com>
2024-01-12 20:03     ` Fwd: " Bart Schaefer
2024-01-12 20:32       ` Ray Andrews
2024-01-12 20:50         ` Roman Perepelitsa
2024-01-13  2:12           ` Ray Andrews
2024-01-12 20:51         ` Bart Schaefer
2024-01-12 21:57           ` Mark J. Reed
2024-01-12 22:09             ` Bart Schaefer
2024-01-13  3:06               ` Ray Andrews
2024-01-13  3:36                 ` Ray Andrews
2024-01-13  4:07                   ` Bart Schaefer
2024-01-13  5:39               ` Roman Perepelitsa
2024-01-13 20:02                 ` Slurping a file (was: more spllitting travails) Bart Schaefer
2024-01-13 20:07                   ` Slurping a file Ray Andrews
2024-01-14  5:03                     ` zcurses mouse delay (not Re: Slurping a file) Bart Schaefer
2024-01-14  5:35                       ` Ray Andrews
2024-01-14 10:34                   ` Slurping a file (was: more spllitting travails) Roman Perepelitsa
2024-01-14 10:57                     ` Roman Perepelitsa
2024-01-14 15:36                     ` Slurping a file Ray Andrews
2024-01-14 15:41                       ` Roman Perepelitsa
2024-01-14 20:13                       ` Lawrence Velázquez
2024-01-15  0:03                         ` Ray Andrews
2024-01-15  0:55                           ` Empty element elision and associative arrays (was Re: Slurping a file) Bart Schaefer
2024-01-15  4:09                             ` Ray Andrews
2024-01-15  7:01                               ` Lawrence Velázquez
2024-01-15 14:47                                 ` Ray Andrews
2024-01-18 16:20                                 ` Mark J. Reed
2024-01-18 17:22                                   ` Ray Andrews
2024-01-18 17:36                                     ` Mark J. Reed
2024-01-18 17:55                                       ` Ray Andrews
2024-01-18 22:34                               ` Bart Schaefer
2024-01-18 23:08                                 ` Ray Andrews
2024-01-19  2:46                                   ` Bart Schaefer
2024-01-19  2:58                                     ` Ray Andrews
2024-01-19 10:27                                       ` Stephane Chazelas
2024-01-19 13:45                                         ` Mikael Magnusson
2024-01-19 14:37                                           ` Mark J. Reed
2024-01-19 14:57                                             ` Ray Andrews
2024-01-19 15:46                                               ` Mark J. Reed
2024-01-19 16:01                                                 ` Mikael Magnusson
2024-01-19 17:15                                                   ` Ray Andrews
2024-01-19 17:42                                                     ` Bart Schaefer
2024-01-19 18:45                                                       ` Ray Andrews
2024-01-14 22:09                     ` Slurping a file (was: more spllitting travails) Bart Schaefer
2024-01-15  8:53                       ` Roman Perepelitsa
2024-01-16 19:57                         ` Bart Schaefer
2024-01-16 20:07                           ` Slurping a file Ray Andrews
2024-01-16 20:14                             ` Roman Perepelitsa
2024-01-16 20:38                               ` Ray Andrews
2024-01-16 20:43                                 ` Roman Perepelitsa
2024-01-16 22:27                                   ` Ray Andrews
2024-01-15  2:00                     ` Slurping a file (was: more spllitting travails) Bart Schaefer
2024-01-15  4:24                       ` Slurping a file Ray Andrews
2024-01-15  6:56                         ` Lawrence Velázquez
2024-01-15 14:37                           ` Ray Andrews
2024-01-15 15:10                             ` Marc Chantreux
2024-01-15 15:29                               ` Mark J. Reed
2024-01-15 16:16                                 ` Marc Chantreux
2024-01-15 16:33                                   ` MUAs (was: Re: Slurping a file) zeurkous
2024-01-16  7:23                               ` Slurping a file Lawrence Velázquez
2024-01-16 14:37                                 ` Ray Andrews
2024-01-17  3:50                                   ` Lawrence Velázquez
2024-01-17  5:10                                     ` Ray Andrews
2024-01-15  7:26                       ` Slurping a file (was: more spllitting travails) Lawrence Velázquez
2024-01-15 14:48                         ` Slurping a file Ray Andrews
2024-01-15 13:13                       ` Slurping a file (was: more spllitting travails) Marc Chantreux
2024-02-10 20:48                     ` Stephane Chazelas
2024-02-11  0:59                       ` Mikael Magnusson
2024-02-11  4:49                         ` Bart Schaefer
2024-02-11  5:04                           ` Mikael Magnusson
2024-02-11  4:46                       ` Bart Schaefer
2024-02-11  5:06                         ` Mikael Magnusson
2024-02-11  7:09                         ` Stephane Chazelas
2024-01-13  2:19           ` Fwd: more splitting travails Ray Andrews
2024-01-13  3:59             ` Bart Schaefer
2024-01-13  4:54               ` Ray Andrews
2024-01-13  5:51                 ` Roman Perepelitsa
2024-01-13 16:40                   ` Ray Andrews
2024-01-13 18:22                     ` Bart Schaefer
2024-01-13 19:08                       ` Ray Andrews

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).