zsh-users
 help / color / mirror / code / Atom feed
* A comment about "slurp" and -o multibyte
@ 2024-01-17  3:45 Bart Schaefer
  2024-01-17  6:07 ` Roman Perepelitsa
  0 siblings, 1 reply; 3+ messages in thread
From: Bart Schaefer @ 2024-01-17  3:45 UTC (permalink / raw)
  To: Zsh Users

On Sun, Jan 14, 2024 at 2:34 AM Roman Perepelitsa
<roman.perepelitsa@gmail.com> wrote:
>
>     function slurp() {
>       emulate -L zsh -o no_multibyte
> [...]
>       typeset -g REPLY=${(j::)content}
>     }

Although the function faithfully reads the input stream into $REPLY,
later references to $REPLY with the multibyte option back in effect
will (re-)interpret the content as multibyte characters.  This may not
be what's desired.

% slurp < =zsh
% () {
print $#REPLY
print ${(m)#REPLY}
print ${(mm)#REPLY}
setopt localoptions nomultibyte
print $#REPLY
}
872903  <-- number of characters
873259  <-- width of printable characters
872383  <-- number of glyphs
878288  <-- actual number of bytes

(Of course those first three numbers are all garbage because it's just
interpreting an executable as wide character text.)

Unfortunately there's no parameter flag to toggle multibyte for a
single expansion.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: A comment about "slurp" and -o multibyte
  2024-01-17  3:45 A comment about "slurp" and -o multibyte Bart Schaefer
@ 2024-01-17  6:07 ` Roman Perepelitsa
  2024-01-17  7:06   ` Bart Schaefer
  0 siblings, 1 reply; 3+ messages in thread
From: Roman Perepelitsa @ 2024-01-17  6:07 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh Users

On Wed, Jan 17, 2024 at 4:46 AM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Sun, Jan 14, 2024 at 2:34 AM Roman Perepelitsa
> <roman.perepelitsa@gmail.com> wrote:
> >
> >     function slurp() {
> >       emulate -L zsh -o no_multibyte
> > [...]
> >       typeset -g REPLY=${(j::)content}
> >     }
>
> Although the function faithfully reads the input stream into $REPLY,
> later references to $REPLY with the multibyte option back in effect
> will (re-)interpret the content as multibyte characters.  This may not
> be what's desired.
>
> % slurp < =zsh
> % () {
> print $#REPLY
> print ${(m)#REPLY}
> print ${(mm)#REPLY}
> setopt localoptions nomultibyte
> print $#REPLY
> }
> 872903  <-- number of characters
> 873259  <-- width of printable characters
> 872383  <-- number of glyphs
> 878288  <-- actual number of bytes
>
> (Of course those first three numbers are all garbage because it's just
> interpreting an executable as wide character text.)

To me this behavior looks as expected. It's consistent with `read`,
`sysread` and process substitution.

    % head -c $((1 << 20)) </dev/urandom | tr '\0' x >1MB
    % slurp <1MB
    % IFS= read -rd '' read <1MB
    % sysread -s $((1 << 20)) sysread <1MB
    % procsubst=${"$(<1MB; print -n .)"%.}
    % () {
      print -r -- $#REPLY $#read $#sysread $#procsubst
      setopt local_options no_multibyte
      print -r -- $#REPLY $#read $#sysread $#procsubst
    }
    1008389 1008389 1008389 1008389
    1048576 1048576 1048576 1048576

Roman.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: A comment about "slurp" and -o multibyte
  2024-01-17  6:07 ` Roman Perepelitsa
@ 2024-01-17  7:06   ` Bart Schaefer
  0 siblings, 0 replies; 3+ messages in thread
From: Bart Schaefer @ 2024-01-17  7:06 UTC (permalink / raw)
  To: Zsh Users

On Tue, Jan 16, 2024 at 10:08 PM Roman Perepelitsa
<roman.perepelitsa@gmail.com> wrote:
>
> To me this behavior looks as expected. It's consistent with `read`,
> `sysread` and process substitution.

It's consistent and as expected from the implementation, yes, but it
might not be what a user expects without careful consideration.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-01-17  7:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-17  3:45 A comment about "slurp" and -o multibyte Bart Schaefer
2024-01-17  6:07 ` Roman Perepelitsa
2024-01-17  7:06   ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).