* A comment about "slurp" and -o multibyte
@ 2024-01-17 3:45 Bart Schaefer
2024-01-17 6:07 ` Roman Perepelitsa
0 siblings, 1 reply; 3+ messages in thread
From: Bart Schaefer @ 2024-01-17 3:45 UTC (permalink / raw)
To: Zsh Users
On Sun, Jan 14, 2024 at 2:34 AM Roman Perepelitsa
<roman.perepelitsa@gmail.com> wrote:
>
> function slurp() {
> emulate -L zsh -o no_multibyte
> [...]
> typeset -g REPLY=${(j::)content}
> }
Although the function faithfully reads the input stream into $REPLY,
later references to $REPLY with the multibyte option back in effect
will (re-)interpret the content as multibyte characters. This may not
be what's desired.
% slurp < =zsh
% () {
print $#REPLY
print ${(m)#REPLY}
print ${(mm)#REPLY}
setopt localoptions nomultibyte
print $#REPLY
}
872903 <-- number of characters
873259 <-- width of printable characters
872383 <-- number of glyphs
878288 <-- actual number of bytes
(Of course those first three numbers are all garbage because it's just
interpreting an executable as wide character text.)
Unfortunately there's no parameter flag to toggle multibyte for a
single expansion.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: A comment about "slurp" and -o multibyte
2024-01-17 3:45 A comment about "slurp" and -o multibyte Bart Schaefer
@ 2024-01-17 6:07 ` Roman Perepelitsa
2024-01-17 7:06 ` Bart Schaefer
0 siblings, 1 reply; 3+ messages in thread
From: Roman Perepelitsa @ 2024-01-17 6:07 UTC (permalink / raw)
To: Bart Schaefer; +Cc: Zsh Users
On Wed, Jan 17, 2024 at 4:46 AM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Sun, Jan 14, 2024 at 2:34 AM Roman Perepelitsa
> <roman.perepelitsa@gmail.com> wrote:
> >
> > function slurp() {
> > emulate -L zsh -o no_multibyte
> > [...]
> > typeset -g REPLY=${(j::)content}
> > }
>
> Although the function faithfully reads the input stream into $REPLY,
> later references to $REPLY with the multibyte option back in effect
> will (re-)interpret the content as multibyte characters. This may not
> be what's desired.
>
> % slurp < =zsh
> % () {
> print $#REPLY
> print ${(m)#REPLY}
> print ${(mm)#REPLY}
> setopt localoptions nomultibyte
> print $#REPLY
> }
> 872903 <-- number of characters
> 873259 <-- width of printable characters
> 872383 <-- number of glyphs
> 878288 <-- actual number of bytes
>
> (Of course those first three numbers are all garbage because it's just
> interpreting an executable as wide character text.)
To me this behavior looks as expected. It's consistent with `read`,
`sysread` and process substitution.
% head -c $((1 << 20)) </dev/urandom | tr '\0' x >1MB
% slurp <1MB
% IFS= read -rd '' read <1MB
% sysread -s $((1 << 20)) sysread <1MB
% procsubst=${"$(<1MB; print -n .)"%.}
% () {
print -r -- $#REPLY $#read $#sysread $#procsubst
setopt local_options no_multibyte
print -r -- $#REPLY $#read $#sysread $#procsubst
}
1008389 1008389 1008389 1008389
1048576 1048576 1048576 1048576
Roman.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: A comment about "slurp" and -o multibyte
2024-01-17 6:07 ` Roman Perepelitsa
@ 2024-01-17 7:06 ` Bart Schaefer
0 siblings, 0 replies; 3+ messages in thread
From: Bart Schaefer @ 2024-01-17 7:06 UTC (permalink / raw)
To: Zsh Users
On Tue, Jan 16, 2024 at 10:08 PM Roman Perepelitsa
<roman.perepelitsa@gmail.com> wrote:
>
> To me this behavior looks as expected. It's consistent with `read`,
> `sysread` and process substitution.
It's consistent and as expected from the implementation, yes, but it
might not be what a user expects without careful consideration.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-01-17 7:07 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-17 3:45 A comment about "slurp" and -o multibyte Bart Schaefer
2024-01-17 6:07 ` Roman Perepelitsa
2024-01-17 7:06 ` Bart Schaefer
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).