zsh-workers
 help / color / mirror / code / Atom feed
* Unicode allowables in Environment variables
@ 2024-11-03 10:44 William DeShazer
  2024-11-04  2:18 ` Mikael Magnusson
  0 siblings, 1 reply; 30+ messages in thread
From: William DeShazer @ 2024-11-03 10:44 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 1420 bytes --]

Greetings,

I have been using non-POSIX compliant unicode characters in my Folder names because of the tremendous amounts of literature that say both macOS and particularly the Terminal.app and Zsh support it. In using the Crystal-ball character in the path of my ZDOTDIR, I found that it got Metafied and no literature explained that behavior for environment variables, save this discussion in 2014 (https://zsh.sourceforge.io/FAQ/zshfaq05.html#l56)

I think that if this is the behavior, it should at least be put on this FAQ, https://zsh.sourceforge.io/FAQ/zshfaq05.html#l56.

It should be clear and in big bold letters. 

If that is not the case, then I would appreciate a definitive statement and then some guidance on what might be metafiying my ZDOTDIR. It’s too closely aligned with the practice of Zsh to Metafy not to be obvious to someone. 

For reference the crystal ball unicode is \U0001F52E and I made the assignment in a few places:

COMBINING_CHARS, MULTIBYTE were both set and $(locale LC_CTYPE) == UTF-8

First:  ~/.zshenv 

`export ZDOTDIR=~/App🔮Bundles/Configurations/Zsh/•Z`

and then using launchctl

`lauchctl setenv ZDOTDIR /Users/$USER/App🔮Bundles/Configurations/Zsh/•Z`

When they get to my other dotfiles, it is read as:

/Users/$USER/App\M-p\<\M-.Bundles/Configurations/Zsh/•Z

Any insight on this would be appreciated.

Thanks,

Will DeShazer


[-- Attachment #2: Type: text/html, Size: 3158 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-03 10:44 Unicode allowables in Environment variables William DeShazer
@ 2024-11-04  2:18 ` Mikael Magnusson
  2024-11-04  4:13   ` Mark J. Reed
  0 siblings, 1 reply; 30+ messages in thread
From: Mikael Magnusson @ 2024-11-04  2:18 UTC (permalink / raw)
  To: William DeShazer; +Cc: zsh-workers

On Sun, Nov 3, 2024 at 10:49 PM William DeShazer
<earl.deshazer@gmail.com> wrote:
>
> Greetings,
>
> I have been using non-POSIX compliant unicode characters in my Folder names because of the tremendous amounts of literature that say both macOS and particularly the Terminal.app and Zsh support it. In using the Crystal-ball character in the path of my ZDOTDIR, I found that it got Metafied and no literature explained that behavior for environment variables, save this discussion in 2014 (https://zsh.sourceforge.io/FAQ/zshfaq05.html#l56)
>
> I think that if this is the behavior, it should at least be put on this FAQ, https://zsh.sourceforge.io/FAQ/zshfaq05.html#l56.
>
> It should be clear and in big bold letters.
>
> If that is not the case, then I would appreciate a definitive statement and then some guidance on what might be metafiying my ZDOTDIR. It’s too closely aligned with the practice of Zsh to Metafy not to be obvious to someone.
>
> For reference the crystal ball unicode is \U0001F52E and I made the assignment in a few places:
>
> COMBINING_CHARS, MULTIBYTE were both set and $(locale LC_CTYPE) == UTF-8
>
> First:  ~/.zshenv
>
> `export ZDOTDIR=~/App🔮Bundles/Configurations/Zsh/•Z`
>
> and then using launchctl
>
> `lauchctl setenv ZDOTDIR /Users/$USER/App🔮Bundles/Configurations/Zsh/•Z`
>
> When they get to my other dotfiles, it is read as:
>
> /Users/$USER/App\M-p\<\M-.Bundles/Configurations/Zsh/•Z

When your crystal ball is metafied, the sequence is ð\M-^C¿\M-^C´® so
whatever your problem is, it's not related to metafying.

I don't use osx, but the following works fine here:
% ZDOTDIR=🔮 zsh -f
% echo $ZDOTDIR
🔮

Perhaps your problem is with launchctl?

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04  2:18 ` Mikael Magnusson
@ 2024-11-04  4:13   ` Mark J. Reed
  2024-11-04  6:27     ` William DeShazer
  0 siblings, 1 reply; 30+ messages in thread
From: Mark J. Reed @ 2024-11-04  4:13 UTC (permalink / raw)
  To: Mikael Magnusson; +Cc: William DeShazer, zsh-workers

[-- Attachment #1: Type: text/plain, Size: 2544 bytes --]

>
> I don't use osx, but the following works fine here:

% ZDOTDIR=🔮 zsh -f
> % echo $ZDOTDIR

🔮


I am on macOS, and that works fine here, too. As do Unicode things that go
beyond putting a string into a parameter value; for instance, you can use
non-Latin alphanumerics in variable names:

%* ℋ=$HOME *

%* echo $ℋ*

/Users/mjreed


You can even *export* such a var, though for some reason trying to set such
a var in the environment of a command (e.g. *ℋ=$HOME zsh -f*) doesn't work.





On Sun, Nov 3, 2024 at 9:19 PM Mikael Magnusson <mikachu@gmail.com> wrote:

> On Sun, Nov 3, 2024 at 10:49 PM William DeShazer
> <earl.deshazer@gmail.com> wrote:
> >
> > Greetings,
> >
> > I have been using non-POSIX compliant unicode characters in my Folder
> names because of the tremendous amounts of literature that say both macOS
> and particularly the Terminal.app and Zsh support it. In using the
> Crystal-ball character in the path of my ZDOTDIR, I found that it got
> Metafied and no literature explained that behavior for environment
> variables, save this discussion in 2014 (
> https://zsh.sourceforge.io/FAQ/zshfaq05.html#l56)
> >
> > I think that if this is the behavior, it should at least be put on this
> FAQ, https://zsh.sourceforge.io/FAQ/zshfaq05.html#l56.
> >
> > It should be clear and in big bold letters.
> >
> > If that is not the case, then I would appreciate a definitive statement
> and then some guidance on what might be metafiying my ZDOTDIR. It’s too
> closely aligned with the practice of Zsh to Metafy not to be obvious to
> someone.
> >
> > For reference the crystal ball unicode is \U0001F52E and I made the
> assignment in a few places:
> >
> > COMBINING_CHARS, MULTIBYTE were both set and $(locale LC_CTYPE) == UTF-8
> >
> > First:  ~/.zshenv
> >
> > `export ZDOTDIR=~/App🔮Bundles/Configurations/Zsh/•Z`
> >
> > and then using launchctl
> >
> > `lauchctl setenv ZDOTDIR /Users/$USER/App🔮Bundles/Configurations/Zsh/•Z`
> >
> > When they get to my other dotfiles, it is read as:
> >
> > /Users/$USER/App\M-p\<\M-.Bundles/Configurations/Zsh/•Z
>
> When your crystal ball is metafied, the sequence is ð\M-^C¿\M-^C´® so
> whatever your problem is, it's not related to metafying.
>
> I don't use osx, but the following works fine here:
> % ZDOTDIR=🔮 zsh -f
> % echo $ZDOTDIR
> 🔮
>
> Perhaps your problem is with launchctl?
>
> --
> Mikael Magnusson
>
>

-- 
Mark J. Reed <markjreed@gmail.com>

[-- Attachment #2: Type: text/html, Size: 7208 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04  4:13   ` Mark J. Reed
@ 2024-11-04  6:27     ` William DeShazer
  2024-11-04 15:29       ` William DeShazer
  0 siblings, 1 reply; 30+ messages in thread
From: William DeShazer @ 2024-11-04  6:27 UTC (permalink / raw)
  To: Mark J. Reed; +Cc: Mikael Magnusson, zsh-workers

[-- Attachment #1: Type: text/plain, Size: 3818 bytes --]

Thanks to both of you for considering this.

Correct me if I am wrong, but in both of your, Mark's and Mikael's, examples, it appears that you are at a prompt, meaning an existing session and then spawn a sub-shell. I concede that will work. The problem is that ZDOTDIR defined that late results in all the dotfiles being created or utilized from $HOME.

I would like it to be defined when a new instance of Terminal.app launches. 

I think it is important to point out, when Terminal.app launches it does look for .zshenv in ~/App🔮Bundles/Configurations/Zsh/•Z, but inside of the .zshenv or .zshrc if I echo $ZDOTDIR I get this (although I removed my username):

/Users/<username>/App\M-p\<\M-.Bundles/Configurations/Zsh/•Z/.zshenv:3: bad pattern: /Users/<username>/App\M-p<\M-.Bundles/Configurations/Zsh/•Z

I have never looked this low into the shell initialization before, but somewhere between init and .zshenv ZDOTDIR gets metafied.

Is there anything that I can share that would help you see what’s going on?

Kind regards,

Will DeShazer


> On Nov 3, 2024, at 20:13, Mark J. Reed <markjreed@gmail.com> wrote:
> 
>> I don't use osx, but the following works fine here: 
>> % ZDOTDIR=🔮 zsh -f
>> % echo $ZDOTDIR 
>> 🔮
> 
> I am on macOS, and that works fine here, too. As do Unicode things that go beyond putting a string into a parameter value; for instance, you can use non-Latin alphanumerics in variable names:
> 
> % ℋ=$HOME 
> % echo $ℋ
> /Users/mjreed
> 
> You can even export such a var, though for some reason trying to set such a var in the environment of a command (e.g. ℋ=$HOME zsh -f) doesn't work.
> 
>     
> 
> 
> On Sun, Nov 3, 2024 at 9:19 PM Mikael Magnusson <mikachu@gmail.com <mailto:mikachu@gmail.com>> wrote:
>> On Sun, Nov 3, 2024 at 10:49 PM William DeShazer
>> <earl.deshazer@gmail.com <mailto:earl.deshazer@gmail.com>> wrote:
>> >
>> > Greetings,
>> >
>> > I have been using non-POSIX compliant unicode characters in my Folder names because of the tremendous amounts of literature that say both macOS and particularly the Terminal.app and Zsh support it. In using the Crystal-ball character in the path of my ZDOTDIR, I found that it got Metafied and no literature explained that behavior for environment variables, save this discussion in 2014 (https://zsh.sourceforge.io/FAQ/zshfaq05.html#l56)
>> >
>> > I think that if this is the behavior, it should at least be put on this FAQ, https://zsh.sourceforge.io/FAQ/zshfaq05.html#l56.
>> >
>> > It should be clear and in big bold letters.
>> >
>> > If that is not the case, then I would appreciate a definitive statement and then some guidance on what might be metafiying my ZDOTDIR. It’s too closely aligned with the practice of Zsh to Metafy not to be obvious to someone.
>> >
>> > For reference the crystal ball unicode is \U0001F52E and I made the assignment in a few places:
>> >
>> > COMBINING_CHARS, MULTIBYTE were both set and $(locale LC_CTYPE) == UTF-8
>> >
>> > First:  ~/.zshenv
>> >
>> > `export ZDOTDIR=~/App🔮Bundles/Configurations/Zsh/•Z`
>> >
>> > and then using launchctl
>> >
>> > `lauchctl setenv ZDOTDIR /Users/$USER/App🔮Bundles/Configurations/Zsh/•Z`
>> >
>> > When they get to my other dotfiles, it is read as:
>> >
>> > /Users/$USER/App\M-p\<\M-.Bundles/Configurations/Zsh/•Z
>> 
>> When your crystal ball is metafied, the sequence is ð\M-^C¿\M-^C´® so
>> whatever your problem is, it's not related to metafying.
>> 
>> I don't use osx, but the following works fine here:
>> % ZDOTDIR=🔮 zsh -f
>> % echo $ZDOTDIR
>> 🔮
>> 
>> Perhaps your problem is with launchctl?
>> 
>> -- 
>> Mikael Magnusson
>> 
> 
> 
> --
> Mark J. Reed <markjreed@gmail.com <mailto:markjreed@gmail.com>>


[-- Attachment #2: Type: text/html, Size: 10102 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04  6:27     ` William DeShazer
@ 2024-11-04 15:29       ` William DeShazer
  2024-11-04 18:12         ` Bart Schaefer
  0 siblings, 1 reply; 30+ messages in thread
From: William DeShazer @ 2024-11-04 15:29 UTC (permalink / raw)
  To: Mark J. Reed; +Cc: Mikael Magnusson, zsh-workers

[-- Attachment #1: Type: text/plain, Size: 10953 bytes --]

I have localized the issue to where I had previously been suspicious  but for some reason was unable to incriminate it. Now I can … I think, unless this is revealing something else. What I can confirm is that the Variable state is preserved until here. So I think that removes my concern about some kind of interaction with tty.

It gets mangled inside /etc/zshrc_Apple_Terminal which is kicked off by /etc/zshrc. This code section is intended to synchronize with the new feature to have the Terminal.app Title bar auto update. In an effort to come up with a safe URL path, by adding this to a zsh hook, this code is mangling ZDOTDIR. 

The code:

```zsh
# Zsh support for Terminal.

echo 'In zshrc_Apple_Terminal Before Mangling: '$ZDOTDIR

# Working Directory
#
# Tell the terminal about the current working directory at each prompt.
#
# Terminal uses this to display the directory in the window title bar
# and tab bar, and for behaviors including creating a new terminal with
# the same working directory and restoring the working directory when
# restoring a terminal for Resume. See Terminal > Preferences for
# additional information.

if [ -z "$INSIDE_EMACS" ]; then
  echo "I am here: $ZDOTDIR:"
  
    update_terminal_cwd() {
  # Identify the directory using a "file:" scheme URL, including
  # the host name to disambiguate local vs. remote paths.

  # Percent-encode the pathname.
  local url_path=''
  {
      # Use LC_CTYPE=C to process text byte-by-byte and
      # LC_COLLATE=C to compare byte-for-byte. Ensure that
      # LC_ALL and LANG are not set so they don't interfere.
      local i ch hexch LC_CTYPE=C LC_COLLATE=C LC_ALL= LANG=
      for ((i = 1; i <= ${#PWD}; ++i)); do
        ch="$PWD[i]"
        if [[ "$ch" =~ [/._~A-Za-z0-9-] ]]; then
            url_path+="$ch"
        else
            printf -v hexch "%02X" "'$ch"
            url_path+="%$hexch"
        fi
      done
    echo POST UPDATE: $ZDOTDIR
  }

  printf '\e]7;%s\a' "file://$HOST$url_path"
    }

    # Register the function so it is called at each prompt.
    autoload -Uz add-zsh-hook
    add-zsh-hook precmd update_terminal_cwd
fi

```


```Last login: Mon Nov  4 07:12:15 on ttys003
In ~/.ZPROFILE:  /Users/$USER/App🔮Bundles/Configurations/Zsh/•Z
Confirming Chars in Cat -v:  /Users/$USER/App?M-^_M-^T?Bundles/Configurations/Zsh/?M-^@?Z
/Users/$USER/App🔮Bundles/Configurations/Zsh/•Z
/Users/$USER/App🔮Bundles/Configurations/Zsh/•Z
In zshrc_Apple_Terminal Before Mangling: /Users/$USER/App🔮Bundles/Configurations/Zsh/•Z
I am here: /Users/$USER/App🔮Bundles/Configurations/Zsh/•Z:
/Users/$USER/App\M-p\<\M-.Bundles/Configurations/Zsh/•Z/.zshrc:4: bad pattern: /Users/$USER/App\M-p<\M-.Bundles/Configurations/Zsh/•Z
update_terminal_cwd:20: bad pattern: /Users/$USER/App?<?Bundles/Configurations/Zsh/?\M-^@?Z    

I’m not familiar enough with Unicode to know if the format spec “%02X” is presumptive. Would anyone posit a more robust solution to getting a URL? 

Kind regards,

Will

> On Nov 3, 2024, at 22:27, William DeShazer <earl.deshazer@gmail.com> wrote:
> 
> Thanks to both of you for considering this.
> 
> Correct me if I am wrong, but in both of your, Mark's and Mikael's, examples, it appears that you are at a prompt, meaning an existing session and then spawn a sub-shell. I concede that will work. The problem is that ZDOTDIR defined that late results in all the dotfiles being created or utilized from $HOME.
> 
> I would like it to be defined when a new instance of Terminal.app launches. 
> 
> I think it is important to point out, when Terminal.app launches it does look for .zshenv in ~/App🔮Bundles/Configurations/Zsh/•Z, but inside of the .zshenv or .zshrc if I echo $ZDOTDIR I get this (although I removed my username):
> 
> /Users/<username>/App\M-p\<\M-.Bundles/Configurations/Zsh/•Z/.zshenv:3: bad pattern: /Users/<username>/App\M-p<\M-.Bundles/Configurations/Zsh/•Z
> 
> I have never looked this low into the shell initialization before, but somewhere between init and .zshenv ZDOTDIR gets metafied.
> 
> Is there anything that I can share that would help you see what’s going on?
> 
> Kind regards,
> 
> Will DeShazer
> 
> 
>> On Nov 3, 2024, at 20:13, Mark J. Reed <markjreed@gmail.com> wrote:
>> 
>> I don't use osx, but the following works fine here: % ZDOTDIR=🔮 zsh -f
>> % echo $ZDOTDIR 🔮
>> 
>> I am on macOS, and that works fine here, too. As do Unicode things that go beyond putting a string into a parameter value; for instance, you can use non-Latin alphanumerics in variable names:
>> 
>> % ℋ=$HOME % echo $ℋ
>> /Users/mjreed
>> 
>> You can even export such a var, though for some reason trying to set such a var in the environment of a command (e.g. ℋ=$HOME zsh -f) doesn't work.
>> 
>>     
>> 
>> On Sun, Nov 3, 2024 at 9:19 PM Mikael Magnusson <mikachu@gmail.com> wrote:
>> On Sun, Nov 3, 2024 at 10:49 PM William DeShazer
>> <earl.deshazer@gmail.com> wrote:
>> >
>> > Greetings,
>> >
>> > I have been using non-POSIX compliant unicode characters in my Folder names because of the tremendous amounts of literature that say both macOS and particularly the Terminal.app and Zsh support it. In using the Crystal-ball character in the path of my ZDOTDIR, I found that it got Metafied and no literature explained that behavior for environment variables, save this discussion in 2014 (https://zsh.sourceforge.io/FAQ/zshfaq05.html#l56)
>> >
>> > I think that if this is the behavior, it should at least be put on this FAQ, https://zsh.sourceforge.io/FAQ/zshfaq05.html#l56.
>> >
>> > It should be clear and in big bold letters.
>> >
>> > If that is not the case, then I would appreciate a definitive statement and then some guidance on what might be metafiying my ZDOTDIR. It’s too closely aligned with the practice of Zsh to Metafy not to be obvious to someone.
>> >
>> > For reference the crystal ball unicode is \U0001F52E and I made the assignment in a few places:
>> >
>> > COMBINING_CHARS, MULTIBYTE were both set and $(locale LC_CTYPE) == UTF-8
>> >
>> > First:  ~/.zshenv
>> >
>> > `export ZDOTDIR=~/App🔮Bundles/Configurations/Zsh/•Z`
>> >
>> > and then using launchctl
>> >
>> > `lauchctl setenv ZDOTDIR /Users/$USER/App🔮Bundles/Configurations/Zsh/•Z`
>> >
>> > When they get to my other dotfiles, it is read as:
>> >
>> > /Users/$USER/App\M-p\<\M-.Bundles/Configurations/Zsh/•Z
>> 
>> When your crystal ball is metafied, the sequence is ð\M-^C¿\M-^C´® so
>> whatever your problem is, it's not related to metafying.
>> 
>> I don't use osx, but the following works fine here:
>> % ZDOTDIR=🔮 zsh -f
>> % echo $ZDOTDIR
>> 🔮
>> 
>> Perhaps your problem is with launchctl?
>> 
>> -- 
>> Mikael Magnusson
>> 
>> 
>> 
>> -- 
>> Mark J. Reed <markjreed@gmail.com>




> On Nov 3, 2024, at 22:27, William DeShazer <earl.deshazer@gmail.com> wrote:
> 
> Thanks to both of you for considering this.
> 
> Correct me if I am wrong, but in both of your, Mark's and Mikael's, examples, it appears that you are at a prompt, meaning an existing session and then spawn a sub-shell. I concede that will work. The problem is that ZDOTDIR defined that late results in all the dotfiles being created or utilized from $HOME.
> 
> I would like it to be defined when a new instance of Terminal.app launches. 
> 
> I think it is important to point out, when Terminal.app launches it does look for .zshenv in ~/App🔮Bundles/Configurations/Zsh/•Z, but inside of the .zshenv or .zshrc if I echo $ZDOTDIR I get this (although I removed my username):
> 
> /Users/<username>/App\M-p\<\M-.Bundles/Configurations/Zsh/•Z/.zshenv:3: bad pattern: /Users/<username>/App\M-p<\M-.Bundles/Configurations/Zsh/•Z
> 
> I have never looked this low into the shell initialization before, but somewhere between init and .zshenv ZDOTDIR gets metafied.
> 
> Is there anything that I can share that would help you see what’s going on?
> 
> Kind regards,
> 
> Will DeShazer
> 
> 
>> On Nov 3, 2024, at 20:13, Mark J. Reed <markjreed@gmail.com> wrote:
>> 
>>> I don't use osx, but the following works fine here: 
>>> % ZDOTDIR=🔮 zsh -f
>>> % echo $ZDOTDIR 
>>> 🔮
>> 
>> I am on macOS, and that works fine here, too. As do Unicode things that go beyond putting a string into a parameter value; for instance, you can use non-Latin alphanumerics in variable names:
>> 
>> % ℋ=$HOME 
>> % echo $ℋ
>> /Users/mjreed
>> 
>> You can even export such a var, though for some reason trying to set such a var in the environment of a command (e.g. ℋ=$HOME zsh -f) doesn't work.
>> 
>>     
>> 
>> 
>> On Sun, Nov 3, 2024 at 9:19 PM Mikael Magnusson <mikachu@gmail.com <mailto:mikachu@gmail.com>> wrote:
>>> On Sun, Nov 3, 2024 at 10:49 PM William DeShazer
>>> <earl.deshazer@gmail.com <mailto:earl.deshazer@gmail.com>> wrote:
>>> >
>>> > Greetings,
>>> >
>>> > I have been using non-POSIX compliant unicode characters in my Folder names because of the tremendous amounts of literature that say both macOS and particularly the Terminal.app and Zsh support it. In using the Crystal-ball character in the path of my ZDOTDIR, I found that it got Metafied and no literature explained that behavior for environment variables, save this discussion in 2014 (https://zsh.sourceforge.io/FAQ/zshfaq05.html#l56)
>>> >
>>> > I think that if this is the behavior, it should at least be put on this FAQ, https://zsh.sourceforge.io/FAQ/zshfaq05.html#l56.
>>> >
>>> > It should be clear and in big bold letters.
>>> >
>>> > If that is not the case, then I would appreciate a definitive statement and then some guidance on what might be metafiying my ZDOTDIR. It’s too closely aligned with the practice of Zsh to Metafy not to be obvious to someone.
>>> >
>>> > For reference the crystal ball unicode is \U0001F52E and I made the assignment in a few places:
>>> >
>>> > COMBINING_CHARS, MULTIBYTE were both set and $(locale LC_CTYPE) == UTF-8
>>> >
>>> > First:  ~/.zshenv
>>> >
>>> > `export ZDOTDIR=~/App🔮Bundles/Configurations/Zsh/•Z`
>>> >
>>> > and then using launchctl
>>> >
>>> > `lauchctl setenv ZDOTDIR /Users/$USER/App🔮Bundles/Configurations/Zsh/•Z`
>>> >
>>> > When they get to my other dotfiles, it is read as:
>>> >
>>> > /Users/$USER/App\M-p\<\M-.Bundles/Configurations/Zsh/•Z
>>> 
>>> When your crystal ball is metafied, the sequence is ð\M-^C¿\M-^C´® so
>>> whatever your problem is, it's not related to metafying.
>>> 
>>> I don't use osx, but the following works fine here:
>>> % ZDOTDIR=🔮 zsh -f
>>> % echo $ZDOTDIR
>>> 🔮
>>> 
>>> Perhaps your problem is with launchctl?
>>> 
>>> -- 
>>> Mikael Magnusson
>>> 
>> 
>> 
>> --
>> Mark J. Reed <markjreed@gmail.com <mailto:markjreed@gmail.com>>
> 


[-- Attachment #2: Type: text/html, Size: 26746 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 15:29       ` William DeShazer
@ 2024-11-04 18:12         ` Bart Schaefer
  2024-11-04 18:30           ` Mark J. Reed
  2024-11-04 19:13           ` William DeShazer
  0 siblings, 2 replies; 30+ messages in thread
From: Bart Schaefer @ 2024-11-04 18:12 UTC (permalink / raw)
  To: William DeShazer; +Cc: zsh-workers

On Mon, Nov 4, 2024 at 7:30 AM William DeShazer <earl.deshazer@gmail.com> wrote:
>
> It gets mangled inside /etc/zshrc_Apple_Terminal which is kicked off by /etc/zshrc. This code section is intended to synchronize with the new feature to have the Terminal.app Title bar auto update. In an effort to come up with a safe URL path, by adding this to a zsh hook, this code is mangling ZDOTDIR.

I can confirm that executing /etc/zshrc_Apple_Terminal during startup
appears to be messing up the local copy of the ZDOTDIR variable.
Interestingly, the value in the environment is not affected, so the
issue can be worked around by:

ZDOTDIR="$(printenv ZDOTDIR)"

This doesn't have anything to do with add-zsh-hook, it happens even if
that function is not executed.  Even an empty file will reproduce
this, and will do so on Ubuntu with the same sort of ZDOTDIR path.
valgrind reports no memory sloppiness, and it doesn't happen with a
random environment variable, so I think it's because ZDOTDIR is
retrieved during startup with getsparam_u() which unmetafies-in-place
the return value of getsparam() ... but I haven't tracked it any
further than that.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 18:12         ` Bart Schaefer
@ 2024-11-04 18:30           ` Mark J. Reed
  2024-11-04 18:35             ` Bart Schaefer
  2024-11-04 19:13           ` William DeShazer
  1 sibling, 1 reply; 30+ messages in thread
From: Mark J. Reed @ 2024-11-04 18:30 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: William DeShazer, zsh-workers


[-- Attachment #1.1: Type: text/plain, Size: 1494 bytes --]

What's going on that causes Zsh to try to do wildcard expansion on the
metafied value even inside quotation marks?

[image: image.png]

On Mon, Nov 4, 2024 at 1:13 PM Bart Schaefer <schaefer@brasslantern.com>
wrote:

> On Mon, Nov 4, 2024 at 7:30 AM William DeShazer <earl.deshazer@gmail.com>
> wrote:
> >
> > It gets mangled inside /etc/zshrc_Apple_Terminal which is kicked off by
> /etc/zshrc. This code section is intended to synchronize with the new
> feature to have the Terminal.app Title bar auto update. In an effort to
> come up with a safe URL path, by adding this to a zsh hook, this code is
> mangling ZDOTDIR.
>
> I can confirm that executing /etc/zshrc_Apple_Terminal during startup
> appears to be messing up the local copy of the ZDOTDIR variable.
> Interestingly, the value in the environment is not affected, so the
> issue can be worked around by:
>
> ZDOTDIR="$(printenv ZDOTDIR)"
>
> This doesn't have anything to do with add-zsh-hook, it happens even if
> that function is not executed.  Even an empty file will reproduce
> this, and will do so on Ubuntu with the same sort of ZDOTDIR path.
> valgrind reports no memory sloppiness, and it doesn't happen with a
> random environment variable, so I think it's because ZDOTDIR is
> retrieved during startup with getsparam_u() which unmetafies-in-place
> the return value of getsparam() ... but I haven't tracked it any
> further than that.
>
>

-- 
Mark J. Reed <markjreed@gmail.com>

[-- Attachment #1.2: Type: text/html, Size: 2164 bytes --]

[-- Attachment #2: image.png --]
[-- Type: image/png, Size: 15232 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 18:30           ` Mark J. Reed
@ 2024-11-04 18:35             ` Bart Schaefer
  2024-11-04 18:50               ` Mark J. Reed
  0 siblings, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2024-11-04 18:35 UTC (permalink / raw)
  To: Mark J. Reed; +Cc: William DeShazer, zsh-workers

[-- Attachment #1: Type: text/plain, Size: 373 bytes --]

On Mon, Nov 4, 2024 at 10:30 AM Mark J. Reed <markjreed@gmail.com> wrote:

> What's going on that causes Zsh to try to do wildcard expansion on the
> metafied value even inside quotation marks?
>

The value isn't metafied, it's unmetafied ... which happens to leave behind
some character values such that when quoted, they turn into (different)
un-quoted bytes.

[-- Attachment #2: Type: text/html, Size: 661 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 18:35             ` Bart Schaefer
@ 2024-11-04 18:50               ` Mark J. Reed
  2024-11-04 18:56                 ` Bart Schaefer
  0 siblings, 1 reply; 30+ messages in thread
From: Mark J. Reed @ 2024-11-04 18:50 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: William DeShazer, zsh-workers

[-- Attachment #1: Type: text/plain, Size: 874 bytes --]

Also, as far as I can tell, the ZDOTDIR value is being munged _before_
/etc/zshrc is executed. If I empty out /etc/zshrc except for the above echo
command, and put the same command as the only thing in /etc/zshenv, the one
in /etc/zshenv displays the correct expected value, while the one in
/etc/zshrc triggers the same "no matches found" error.

On Mon, Nov 4, 2024 at 1:36 PM Bart Schaefer <schaefer@brasslantern.com>
wrote:

> On Mon, Nov 4, 2024 at 10:30 AM Mark J. Reed <markjreed@gmail.com> wrote:
>
>> What's going on that causes Zsh to try to do wildcard expansion on the
>> metafied value even inside quotation marks?
>>
>
> The value isn't metafied, it's unmetafied ... which happens to leave
> behind some character values such that when quoted, they turn into
> (different) un-quoted bytes.
>


-- 
Mark J. Reed <markjreed@gmail.com>

[-- Attachment #2: Type: text/html, Size: 1631 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 18:50               ` Mark J. Reed
@ 2024-11-04 18:56                 ` Bart Schaefer
  2024-11-04 19:00                   ` Bart Schaefer
  0 siblings, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2024-11-04 18:56 UTC (permalink / raw)
  To: Mark J. Reed; +Cc: William DeShazer, zsh-workers

On Mon, Nov 4, 2024 at 10:50 AM Mark J. Reed <markjreed@gmail.com> wrote:
>
> Also, as far as I can tell, the ZDOTDIR value is being munged _before_ /etc/zshrc is executed. If I empty out /etc/zshrc except for the above echo command, and put the same command as the only thing in /etc/zshenv, the one in /etc/zshenv displays the correct expected value, while the one in /etc/zshrc triggers the same "no matches found" error.

That points to sourcehome() being called more than once (of necessity)
with the side-effect that getsparam_u() is applied to ZDOTDIR more
than once.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 18:56                 ` Bart Schaefer
@ 2024-11-04 19:00                   ` Bart Schaefer
  2024-11-04 19:14                     ` Mark J. Reed
  0 siblings, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2024-11-04 19:00 UTC (permalink / raw)
  To: zsh-workers; +Cc: William DeShazer

On Mon, Nov 4, 2024 at 10:56 AM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> That points to sourcehome() being called more than once (of necessity)
> with the side-effect that getsparam_u() is applied to ZDOTDIR more
> than once.

Would be interesting to see if it gets further munged after passing
through zlogin


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 18:12         ` Bart Schaefer
  2024-11-04 18:30           ` Mark J. Reed
@ 2024-11-04 19:13           ` William DeShazer
  1 sibling, 0 replies; 30+ messages in thread
From: William DeShazer @ 2024-11-04 19:13 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

You beat me to it! I saw that part of the code this morning. I already jumped the gun though, so I wanted to hold off until I could connect the dots. It’s definitely on the program side. I will have to get back to this later, but I am glad we have a second data point.

Will

> On Nov 4, 2024, at 10:12, Bart Schaefer <schaefer@brasslantern.com> wrote:
> 
> On Mon, Nov 4, 2024 at 7:30 AM William DeShazer <earl.deshazer@gmail.com> wrote:
>> 
>> It gets mangled inside /etc/zshrc_Apple_Terminal which is kicked off by /etc/zshrc. This code section is intended to synchronize with the new feature to have the Terminal.app Title bar auto update. In an effort to come up with a safe URL path, by adding this to a zsh hook, this code is mangling ZDOTDIR.
> 
> I can confirm that executing /etc/zshrc_Apple_Terminal during startup
> appears to be messing up the local copy of the ZDOTDIR variable.
> Interestingly, the value in the environment is not affected, so the
> issue can be worked around by:
> 
> ZDOTDIR="$(printenv ZDOTDIR)"
> 
> This doesn't have anything to do with add-zsh-hook, it happens even if
> that function is not executed.  Even an empty file will reproduce
> this, and will do so on Ubuntu with the same sort of ZDOTDIR path.
> valgrind reports no memory sloppiness, and it doesn't happen with a
> random environment variable, so I think it's because ZDOTDIR is
> retrieved during startup with getsparam_u() which unmetafies-in-place
> the return value of getsparam() ... but I haven't tracked it any
> further than that.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 19:00                   ` Bart Schaefer
@ 2024-11-04 19:14                     ` Mark J. Reed
  2024-11-04 20:06                       ` Bart Schaefer
  0 siblings, 1 reply; 30+ messages in thread
From: Mark J. Reed @ 2024-11-04 19:14 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers, William DeShazer

[-- Attachment #1: Type: text/plain, Size: 661 bytes --]

It does seem to get munged a second time. If I put a correcting
*ZDOTDIR=$(printenv
ZDOTDIR)* at the top of /etc/zshrc, it's again unechoable by the time I get
a prompt.

On Mon, Nov 4, 2024 at 2:00 PM Bart Schaefer <schaefer@brasslantern.com>
wrote:

> On Mon, Nov 4, 2024 at 10:56 AM Bart Schaefer <schaefer@brasslantern.com>
> wrote:
> >
> > That points to sourcehome() being called more than once (of necessity)
> > with the side-effect that getsparam_u() is applied to ZDOTDIR more
> > than once.
>
> Would be interesting to see if it gets further munged after passing
> through zlogin
>
>

-- 
Mark J. Reed <markjreed@gmail.com>

[-- Attachment #2: Type: text/html, Size: 1229 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 19:14                     ` Mark J. Reed
@ 2024-11-04 20:06                       ` Bart Schaefer
  2024-11-04 20:31                         ` Mark J. Reed
  0 siblings, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2024-11-04 20:06 UTC (permalink / raw)
  To: Mark J. Reed; +Cc: zsh-workers, William DeShazer

On Mon, Nov 4, 2024 at 11:15 AM Mark J. Reed <markjreed@gmail.com> wrote:
>
> It does seem to get munged a second time. If I put a correcting ZDOTDIR=$(printenv ZDOTDIR) at the top of /etc/zshrc, it's again unechoable by the time I get a prompt.

If you DO NOT correct it in /etc/zshrc, just echo it at every point,
is the pattern different each time? (Try "noglob echo ..." for the raw
string?)


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 20:06                       ` Bart Schaefer
@ 2024-11-04 20:31                         ` Mark J. Reed
  2024-11-04 20:53                           ` William DeShazer
  0 siblings, 1 reply; 30+ messages in thread
From: Mark J. Reed @ 2024-11-04 20:31 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers, William DeShazer


[-- Attachment #1.1: Type: text/plain, Size: 989 bytes --]

Looks like once it's munged, it doesn't change any more. So the re-munging
(getsparam_u() or whatever) must be idempotent.

[image: image.png]

I also tried adding corrections immediately after the echos. It was fine at
the start of /etc/zshenv, corrupted at the start of ~/.zshenv, back to fine
at the start of /etc/zshrc, and corrupted again at the start of ~/.zshrc,
which makes sense if the problem is sourcehome().

On Mon, Nov 4, 2024 at 3:07 PM Bart Schaefer <schaefer@brasslantern.com>
wrote:

> On Mon, Nov 4, 2024 at 11:15 AM Mark J. Reed <markjreed@gmail.com> wrote:
> >
> > It does seem to get munged a second time. If I put a correcting
> ZDOTDIR=$(printenv ZDOTDIR) at the top of /etc/zshrc, it's again unechoable
> by the time I get a prompt.
>
> If you DO NOT correct it in /etc/zshrc, just echo it at every point,
> is the pattern different each time? (Try "noglob echo ..." for the raw
> string?)
>


-- 
Mark J. Reed <markjreed@gmail.com>

[-- Attachment #1.2: Type: text/html, Size: 1644 bytes --]

[-- Attachment #2: image.png --]
[-- Type: image/png, Size: 14321 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 20:31                         ` Mark J. Reed
@ 2024-11-04 20:53                           ` William DeShazer
  2024-11-04 21:10                             ` Mark J. Reed
  0 siblings, 1 reply; 30+ messages in thread
From: William DeShazer @ 2024-11-04 20:53 UTC (permalink / raw)
  To: Mark J. Reed; +Cc: Bart Schaefer, zsh-workers

[-- Attachment #1: Type: text/plain, Size: 1609 bytes --]

Yes. I can confirm that I had to reset it at every stage. That’s not a horrible solution, but I was worried that there might be some things under the hood that would not set correctly if the value wasn't  retained between configuration files. I’ll go ahead and try to do a custom development build to see if I can help with the resolution. Im just happy to be validated. I was sure it was my ignorance. I appreciate the support.

Will

> On Nov 4, 2024, at 12:31, Mark J. Reed <markjreed@gmail.com> wrote:
> 
> Looks like once it's munged, it doesn't change any more. So the re-munging (getsparam_u() or whatever) must be idempotent.
> 
> <image.png>
> 
> I also tried adding corrections immediately after the echos. It was fine at the start of /etc/zshenv, corrupted at the start of ~/.zshenv, back to fine at the start of /etc/zshrc, and corrupted again at the start of ~/.zshrc, which makes sense if the problem is sourcehome().
> 
> On Mon, Nov 4, 2024 at 3:07 PM Bart Schaefer <schaefer@brasslantern.com <mailto:schaefer@brasslantern.com>> wrote:
>> On Mon, Nov 4, 2024 at 11:15 AM Mark J. Reed <markjreed@gmail.com <mailto:markjreed@gmail.com>> wrote:
>> >
>> > It does seem to get munged a second time. If I put a correcting ZDOTDIR=$(printenv ZDOTDIR) at the top of /etc/zshrc, it's again unechoable by the time I get a prompt.
>> 
>> If you DO NOT correct it in /etc/zshrc, just echo it at every point,
>> is the pattern different each time? (Try "noglob echo ..." for the raw
>> string?)
> 
> 
> --
> Mark J. Reed <markjreed@gmail.com <mailto:markjreed@gmail.com>>


[-- Attachment #2: Type: text/html, Size: 2498 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 20:53                           ` William DeShazer
@ 2024-11-04 21:10                             ` Mark J. Reed
  2024-11-04 23:27                               ` Bart Schaefer
  0 siblings, 1 reply; 30+ messages in thread
From: Mark J. Reed @ 2024-11-04 21:10 UTC (permalink / raw)
  To: William DeShazer; +Cc: Bart Schaefer, zsh-workers

[-- Attachment #1: Type: text/plain, Size: 2052 bytes --]

I'm not sure why unmetafy() is munging it at all, though, as none of the
bytes in the UTF-8 encoding of either of our emojiful pathnames is equal to
Meta (0x83).
🔮 is 0xf0 0x9f 0x94 0xae, and 🍇 is 0xf0 0x9f 0x8d 0x87.  Possibly because
it proceeds right on past the nul terminator if the string has no Meta in
it. That seems like a bug.



On Mon, Nov 4, 2024 at 3:53 PM William DeShazer <earl.deshazer@gmail.com>
wrote:

> Yes. I can confirm that I had to reset it at every stage. That’s not a
> horrible solution, but I was worried that there might be some things under
> the hood that would not set correctly if the value wasn't  retained between
> configuration files. I’ll go ahead and try to do a custom development build
> to see if I can help with the resolution. Im just happy to be validated. I
> was sure it was my ignorance. I appreciate the support.
>
> Will
>
> On Nov 4, 2024, at 12:31, Mark J. Reed <markjreed@gmail.com> wrote:
>
> Looks like once it's munged, it doesn't change any more. So the re-munging
> (getsparam_u() or whatever) must be idempotent.
>
> <image.png>
>
> I also tried adding corrections immediately after the echos. It was fine
> at the start of /etc/zshenv, corrupted at the start of ~/.zshenv, back to
> fine at the start of /etc/zshrc, and corrupted again at the start of
> ~/.zshrc, which makes sense if the problem is sourcehome().
>
> On Mon, Nov 4, 2024 at 3:07 PM Bart Schaefer <schaefer@brasslantern.com>
> wrote:
>
>> On Mon, Nov 4, 2024 at 11:15 AM Mark J. Reed <markjreed@gmail.com> wrote:
>> >
>> > It does seem to get munged a second time. If I put a correcting
>> ZDOTDIR=$(printenv ZDOTDIR) at the top of /etc/zshrc, it's again unechoable
>> by the time I get a prompt.
>>
>> If you DO NOT correct it in /etc/zshrc, just echo it at every point,
>> is the pattern different each time? (Try "noglob echo ..." for the raw
>> string?)
>>
>
>
> --
> Mark J. Reed <markjreed@gmail.com>
>
>
>

-- 
Mark J. Reed <markjreed@gmail.com>

[-- Attachment #2: Type: text/html, Size: 3697 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 21:10                             ` Mark J. Reed
@ 2024-11-04 23:27                               ` Bart Schaefer
  2024-11-04 23:46                                 ` William DeShazer
  0 siblings, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2024-11-04 23:27 UTC (permalink / raw)
  To: zsh-workers; +Cc: William DeShazer

On Mon, Nov 4, 2024 at 1:11 PM Mark J. Reed <markjreed@gmail.com> wrote:
>
> I'm not sure why unmetafy() is munging it at all ... Possibly because it proceeds right on past the nul terminator if the string has no Meta in it.

valgrind reports no OOB errors, FWIW.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 23:27                               ` Bart Schaefer
@ 2024-11-04 23:46                                 ` William DeShazer
  2024-11-04 23:52                                   ` Bart Schaefer
  2024-11-04 23:59                                   ` Bart Schaefer
  0 siblings, 2 replies; 30+ messages in thread
From: William DeShazer @ 2024-11-04 23:46 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

One interesting observation by the way is that it is on ZDOTDIR that is getting munged.

ZSH_CUSTOM
ZSH
XDG_CONFIG_HOME
XDG_DATA_HOME
XDG_CACHE_HOME

FPATH/fpath
PATH/path

all retain the Crystal Ball Unicode character cleanly through initialization. It is just ZDOTDIR that exhibits the behavior. We will know more soon enough, but I look forward to knowing why someone is picking on ZDOTDIR. It is one of my favorite environment variables, so I take it personally. 😊

With regard to your Valgrind observation, I am not surprised. Few people would hazard to put Unicode characters in their path name, especially non-Portable ones. This may not be tested for. I don’t know Valgrind very well, so I probably shouldn’t speculate too much.

I am surprised it hasn’t shown up in one of the other languages, but maybe the unicode characters for languages were constructed to purposefully keep away from these code-points. These seem to be reserved for all of the coolest characters, if I do say so myself. 

Will

> On Nov 4, 2024, at 15:27, Bart Schaefer <schaefer@brasslantern.com> wrote:
> 
> On Mon, Nov 4, 2024 at 1:11 PM Mark J. Reed <markjreed@gmail.com> wrote:
>> 
>> I'm not sure why unmetafy() is munging it at all ... Possibly because it proceeds right on past the nul terminator if the string has no Meta in it.
> 
> valgrind reports no OOB errors, FWIW.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 23:46                                 ` William DeShazer
@ 2024-11-04 23:52                                   ` Bart Schaefer
  2024-11-05  0:10                                     ` William DeShazer
  2024-11-04 23:59                                   ` Bart Schaefer
  1 sibling, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2024-11-04 23:52 UTC (permalink / raw)
  To: William DeShazer; +Cc: zsh-workers

On Mon, Nov 4, 2024 at 3:46 PM William DeShazer <earl.deshazer@gmail.com> wrote:
>
> With regard to your Valgrind observation, I am not surprised. Few people would hazard to put Unicode characters in their path name

I meant that I specifically tested your example with valgrind, and
witnessed the problem repeat, without getting any reported errors or
leaks.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 23:46                                 ` William DeShazer
  2024-11-04 23:52                                   ` Bart Schaefer
@ 2024-11-04 23:59                                   ` Bart Schaefer
  2024-11-05  0:19                                     ` William DeShazer
  1 sibling, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2024-11-04 23:59 UTC (permalink / raw)
  To: William DeShazer; +Cc: zsh-workers

On Mon, Nov 4, 2024 at 3:46 PM William DeShazer <earl.deshazer@gmail.com> wrote:
>
> One interesting observation by the way is that it is on ZDOTDIR that is getting munged.

*IF* the problem really is getsparam_u(), then the other parameters
that are fetched that way are:
LC_ALL
LC_COLLATE
LC_CTYPE
LC_MESSAGES
LANG
ZSH_DEBUG_LOG
ZBEEP

The latter will be re-fetched every time the shell would beep (e.g.,
on a failed completion or undefined keystroke) so it might be the
easiest to test against multiple interpretation without having to
restart the shell.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 23:52                                   ` Bart Schaefer
@ 2024-11-05  0:10                                     ` William DeShazer
  2024-11-05  0:58                                       ` Bart Schaefer
  0 siblings, 1 reply; 30+ messages in thread
From: William DeShazer @ 2024-11-05  0:10 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

Please educate me if I am wrong about Valgrind, but I believe the reason that Valgrind didn’t catch it is that without asserting that you expect to get out the value you put in, to something like Valgrind nothing bad happened. Munging a variable for all it knows is what you intended to do. As you pointed out earlier, you can still access the variable with printenv. So the munged variable only hides access to the original. In that regard it is not leaking either, or it is leaking the same amount to the same variable, because you can still find it.

This was probably missed because few are bold (stupid?) enough to put non-portable Unicode in their path.

> On Nov 4, 2024, at 15:52, Bart Schaefer <schaefer@brasslantern.com> wrote:
> 
> On Mon, Nov 4, 2024 at 3:46 PM William DeShazer <earl.deshazer@gmail.com> wrote:
>> 
>> With regard to your Valgrind observation, I am not surprised. Few people would hazard to put Unicode characters in their path name
> 
> I meant that I specifically tested your example with valgrind, and
> witnessed the problem repeat, without getting any reported errors or
> leaks.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-04 23:59                                   ` Bart Schaefer
@ 2024-11-05  0:19                                     ` William DeShazer
  0 siblings, 0 replies; 30+ messages in thread
From: William DeShazer @ 2024-11-05  0:19 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 1083 bytes --]

This is a fantastic Observation. Easy to test.

I just ran it with LC_COLLATE. 

Last login: Mon Nov  4 16:16:46 on ttys001
ZDOTDIR is 
Assigning LC_COLLATE=🔮
After assignment: 🔮
In zshrc_Apple_Terminal Before Mangling: ~/App🔮Bundles/Configurations/Zsh/•Z
In .zshrc LC_COLLATE=🔮
compinit:200: bad pattern: ~/App\M-p<\M-.Bundles/Configurations/Zsh/•Z/.zshrc

> On Nov 4, 2024, at 15:59, Bart Schaefer <schaefer@brasslantern.com> wrote:
> 
> On Mon, Nov 4, 2024 at 3:46 PM William DeShazer <earl.deshazer@gmail.com> wrote:
>> 
>> One interesting observation by the way is that it is on ZDOTDIR that is getting munged.
> 
> *IF* the problem really is getsparam_u(), then the other parameters
> that are fetched that way are:
> LC_ALL
> LC_COLLATE
> LC_CTYPE
> LC_MESSAGES
> LANG
> ZSH_DEBUG_LOG
> ZBEEP
> 
> The latter will be re-fetched every time the shell would beep (e.g.,
> on a failed completion or undefined keystroke) so it might be the
> easiest to test against multiple interpretation without having to
> restart the shell.


[-- Attachment #2: Type: text/html, Size: 5593 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-05  0:10                                     ` William DeShazer
@ 2024-11-05  0:58                                       ` Bart Schaefer
  2024-11-05  1:28                                         ` Bart Schaefer
  0 siblings, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2024-11-05  0:58 UTC (permalink / raw)
  To: William DeShazer; +Cc: zsh-workers

On Mon, Nov 4, 2024 at 4:11 PM William DeShazer <earl.deshazer@gmail.com> wrote:
>
> Please educate me if I am wrong about Valgrind, but I believe the reason that Valgrind didn’t catch it is ...

Valgrind looks for pointers running into memory outside the bounds of
a string.  Mark suggested that the reason unmetafy() was having an
effect is because it was failing to stop at the end of the string.  My
only point was that valgrind did not find THAT to be the problem.

Subsequently ...

I put this through gdb and found that getsparam("ZDOTDIR") appears to
be returning a consistent value all the way through the point of
failure, and further that getsparam_u() also appears to return that
same value.  So I went looking for the actual error occurrence in
paramsubst().

What I found is that in the erroring case, when we get to subst.c:169
in prefork(), the value returned from paramsubst() contains

App\360\224\256Bundles/Configurations/Zsh/•Z

However, if I then "fix" ZDOTDIR by assigning from $(printenv), upon
getting back to that line in prefork() the value instead contains

App\360\203\277\203\264\256Bundles/Configurations/Zsh/ \202Z

I haven't found where the internal representation is getting changed,
but this is looking a lot like the same oddity discussed in "ZLE
character width with emoji presentation variation selectors in
Unicode"
  https://www.zsh.org/mla/workers/2024/msg00481.html (and followups)
wherein we seem to have concluded that MacOS has a broken wcwidth()
implementation but never arrived at a resolution for it.

I seem to recall that the MacOS file system automatically converts
between the two formats in order to standardize on the one used for
file names, so you'll never see this in context of finding files in
the ZDOTDIR directory.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-05  0:58                                       ` Bart Schaefer
@ 2024-11-05  1:28                                         ` Bart Schaefer
  2024-11-05  2:17                                           ` William DeShazer
  0 siblings, 1 reply; 30+ messages in thread
From: Bart Schaefer @ 2024-11-05  1:28 UTC (permalink / raw)
  To: William DeShazer; +Cc: zsh-workers

On Mon, Nov 4, 2024 at 4:58 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> What I found is that in the erroring case, when we get to subst.c:169
> in prefork(), the value returned from paramsubst() contains
>
> App\360\224\256Bundles/Configurations/Zsh/•Z

It may be remnulargs() that then mangles this into something that
attempts globbing.

On the Ubuntu filesystem when I "mkdir" with the string copy-pasted
from William's original message, I get a third variation:

App\360\237\224\256Bundles

> However, if I then "fix" ZDOTDIR by assigning from $(printenv), upon
> getting back to that line in prefork() the value instead contains
>
> App\360\203\277\203\264\256Bundles/Configurations/Zsh/ \202Z

I still get this transformation on Ubuntu, though, so it's not
entirely a MacOS wcwidth thing.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-05  1:28                                         ` Bart Schaefer
@ 2024-11-05  2:17                                           ` William DeShazer
  2024-11-05  4:16                                             ` Bart Schaefer
  0 siblings, 1 reply; 30+ messages in thread
From: William DeShazer @ 2024-11-05  2:17 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

You mentioned, or the issue that you referenced, mentioned that it was a zle interaction. I think candidates for odd insertions would be insert_unicode_char, insert-composed-char, expand-absolute-path, bracketed-paste-magic, send-invisible. While those look interesting as candidates, the zle is only involved with handling strings if it has been invoked, correct? I’m not sure there is a logic path to trigger these kinds of events. I figured I should mention it, since you have a better knowledge of the code-base than I do.

On a separate but possibly related note, More frequently lately, I have been having a heck of a time with a tty interrupt that happens when I do command-K, and then ls. The l gets gobbled, and then the tab completion on the ’s’ engages the tty interrupt. I have been reticent to explore it, however if the zle is injecting/transforming characters randomly, these issues could be related. I can’t reliably reproduce it and if we solve this issue, maybe the other will mysteriously go away. Wishful thinking.


> On Nov 4, 2024, at 17:28, Bart Schaefer <schaefer@brasslantern.com> wrote:
> 
> On Mon, Nov 4, 2024 at 4:58 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
>> 
>> What I found is that in the erroring case, when we get to subst.c:169
>> in prefork(), the value returned from paramsubst() contains
>> 
>> App\360\224\256Bundles/Configurations/Zsh/•Z
> 
> It may be remnulargs() that then mangles this into something that
> attempts globbing.
> 
> On the Ubuntu filesystem when I "mkdir" with the string copy-pasted
> from William's original message, I get a third variation:
> 
> App\360\237\224\256Bundles
> 
>> However, if I then "fix" ZDOTDIR by assigning from $(printenv), upon
>> getting back to that line in prefork() the value instead contains
>> 
>> App\360\203\277\203\264\256Bundles/Configurations/Zsh/ \202Z
> 
> I still get this transformation on Ubuntu, though, so it's not
> entirely a MacOS wcwidth thing.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-05  2:17                                           ` William DeShazer
@ 2024-11-05  4:16                                             ` Bart Schaefer
  2024-11-05  9:19                                               ` Peter Stephenson
  2024-11-05 23:05                                               ` William DeShazer
  0 siblings, 2 replies; 30+ messages in thread
From: Bart Schaefer @ 2024-11-05  4:16 UTC (permalink / raw)
  To: zsh-workers; +Cc: William DeShazer

On Mon, Nov 4, 2024 at 6:17 PM William DeShazer <earl.deshazer@gmail.com> wrote:
>
> You mentioned, or the issue that you referenced, mentioned that it was a zle interaction.

Just a similarity, pointing to a potential issue with wcwidth(), not a
suggestion that zle is actually involved in this case.

But that turns out to be a red herring ... it is getsparam_u() after all.

getsparam_u() calls getsparam() which calls fetchvalue(), which
returns a Value with a pointer to the Param struct in the global
parameter hash, then calls getstrvalue() on that, which returns a
pointer to the string in the Param, which unmetafy() then modifies in
place.  Subsequent unmetafy() are no-ops during following calls to
sourcehome() but actual parameter expansions expect the value in the
global to be stored metafied.

Possible solutions are to use unmeta(), which is more efficient but
returns a pointer to a single-use buffer (that is, a subsequent call
to unmeta() will clobber it):

diff --git a/Src/params.c b/Src/params.c
index acd577527..25a831ed7 100644
--- a/Src/params.c
+++ b/Src/params.c
@@ -3065,7 +3065,7 @@ mod_export char *
 getsparam_u(char *s)
 {
     if ((s = getsparam(s)))
-    return unmetafy(s, NULL);
+    return unmeta(s);
     return s;
 }

Or to explicitly allocate space on the heap:

diff --git a/Src/params.c b/Src/params.c
index acd577527..99c979b85 100644
--- a/Src/params.c
+++ b/Src/params.c
@@ -3065,7 +3065,7 @@ mod_export char *
 getsparam_u(char *s)
 {
     if ((s = getsparam(s)))
-    return unmetafy(s, NULL);
+    return unmetafy(dupstring(s), NULL);
     return s;
 }

Looking at the uses of getsparam_u() the returned pointer is always
immediately used and discarded so I think unmeta() is safe.  Thoughts,
-workers?


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-05  4:16                                             ` Bart Schaefer
@ 2024-11-05  9:19                                               ` Peter Stephenson
  2024-11-05 23:05                                               ` William DeShazer
  1 sibling, 0 replies; 30+ messages in thread
From: Peter Stephenson @ 2024-11-05  9:19 UTC (permalink / raw)
  To: zsh-workers; +Cc: William DeShazer

> On 05/11/2024 04:16 GMT Bart Schaefer <schaefer@brasslantern.com> wrote:
> Looking at the uses of getsparam_u() the returned pointer is always
> immediately used and discarded so I think unmeta() is safe.  Thoughts,
> -workers?

Sounds OK to me, probably worth a note in the code for gesparam_u() to
warn future hackers who might not immediately recognised the effect of
the unmeta().

cheers
pws


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-05  4:16                                             ` Bart Schaefer
  2024-11-05  9:19                                               ` Peter Stephenson
@ 2024-11-05 23:05                                               ` William DeShazer
  2024-11-06 19:30                                                 ` Bart Schaefer
  1 sibling, 1 reply; 30+ messages in thread
From: William DeShazer @ 2024-11-05 23:05 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 2696 bytes --]

I am not sure what the proper protocol is on such an issue. I intend to use the https://github.com/apple-oss-distributions/zsh distribution to apply your patch, but the README has these instructions:

> To upgrade to a new version of zsh:

Checkout dev/UPSTREAM.
Replace its contents with the new version.
Rename zsh/.gitignore to zsh/.gitignore.dist.
Commit and push the new content.
Create an eng/ branch.
git merge dev/UPSTREAM into your branch and resolve any conflicts.
Review, test, and nominate that eng/ branch.
In this case this is your change. I don’t think I should apply push it back up to that channel as my change.

Do you have a recommendation for how I should test this?

Will

> On Nov 4, 2024, at 20:16, Bart Schaefer <schaefer@brasslantern.com> wrote:
> 
> On Mon, Nov 4, 2024 at 6:17 PM William DeShazer <earl.deshazer@gmail.com> wrote:
>> 
>> You mentioned, or the issue that you referenced, mentioned that it was a zle interaction.
> 
> Just a similarity, pointing to a potential issue with wcwidth(), not a
> suggestion that zle is actually involved in this case.
> 
> But that turns out to be a red herring ... it is getsparam_u() after all.
> 
> getsparam_u() calls getsparam() which calls fetchvalue(), which
> returns a Value with a pointer to the Param struct in the global
> parameter hash, then calls getstrvalue() on that, which returns a
> pointer to the string in the Param, which unmetafy() then modifies in
> place.  Subsequent unmetafy() are no-ops during following calls to
> sourcehome() but actual parameter expansions expect the value in the
> global to be stored metafied.
> 
> Possible solutions are to use unmeta(), which is more efficient but
> returns a pointer to a single-use buffer (that is, a subsequent call
> to unmeta() will clobber it):
> 
> diff --git a/Src/params.c b/Src/params.c
> index acd577527..25a831ed7 100644
> --- a/Src/params.c
> +++ b/Src/params.c
> @@ -3065,7 +3065,7 @@ mod_export char *
> getsparam_u(char *s)
> {
>     if ((s = getsparam(s)))
> -    return unmetafy(s, NULL);
> +    return unmeta(s);
>     return s;
> }
> 
> Or to explicitly allocate space on the heap:
> 
> diff --git a/Src/params.c b/Src/params.c
> index acd577527..99c979b85 100644
> --- a/Src/params.c
> +++ b/Src/params.c
> @@ -3065,7 +3065,7 @@ mod_export char *
> getsparam_u(char *s)
> {
>     if ((s = getsparam(s)))
> -    return unmetafy(s, NULL);
> +    return unmetafy(dupstring(s), NULL);
>     return s;
> }
> 
> Looking at the uses of getsparam_u() the returned pointer is always
> immediately used and discarded so I think unmeta() is safe.  Thoughts,
> -workers?


[-- Attachment #2: Type: text/html, Size: 3459 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Unicode allowables in Environment variables
  2024-11-05 23:05                                               ` William DeShazer
@ 2024-11-06 19:30                                                 ` Bart Schaefer
  0 siblings, 0 replies; 30+ messages in thread
From: Bart Schaefer @ 2024-11-06 19:30 UTC (permalink / raw)
  To: William DeShazer; +Cc: zsh-workers

On Tue, Nov 5, 2024 at 3:06 PM William DeShazer <earl.deshazer@gmail.com> wrote:
>
> I am not sure what the proper protocol is on such an issue. I intend to use the https://github.com/apple-oss-distributions/zsh distribution to apply your patch, but the README has these instructions:
>
> > To upgrade to a new version of zsh:

Those instructions imply that you're downloading a full new set of
sources from the zsh upstream.  If you just want to apply a single
patch, you can just create a branch, patch the source on that branch,
compile, and test the resulting binary.  Obviously some patches won't
apply if you don't have the upstream source, but in this case the
change is small and isolated, so patching should not present an issue
... you could even make the edit by hand.

> diff --git a/Src/params.c b/Src/params.c
> index acd577527..25a831ed7 100644
> --- a/Src/params.c
> +++ b/Src/params.c
> @@ -3065,7 +3065,7 @@ mod_export char *
> getsparam_u(char *s)
> {
>     if ((s = getsparam(s)))
> -    return unmetafy(s, NULL);
> +    return unmeta(s);
>     return s;
> }


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2024-11-06 19:30 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-03 10:44 Unicode allowables in Environment variables William DeShazer
2024-11-04  2:18 ` Mikael Magnusson
2024-11-04  4:13   ` Mark J. Reed
2024-11-04  6:27     ` William DeShazer
2024-11-04 15:29       ` William DeShazer
2024-11-04 18:12         ` Bart Schaefer
2024-11-04 18:30           ` Mark J. Reed
2024-11-04 18:35             ` Bart Schaefer
2024-11-04 18:50               ` Mark J. Reed
2024-11-04 18:56                 ` Bart Schaefer
2024-11-04 19:00                   ` Bart Schaefer
2024-11-04 19:14                     ` Mark J. Reed
2024-11-04 20:06                       ` Bart Schaefer
2024-11-04 20:31                         ` Mark J. Reed
2024-11-04 20:53                           ` William DeShazer
2024-11-04 21:10                             ` Mark J. Reed
2024-11-04 23:27                               ` Bart Schaefer
2024-11-04 23:46                                 ` William DeShazer
2024-11-04 23:52                                   ` Bart Schaefer
2024-11-05  0:10                                     ` William DeShazer
2024-11-05  0:58                                       ` Bart Schaefer
2024-11-05  1:28                                         ` Bart Schaefer
2024-11-05  2:17                                           ` William DeShazer
2024-11-05  4:16                                             ` Bart Schaefer
2024-11-05  9:19                                               ` Peter Stephenson
2024-11-05 23:05                                               ` William DeShazer
2024-11-06 19:30                                                 ` Bart Schaefer
2024-11-04 23:59                                   ` Bart Schaefer
2024-11-05  0:19                                     ` William DeShazer
2024-11-04 19:13           ` William DeShazer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).