zsh-workers
 help / color / mirror / code / Atom feed
From: Mikael Magnusson <mikachu@gmail.com>
To: Advait Maybhate <advait@warp.dev>
Cc: zsh-workers@zsh.org
Subject: Re: [BUG] ZLE character width with emoji presentation variation selectors in Unicode
Date: Fri, 10 May 2024 11:37:50 +0200	[thread overview]
Message-ID: <CAHYJk3Rbpnqgjp04jXT4QKZG-7qxzz9PceEX7YG-ion04icUFA@mail.gmail.com> (raw)
In-Reply-To: <CAN+tYMf4fH2Lkww5nzAB24fGZ6uJAt7r_FpRcFocYpaYOD=1Yw@mail.gmail.com>

On Thu, May 9, 2024 at 4:46 PM Advait Maybhate <advait@warp.dev> wrote:
>
> Hey folks!
>
>
> Wanted to file a bug report/get a discussion going on the best way to handle emoji variation selectors with Unicode characters.
>
>
> Metadata:
>
> Zsh version: zsh 5.9 (x86_64-apple-darwin23.0), OS version: macOS Sonoma 14.3.1
>
> Terminal: tested across Warp, Kitty, default Mac terminal, Alacritty, iTerm 2
>
>
> ZLE incorrectly treats characters with the emoji variation selector as 1 character instead of 2 characters, causing off-by-one cursor movement issues in terminals that (correctly) treat it as 2 characters.
>
>
> This is most easily reproduced in Kitty (v0.34), which renders and calculates these emojis as 2 cells (most terminal emulators seem to incorrectly handle this case of Unicode).
>
>
> To repro:
>
> Paste in the command “echo ☁️” into Kitty (the last character is \0x2601 followed by \0xFE0F). Note that this results in bracketed paste mode in Zsh.
>
>
> Expected behavior:
>
> ZLE contains “echo ☁️”.
>
>
> Actual behavior:
>
> ZLE contains “eecho ☁️” (note the additional “e” at the beginning here - inverted colors from the bracketed paste). Confirmed that this is due to an off-by-one on the cursor instruction, from the PTY recording.
>
>
> Screenshot: link
>
>
> I’d love to discuss how to fix this for terminals that do respect variation selectors. One way to do this could be via a new `terminfo` entry, but I’d love to know what ZSH devs think! I’m an engineer building the Warp terminal, so I’d be happy to work on any terminal-side changes of this with `terminfo` (we actually use bracketed paste mode for all commands, to best support multiline commands with Warp's input editor)!
>
>
> Notably, Fish 3.6 seems to calculate the width correctly as 2 cells (this is what originally prompted my investigation, due to the Starship prompt - see fish-shell/issues/10461), along with Bash (using bracketed paste with Bash 5.2).
>
>
> I’ve seen 2017/msg00432 which is related to this, but deals with 0xFE0E not 0xFE0F.

Generally speaking it is impossible to handle combining emoji, since
the specification allows the rendering to either combine or not
combine the glyphs, it is not possible for zsh to know how much space
they will take up. Of course, your problem isn't even about combining
emoji, but as far as I can see the same conceptual problem applies
here; there is no way for zsh to know what "render as an image"
implies for glyph width, all we can do is call wcwidth. I took a quick
look at some unicode emoji standards pages and none of them even
mention the word width. If you can find an authorative part of the
standard talking about emoji width, feel free to link it... In my
terminal your example renders as 1 glyph wide which agrees with zsh's
guess, and I don't get any display errors.

-- 
Mikael Magnusson


  reply	other threads:[~2024-05-10  9:38 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-09 14:45 Advait Maybhate
2024-05-10  9:37 ` Mikael Magnusson [this message]
2024-05-10  9:54   ` Mikael Magnusson
2024-05-10 17:11     ` Advait Maybhate
2024-05-10 18:57       ` Mikael Magnusson
2024-05-14  0:08         ` Advait Maybhate
2024-05-10 20:40       ` Bart Schaefer
2024-05-14  0:04         ` Advait Maybhate

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHYJk3Rbpnqgjp04jXT4QKZG-7qxzz9PceEX7YG-ion04icUFA@mail.gmail.com \
    --to=mikachu@gmail.com \
    --cc=advait@warp.dev \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).