From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham autolearn_force=no version=3.4.4 Received: from zero.zsh.org (zero.zsh.org [IPv6:2a02:898:31:0:48:4558:7a:7368]) by inbox.vuxu.org (Postfix) with ESMTP id E877B25E13 for ; Fri, 10 May 2024 11:55:06 +0200 (CEST) ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1715334906; b=WKyR8GU3GmpbMvFGITdYo2PuP2/NxNuysAw3G2YOC8VD0NtsHO91qVNwUs83AGex236psjHbMA MaDFC/h7EqTPB/RFv0vAmlQZgHW8HflbSTrmX+GtL0nf/h5CinUyQf+doiP/yDvkGmNgqFn1z7 3V035Od7dowL6j2Ok8SrexAzWcfJnWM1ddd3f01Olx1g83XnbK1KLUXyW5O/ZvA66A/toySdse dO+cReltyxWZT48oQ/+8GRfKixaLSaCBJUCgnAHqNMGW01TdLESV9fH3YGGFM/vqhIz3yx1cro nQXzOvAfo8eXy2eMkA5/qGBBpKkJwTFrqP0cc5RWU9kdiw==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (mail-ej1-f42.google.com) smtp.remote-ip=209.85.218.42; dkim=pass header.d=gmail.com header.s=20230601 header.a=rsa-sha256; dmarc=pass header.from=gmail.com; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1715334906; bh=vu5ENXMN/16pv91G46yokRJeckFT3ARf7qMWM6YBv7k=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Content-Transfer-Encoding:Content-Type:Cc:To:Subject: Message-ID:Date:From:In-Reply-To:References:MIME-Version:DKIM-Signature: DKIM-Signature; b=fT4vGNgxx3InKuzHLtSqYbm1hxO3bkaJAHk522fEdDoLw50ApN3RDdEiMgDEF0YSQVAZIUclTX Zk7LC6pPfRCt+6IUV5YTprR/pp0c9uMQ3pOvHOEc2IiU4ZGeQ19Ympyh6Imqn/PyVLF88str2N Pned35HJvBpcHbYTwxTyrG1Z8YoI6UFxKr6jLU4YqANEQi4W5kGgP0geg3EzBs9Z2nD+PsHRtG GKRIRTbDYINyc5f9obPTVtZLtu/BRMSAudIvdSFZR6OKcsUHtHZgwc3k/OAPigEUizVPUmjkTQ 0m0ooTz1+kn/txWPl0Vm1+TOCT3ffL6B7+/hhF26sfirHQ==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Content-Transfer-Encoding: Content-Type:Cc:To:Subject:Message-ID:Date:From:In-Reply-To:References: MIME-Version:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=vu5ENXMN/16pv91G46yokRJeckFT3ARf7qMWM6YBv7k=; b=WZDbk02KjM2kW7Fab5P/UE+0lL YkAbsxvAHP57t4If62rGD5cqZSLcyVXeKyglU/FGFgJRat8M4Si5szagkMzgxQidr8gigagbqJpWG QNhmjbiH249f0CEmTJtVABexkkbWm+rr3At5VDXSrw81xMvjUv2kfF/t7ItWmIfAE5+yQuZwgpnPI meiXo/yveYcnPPcp9dvkQKsdlzJrnek9YonkY2tBn81mhVVwBXpD+wqfGyBF6EvYoUNBGUDofPsns Gs8wUlgJXEydWvJVwqXHfWymiVNiBE1eCc5UDSKxmW+o++gxSuQvSRmtkvQVZ4BzD2UGR+3cWsGzJ qhSSD6Ig==; Received: by zero.zsh.org with local id 1s5My2-0005IM-JX; Fri, 10 May 2024 09:55:06 +0000 Authentication-Results: zsh.org; iprev=pass (mail-ej1-f42.google.com) smtp.remote-ip=209.85.218.42; dkim=pass header.d=gmail.com header.s=20230601 header.a=rsa-sha256; dmarc=pass header.from=gmail.com; arc=none Received: from mail-ej1-f42.google.com ([209.85.218.42]:49602) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_128_GCM_SHA256:128) id 1s5Mxn-0004yQ-ET; Fri, 10 May 2024 09:54:52 +0000 Received: by mail-ej1-f42.google.com with SMTP id a640c23a62f3a-a59b49162aeso431917366b.3 for ; Fri, 10 May 2024 02:54:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715334891; x=1715939691; darn=zsh.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=vu5ENXMN/16pv91G46yokRJeckFT3ARf7qMWM6YBv7k=; b=EFni6s8kN6+nofvfstTkfqvtiNdkpLFvrGehQbIUoGETvxpve9OvwRJMYz9CqPPo1q cPJU3TJw0KIynRg4MfkHzWM/5UKIVSw3qK/rEB6K9cZdymglCoS/DD/Zcf8+vDkKATss 6on6xuKmzMOLLd2GVlnOSm3gZFWpNhT8J/sYFpVyyBeX8bj0pgv4bSDZzLdL5byh6luk ZvE+nyeL7cQKv0et7knHJlCVLD5v8WlUs6G1G0yPMuSPsSHwh6hhO+3UPVeK3kjPVJT+ M0RN6UEjoW19NvLzJvk8u67EaBM4tcEjW69rHKxIbfHHahXC07Iek2qJQqDulBwll27M cofg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715334891; x=1715939691; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vu5ENXMN/16pv91G46yokRJeckFT3ARf7qMWM6YBv7k=; b=eDGSs+0VELBi+vdw5hRuNsxbhqorM++tWT3reywni9Erp4ElM0SVMzoXnv8oLIlW/v O3kjk+nPZVKsbeYd+mjcjhAGktouSWEyfiXFnk5DYt5J3DRMRH+AWtlx84JJxsFrrm++ gfGU+LAEKgYn+rXfyrmV4z4B4voL8MVsbU9aiKO1I6936PVju4o0zEMo/tHiVkc1ApEy Corpn2nDtS6CwV5ueFYKph6O/A5SgPLC/Z0APIcweW/6MNpPPF/EjntNBZRv8nKO29t6 0reJVNpcIyERFKPiiO/DjKrkV6VE9bOpKHg/Wumd75JRP7QcEVkx5lTOfr4XQNmXz+G+ h6Ig== X-Gm-Message-State: AOJu0YyEtyDp28doEoGMuNux320q2+3+WaBL3AlVlsDuNCVEXnd8PaJN Pq6UXo72+c2g6mwTCuC/2dy3jwQ9ndrdwlwRteBCozP6csC3WE0dDFktIr93WVLRpCvyKq4Ij9p NJ7uuZ3cZp96W65V7T61CkHRSiEXtMA== X-Google-Smtp-Source: AGHT+IHtgwxD+3VFzfLoeqOp7j20YcyS9xoYgYgFrwQGzDXXehhh+fLLwBfuMnrA8w5XsKWSP+zvS7lTJf9iTdBkvdY= X-Received: by 2002:a17:907:6e9e:b0:a59:c39b:6bc3 with SMTP id a640c23a62f3a-a5a2d6417e5mr152664966b.49.1715334890580; Fri, 10 May 2024 02:54:50 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Mikael Magnusson Date: Fri, 10 May 2024 11:54:33 +0200 Message-ID: Subject: Re: [BUG] ZLE character width with emoji presentation variation selectors in Unicode To: Advait Maybhate Cc: zsh-workers@zsh.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Seq: 52923 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: , List-Subscribe: , List-Unsubscribe: , List-Post: List-Owner: List-Archive: On Fri, May 10, 2024 at 11:37=E2=80=AFAM Mikael Magnusson wrote: > > On Thu, May 9, 2024 at 4:46=E2=80=AFPM Advait Maybhate = wrote: > > > > Hey folks! > > > > > > Wanted to file a bug report/get a discussion going on the best way to h= andle emoji variation selectors with Unicode characters. > > > > > > Metadata: > > > > Zsh version: zsh 5.9 (x86_64-apple-darwin23.0), OS version: macOS Sonom= a 14.3.1 > > > > Terminal: tested across Warp, Kitty, default Mac terminal, Alacritty, i= Term 2 > > > > > > ZLE incorrectly treats characters with the emoji variation selector as = 1 character instead of 2 characters, causing off-by-one cursor movement iss= ues in terminals that (correctly) treat it as 2 characters. > > > > > > This is most easily reproduced in Kitty (v0.34), which renders and calc= ulates these emojis as 2 cells (most terminal emulators seem to incorrectly= handle this case of Unicode). > > > > > > To repro: > > > > Paste in the command =E2=80=9Cecho =E2=98=81=EF=B8=8F=E2=80=9D into Kit= ty (the last character is \0x2601 followed by \0xFE0F). Note that this resu= lts in bracketed paste mode in Zsh. > > > > > > Expected behavior: > > > > ZLE contains =E2=80=9Cecho =E2=98=81=EF=B8=8F=E2=80=9D. > > > > > > Actual behavior: > > > > ZLE contains =E2=80=9Ceecho =E2=98=81=EF=B8=8F=E2=80=9D (note the addit= ional =E2=80=9Ce=E2=80=9D at the beginning here - inverted colors from the = bracketed paste). Confirmed that this is due to an off-by-one on the cursor= instruction, from the PTY recording. > > > > > > Screenshot: link > > > > > > I=E2=80=99d love to discuss how to fix this for terminals that do respe= ct variation selectors. One way to do this could be via a new `terminfo` en= try, but I=E2=80=99d love to know what ZSH devs think! I=E2=80=99m an engin= eer building the Warp terminal, so I=E2=80=99d be happy to work on any term= inal-side changes of this with `terminfo` (we actually use bracketed paste = mode for all commands, to best support multiline commands with Warp's input= editor)! > > > > > > Notably, Fish 3.6 seems to calculate the width correctly as 2 cells (th= is is what originally prompted my investigation, due to the Starship prompt= - see fish-shell/issues/10461), along with Bash (using bracketed paste wit= h Bash 5.2). > > > > > > I=E2=80=99ve seen 2017/msg00432 which is related to this, but deals wit= h 0xFE0E not 0xFE0F. > > Generally speaking it is impossible to handle combining emoji, since > the specification allows the rendering to either combine or not > combine the glyphs, it is not possible for zsh to know how much space > they will take up. Of course, your problem isn't even about combining > emoji, but as far as I can see the same conceptual problem applies > here; there is no way for zsh to know what "render as an image" > implies for glyph width, all we can do is call wcwidth. I also meant to say, if wcwidth for the base glyph is 1, then adding a composing character after with a width of 0, it will not magically change the width of the base glyph and cannot do so. https://www.unicode.org/reports/tr51/ does mention that "Current practice is for emoji to have a square aspect ratio, deriving from their origin in Japanese. For interoperability, it is recommended that this practice be continued with current and future emoji. They will typically have about the same vertical placement and advance width as CJK ideographs." but zsh cannot have some custom tables of emoji widths, either wcwidth works correctly or it doesn't. --=20 Mikael Magnusson