From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_EF,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,T_KAM_HTML_FONT_INVALID autolearn=ham autolearn_force=no version=3.4.4 Received: from zero.zsh.org (zero.zsh.org [IPv6:2a02:898:31:0:48:4558:7a:7368]) by inbox.vuxu.org (Postfix) with ESMTP id DE51821838 for ; Thu, 9 May 2024 16:46:41 +0200 (CEST) ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1715266001; b=pbv+e/OjFnM1rc9bCEePTCRFHt0/aL1oLGsPrPQWY0dVuX3oqpCcjp/s31yyg1Fzy+1Ye/8/Ac QsC4AHaYKHuTG4VAe03AYQgMds8Wg5F2Z8QA9or/BHA6y/HHoJlY3ru4uMYUdAWRQXTYDJwrP9 IpxqyLaVL+EWR52eVvMfZJYKTUZkQTHr7QbH3A2NR60QXiuvSbTyqiDC3/8rl7Z/UnSPbiPYH6 MvI6eOS4ux7tj6EyWAL/x6QKCRDjzE17HYc4P787Pcajpssfcn1zdhmUMqKNNFVJrLlVWqR4q9 GzJ85uUmXMrDyjS3G0ekyX752X5nYXp1kJn0gnEjTKnErg==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (mail-oo1-f46.google.com) smtp.remote-ip=209.85.161.46; dkim=pass header.d=warp-dev.20230601.gappssmtp.com header.s=20230601 header.a=rsa-sha256; dmarc=none header.from=warp.dev; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1715266001; bh=BrT4OugdEAvkJ2oG1fQsThxkn+wrtdDWyD+38tGt7NM=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Content-Type:Cc:To:Subject:Message-ID:Date:From:MIME-Version: DKIM-Signature:DKIM-Signature; b=jkmXB/MTJhxLy3XBAZD7xuwbn0z9oCO1a9zH+fduz5aJeLBFlShnf6HRay539qyJZnMMrSCxx7 bT9GO/P6tdhTgDwqM0LrNJqda2S+v1MrHUj3manSKFOUqxDpQZzyybrRgCItJJicqPrn5Uanz2 a7W1w6oLG+jia/KLRHLX66lQvNAHeov0zKWY8txQilUuGQbGMI2GP9Xs80nsKzgmr0jcEOz6o4 5s5OexktLHe4e7YmMSjDW3Mwq9OudAe+lIYqDoanSogpPhh8qCFmK8MrK5JrzXS+KbjEJE7TZy kkMSDyw3vFTPL/2w/WXyRIBkmit/eBpNhjOUSw9BqRyjTQ==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Content-Type:Cc:To:Subject:Message-ID :Date:From:MIME-Version:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References; bh=qCie/dOAgWpIG8MO4UpzbX2GDwzA8pzZqNEfhwsBsH8=; b=A6Qzk8aEn6HmUm5ALe4izdnRIE GFCiYxB7Z9hWXb8h1lh5aQ/59fGK9Nx0drpaJYOkdSJR737Q/WFQE4B/vPujIS7lwIdCIGOxEcz3X Avnk46ijNBNQEPXHkAXndwFvhL4BQvJigvWL5Yo61Uc6jnhpdmN1kObK2YkGL5YnOj4wULAZ/e/iT lfnSmH9CHThqgdFFk10dn8mlWUNzKiIVWzID7yeq9IWxLc1p7JK54AHkzMBVU5Rvcc9kSbbEOn5TG INiScHdAKcIirXBpxrF+bLhYQGI2ej2JxQm819T6YwlE6hmKECxPPyD8srZAgpu4s6LT/DVsUvsPs JT55rvcA==; Received: by zero.zsh.org with local id 1s552e-000OtX-GV; Thu, 09 May 2024 14:46:40 +0000 Authentication-Results: zsh.org; iprev=pass (mail-oo1-f46.google.com) smtp.remote-ip=209.85.161.46; dkim=pass header.d=warp-dev.20230601.gappssmtp.com header.s=20230601 header.a=rsa-sha256; dmarc=none header.from=warp.dev; arc=none Received: from mail-oo1-f46.google.com ([209.85.161.46]:57633) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_128_GCM_SHA256:128) id 1s551m-000OY3-6a; Thu, 09 May 2024 14:45:47 +0000 Received: by mail-oo1-f46.google.com with SMTP id 006d021491bc7-5b279e04391so212306eaf.3 for ; Thu, 09 May 2024 07:45:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=warp-dev.20230601.gappssmtp.com; s=20230601; t=1715265944; x=1715870744; darn=zsh.org; h=cc:to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=qCie/dOAgWpIG8MO4UpzbX2GDwzA8pzZqNEfhwsBsH8=; b=oUaUd4LnRSaadsWUN8h94izO5LLsfT3uRiI7iWAJ51kXTee/GDqVw7xQSiL6YZq+Gm uSQ4d/qd03XT8OkWksv6JuStPV4x1lS2mFqGUqJlWpQjBUDODCTfxmIVRTEMJFpQR6Bo oU+5ed+1GrIc2uFaEcI4csrk+VuvcXQvaT72/kO33LFQVDvoCWEnOcvHpPtPYn2JkMPz m0gfrsdwHiraq55BehctrCdmDzbokcrTbVSwmZ03AjmBTmjgO3lz3io3BzsMCn3XylGi A0AGZ/hwlKmybGzefL3OI/f/nn494BDV+e0lkYA5n8HBcFgpB9uouIWVrCdM+zXnfSBG HaAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715265944; x=1715870744; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=qCie/dOAgWpIG8MO4UpzbX2GDwzA8pzZqNEfhwsBsH8=; b=nVBz+nIBhXA/xZb8zbxYeQjrclig2+V9b99jB8w+/WJFqydvc1iAlT6K4XGqrJ5QGu CxTXSF7b5XgED7h3+jgrpA+V0VwqnUdqwa2Vqgq2jTitQm8nbwrwyC4N0KHuC8Q17I2n QXaydF7pXxk5BwriN61d2Uym+y1beZkBbZOnnQUi9rxeailCBoAxGCE2O7WZD3JqALEP 1d8qiCJ0tmN0t6tRD9lDIFDPY2cZqhQwmY+Hcdetw0AMbUBSgmKhVZknQDrRMbQqzWCI p7qycevKI9LgccSB/gHHWwWbvdE1XpnaCjs8o0Ikh6LZF/6Q7IIfJJF6y4ho8jeXgQm8 pOPA== X-Gm-Message-State: AOJu0Yx8blrq7/LKmmPrq6znhaazI06+r/6k8YxbCamv7M5HSgdpizTy P+H9b3Rhhwn7Ra2fP2ry4BGmqH2vH8xnLeNDFhKLKkchq6vP/3KznBKa/k7kgCNzBzl5pvgi9Ws Phs6yX41PsPBGmVgPzbh5wITX1SxPhXTCnv5Drv8gzxA4hmP/aWpdWQ== X-Google-Smtp-Source: AGHT+IENh/VscblVJphic0dgTtxRadOnDKDmiZjLQJygkQRAlgnm7A3cgVWoo4JHs0A1Yj6podKnVI1K11DN3/RBzng= X-Received: by 2002:a4a:9893:0:b0:5ac:9f86:cc0d with SMTP id 006d021491bc7-5b24d73871bmr5371416eaf.6.1715265944397; Thu, 09 May 2024 07:45:44 -0700 (PDT) MIME-Version: 1.0 From: Advait Maybhate Date: Thu, 9 May 2024 10:45:32 -0400 Message-ID: Subject: [BUG] ZLE character width with emoji presentation variation selectors in Unicode To: zsh-workers@zsh.org Cc: Aloke Desai , Zach Bai Content-Type: multipart/alternative; boundary="0000000000004fd3d506180679bd" X-Seq: 52919 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: , List-Subscribe: , List-Unsubscribe: , List-Post: List-Owner: List-Archive: --0000000000004fd3d506180679bd Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hey folks! Wanted to file a bug report/get a discussion going on the best way to handle emoji variation selectors with Unicode characters. Metadata: Zsh version: zsh 5.9 (x86_64-apple-darwin23.0), OS version: macOS Sonoma 14.3.1 Terminal: tested across Warp, Kitty, default Mac terminal, Alacritty, iTerm 2 ZLE incorrectly treats characters with the emoji variation selector as 1 character instead of 2 characters, causing off-by-one cursor movement issues in terminals that (correctly) treat it as 2 characters. This is most easily reproduced in Kitty (v0.34), which renders and calculates these emojis as 2 cells (most terminal emulators seem to incorrectly handle this case of Unicode). To repro: - Paste in the command =E2=80=9Cecho =E2=98=81=EF=B8=8F=E2=80=9D into Kitt= y (the last character is \0x2601 followed by \0xFE0F). Note that this results in bracketed paste mode in = Zsh. Expected behavior: - ZLE contains =E2=80=9Cecho =E2=98=81=EF=B8=8F=E2=80=9D. Actual behavior: - ZLE contains =E2=80=9Ceecho =E2=98=81=EF=B8=8F=E2=80=9D (note the additi= onal =E2=80=9Ce=E2=80=9D at the beginning here - inverted colors from the bracketed paste). Confirmed that this is due to an off-by-one on the cursor instruction, from the PTY recording. Screenshot: link I=E2=80=99d love to discuss how to fix this for terminals that do respect v= ariation selectors. One way to do this could be via a new `terminfo` entry, but I=E2= =80=99d love to know what ZSH devs think! I=E2=80=99m an engineer building the Warp terminal, so I=E2=80=99d be happy to work on any terminal-side changes of t= his with `terminfo` (we actually use bracketed paste mode for all commands, to best support multiline commands with Warp's input editor)! Notably, Fish 3.6 seems to calculate the width correctly as 2 cells (this is what originally prompted my investigation, due to the Starship prompt - see fish-shell/issues/10461 ), along with Bash (using bracketed paste with Bash 5.2). I=E2=80=99ve seen 2017/msg00432 which is related to this, but deals with 0xFE0E not 0xFE0F. Thanks! Best, Advait --0000000000004fd3d506180679bd Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Hey folks!


=

Wanted to file a bug report/get a discussion = going on the best way to handle emoji variation selectors with Unicode char= acters.


Metadata:

Zsh version: zsh 5.9 (x86_64-apple-darwin23.0), OS versio= n: macOS Sonoma 14.3.1

Terminal: tested across Warp, Kitty, default Mac terminal, Ala= critty, iTerm 2


ZLE incorrectly treats characters with the emoji variation select= or as 1 character instead of 2 characters, causing off-by-one cursor moveme= nt issues in terminals that (correctly) treat it as 2 characters.

This is most ea= sily reproduced in Kitty (v0.34), which renders and calculates these emojis= as 2 cells (most terminal emulators seem to incorrectly handle this case o= f Unicode).=C2=A0


To repro:

  • Paste in the = command =E2=80=9Cecho =E2=98=81=EF=B8=8F=E2=80=9D into Kitty (the last char= acter is \0x2601 followed by \0xFE0F). Note that this results in bracketed = paste mode in Zsh.


Expected behavior:

  • ZLE contains =E2=80=9Cecho =E2=98=81=EF=B8=8F=E2=80=9D.


Actual b= ehavior:

  • ZLE contains =E2=80=9Cee= cho =E2=98=81=EF=B8=8F=E2=80=9D (note the additional =E2=80=9Ce=E2=80=9D at= the beginning here - inverted colors from the bracketed paste). Confirmed that this is due to an = off-by-one on the cursor instruction, from the PTY recording.


Screenshot: <= /span>link=C2=A0


I=E2=80=99d love to discuss how to fix= this for terminals that do respect variation selectors. One way to do this= could be via a new `terminfo` entry, but I=E2=80=99d love to know what ZSH devs think! = I=E2=80=99m a= n engineer building the Warp terminal, so I=E2=80=99d be happy to work on a= ny terminal-side changes of this with `terminfo` (we actually use bracketed paste= mode for all commands, to best support multiline commands with Warp's = input editor)= !


Nota= bly, Fish 3.6 seems to calculate the width correctly as 2 cells (this is wh= at originally prompted my investigation, due to the Starship prompt - see <= a href=3D"https://github.com/fish-shell/fish-shell/issues/10461">fish-shell= /issues/10461), along with Bash (using bracketed paste with Bash 5.2).<= /span>


I=E2= =80=99ve seen 2017/msg00432 which is related to this, but dea= ls with 0xFE0E not 0xFE0F.


Thanks!


Best,<= /p>

Advait




--0000000000004fd3d506180679bd--