From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 27094 invoked from network); 24 Oct 2022 01:28:33 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 24 Oct 2022 01:28:33 -0000 ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1666574913; b=ohhp/5igMO5B4kuU52iYG98J0JzO0TPCT29MUAXkeR5MdQuK4v/wC+NB8TQ1cdA3OdGsk72VZF gneDi7qYFFTJ3KuIGDtxcbcKgyxxpMcUSufGLX6O3ZJSUPTER6miFcL9B/iBkYTBh18WmbaIwK 4Bwo9mJ/Hs7JSj4gQXHcyOlSPDrQc//7fmzAcu/+3jFngu8Y6u+EK2eV/hSGJZ8MoX8w3vptYI AkNc1sjGu8IqBquaLD3VuQQE7jRvdaEbRqTV7TxhgKjU6zfGnM4tlPytT+NBLS03yQuKy0ybn7 phR6bo/6e+78oHNUt1j9PZM0/l3OB+NoSl7F5ut3CR2Fwg==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (mail-ed1-f51.google.com) smtp.remote-ip=209.85.208.51; dkim=pass header.d=gmail.com header.s=20210112 header.a=rsa-sha256; dmarc=pass header.from=gmail.com; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1666574913; bh=F7kH6cmCzEW8Xoyhagchkf2RvVVk8MMk15SNuL3CKg0=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Content-Type:Cc:To:Subject:Message-ID:Date:From:References: In-Reply-To:MIME-Version:DKIM-Signature:DKIM-Signature; b=TBxW3ZEY8mVpVn+HhoftrY/S6aQrkMo9HBbnjq5YCUwCW6lLTHmWyglIYUrb5O9FFXgjHiGvf+ lbEIAWD1tJGHVimdfO5cy6bUqjM2t0l6+WVFzbqIfU33Ww2rylePZbcda35H3eL6d3uxBSQRKo BLbPOPh+vxyyT0glFydN27DxRSvSFa1rat4+6U2zbOywl+mfCuJOHJ3kHxmYvByTGZdD11y4fj 36ZtmK6bKwlRwhLFYmiYaz6lDPPWLXa5b7nxQqVC/TiN6esJs/QxQeafSiGHeNq9KAH+fnfiah RjugQTma1ekVJxhfTEfXRTm9HCAJTX8yGt9xfupi46J3YQ==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Content-Type:Cc:To:Subject:Message-ID :Date:From:References:In-Reply-To:MIME-Version:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=Ei18GWfcYcWX7n0ahes2rf46sAnIVnjGjdpMLJt1J9Y=; b=MW2TSEFDFDx1bQXdt+n+NoHbD+ VIR8N/PH1kphbsfgF9kF7riYjEOH/eVb9l664LvBMpScZk/vugvDYDisF6Zb1S+2sRwrSdxHCSvSB fixrC1WvQMG4jo7actg0FIkY/YVE2TZywsu0/JNQ3grRXLlh8EGXkdgV1YPrrUR9SUskIHiW1DJIc Fk5DTdWjsXTbjL9U4YJAtx588VbuLWx7ZiWZXCw+niX/GCXF2oZ/UphBVZlN7uSQtsngd6wH+HIvj 47VKdPAs0jVTY5geET256xQSaZLAHDB0KpRAO2iR8uqZmPoYaZGXJ90ev3/jJV91qcMT4VbHOGvJV kwagD98g==; Received: by zero.zsh.org with local id 1ommGW-000L3W-L8; Mon, 24 Oct 2022 01:28:32 +0000 Authentication-Results: zsh.org; iprev=pass (mail-ed1-f51.google.com) smtp.remote-ip=209.85.208.51; dkim=pass header.d=gmail.com header.s=20210112 header.a=rsa-sha256; dmarc=pass header.from=gmail.com; arc=none Received: from mail-ed1-f51.google.com ([209.85.208.51]:35628) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_128_GCM_SHA256:128) id 1ommFr-000Khx-U9; Mon, 24 Oct 2022 01:27:53 +0000 Received: by mail-ed1-f51.google.com with SMTP id t18so10417149edt.2 for ; Sun, 23 Oct 2022 18:27:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Ei18GWfcYcWX7n0ahes2rf46sAnIVnjGjdpMLJt1J9Y=; b=h3oPG5qcvwh9aGc5UExmndFW+UM4tOCyBrPLLPfoBlCYAFuRELpRAWOFIdE2LqM2e0 893piwiXSP/VP/udPgWgypk0tit6T69YZsFQEt5DfxuCYmUnltbGWHpPEl31yThgM6nV 8M9vz1t3znusPdyiHPwOjLU4oj+2fOgOvBTcvd/9rGaxPb4zhrhI42ZDteBYwQL9d0/y kqltkiOEMc/6z2ql6yLKZeypDnTeyy2KCYVgfDyD/SbHou5EgEhMki3H4/C4B1j/xKAT N912L976ezLR+U+sZ7R/Uum2uIPOHBfO2lRYPZOWKrlHjE+KJubJj0lFOqUj7BDWIqtK tczQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Ei18GWfcYcWX7n0ahes2rf46sAnIVnjGjdpMLJt1J9Y=; b=NyOv+tewO2NNeJ7xJuQWuS0IMIWroRnqFIBQ2eZ/VtgxwzPua8nyNBblJnMBjEne2u Qr814XjCqoKr3IDqr4od2Kej+5P/oDLR8tIMbE2XKN2uoNSRIK/mx6rGKcpCUf2l7Ssq LyDu9wuoGxGIgEbQd9i34wShVfYXjlcoB5aXbHkP2bztzdEeq5rIb8pHjEMOGB+9AG+1 fKGg05zt/qG+04gnhv/ayxKPFoG2XQBLF7NrVA/9N0NF+sl7JoiF3qvzDMRDrsgM2l+Q KJrp5Ga7Nq8675C9JH8AFDW0HPURf52RfigKANfEI1sdA3amcif0NL4uQXHmwArOmmnn sOHw== X-Gm-Message-State: ACrzQf3rsBIVJBWrkgJRhOzkIg+ErXCI8Tz701DkjoOMl2m4E1v51GIl JlO6iSQ0ePqL7yej/2qrFgKArPnRMj6graU67gOn8ec3 X-Google-Smtp-Source: AMsMyM7oBUUcjSVyFuHmIsdxSkVZDFkSWWQshTO1eUF2lAoyBpd++oLBHZwlvIO8eKEOs8iOzMC0DJ2QOJAqDA9yIgo= X-Received: by 2002:a05:6402:450c:b0:443:6279:774f with SMTP id ez12-20020a056402450c00b004436279774fmr29197996edb.11.1666574871322; Sun, 23 Oct 2022 18:27:51 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a17:906:a8a:b0:7a6:9215:51b7 with HTTP; Sun, 23 Oct 2022 18:27:50 -0700 (PDT) In-Reply-To: References: From: Mikael Magnusson Date: Mon, 24 Oct 2022 03:27:50 +0200 Message-ID: Subject: Re: UNICODE Private Use Area characters in BUFFER To: Bart Schaefer Cc: Zsh hackers list Content-Type: text/plain; charset="UTF-8" X-Seq: 50826 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: , List-Subscribe: , List-Unsubscribe: , List-Post: List-Owner: List-Archive: On 10/24/22, Bart Schaefer wrote: > On Sun, Oct 23, 2022 at 4:35 PM Bart Schaefer > wrote: >> >> Asserting that zsh "handles" those characters in other >> contexts isn't indicative of anything beyond demonstrating that >> terminal "handling" is a special case. > > Seems to me we've got the following options: > > 1. Do nothing. > 2. Presume Roman is correct that these characters can always be > treated as printable and narrow. (Still no answer as to how best to > change this?) > 3. Add an option UNICODE_PRINTABLE_NARROW that when set, asserts all > these characters to be printable and narrow. Default ... on? > 4. Add special variable(s) (perhaps via module?) to allow remapping > the wcwidth9.h lookup tables to make individual characters printable > and set their width. I think if we should do anything with wcwidth9.h, it's remove it. Since adding it there have been 6 subsequent unicode standards, the latest one adding over 4000 ideographs alone[1] (I don't know what width the version 9 wcwidth gives for this range). It is probably returning wrong values for many more thousands of characters on systems where the libc has newer tables than unicode 9. I suppose it could be useful to enable when remoting into old systems from a modern one. We should probably at least mark it as deprecated, glibc 2.26 added support for unicode 9 and was released in august 2017, and the unicode 9 wcwidth.h was added to zsh in november 2016, a rather small window where it mattered. What happened in unicode 9 was that the presentation width for all emoji was changed to 2[2], I'm not sure how this motivated people to add custom tables to every program they used instead of simply updating glibc and have every program be correct at once... [1] https://home.unicode.org/announcing-the-unicode-standard-version-15-0/ [2] I couldn't find a more official reference than this atm, https://github.com/irssi/irssi/issues/720 -- Mikael Magnusson