zsh-workers
 help / color / mirror / code / Atom feed
From: Jun T <takimoto-j@kba.biglobe.ne.jp>
To: zsh-workers@zsh.org
Subject: Re: UNICODE Private Use Area characters in BUFFER
Date: Fri, 4 Nov 2022 18:55:42 +0900	[thread overview]
Message-ID: <C0B808D6-F3A4-4567-B8B3-FBF5687D8B20@kba.biglobe.ne.jp> (raw)
In-Reply-To: <CAN=4vMohKT=CAx5XoSAHrDvx4J--58535h-ZgCn3dkSqvZKDZg@mail.gmail.com>


> 2022/10/24 2:29, Roman Perepelitsa <roman.perepelitsa@gmail.com> wrote:
> 
> You are right, iswprint(0xE0B0) returns 0.
> 
> I'm compiling zsh with --enable-unicode9, so instead of iswprint() it
> goes into u9_iswprint(). This function explicitly handles this case
> and returns 0, just like iswprint(). So we get this:
> 
>    WCWIDTH(0xE0B0) => 1
>    WC_ISPRINT(0xE0B0) => 0

I think iswprint(0xe0b0) (or WC_ISWPRINT()) returns 1 (in UTF-8 locale).
The reason that it doesn't work in Zle seems to be in Zle/zle_refresh.c:

1328 #ifdef MULTIBYTE_SUPPORT                                              
1329         else if (                                            
1330 #ifdef __STDC_ISO_10646__                                              
1331                  !ZSH_INVALID_WCHAR_TEST(*t) &&                        
1332 #endif                                                           
1333                  WC_ISPRINT(*t) && (width = WCWIDTH(*t)) > 0) {

__STDC_ISO_10646__ is defined in (probably all) Linux (but not in macOS),
and ZSH_INVALID_WCHAR_TEST() is defined in Zle/zle.h:

512 /* The start of the private range we use, for 256 characters */
513 #define ZSH_INVALID_WCHAR_BASE  (0xe000U) 
514 /* Detect a wide character within our range */       
515 #define ZSH_INVALID_WCHAR_TEST(x)                       \
516     ((unsigned)(x) >= ZSH_INVALID_WCHAR_BASE &&         \  
517      (unsigned)(x) <= (ZSH_INVALID_WCHAR_BASE + 255u))   

ZSH_INVALID_WCHAR_TEST() returns true for the wide character wc in the
range 0xe000 <= wc <= 0xe0ff. It seems zsh assume that this range
is not used by users and use it for representing "invalid" (or incomplete)
characters (see line 452 in Zle/zle_utils.c).

If characters in this range need be output as is, then we need some
options or such to disable this feature.

On macOS __STDC_ISO_10646__ is not defined (I think this is a bug of
macOS), and the character U+e0b0 is output as is. But on standard
macOS there is no font that has a glyph for this character, and
it is rendered as "a square with ? inside" (double width).
If you install a font that has a gliph for this character, and if the
gliph is single width, then I guess it will work OK in Zle.


  parent reply	other threads:[~2022-11-04  9:56 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-23 10:12 Roman Perepelitsa
2022-10-23 16:29 ` Mikael Magnusson
2022-10-23 16:43   ` Roman Perepelitsa
2022-10-23 17:02     ` Bart Schaefer
2022-10-23 17:29       ` Roman Perepelitsa
2022-10-23 18:30         ` Unicode9 (was Re: UNICODE Private Use Area characters in BUFFER) Bart Schaefer
2022-10-23 19:30           ` Roman Perepelitsa
2022-10-23 21:57           ` Mikael Magnusson
2022-10-23 18:54         ` UNICODE Private Use Area characters in BUFFER Bart Schaefer
2022-10-23 19:26           ` Roman Perepelitsa
2022-11-04  9:55         ` Jun T [this message]
2022-10-23 22:42     ` Mikael Magnusson
2022-10-23 23:16       ` Roman Perepelitsa
2022-10-23 23:35         ` Bart Schaefer
2022-10-23 23:46           ` Bart Schaefer
2022-10-24  1:27             ` Mikael Magnusson
2022-10-24  1:43               ` Bart Schaefer
2022-10-24 10:50                 ` Roman Perepelitsa
2022-11-04 10:31                   ` Jun T
2022-11-04 10:33                     ` Roman Perepelitsa
2022-11-04 11:06                       ` Jun T
2022-11-04 11:09                         ` Roman Perepelitsa
2022-11-04 15:32                           ` Jun T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C0B808D6-F3A4-4567-B8B3-FBF5687D8B20@kba.biglobe.ne.jp \
    --to=takimoto-j@kba.biglobe.ne.jp \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).