From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 26268 invoked from network); 4 Nov 2022 09:56:28 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 4 Nov 2022 09:56:28 -0000 ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1667555788; b=ByJkVAeLfpBqJQF9wnXQUWLfFmG5AvjDPwCzUY+AoqCD6Sw/mbD+AQH9/eGABTIp0CUM9vQ+F5 erIiDFT8TH447j/YRayuZzmTUiGOcLkN8wqwYAo1o3PTFj9Cr8mJ7sQptzONqtIyekoohuk41o 5MaF4lVrt0KfFe89AeIu31w1qja1f7Zg6BGig9omI8bEzalcjjXErwq7i9VFC2KDYsNmWiDG7X ZyyGvkztx73oUk7W/zvZ+574GjeKOLvU1voBV3r62HZwV19ZcrR+Lzniq9r1vXmdiy+adxURrL aWMhsQjP4wDsJQn8i35ShhAmFf1jRplSxnghenvjRENiTw==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (snd00006-bg.im.kddi.ne.jp) smtp.remote-ip=27.86.113.6; dmarc=none header.from=kba.biglobe.ne.jp; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1667555788; bh=SYTXGvNe66+z8ad13KgGoeX4b4/WcISq1VfrlJFx0yc=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Message-ID:In-Reply-To:To:References:Date:Subject: MIME-Version:Content-Transfer-Encoding:Content-Type:From:DKIM-Signature; b=ZgexD0PJ6N/TNZhltqnnLBlCi/o6urI4IEiZ1RdnWtHLVPMbUhKTGSmlYnGuTVokIqh/9CSa1K kbeH1jGcFa1H/OkltmbwzP3dqO17SQ/lLV2Z4K1zjAbRRfNCg2Wm4TiQXzstaIg2F1vZWtsI3n y1C7bWtxf6XMtldf6dODlyxXS0pp4XjQQEK+zpRj1wK8jR6/jdvhybOJnaVQNZbhO4DnGS2S+E NjSUVWZPf2XhhujF27F9K0TTxOm1qgTN8ZmEuZ1JQKA1TvSFkzsiZfxVrOto3V1wC6MC5nqyzg GOkKRNnftw9OfuWhf+pyrv53gLJh8s4mkW+Q1aKUfsICEg==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Message-Id:In-Reply-To:To:References: Date:Subject:Mime-Version:Content-Transfer-Encoding:Content-Type:From: Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=argACr216Fic+VBwGZzVkxI7bueGV8kSq+VAP4aTsDo=; b=DVBedeVpXo6poUv0I9zOqxMkMc pOU+wurU26yH1zdehkRvDqcgRtBjsY3zR7M6QGbZLkbcVIa8PKSB+gnQr8tGZRk6f61JaTmA8ZEnN X32SNUxA8M6DvpN1Q0QQnWJSBITcHZcGU+ZJq30BmFyNwKudGcHR/1gEzL0p+kbP5v+z3lvn/oWUE B7rAMFq/zvfsOeeAYqGpja9cCar2Zoi1GCxnq5nTLiNpS4AolHpAk6tAWxAgdDDSl2TJRV+G6yE3/ sHiITBOqQUp6nPGYLXM6M/iawSwQ8Ra8nHhyCX4ecjD+HqoehZ9tpUIRm0TnGrAW5LhBemIjZNAcs T9jS66Dw==; Received: by zero.zsh.org with local id 1oqtR4-000GOj-UV; Fri, 04 Nov 2022 09:56:26 +0000 Authentication-Results: zsh.org; iprev=pass (snd00006-bg.im.kddi.ne.jp) smtp.remote-ip=27.86.113.6; dmarc=none header.from=kba.biglobe.ne.jp; arc=none Received: from snd00006-bg.im.kddi.ne.jp ([27.86.113.6]:23521 helo=dfmta0012.biglobe.ne.jp) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_256_GCM_SHA384:256) id 1oqtQT-000G3o-H1; Fri, 04 Nov 2022 09:55:51 +0000 Received: from mail.biglobe.ne.jp by omta0012.biglobe.ne.jp with ESMTP id <20221104095542803.HTBO.10197.mail.biglobe.ne.jp@biglobe.ne.jp> for ; Fri, 4 Nov 2022 18:55:42 +0900 From: Jun T Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.21\)) Subject: Re: UNICODE Private Use Area characters in BUFFER Date: Fri, 4 Nov 2022 18:55:42 +0900 References: To: zsh-workers@zsh.org In-Reply-To: Message-Id: X-Mailer: Apple Mail (2.3445.104.21) X-Biglobe-Sender: takimoto-j@kba.biglobe.ne.jp X-Seq: 50865 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: , List-Subscribe: , List-Unsubscribe: , List-Post: List-Owner: List-Archive: > 2022/10/24 2:29, Roman Perepelitsa wrote: > > You are right, iswprint(0xE0B0) returns 0. > > I'm compiling zsh with --enable-unicode9, so instead of iswprint() it > goes into u9_iswprint(). This function explicitly handles this case > and returns 0, just like iswprint(). So we get this: > > WCWIDTH(0xE0B0) => 1 > WC_ISPRINT(0xE0B0) => 0 I think iswprint(0xe0b0) (or WC_ISWPRINT()) returns 1 (in UTF-8 locale). The reason that it doesn't work in Zle seems to be in Zle/zle_refresh.c: 1328 #ifdef MULTIBYTE_SUPPORT 1329 else if ( 1330 #ifdef __STDC_ISO_10646__ 1331 !ZSH_INVALID_WCHAR_TEST(*t) && 1332 #endif 1333 WC_ISPRINT(*t) && (width = WCWIDTH(*t)) > 0) { __STDC_ISO_10646__ is defined in (probably all) Linux (but not in macOS), and ZSH_INVALID_WCHAR_TEST() is defined in Zle/zle.h: 512 /* The start of the private range we use, for 256 characters */ 513 #define ZSH_INVALID_WCHAR_BASE (0xe000U) 514 /* Detect a wide character within our range */ 515 #define ZSH_INVALID_WCHAR_TEST(x) \ 516 ((unsigned)(x) >= ZSH_INVALID_WCHAR_BASE && \ 517 (unsigned)(x) <= (ZSH_INVALID_WCHAR_BASE + 255u)) ZSH_INVALID_WCHAR_TEST() returns true for the wide character wc in the range 0xe000 <= wc <= 0xe0ff. It seems zsh assume that this range is not used by users and use it for representing "invalid" (or incomplete) characters (see line 452 in Zle/zle_utils.c). If characters in this range need be output as is, then we need some options or such to disable this feature. On macOS __STDC_ISO_10646__ is not defined (I think this is a bug of macOS), and the character U+e0b0 is output as is. But on standard macOS there is no font that has a glyph for this character, and it is rendered as "a square with ? inside" (double width). If you install a font that has a gliph for this character, and if the gliph is single width, then I guess it will work OK in Zle.