From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 31111 invoked from network); 2 May 2022 09:56:01 -0000 Received: from minnie.tuhs.org (45.79.103.53) by inbox.vuxu.org with ESMTPUTF8; 2 May 2022 09:56:01 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id 2A1CC9D469; Mon, 2 May 2022 19:55:57 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id 66EDF9D431; Mon, 2 May 2022 19:55:23 +1000 (AEST) Received: by minnie.tuhs.org (Postfix, from userid 112) id 19A119D431; Mon, 2 May 2022 19:55:17 +1000 (AEST) Received: from relay05.pair.com (relay05.pair.com [216.92.24.67]) by minnie.tuhs.org (Postfix) with ESMTPS id 9956D9CE23 for ; Mon, 2 May 2022 19:55:16 +1000 (AEST) Received: from orac.inputplus.co.uk (unknown [87.112.46.167]) by relay05.pair.com (Postfix) with ESMTP id 6BA801A2660 for ; Mon, 2 May 2022 05:55:15 -0400 (EDT) Received: from orac.inputplus.co.uk (orac.inputplus.co.uk [IPv6:::1]) by orac.inputplus.co.uk (Postfix) with ESMTP id 7C2BC21547 for ; Mon, 2 May 2022 10:55:14 +0100 (BST) To: tuhs@minnie.tuhs.org From: Ralph Corderoy MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit In-reply-to: References: <20220430104546.1C13022135@orac.inputplus.co.uk> <20220430125207.4DDE8200B5@orac.inputplus.co.uk> Date: Mon, 02 May 2022 10:55:14 +0100 Message-Id: <20220502095514.7C2BC21547@orac.inputplus.co.uk> Subject: Re: [TUHS] Aleph Null in Software Practice & Experience. X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" Hi Rob, > The output of "unicode 5d0-5e7" (robpike.io/cmd/unicode has the > command) is fun. > > 05d0 א 05d1 ב 05d2 ג 05d3 ד > 05d4 ה 05d5 ו 05d6 ז 05d7 ח > 05d8 ט 05d9 י 05da ך 05db כ > 05dc ל 05dd ם 05de מ 05df ן > 05e0 נ 05e1 ס 05e2 ע 05e3 ף > 05e4 פ 05e5 ץ 05e6 צ 05e7 ק > > For comparison, here is "unicode 3d0-3e7". It will be fun to watch how > it's rendered. > > 03d0 ϐ 03d1 ϑ 03d2 ϒ 03d3 ϓ > 03d4 ϔ 03d5 ϕ 03d6 ϖ 03d7 ϗ > 03d8 Ϙ 03d9 ϙ 03da Ϛ 03db ϛ > 03dc Ϝ 03dd ϝ 03de Ϟ 03df ϟ > 03e0 Ϡ 03e1 ϡ 03e2 Ϣ 03e3 ϣ > 03e4 Ϥ 03e5 ϥ 03e6 Ϧ 03e7 ϧ In the terminal where I read and write email, they're all as if ‘0041 A’. But save the email's text/plain to foo.txt and foo.html, add a little HTML to foo.html, and the browser, here Firefox, presents the Hebrew in both as 05d0 05 אd1 05 בd2 05 גd3 ד 05d4 05 הd5 05 וd6 05 זd7 ח 05d8 05 טd9 05 יda 05 ךdb כ 05dc 05 לdd 05 םde 05 מdf ן 05e0 05 נe1 05 סe2 05 עe3 ף 05e4 05 פe5 05 ץe6 05 צe7 ק due to the mix of Unicode's strong, weak, and neutral bi-directional character types. To see what I intend above needs a ‘broken’ renderer, like a terminal. For those with more intelligent renderers, it's as if runes normally drawn as 00c0 À 00c1 Á 00c2 Â 00c3 Ã became 00c0 00 Àc1 00 Ác2 00 Âc3 Ã Wrapping each of the Hebrew characters in the text and HTML files in LRI...PDI, LRI U+2066 Left-to-right isolate PDI U+2069 Pop directional isolate so the first row becomes 0030 0035 0064 0030 0020 2066 05d0 2069 0020 0030 0035 0064 0031 0020 2066 05d1 2069 0020 0030 0035 0064 0032 0020 2066 05d2 2069 0020 0030 0035 0064 0033 0020 2066 05d3 2069 000a has Firefox display the tables as intended. Perhaps the unicode command should do this to ensure correct display, especially if some terminals ever start to improve? I note that vim(1) here doesn't realise LRI and PDI are zero width so the cursor position drifts past the end of the visible line. ed(1) copes without a murmur. -- Cheers, Ralph.