From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FROM,HTML_MESSAGE,MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 10667 invoked from network); 20 Mar 2023 09:23:16 -0000 Received: from minnie.tuhs.org (50.116.15.146) by inbox.vuxu.org with ESMTPUTF8; 20 Mar 2023 09:23:16 -0000 Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id 78F7D41356; Mon, 20 Mar 2023 19:23:10 +1000 (AEST) Received: from mail-vs1-xe35.google.com (mail-vs1-xe35.google.com [IPv6:2607:f8b0:4864:20::e35]) by minnie.tuhs.org (Postfix) with ESMTPS id 8E43341332 for ; Mon, 20 Mar 2023 19:23:04 +1000 (AEST) Received: by mail-vs1-xe35.google.com with SMTP id d2so3631748vso.9 for ; Mon, 20 Mar 2023 02:23:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679304183; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=2R3A6BcAkSzLIUWi+ZG/29JgrZpVD1uBub/5McXElH4=; b=UOv8qaPJvWgayIFFD3HJkWeUnreezVthgDFMdqH1/n33k1VnP+cFFu7bMLeHFk8uKe 5bsG7RB0m4M7N26ti8s/K4gFntqSGB4Xcu2Hsrd7/QlQg79Tsq6etF2DNGz+W2COI929 RKylwrMvpMRx2xqXluUK/0l71ZtVJ5VngTOfeDS8vZrmBdz9jFgXLygSE7eofF0j1Mn8 fqoe/0qwNWK6G0F4Bx5Im9ukzbsnFUnvwS9cvHj3lMcdrDdTFqyaCuAwv4tO91czA++s 0QmTCP9ghgPPgzGZiHKtxeWn2NFs/saRI4G/ZisjzjQ4yms1qXCA9SHgDpDxXOjXV2j2 cPrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679304183; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=2R3A6BcAkSzLIUWi+ZG/29JgrZpVD1uBub/5McXElH4=; b=Pj9jHDOWFs6Xbs6E4DgJdb/EtQNV+/+eQrWKyNJJ8vEwpNWyggaVr2kA+NMPcqZ2/g x+BvgtWGdOm5KPU/GPOETx7fkTaDxh79mYQzKMbxFwS+srlaqpgm4ujzTfNkaDKAr6qb ZrzcpNhE+7hwAGQDm5Pl+iQTHOKRTPK03BK59eonYHf0FPMQfcGqgjnNvm7IY/24ZnMA ig+hWYFn5c4AkbjVzHdWf2eEdqkhLOgUTMhvKF4FzP0VurYZv99vHkCIBjXH5AYKVzPw I3kmYERYKp3S2fFhBW9bnDkKKfAf37sgYo8x+kcvmT9lrr1QaRTontzxZ+gCgvPs1zVt BLxA== X-Gm-Message-State: AO0yUKXPBcnLRic26vp1ujulQpDn53hXngeM2yA9pH6CNyQWHNE0z0m/ J5FftJ8EqPXAuhm9S+QX4SoiAZsLFLpzOWr7FkOEM/Em X-Google-Smtp-Source: AK7set+6MFVayl+utHUfs5QFotc5BxpW2HOsf9vuPBbevBNeS3rUpY7ky4QmGTjSGTfQrQmyyR8j1q/0VmtxLNB2tFA= X-Received: by 2002:a67:d80b:0:b0:425:cf3c:de6 with SMTP id e11-20020a67d80b000000b00425cf3c0de6mr3640352vsj.6.1679304183103; Mon, 20 Mar 2023 02:23:03 -0700 (PDT) MIME-Version: 1.0 References: <20230319134701.3A262220F7@orac.inputplus.co.uk> <202303200755.32K7tIeW023352@freefriends.org> In-Reply-To: <202303200755.32K7tIeW023352@freefriends.org> From: Rob Pike Date: Mon, 20 Mar 2023 20:22:52 +1100 Message-ID: To: arnold@skeeve.com Content-Type: multipart/alternative; boundary="0000000000004dc76c05f75179d0" Message-ID-Hash: EZLOHBOMUAMC342PE5OEL7QFZ6VBDRV6 X-Message-ID-Hash: EZLOHBOMUAMC342PE5OEL7QFZ6VBDRV6 X-MailFrom: robpike@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: tuhs@tuhs.org X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Re: Bell Foreign-Language UNIX Efforts List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --0000000000004dc76c05f75179d0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Exactly the way we did it in Plan 9, and published in the paper cited earlier. In fact, it's possible the library work was done as early as 1989, but I'm not sure. Certainly by 1990. -rob On Mon, Mar 20, 2023 at 6:55=E2=80=AFPM wrote: > Hi Rob. > > Rob Pike wrote: > > > (Speaking of design by committee, the multibyte stuff in C89 was > atrocious, > > and I heard was done in committee to get someone, perhaps the Japanese, > to > > sign off.) > > It's not lovely, but I wouldn't call it atrocious. It gets the job > done; code using it can handle multibyte encodings while being totally > character-set agnostic. I speak from experience, gawk does this. > (I use the "restartable" routins - mbrlen() and so on.) > > I understand that Unicode + UTF-8 solve the issue completely. But I'd > like to ask, in all seriousness and so that I can learn, given the world > as it was in 1989, how would you solve the problem? If you had designed > the C level routines, what would they have looked like? > > Thanks, > > Arnold > --0000000000004dc76c05f75179d0 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Exactly the way we did it in Plan 9, and published in the paper c= ited earlier. In fact, it's possible the library work was done as early= as 1989, but I'm not sure. Certainly by 1990.

-rob


=
On Mon, Ma= r 20, 2023 at 6:55=E2=80=AFPM <arno= ld@skeeve.com> wrote:
Hi Rob.

Rob Pike <robpike= @gmail.com> wrote:

> (Speaking of design by committee, the multibyte stuff in C89 was atroc= ious,
> and I heard was done in committee to get someone, perhaps the Japanese= , to
> sign off.)

It's not lovely, but I wouldn't call it atrocious. It gets the job<= br> done; code using it can handle multibyte encodings while being totally
character-set agnostic.=C2=A0 I speak from experience, gawk does this.
(I use the "restartable" routins - mbrlen() and so on.)

I understand that Unicode + UTF-8 solve the issue completely. But I'd like to ask, in all seriousness and so that I can learn, given the world as it was in 1989, how would you solve the problem? If you had designed
the C level routines, what would they have looked like?

Thanks,

Arnold
--0000000000004dc76c05f75179d0--