From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 25874 invoked from network); 22 Mar 2023 02:25:46 -0000 Received: from minnie.tuhs.org (2600:3c01:e000:146::1) by inbox.vuxu.org with ESMTPUTF8; 22 Mar 2023 02:25:46 -0000 Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id ABB4F41200; Wed, 22 Mar 2023 12:25:39 +1000 (AEST) Received: from mcvoy.com (mcvoy.com [192.169.23.250]) by minnie.tuhs.org (Postfix) with ESMTPS id 39D50411FA for ; Wed, 22 Mar 2023 12:25:27 +1000 (AEST) Received: by mcvoy.com (Postfix, from userid 3546) id CD7FC35E845; Tue, 21 Mar 2023 19:25:26 -0700 (PDT) Date: Tue, 21 Mar 2023 19:25:26 -0700 From: Larry McVoy To: Rob Pike Message-ID: <20230322022526.GI3779@mcvoy.com> References: <20230319134701.3A262220F7@orac.inputplus.co.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Message-ID-Hash: SIATXMXI3LNWL2TZGFLR6QL5DRW45ZRD X-Message-ID-Hash: SIATXMXI3LNWL2TZGFLR6QL5DRW45ZRD X-MailFrom: lm@mcvoy.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: tuhs@tuhs.org X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Re: Bell Foreign-Language UNIX Efforts List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: The brilliance of UTF-8 was to encode ASCII as is. That seems obvious in retrospect but as Rob says, the multibyte crud in C89 was just awful, and that was the answer at the time. Fitting ASCII in as is meant that all of the Unix utilities, sed, grep, awk, etc, had close to no performance hit if you were processing ascii. That's pretty cool when you get that and you can process Japanese et al as well. I kind of cringe when I say it is brilliant to not break what exists already, to me, that's just part of what you do as an engineer. But history has shown that not breaking stuff, fitting the new into the old, is brilliant. So kudos to Rob and Ken for doing that (but truth be told, I'd be stunned if they didn't, they are great engineers). On Mon, Mar 20, 2023 at 07:27:34AM +1100, Rob Pike wrote: > As my mail quoted in > https://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt says, > Ken worked out a new packing that avoided all the problems with the > existing ones. He didn't alter Prosser's encoding. UTF-8, as it was later > called, was not based on anything but it was deeply informed by a couple of > years of work coming to grips with the problem of programming with > multibyte characters. What Prosser did do, and what we - all of us - are > very grateful for, is start the conversation about replacing UTF with > something practical. > > (Speaking of design by committee, the multibyte stuff in C89 was atrocious, > and I heard was done in committee to get someone, perhaps the Japanese, to > sign off.) > > Regarding windows, Nathan Myrhvold visited Bell Labs around this time, and > we tried to talk to him about this, but he wasn't interested, claiming they > had it all worked out. We later learned what he meant, and lamented. Not > the only time someone wasn't open to hear an idea that might be worth > hearing, but an educational one. > > It's important historically to understand how all the forces came together > that day. The world was ready for a solution to international text, the > proposed character set was acceptable to most but the ASCII compatibility > issues were unbearable, the proposed solution to that was noxious, various > committees were starting to solve the problem in committee, leading to > technical briefs of varying quality, none right, and somehow a phone call > was made one afternoon to a couple of people who had been thinking and > working these issues for ages, one of whom was a genius. And it all worked > out, which is truly unusual. > > -rob -- --- Larry McVoy Retired to fishing http://www.mcvoy.com/lm/boat