From mboxrd@z Thu Jan 1 00:00:00 1970 From: bqt@update.uu.se (Johnny Billquist) Date: Sun, 27 Mar 2016 23:53:32 +0200 Subject: [TUHS] Character sets In-Reply-To: <20160327214943.GS3766@eureka.lemis.com> References: <56F7B14A.7040201@update.uu.se> <20160327112920.GH12921@mercury.ccil.org> <56F7C85F.6060503@update.uu.se> <20160327214943.GS3766@eureka.lemis.com> Message-ID: <56F8565C.3080704@update.uu.se> On 2016-03-27 23:49, Greg 'groggy' Lehey wrote: > On Sunday, 27 March 2016 at 13:47:43 +0200, Johnny Billquist wrote: >> On 2016-03-27 13:29, John Cowan wrote: >>> Johnny Billquist scripsit: >>> >>>> On 2016-03-27 08:18, Greg 'groggy' Lehey wrote: >>>>> Isn't it wonderful that we no longer have issues with character >>>>> representation? >>>> >>>> I hope that comment was meant as a joke, ironic, cynical, or whatever... >>> >>> Undoubtedly. But things *are* much better than they used to be: >>> we can now do everything within a single character set, and convert >>> only at the boundaries (and increasingly, only in one direction). >> >> Haha. Yes... Except that you now have multiple representations of each >> character within one character set. So what has improved??? > > In the Good Old Days, characters were all the same size, and you could > do nice, simple things like > > while (*c && *c++ != " "); > > Now you need a whole library to do the same thing. Another one I noted a while ago was that functions and command in Unix, such as lpq, which try to print things in nice columns now fail, because the code don't actually know how many characters have been output. And let's not even talk about such wonderful concepts as colors in the character set definition... Unicode seems to have it all... I wonder how many code points exist for 'A'. It's definitely more than one... Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: bqt at softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol