From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14864 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH] Update ctype data to Unicode 12.1.0 Date: Fri, 25 Oct 2019 10:15:14 -0400 Message-ID: <20191025141514.GU16318@brightrain.aerifal.cx> References: <20191012212742.29880-1-el01049@gmail.com> <20191012223947.GH16318@brightrain.aerifal.cx> <20191014130709.GL16318@brightrain.aerifal.cx> <20191020145915.GD16318@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="83225"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-14880-gllmg-musl=m.gmane.org@lists.openwall.com Fri Oct 25 16:15:32 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1iO0NH-000LTN-1X for gllmg-musl@m.gmane.org; Fri, 25 Oct 2019 16:15:31 +0200 Original-Received: (qmail 15372 invoked by uid 550); 25 Oct 2019 14:15:27 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 14327 invoked from network); 25 Oct 2019 14:15:27 -0000 Content-Disposition: inline In-Reply-To: Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:14864 Archived-At: On Wed, Oct 23, 2019 at 07:21:35PM +0300, Eleftherios Kritikos wrote: > Hi all, > > I wanted to mention that I have used the code for `wcwidth`[1] and for > generating Unicode data tables[2] from musl in the Haskell library > vty[3] (a ncurses style library). > > Relevant files in the MR: > * https://github.com/jtdaugherty/vty/pull/179/files#diff-ab3908e00d1c13397ed03e5c2213ad8bR5 > * https://github.com/jtdaugherty/vty/pull/179/files#diff-a06fd5aeeca6d7dac0278c2537eb1950R1 > * https://github.com/jtdaugherty/vty/pull/179/files#diff-86acb7ffecd1a09c5f55892bd0ce13b1R1 > * https://github.com/jtdaugherty/vty/pull/179/files#diff-dc77683ad25ad6f509fb58a397c93f4aR1 > * https://github.com/jtdaugherty/vty/pull/179/files#diff-9879d6db96fd29134fc802214163b95aR32 > > Thanks Rich Felker and everyone else for all the good work that has > gone into musl! > > Please let me know if you think attribution was not properly given. > > 1.http://git.musl-libc.org/cgit/musl/tree/src/ctype/wcwidth.c?id=9b2921bea1d5017832e1b45d1fd64220047a9802 > 2.https://github.com/richfelker/musl-chartable-tools/tree/master/ctype > 3. https://github.com/jtdaugherty/vty Great! I love seeing code/concepts from musl getting adopted elsewhere especially in places where the classic solutions were all much larger. Just a quick update on why I haven't merged this yet: I went to do the case mappings too, and found that at least one range, I believe the one that would be CASEMAP(0x1c90,0x1cba,0x10d0), is not representable in the current code that requires updating by hand (it could be done on a char-by-char basis but continuing to expand that part makes the file grow larger and slower very quickly). So, I'm pulling back up the proposed replacement code from April 2018 that never got finished and merged. The old thread is here: https://www.openwall.com/lists/musl/2018/04/05/1 It's moderately larger -- ~4.8k instead of ~1.5k for Unicode 10 -- but O(1) rather than O(n) (n = # of case mappings), about 10x faster, and programmatically generated from UnicodeData.txt. I'll add the (awful, ugly, just like everything else in musl-chartable-tools) code for generating the table to musl-chartable-tools when I merge it so it's not a black box. I have it working now, so as long as I don't hit any unexpected problems testing I'll get this (and your patch, and updating case mappings to Unicode 12) merged soon. Thanks again for sending the patch and pinging this. Rich