mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: [PATCH] Update ctype data to Unicode 12.1.0
Date: Fri, 25 Oct 2019 10:15:14 -0400	[thread overview]
Message-ID: <20191025141514.GU16318@brightrain.aerifal.cx> (raw)
In-Reply-To: <CAAw4D02dGKZ_OcHNsrxhyfc18OTgjWGGK1aqH4GxVTOChgrR5A@mail.gmail.com>

On Wed, Oct 23, 2019 at 07:21:35PM +0300, Eleftherios Kritikos wrote:
> Hi all,
> 
> I wanted to mention that I have used the code for `wcwidth`[1] and for
> generating Unicode data tables[2] from musl in the Haskell library
> vty[3] (a ncurses style library).
> 
> Relevant files in the MR:
>  * https://github.com/jtdaugherty/vty/pull/179/files#diff-ab3908e00d1c13397ed03e5c2213ad8bR5
>  * https://github.com/jtdaugherty/vty/pull/179/files#diff-a06fd5aeeca6d7dac0278c2537eb1950R1
>  * https://github.com/jtdaugherty/vty/pull/179/files#diff-86acb7ffecd1a09c5f55892bd0ce13b1R1
>  * https://github.com/jtdaugherty/vty/pull/179/files#diff-dc77683ad25ad6f509fb58a397c93f4aR1
>  * https://github.com/jtdaugherty/vty/pull/179/files#diff-9879d6db96fd29134fc802214163b95aR32
> 
> Thanks Rich Felker and everyone else for all the good work that has
> gone into musl!
> 
> Please let me know if you think attribution was not properly given.
> 
> 1.http://git.musl-libc.org/cgit/musl/tree/src/ctype/wcwidth.c?id=9b2921bea1d5017832e1b45d1fd64220047a9802
> 2.https://github.com/richfelker/musl-chartable-tools/tree/master/ctype
> 3. https://github.com/jtdaugherty/vty

Great! I love seeing code/concepts from musl getting adopted elsewhere
especially in places where the classic solutions were all much larger.

Just a quick update on why I haven't merged this yet: I went to do the
case mappings too, and found that at least one range, I believe the
one that would be CASEMAP(0x1c90,0x1cba,0x10d0), is not representable
in the current code that requires updating by hand (it could be done
on a char-by-char basis but continuing to expand that part makes the
file grow larger and slower very quickly).

So, I'm pulling back up the proposed replacement code from April 2018
that never got finished and merged. The old thread is here:
https://www.openwall.com/lists/musl/2018/04/05/1

It's moderately larger -- ~4.8k instead of ~1.5k for Unicode 10 -- but
O(1) rather than O(n) (n = # of case mappings), about 10x faster, and
programmatically generated from UnicodeData.txt. I'll add the (awful,
ugly, just like everything else in musl-chartable-tools) code for
generating the table to musl-chartable-tools when I merge it so it's
not a black box.

I have it working now, so as long as I don't hit any unexpected
problems testing I'll get this (and your patch, and updating case
mappings to Unicode 12) merged soon.

Thanks again for sending the patch and pinging this.

Rich


  reply	other threads:[~2019-10-25 14:15 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-12 21:27 u_quark
2019-10-12 22:39 ` Rich Felker
2019-10-12 22:56   ` Eleftherios Kritikos
2019-10-14 13:07     ` Rich Felker
2019-10-14 13:51       ` Eleftherios Kritikos
2019-10-14 13:57         ` Eleftherios Kritikos
2019-10-20  8:53           ` Eleftherios Kritikos
2019-10-20 14:59             ` Rich Felker
2019-10-20 15:26               ` Eleftherios Kritikos
2019-10-23 16:21                 ` Eleftherios Kritikos
2019-10-25 14:15                   ` Rich Felker [this message]
2019-10-25 14:29                     ` Eleftherios Kritikos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191025141514.GU16318@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).