On Wed, Sep 13, 2017 at 8:13 PM, Rich Felker wrote: > On Wed, Sep 13, 2017 at 12:05:19PM +0200, Reini Urban wrote: > > Wait a bit with that. I think I found some more Unicode 9.0 issues with > the tables, > > and I’ve found a huge performance opportunity by sorting the 3 tables > (mostly pairs), > > and break the loops earlier. > > This should come close to glibc table performance then, without the huge > memory costs they have. > > > > I’ll write a perl regression testing script not to miss any more > mappings, and maybe > > improve the current musl logic. This will need 1-2 days. > > I’ll also use it for cperl then. > > Thanks for the update. I still need to publish the table generation > code for all the other tables -- I got it mostly dug up and cleaned up > but got interrupted last time so it's still not posted. With that it > will be possible to update other things too, not just case mappings. > > A few of the existing tables are using an older version of the > tabulation code that formats the big arrays differently, so I'll > probably first make a commit to reformat them, so that it's possible > to mechanically check that this commit does not change the generated > .o files, then use the uniform formatting as the basis the subsequent > update to Unicode 9.0. That should not affect the case mapping file > though since it's not machine-generated. > I haven't yet seen your table generator, so I updated the tables with my version, as I use them in safeclib. Unicode 10.0 support plus sort tables for double search speed. I also added a harmless patch to a check-syntax target for emacs flymake support. -- Reini