From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Mon, 24 Jun 2013 11:11:42 -0400 To: 9fans@9fans.net Message-ID: <34723d59b4618c0a19b67299d8c27dc6@ladd.quanstro.net> In-Reply-To: <20130624141503.pffQxijUoC6mzgT/cF2fnZTk@dietcurd.local> References: <20130624141503.pffQxijUoC6mzgT/cF2fnZTk@dietcurd.local> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [9fans] Character case mappings Topicbox-Message-UUID: 680d45dc-ead8-11e9-9d60-3106f5b1d025 > My S-CText (on code SLASH>) tests all 0x10FFFF code points correct with the > above. Now when i look at the sys/src/libc/port/runetype.c (of > plan9front) then i think this one is generated, but i cannot find > the creating script or program, which would be of interest to me. > And maybe Plan9 would be interested to see the above patched into > that, at some later time. ? > Thank you and ciao, that's close to the approach taken, except since one needs a fresh table for each sorting if one hopes to do a binary search, simple tables of (various width) integers were made. it was also noted that bursting the tables at the junction of the basic and extended plans was possible in many cases. for example, for decompositions if r is a precombined form, and r is in the basic frame then for r =3D r' + c, r' and c are both in the basic plane. thus we can burst this table, and put basic plane mappings (1000 of them) in a more compact table that doesn't use vlongs. the extended plane table is tiny (18 entries). it's only worth using a binary search for symmetry. static uint __decompose2[] =3D { 0x00c0, 0x00410300, /* =C3=80 -> A 0300 */ [... 998 entries skipped ... ] 0xfb4e, 0x05e405bf, /* =EF=AD=8E -> =D7=A4 05bf */ } static uvlong __decompose264[] =3D { 0x1109a, 0x11099110baull, /* =F0=91=82=9A -> =F0=91=82=99 + 110ba */ [... 16 entries skipped ...] 0x1d1c0, 0x1d1bc1d16full, /* =F0=9D=87=80 -> =F0=9D=86=BC + 1d16f */ }; static uint* bsearch32(uint c, uint *t, int n, int ne) { uint *p; int m; while(n > 1) { m =3D n/2; p =3D t + m*ne; if(c >=3D p[0]) { t =3D p; n =3D n-m; } else n =3D m; } if(n && c =3D=3D t[0]) return t; return 0; } [bsearch64 omitted] int runedecompose(Rune a, Rune *d) { uint *p; uvlong *q; if(a <=3D 0xffff){ p =3D bsearch32(a, __decompose2, nelem(__decompose2)/2, 2); if(p){ d[0] =3D p[1] >> 16; d[1] =3D p[1] & 0xffff; return 0; } }else{ q =3D bsearch64(a, __decompose264, nelem(__decompose264)/2, 2); if(q){ d[0] =3D q[1] >> 32; d[1] =3D q[1] & 0xfffffff; return 0; } } return -1; } all the other rune tables work this way. there is one table per property. having a structure doesn't fit the current programming interface, nor usage. - erik