From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14810 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eleftherios Kritikos Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH] Update ctype data to Unicode 12.1.0 Date: Mon, 14 Oct 2019 14:57:13 +0100 Message-ID: References: <20191012212742.29880-1-el01049@gmail.com> <20191012223947.GH16318@brightrain.aerifal.cx> <20191014130709.GL16318@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000508faa0594df3ff4" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="170439"; mail-complaints-to="usenet@blaine.gmane.org" Cc: musl@lists.openwall.com To: Rich Felker Original-X-From: musl-return-14826-gllmg-musl=m.gmane.org@lists.openwall.com Mon Oct 14 15:57:43 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1iK0qw-000i9H-G0 for gllmg-musl@m.gmane.org; Mon, 14 Oct 2019 15:57:43 +0200 Original-Received: (qmail 14117 invoked by uid 550); 14 Oct 2019 13:57:36 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 14096 invoked from network); 14 Oct 2019 13:57:35 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bkOKS1k0sCa6WDB1ZKnmImT3CQPfw1Jxf8iR6N7Dccs=; b=HTi2jTE8txRbm9FavZBlkSQtIvuH+dD2G+/bX31WKin3rrxQqOoscEolSG/KZ71CU7 fEcLav07q1BFuf/uvJTwkYMzTrN4HPy8Q/DUISNaA1960trY4PvWO/z7cokL+TlPy8pF 4zBgQOA/x8+DX5izAxSJKkXdxcWFOr/EhGgnnyUPd8PT/TnFoDkCZ+wagtwa+bV9VwQ0 zmGCKFiTxMq2qBlEpD+iDqaXE7ovEev/vBOTpnQkwJse7/0xoAho0XF9Tcid+h5JT9b0 uOJFMRof2MWU+b5fL77Hg6Y9cFmwbUTXhI/MN++Qj8uPN5hzdk1tGXv2yDUGa8I/OrIF mgwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bkOKS1k0sCa6WDB1ZKnmImT3CQPfw1Jxf8iR6N7Dccs=; b=l2VHlfOYLpauYId/LDdrt3Jf4hW6s/ZwjcEMtvw5tH4JpjmblVUGI5Jbn+OLZufWJ+ Ok3Fhr6TqDb7NxdCMUfPEwLElYgE3Gu8XoUDR2OTkVW3NHRIhXB9SjmF5mdi4UUYOyWm hWyjhOyFqJwHeFBo6eFnTzXjB4ExJ7tUtjVCXprlF2pyJTZbihez8IzxVO/RI6HcgG0g 5jDaz05I5Hu2AXObOpEA8vD8y4Whgzsi7driDyC8KLdo5bGxzXK9kK4BZ2K792A5fC3n DjsIsvkZMe23I4fG6a0YwI3cNmSGMAi3MssfoGf/5qpLAUSieQc+wXpPVbyrVRTmLtzc 6d5A== X-Gm-Message-State: APjAAAUyLklUhG2SAUsbD1WxOIRqBME83etimefFdoJzVYN4bS9rW4RE +AHXhVAnwuJDeIeQ3gpCmFYUPii55DYMnmkPJ4VHzbEOsKSY6w== X-Google-Smtp-Source: APXvYqyvTgw3zp5ccjq0p2rMBj98IyuRqUXjFm+HoDzF2aSC1Y68gJaX6syIgQGtfDL4Hlc8hrQWSOlXFnZ+9lj/Duc= X-Received: by 2002:a2e:3016:: with SMTP id w22mr19138915ljw.117.1571061444414; Mon, 14 Oct 2019 06:57:24 -0700 (PDT) In-Reply-To: Xref: news.gmane.org gmane.linux.lib.musl.general:14810 Archived-At: --000000000000508faa0594df3ff4 Content-Type: text/plain; charset="UTF-8" Actually now that I read the ISO spec again, it seems to correspond to Unicode 11 :( so I think ISO/IEC 10646 has not catch up with Unicode 12 yet. Not sure what we should do in this case. On Mon, Oct 14, 2019 at 2:51 PM Eleftherios Kritikos wrote: > From what I read from here: > > > https://stackoverflow.com/questions/12590255/what-does-stdc-iso-10646-exactly-mean > > and here: > > https://standards.iso.org/ittf/PubliclyAvailableStandards/index.html > > it seems like the latest ISO/IEC 10646 standard that most closely match > Unicode 12.1.0 is ISO/IEC 10646:2017/Amd 2:2019 (fifth edition 2017, > amendment 2). > > From what I read in the document here: > https://standards.iso.org/ittf/PubliclyAvailableStandards/c073773_ISO_IEC_10646_2017_Amd_2_2019%20(E).zip > > on first page, this amendment was made on 2019-06. So I would guess that > the correct value should be: > > ``` > #define __STDC_ISO_10646__ 201906L > ``` > > All this with a grain of salt as this is the first time I am looking at > ISO/IEC 10646. > > Thanks for looking into this! > > Regards, > Lefteris > > > > On Mon, Oct 14, 2019 at 2:07 PM Rich Felker wrote: > >> On Sat, Oct 12, 2019 at 11:56:44PM +0100, Eleftherios Kritikos wrote: >> > Yes. I also created a merge request for musl-chartable-tools >> > https://github.com/richfelker/musl-chartable-tools/pull/2 >> >> Thanks. stdc-predef.h also needs to be updated with a new value for >> __STD_ISO_10646__. Do you know the right yyyymm value it should have >> for this version of Unicode? >> >> Rich >> >> >> > On Sat, 12 Oct 2019, 11:40 pm Rich Felker, wrote: >> > >> > > On Sat, Oct 12, 2019 at 10:27:42PM +0100, u_quark wrote: >> > > > --- >> > > > src/ctype/alpha.h | 159 >> +++++++++++++++++++++------------------- >> > > > src/ctype/nonspacing.h | 88 ++++++++++++----------- >> > > > src/ctype/punct.h | 160 >> ++++++++++++++++++++++------------------- >> > > > src/ctype/wide.h | 26 +++---- >> > > > 4 files changed, 232 insertions(+), 201 deletions(-) >> > > >> > > Is this done just by dropping the new Unicode files into >> > > musl-chartable-tools and running make? >> > > >> > > Rich >> > > >> > --000000000000508faa0594df3ff4 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Actually now that I read the ISO spec again, it seems to c= orrespond to Unicode 11 :( so I think ISO/IEC 10646 has not catch up with U= nicode 12 yet. Not sure what we should do in this case.

On Mon, Oct 14, = 2019 at 2:51 PM Eleftherios Kritikos <el01049@gmail.com> wrote:
From what I read from here:
=

=

and here:




on first= page, this amendment was made on 2019-06. So I would guess that the correc= t value should be:

```
#define __STDC_ISO_10646__ 201906L
``= `

All this with a grain of salt as this is the fir= st time I am looking at ISO/IEC 10646.

Thanks for= looking into this!

Regards,
Lefteri= s



<= div dir=3D"ltr" class=3D"gmail_attr">On Mon, Oct 14, 2019 at 2:07 PM Rich F= elker <dalias@libc.= org> wrote:
On Sat, Oct 12, 2019 at 11:56:44PM +0100, Eleftherios Kritikos wrote: > Yes. I also created a merge request for musl-chartable-tools
> https://github.com/richfelker/musl-cha= rtable-tools/pull/2

Thanks. stdc-predef.h also needs to be updated with a new value for
__STD_ISO_10646__. Do you know the right yyyymm value it should have
for this version of Unicode?

Rich


> On Sat, 12 Oct 2019, 11:40 pm Rich Felker, <dalias@libc.org> wrote:
>
> > On Sat, Oct 12, 2019 at 10:27:42PM +0100, u_quark wrote:
> > > ---
> > >=C2=A0 src/ctype/alpha.h=C2=A0 =C2=A0 =C2=A0 | 159 ++++++++++= +++++++++++-------------------
> > >=C2=A0 src/ctype/nonspacing.h |=C2=A0 88 ++++++++++++--------= ---
> > >=C2=A0 src/ctype/punct.h=C2=A0 =C2=A0 =C2=A0 | 160 ++++++++++= ++++++++++++-------------------
> > >=C2=A0 src/ctype/wide.h=C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 26 = +++----
> > >=C2=A0 4 files changed, 232 insertions(+), 201 deletions(-) > >
> > Is this done just by dropping the new Unicode files into
> > musl-chartable-tools and running make?
> >
> > Rich
> >
--000000000000508faa0594df3ff4--