From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/14838 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Eleftherios Kritikos Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH] Update ctype data to Unicode 12.1.0 Date: Sun, 20 Oct 2019 11:53:15 +0300 Message-ID: References: <20191012212742.29880-1-el01049@gmail.com> <20191012223947.GH16318@brightrain.aerifal.cx> <20191014130709.GL16318@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="0000000000005e2e57059553b389" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="161855"; mail-complaints-to="usenet@blaine.gmane.org" Cc: musl@lists.openwall.com To: Rich Felker Original-X-From: musl-return-14854-gllmg-musl=m.gmane.org@lists.openwall.com Sun Oct 20 10:53:43 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1iM6y6-000frZ-BW for gllmg-musl@m.gmane.org; Sun, 20 Oct 2019 10:53:42 +0200 Original-Received: (qmail 30620 invoked by uid 550); 20 Oct 2019 08:53:39 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 30602 invoked from network); 20 Oct 2019 08:53:39 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=axKcXA+RHhBtF5msAYgVMQxxQON0txm3ZEHjJ416hdQ=; b=WanCZY5lGR0mBFK2RFlmCIxf6uL0TE/mb/YFzv5cdTm1+dyLsh+3pDqeXS7aX9K5X2 STJ02bAPm5cyZhduDd+9w+Pmd3QMWigHObjMewBaVFpVE4tsffoMfqVwhG/wkdMn33IP QOBDt+d106uNAjGiOV70YG5OQOa7bATqu4opqSyJxuG+yrzg4lN0+9Swa4+EwVdasAc7 kyVee0nc3eJhxDMbfuLFSAdOI+tHein74eOsrX/fm6V2g8F9MA+aqnqwGqdxFPTVtAxz Xz8WdUgexEVWODzrNXUQALQwn5sB2lKon52/Ai1HCty3FYZoRmv6O7LV/GHEagPtD3n3 JDzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=axKcXA+RHhBtF5msAYgVMQxxQON0txm3ZEHjJ416hdQ=; b=AfDM5fJroUIq/arHXDnVLVk/UWSERfMVb+rYZ3aA/PJnoiiKuG2QWNQznDJvFZhqKR iYnvFiJ2ZQcQWwp4Anbh4fJTMvq5Yxs7vrmj6J/QqMSTVzT6FDxvBXVSrl5ppDnSOX4F PcJc6QWF4BMG+NIxjaBotW5A0veqEoh5QZHxx81ked6xWTXjgzF60A46BZF2LtT1DVaU lHxo0N8yIbLkpxVI2/7gkXurAPvrvdv3buLiMpHBg9E8hZrfQIJ1/USv5s+W8DEMxjDB UEUNRSbCYE/NHfNeme13Q068Io4hiJTccuyQErm3iJyf4X4p/WH9LmIbcvSerXl9S8ew aysA== X-Gm-Message-State: APjAAAX1gke1nltDmNEtvrYUDH9+fxuNjNxArnceWlFwgD28gO4dtmu/ eD5620RHwhmrc6G93AcXWl8hRMumH3nFpt2SSVg= X-Google-Smtp-Source: APXvYqy71cGTqSGxLJDZBVwXms4TaadRaYswqTzNV9ubZUKCMEZGNbErbvRnfSYhCIQnHDPK3qna8hMnKI4qjujnxtw= X-Received: by 2002:a2e:3505:: with SMTP id z5mr4467964ljz.126.1571561607670; Sun, 20 Oct 2019 01:53:27 -0700 (PDT) In-Reply-To: Xref: news.gmane.org gmane.linux.lib.musl.general:14838 Archived-At: --0000000000005e2e57059553b389 Content-Type: text/plain; charset="UTF-8" Is anything else we could do to get this merged? Kind regards, Lefteris On Mon, Oct 14, 2019 at 4:57 PM Eleftherios Kritikos wrote: > Actually now that I read the ISO spec again, it seems to correspond to > Unicode 11 :( so I think ISO/IEC 10646 has not catch up with Unicode 12 > yet. Not sure what we should do in this case. > > On Mon, Oct 14, 2019 at 2:51 PM Eleftherios Kritikos > wrote: > >> From what I read from here: >> >> >> https://stackoverflow.com/questions/12590255/what-does-stdc-iso-10646-exactly-mean >> >> and here: >> >> https://standards.iso.org/ittf/PubliclyAvailableStandards/index.html >> >> it seems like the latest ISO/IEC 10646 standard that most closely match >> Unicode 12.1.0 is ISO/IEC 10646:2017/Amd 2:2019 (fifth edition 2017, >> amendment 2). >> >> From what I read in the document here: >> https://standards.iso.org/ittf/PubliclyAvailableStandards/c073773_ISO_IEC_10646_2017_Amd_2_2019%20(E).zip >> >> on first page, this amendment was made on 2019-06. So I would guess that >> the correct value should be: >> >> ``` >> #define __STDC_ISO_10646__ 201906L >> ``` >> >> All this with a grain of salt as this is the first time I am looking at >> ISO/IEC 10646. >> >> Thanks for looking into this! >> >> Regards, >> Lefteris >> >> >> >> On Mon, Oct 14, 2019 at 2:07 PM Rich Felker wrote: >> >>> On Sat, Oct 12, 2019 at 11:56:44PM +0100, Eleftherios Kritikos wrote: >>> > Yes. I also created a merge request for musl-chartable-tools >>> > https://github.com/richfelker/musl-chartable-tools/pull/2 >>> >>> Thanks. stdc-predef.h also needs to be updated with a new value for >>> __STD_ISO_10646__. Do you know the right yyyymm value it should have >>> for this version of Unicode? >>> >>> Rich >>> >>> >>> > On Sat, 12 Oct 2019, 11:40 pm Rich Felker, wrote: >>> > >>> > > On Sat, Oct 12, 2019 at 10:27:42PM +0100, u_quark wrote: >>> > > > --- >>> > > > src/ctype/alpha.h | 159 >>> +++++++++++++++++++++------------------- >>> > > > src/ctype/nonspacing.h | 88 ++++++++++++----------- >>> > > > src/ctype/punct.h | 160 >>> ++++++++++++++++++++++------------------- >>> > > > src/ctype/wide.h | 26 +++---- >>> > > > 4 files changed, 232 insertions(+), 201 deletions(-) >>> > > >>> > > Is this done just by dropping the new Unicode files into >>> > > musl-chartable-tools and running make? >>> > > >>> > > Rich >>> > > >>> >> --0000000000005e2e57059553b389 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Is anything else we could do to get this merged?

Kind regards,
Lefteris

<= /div>
O= n Mon, Oct 14, 2019 at 4:57 PM Eleftherios Kritikos <el01049@gmail.com> wrote:
Actually now that I rea= d the ISO spec again, it seems to correspond to Unicode 11 :( so I think IS= O/IEC 10646 has not catch up with Unicode 12 yet. Not sure what we should d= o in this case.

On Mon, Oct 14, 2019 at 2:51 PM Eleftherios Kritikos <= ;el01049@gmail.com> wrote:
From what I read from here:


an= d here:


=
it seems like the latest ISO/IEC 10646 standard that most closely matc= h Unicode 12.1.0 is ISO/IEC 10646:2017/Amd 2:2019 (fifth edition 2017, amen= dment 2).

From what I read in the document here: <= a href=3D"https://standards.iso.org/ittf/PubliclyAvailableStandards/c073773= _ISO_IEC_10646_2017_Amd_2_2019%20(E).zip" target=3D"_blank">https://standar= ds.iso.org/ittf/PubliclyAvailableStandards/c073773_ISO_IEC_10646_2017_Amd_2= _2019%20(E).zip

on first page, this amendment = was made on 2019-06. So I would guess that the correct value should be:

```
#define __STDC_ISO_1= 0646__ 201906L
```

=
All this with a grain of salt as this is the first time I am looking a= t ISO/IEC 10646.

Thanks for looking into this!

Regards,
Lefteris

=


On Mon, Oct 14, 2019 at 2:07 PM Rich Felker <dalias@libc.org> wrote:
On Sat, Oct 12, 20= 19 at 11:56:44PM +0100, Eleftherios Kritikos wrote:
> Yes. I also created a merge request for musl-chartable-tools
> https://github.com/richfelker/musl-cha= rtable-tools/pull/2

Thanks. stdc-predef.h also needs to be updated with a new value for
__STD_ISO_10646__. Do you know the right yyyymm value it should have
for this version of Unicode?

Rich


> On Sat, 12 Oct 2019, 11:40 pm Rich Felker, <dalias@libc.org> wrote:
>
> > On Sat, Oct 12, 2019 at 10:27:42PM +0100, u_quark wrote:
> > > ---
> > >=C2=A0 src/ctype/alpha.h=C2=A0 =C2=A0 =C2=A0 | 159 ++++++++++= +++++++++++-------------------
> > >=C2=A0 src/ctype/nonspacing.h |=C2=A0 88 ++++++++++++--------= ---
> > >=C2=A0 src/ctype/punct.h=C2=A0 =C2=A0 =C2=A0 | 160 ++++++++++= ++++++++++++-------------------
> > >=C2=A0 src/ctype/wide.h=C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 26 = +++----
> > >=C2=A0 4 files changed, 232 insertions(+), 201 deletions(-) > >
> > Is this done just by dropping the new Unicode files into
> > musl-chartable-tools and running make?
> >
> > Rich
> >
--0000000000005e2e57059553b389--