From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.1 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FROM,HTML_MESSAGE,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 2075 invoked from network); 30 Nov 2020 10:44:07 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 30 Nov 2020 10:44:07 -0000 Received: (qmail 18186 invoked by uid 550); 30 Nov 2020 10:44:01 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 16378 invoked from network); 30 Nov 2020 10:41:49 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:mime-version:subject:message-id:date:cc:to; bh=rzh3BK465biR5rOGQ5OOnSym0cpP9ZREzqHAShxbvGI=; b=gZSGDXhcqq4W1rQol/Hw9ZZMrcICqt80lnTcPcLDZiqPTtYo9epILSnYGoiWbbdbdQ mLN3/xGWh/DA3LS2rabItCsaoFo2KU34Zg5GFxmjSYP5q8eI/E9JaOcnQCQRuPY0rZRI gNvaCjBAu+0a/AcDh8MclYUUst142rn49JwiFP03NVWpeuKjzhCJMBkWutEyTDS1rHvY JEuXZypTnbbNkZI7saIRQU7j67EWf6KsDusQYyrsgX++n8918MMJtG2cS2BSKKGaMpzD e+AxHG1kO8lLUSQkLx5B7yb0NZiEJ4WvCG7yMiaXn5OHapumpTWLjtRabslqaoqXBEi3 wmsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:mime-version:subject:message-id:date:cc:to; bh=rzh3BK465biR5rOGQ5OOnSym0cpP9ZREzqHAShxbvGI=; b=E6bGvc1uilQS2uNHJ/3zZJ2t3tOL02BAyKE1yo27N2qC45Uz9D8U4y0zyxMheFE3r6 18YoI/Ux+E07RyOK6WnjFEaKEOqnZBRFgRb8a9j7kTBADQnTO9Qk3gGOCgITIFF3zFep wy/rwddA6q2ZdIDmpkMraGx3jalVHpm4xHAC+/ApzlwxQ7UAjYI3J+IIyawG2joyYFkD 9fhwFpcxXEso8iqu0X8P5/BcuBrlfhjS244+pad6iI/34iWGf+6/u0t94yslYaTR0Rtd 6yyU1Cmbom4sjM9w//0nWsmyoVhIzpH+HVKIaldDwL7rdlBc1Q0MhHhoF1lWLdunHjbQ 10Rw== X-Gm-Message-State: AOAM5314wQA9UhaJl9oQh91pEezrPBOT+W9PAkW+ZbN+CvnwReq1q9YB SHU72WuRjankYUkHD5LMM7HfrlrAZ855Uw== X-Google-Smtp-Source: ABdhPJz4a3R3HRTirfrekqmEvP2+HS1AS+5hSYggiT4NWdJs+60vbmhI/yvxRXSJQHzU9Di32ZWYMg== X-Received: by 2002:a05:6a00:2384:b029:19a:eed3:7f42 with SMTP id f4-20020a056a002384b029019aeed37f42mr1533214pfc.4.1606732897218; Mon, 30 Nov 2020 02:41:37 -0800 (PST) From: Dong Brett Content-Type: multipart/alternative; boundary="Apple-Mail=_8D7E64B6-ED99-46B0-BD65-A04348BC7DA9" Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\)) Message-Id: <62BA8BC9-9943-418A-8349-2B1FB962EDE9@gmail.com> Date: Mon, 30 Nov 2020 18:41:33 +0800 Cc: Binrui Dong To: musl@lists.openwall.com X-Mailer: Apple Mail (2.3608.120.23.2.4) Subject: [musl] Question on C++ locale --Apple-Mail=_8D7E64B6-ED99-46B0-BD65-A04348BC7DA9 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 Hi all, I am troubleshooting a locale related issue of our C++ software when = building with musl. With some efforts I narrowed our problem down to the = inability of setting a UTF-8 locale in C++ standard library. The following C code prints UTF-8 characters correctly: #include #include #include int main() { setlocale(LC_ALL, ""); initscr(); printw("LC_ALL: %s\n", setlocale(LC_ALL, NULL)); printw("CODESET: %s\n", nl_langinfo(CODESET)); printw("Hello, world!\n"); printw("=E4=BD=A0=E5=A5=BD=EF=BC=8C=E4=B8=96=E7=95=8C!\n"); refresh(); getch(); endwin(); return 0; } Giving the output of LC_ALL: C.UTF-8;C;C;C;C;C CODESET: UTF-8 Hello, world! =E4=BD=A0=E5=A5=BD=EF=BC=8C=E4=B8=96=E7=95=8C! However, the following C++ code does not work (our software uses = std::locale in C++ standard library for locale related stuff): #include #include #include using namespace std; int main() { std::locale::global(locale("")); initscr(); printw("LC_ALL: %s\n", setlocale(LC_ALL, NULL)); printw("C++ locale: %s\n", locale().name().c_str()); printw("CODESET: %s\n", nl_langinfo(CODESET)); printw("Hello, world!\n"); printw("=E4=BD=A0=E5=A5=BD=EF=BC=8C=E4=B8=96=E7=95=8C!\n"); refresh(); getch(); endwin(); return 0; } Giving a corrupted output: LC_ALL: C C++ locale: C CODESET: ASCII Hello, world! =E4=BD=A0=E5=A5=BD?~L?~V?~U~L! Seems only ASCII C locale is available in C++. If I run the above C++ = code with LANG=3D"C.UTF-8", an exception is thrown and the program is = aborted: terminate called after throwing an instance of 'std::runtime_error' what(): locale::facet::_S_create_c_locale name not valid Aborted I also tried LANG=3D"UTF-8=E2=80=9D, LANG=3D"en_US.UTF-8" but none of = those works. Only LANG=3D"C" could make the program run but then only = ASCII characters are supported. My question is that is there a way to make locale in C++ standard = library work with musl? Or had I done anything wrong with it? Regards, Brett= --Apple-Mail=_8D7E64B6-ED99-46B0-BD65-A04348BC7DA9 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8 Hi = all,

I am = troubleshooting a locale related issue of our C++ software when building = with musl. With some efforts I narrowed our problem down to the = inability of setting a UTF-8 locale in C++ standard library.

The following C code = prints UTF-8 characters correctly:
#include = <ncurses.h>
#include <langinfo.h>
#include = <locale.h>

int = main()
{
    setlocale(LC_ALL, "");
  =   initscr();
    printw("LC_ALL: %s\n", setlocale(LC_ALL, = NULL));
    printw("CODESET: %s\n", = nl_langinfo(CODESET));
    printw("Hello, = world!\n");
    printw("=E4=BD=A0=E5=A5=BD=EF=BC=8C=E4=B8=96=E7=95= =8C!\n");
    refresh();
    getch();
  =   endwin();
    return 0;
}

Giving the output of
LC_ALL: = C.UTF-8;C;C;C;C;C
CODESET: UTF-8
Hello, world!
=E4=BD=A0=E5=A5=BD=EF=BC=8C=E4=B8=96=E7=95=8C!

However, the = following C++ code does not work (our software uses std::locale in C++ = standard library for locale related stuff):
#include = <langinfo.h>
#include <locale.h>
#include <locale>
using = namespace std;
int main()
{
    = std::locale::global(locale(""));
    = initscr();
    printw("LC_ALL: %s\n", setlocale(LC_ALL, = NULL));
    printw("C++ locale: %s\n", = locale().name().c_str());
    printw("CODESET: %s\n", = nl_langinfo(CODESET));
    printw("Hello, = world!\n");
    printw("=E4=BD=A0=E5=A5=BD=EF=BC=8C=E4=B8=96=E7=95= =8C!\n");
    refresh();
    getch();
  =   endwin();
    return 0;
}

Giving a corrupted output:
LC_ALL: = C
C++ locale: C
CODESET: ASCII
Hello, = world!
=E4=BD=A0=E5=A5=BD?~L?~V?~U~L!

Seems only ASCII C = locale is available in C++. If I run the above C++ code with LANG=3D"C.UTF-8", an exception is thrown and the = program is aborted:
terminate called after throwing an = instance of 'std::runtime_error'
  what():  = locale::facet::_S_create_c_locale name not valid
Aborted

I also tried LANG=3D"UTF-8=E2=80=9D, = LANG=3D"en_US.UTF-8" but none of those works. = Only LANG=3D"C" could make the program run but then = only ASCII characters are supported.

My question is that is there a way to = make locale in C++ standard library work with musl? Or had I done = anything wrong with it?

Regards,
Brett
= --Apple-Mail=_8D7E64B6-ED99-46B0-BD65-A04348BC7DA9--