mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: "Érico Nogueira" <ericonr@disroot.org>
Cc: musl@lists.openwall.com, Dong Brett <brett.browning.dong@gmail.com>
Subject: Re: [musl] Question on C++ locale
Date: Wed, 9 Dec 2020 11:41:03 -0500	[thread overview]
Message-ID: <20201209164102.GM534@brightrain.aerifal.cx> (raw)
In-Reply-To: <C7OBWAZMT7TN.1LYHQ9QM3WW6M@mussels>

On Wed, Dec 09, 2020 at 11:35:57AM -0300, Érico Nogueira wrote:
> On Mon Nov 30, 2020 at 11:51 AM -03, Rich Felker wrote:
> > On Mon, Nov 30, 2020 at 06:41:33PM +0800, Dong Brett wrote:
> > > Hi all,
> > > 
> > > I am troubleshooting a locale related issue of our C++ software when building with musl. With some efforts I narrowed our problem down to the inability of setting a UTF-8 locale in C++ standard library.
> > > 
> > > The following C code prints UTF-8 characters correctly:
> > > #include <ncurses.h>
> > > #include <langinfo.h>
> > > #include <locale.h>
> > > 
> > > int main()
> > > {
> > >     setlocale(LC_ALL, "");
> > >     initscr();
> > >     printw("LC_ALL: %s\n", setlocale(LC_ALL, NULL));
> > >     printw("CODESET: %s\n", nl_langinfo(CODESET));
> > >     printw("Hello, world!\n");
> > >     printw("你好,世界!\n");
> > >     refresh();
> > >     getch();
> > >     endwin();
> > >     return 0;
> > > }
> > > 
> > > Giving the output of
> > > LC_ALL: C.UTF-8;C;C;C;C;C
> > > CODESET: UTF-8
> > > Hello, world!
> > > 你好,世界!
> > > 
> > > However, the following C++ code does not work (our software uses std::locale in C++ standard library for locale related stuff):
> > > #include <langinfo.h>
> > > #include <locale.h>
> > > #include <locale>
> > > using namespace std;
> > > int main()
> > > {
> > >     std::locale::global(locale(""));
> > >     initscr();
> > >     printw("LC_ALL: %s\n", setlocale(LC_ALL, NULL));
> > >     printw("C++ locale: %s\n", locale().name().c_str());
> > >     printw("CODESET: %s\n", nl_langinfo(CODESET));
> > >     printw("Hello, world!\n");
> > >     printw("你好,世界!\n");
> > >     refresh();
> > >     getch();
> > >     endwin();
> > >     return 0;
> > > }
> > > 
> > > Giving a corrupted output:
> > > LC_ALL: C
> > > C++ locale: C
> > > CODESET: ASCII
> > > Hello, world!
> > > 你好?~L?~V?~U~L!
> > > 
> > > Seems only ASCII C locale is available in C++. If I run the above C++ code with LANG="C.UTF-8", an exception is thrown and the program is aborted:
> > > terminate called after throwing an instance of 'std::runtime_error'
> > >   what():  locale::facet::_S_create_c_locale name not valid
> > > Aborted
> > > 
> > > I also tried LANG="UTF-8”, LANG="en_US.UTF-8" but none of those
> > > works. Only LANG="C" could make the program run but then only ASCII
> > > characters are supported.
> > > 
> > > My question is that is there a way to make locale in C++ standard
> > > library work with musl? Or had I done anything wrong with it?
> >
> > Thanks for raising this. Indeed you've uncovered a (pile of) bug(s) in
> > libstdc++, but they don't seem to be relevant to your usage with
> > ncurses. Being a C library, not a C++ one, curses behavior depends on
> > the locale as set through the C/POSIX mechanisms, setlocale and/or
> > newlocale/uselocale. You shouldn't be using C++'s locale framework for
> > this. Any program using ncurses should start with either
> > setlocale(LC_ALL,"") or setlocale(LC_CTYPE,"") (depending on whether
> > you want the behavior of the other categories).
> >
> > I'll try to figure out what we need to do to get this fixed in
> > libstdc++. Since it's never been reported before, I suspect just very
> > few programs are using the C++ locale API so hopefully at least the
> > problem is low-impact.
> 
> As another data point for an application that uses C++ locales, there is
> snapper. From [1]:
> 
>     try
>     {
> 	locale::global(locale(""));
>     }
>     catch (const runtime_error& e)
>     {
> 	cerr << _("Failed to set locale. Fix your system.") << endl;
>     }
> 
> Fortunately, they have a try-catch around the call, which will also
> catch other errors like bad LANG values, if I understand correctly.  I
> wonder if other applications that make use of the API usually have this
> block, which can mask the error for the user.

On musl there are no bad LANG values. setlocale to "" (and likewise
newlocale for LC_ALL/"" or LC_CTYPE/"") can never fail. But this could
matter on other systems.

Rich

  reply	other threads:[~2020-12-09 16:41 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-30 10:41 Dong Brett
2020-11-30 11:31 ` Szabolcs Nagy
2020-11-30 11:58   ` Dong Brett
2020-11-30 11:35 ` Szabolcs Nagy
2020-11-30 12:37   ` Érico Nogueira
2020-11-30 13:44   ` Érico Nogueira
2020-11-30 14:39     ` Samuel Holland
2020-11-30 14:48       ` Rich Felker
2020-11-30 15:12       ` Érico Nogueira
2020-11-30 15:35         ` Rich Felker
2020-11-30 17:14           ` Érico Nogueira
2020-11-30 18:11             ` Rich Felker
2020-11-30 14:51 ` Rich Felker
2020-12-09 14:35   ` Érico Nogueira
2020-12-09 16:41     ` Rich Felker [this message]
2020-12-01  3:53 ` [musl] " Dong Brett
2020-12-01 16:21   ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201209164102.GM534@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=brett.browning.dong@gmail.com \
    --cc=ericonr@disroot.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).