mailing list of musl libc
 help / color / mirror / code / Atom feed
From: "Érico Nogueira" <ericonr@disroot.org>
To: <musl@lists.openwall.com>
Cc: "Samuel Holland" <samuel@sholland.org>,
	"Dong Brett" <brett.browning.dong@gmail.com>
Subject: Re: [musl] Question on C++ locale
Date: Mon, 30 Nov 2020 14:14:15 -0300	[thread overview]
Message-ID: <C7GRMLMLTEL6.2CDPLZ37X1R89@mussels> (raw)
In-Reply-To: <20201130153503.GP534@brightrain.aerifal.cx>

On Mon Nov 30, 2020 at 12:35 PM -03, Rich Felker wrote:
> On Mon, Nov 30, 2020 at 12:12:50PM -0300, Érico Nogueira wrote:
> > On Mon Nov 30, 2020 at 11:39 AM -03, Samuel Holland wrote:
> > > On 11/30/20 7:44 AM, Érico Nogueira wrote:
> > > > On Mon Nov 30, 2020 at 8:35 AM -03, Szabolcs Nagy wrote:
> > > >> * Dong Brett <brett.browning.dong@gmail.com> [2020-11-30 18:41:33
> > > >> +0800]:
> > > >>> However, the following C++ code does not work (our software uses std::locale in C++ standard library for locale related stuff):
> > > >>> #include <langinfo.h>
> > > >>> #include <locale.h>
> > > >>> #include <locale>
> > > >>> using namespace std;
> > > >>> int main()
> > > >>> {
> > > >>>     std::locale::global(locale(""));
> > > >>>     initscr();
> > > >>>     printw("LC_ALL: %s\n", setlocale(LC_ALL, NULL));
> > > >>>     printw("C++ locale: %s\n", locale().name().c_str());
> > > >>>     printw("CODESET: %s\n", nl_langinfo(CODESET));
> > > >>>     printw("Hello, world!\n");
> > > >>>     printw("你好,世界!\n");
> > > >>>     refresh();
> > > >>>     getch();
> > > >>>     endwin();
> > > >>>     return 0;
> > > >>> }
> > > >>
> > > >> fwiw for me even the first line fails.
> > > >> i don't know how c++ locales are supposed to work.
> > > > 
> > > > From [1], it seems that C++ locales are supposed to affect the global
> > > > locale as well, so they should call setlocale() when appropriate.
> > > > 
> > > > - [1] https://www.cplusplus.com/reference/locale/locale/
> > > > 
> > > > Unfortunately, I assume libstdc++ uses their generic locale support on
> > > > musl...  From gcc-10.2.0/libstdc++-v3/config/locale/generic/c_locale.cc:
> > > > 
> > > >   void
> > > >   locale::facet::_S_create_c_locale(__c_locale& __cloc, const char* __s,
> > > > 				    __c_locale)
> > > >   {
> > > >     // Currently, the generic model only supports the "C" locale.
> > > >     // See http://gcc.gnu.org/ml/libstdc++/2003-02/msg00345.html
> > > >     __cloc = 0;
> > > >     if (strcmp(__s, "C"))
> > > >       __throw_runtime_error(__N("locale::facet::_S_create_c_locale "
> > > > 			    "name not valid"));
> > > >   }
> > > > 
> > >
> > > I don't know for sure that it's the right thing to do, but I have been
> > > patching
> > > out that error for the last several years[1] and so far I have not
> > > noticed any
> > > negative effects. Adelie, which is very thorough about testing, has also
> > > carried
> > > the patch for a while[2].
> > >
> > > Samuel
> > >
> > > [1]:
> > > https://github.com/smaeul/portage/blob/c744774a/patches/sys-devel/gcc/gcc-5.4.0-locale.patch
> > > [2]: https://code.foxkit.us/adelie/packages/-/commit/d09b437d
> > 
> > Are those patches correct in functionality? The GNU version is:
> > 
> >   void
> >   locale::facet::_S_create_c_locale(__c_locale& __cloc, const char* __s,
> > 				    __c_locale __old)
> >   {
> >     __cloc = __newlocale(1 << LC_ALL, __s, __old);
> >     if (!__cloc)
> >       {
> > 	// This named locale is not supported by the underlying OS.
> > 	__throw_runtime_error(__N("locale::facet::_S_create_c_locale "
> > 				  "name not valid"));
> >       }
> >   }
> > 
> > It tries to create a locale object, which the generic code doesn't do.
> > In the generic case, _S_create_c_locale is basically a noop, and I'd
> > assume localization wouldn't work, even if it does avoid the runtime
> > abort.
> > 
> > I will try it out locally when I get the time.
>
> The code there in the GNU version is correct (the one without
> newlocale isn't correct) aside from having the __ prefix, but other
> parts of the GNU version are wrong in that they poke at glibc
> internals to "optimize" useless byte-based ctype functions (useless
> because they can't operate on the only characters whose properties
> could vary by locale, the non-ASCII ones). There should probably be a
> new "posix" directory here based on the GNU one but with all the
> GNUisms removed. If it's not hard to backport that to older GCC
> versions maybe we should do that.

C++ is a bit mysterious to me; do you think there's a chance that
changing the libstdc++ locale implementation could break programs
built for the old version?

I also wonder what the configure script should look for in order to
choose which version to use.

From a really quick look at _S_create_c_locale, the dragonfly version
might be usable for this purpose, although it uses some non-standard
headers.

>
> One thing: I think in order for std::locale::global to be able to
> work, the locale creation code also needs to store the name (string)
> passed to locale() constructor, since there's no way to setlocale to a
> locale_t. Instead you need to remember the name so you can setlocale()
> to the same name. Perhaps NL_LOCALE_NAME would suffice, but I don't
> think it can easily give the exact same behavior since it's
> per-category.
>
> Rich


  reply	other threads:[~2020-11-30 17:23 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-30 10:41 Dong Brett
2020-11-30 11:31 ` Szabolcs Nagy
2020-11-30 11:58   ` Dong Brett
2020-11-30 11:35 ` Szabolcs Nagy
2020-11-30 12:37   ` Érico Nogueira
2020-11-30 13:44   ` Érico Nogueira
2020-11-30 14:39     ` Samuel Holland
2020-11-30 14:48       ` Rich Felker
2020-11-30 15:12       ` Érico Nogueira
2020-11-30 15:35         ` Rich Felker
2020-11-30 17:14           ` Érico Nogueira [this message]
2020-11-30 18:11             ` Rich Felker
2020-11-30 14:51 ` Rich Felker
2020-12-09 14:35   ` Érico Nogueira
2020-12-09 16:41     ` Rich Felker
2020-12-01  3:53 ` [musl] " Dong Brett
2020-12-01 16:21   ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C7GRMLMLTEL6.2CDPLZ37X1R89@mussels \
    --to=ericonr@disroot.org \
    --cc=brett.browning.dong@gmail.com \
    --cc=musl@lists.openwall.com \
    --cc=samuel@sholland.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).