From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: printf doesn't respect locale
Date: Wed, 11 Sep 2019 11:38:53 -0400 [thread overview]
Message-ID: <20190911153853.GY9017@brightrain.aerifal.cx> (raw)
In-Reply-To: <20190911171545.437a2033@inria.fr>
On Wed, Sep 11, 2019 at 05:15:45PM +0200, Jens Gustedt wrote:
> Hello Rich,
>
> On Wed, 11 Sep 2019 09:47:27 -0400 Rich Felker <dalias@libc.org> wrote:
>
> > > > An alternative/additional solution, which I actually might like
> > > > better, is having a function which sets a thread-local flag to
> > > > treat certain locale properties (at least the problematic
> > > > LC_NUMERIC ones) as if the current locale were "C". This is
> > > > weaker than the uselocale API from POSIX, but doesn't have the
> > > > problems with the possibility of failure (likely with no way to
> > > > make forward progress) like it does, and more importantly, would
> > > > avoid *breaking* m17n/i18n functionality by turning off other
> > > > unrelated, non-problematic locale features. Application or
> > > > library code could then just set/restore this flag around
> > > > *printf/*scanf/strto*/etc calls, or could set it and leave it if
> > > > they never want to see ',' again.
> > >
> > > Interesting.
> > >
> > > Would this be difficult to implement in musl? (I guess not)
> >
> > I would think not, but I'd have to look at the details a little more.
> >
> > One other advantage of this approach is that it has a more graceful
> > fallback. If an application needs portable LC_NUMERIC behavior, it can
> > check at build time for the presence of the new interface. If present,
> > LC_NUMERIC can be set to "" (user's preference) and the new interface
> > can be used to get the needed behavior. If absent, the application can
> > refrain from setting LC_NUMERIC, only setting the other categories and
> > leaving it as "C" (default).
> >
> > Note that having it be thread-locally stateful is, in my opinion, much
> > better than having new variants of the affected functions or new
> > formats, since a caller using LC_NUMERIC can set/restore the state to
> > safely call library code that's completely unaware of the new
> > interfaces.
> >
> > Of course there may be complications I haven't thought of. One that
> > comes to mind right away is what localeconv() should return under such
> > conditions.
>
> Ok, yes so this path sounds much more promissing than to concur with
> all the different parties to find a free modification character, and
> agree on the semantics.
>
> > > Would you be willing to write this up?
> >
> > What form would it need to be in?
>
> At the end this should be an N-document to submit to WG14, but that is
> really at the end. Just one or two pages would be good to get perhaps
> some discussion going, first, and also make it clear what it would
> imply for and need from musl.
>
> Do you think that a highlevel implementation using _Thread_local or
> (tss calls) and setlocale would be doable, such that we could even
> provide a reference implementation for all POSIX systems that also
> implement some form of thread local variables?
It can't be done in terms of setlocale because setlocale is not
thread-safe or thread-local. It could be done in terms of POSIX
uselocale, but such an implementation would not be fail-safe -- it
needs to be able to allocate a locale_t object via duplocale, since
the uselocale API works with a locale_t objects that describe the
value of *all* locale categories, rather than the categories being
individually settable on a per-thread basis (this is a design flaw in
the POSIX interfaces, and the historic xlocale ones they were based
on, IMO).
So such an implementation could be a pseudo-code/demo of the
functionality, but I think I'd want the proposed functionality to be
always-succeeds to discourage erroneous code that ignores the result
(resulting in wrong formatting/parsing, which is unsafe) or aborts the
program (eew).
Rich
next prev parent reply other threads:[~2019-09-11 15:38 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-09 16:31 Daniel Schoepe
2019-09-09 16:39 ` Daniel Schoepe
2019-09-09 16:51 ` Szabolcs Nagy
2019-09-09 17:55 ` Rich Felker
2019-09-09 17:54 ` Rich Felker
2019-09-10 16:00 ` Daniel Schoepe
2019-09-10 16:31 ` Szabolcs Nagy
2019-09-10 16:44 ` Tim Tassonis
2019-09-10 17:30 ` Rich Felker
2019-09-10 17:10 ` Daniel Schoepe
2019-09-10 17:33 ` Rich Felker
2019-09-10 18:43 ` Szabolcs Nagy
2019-09-10 21:55 ` A. Wilcox
2019-09-11 10:01 ` Szabolcs Nagy
2019-09-11 10:07 ` Jens Gustedt
2019-09-11 11:44 ` Rich Felker
2019-09-11 12:53 ` Jens Gustedt
2019-09-11 13:47 ` Rich Felker
2019-09-11 15:15 ` Jens Gustedt
2019-09-11 15:38 ` Rich Felker [this message]
2019-09-11 18:08 ` Jens Gustedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190911153853.GY9017@brightrain.aerifal.cx \
--to=dalias@libc.org \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).