mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Pablo Correa Gomez <pabloyoyoista@postmarketos.org>
To: Rich Felker <dalias@libc.org>
Cc: musl@lists.openwall.com
Subject: Re: [musl] [PATCH 0/2] Support printing localized RADIXCHAR
Date: Sun, 17 Dec 2023 23:25:38 +0100	[thread overview]
Message-ID: <2f2bd83ea63dba54ce4f69906a086cb539cdfd74.camel@postmarketos.org> (raw)
In-Reply-To: <20231216231037.GG4163@brightrain.aerifal.cx>

El sab, 16-12-2023 a las 18:10 -0500, Rich Felker escribió:
> On Sat, Dec 16, 2023 at 08:36:42PM +0100, Pablo Correa Gómez wrote:
> > From: Pablo Correa Gómez <ablocorrea@hotmail.com>
> > 
> > Since we've been discussing about translations, I've been looking a
> > bit
> > around, and have found some low-hanging fruit, in the form of
> > improving
> > printf-family output for localized systems.
> > 
> > I've tried to do the same for strtof family of functions, but I was
> > not
> > completely sure on how to approach that. Forcing the radix char
> > there
> > has the problem that numeric values as written for programming stop
> > being supported, and treating equally a "." and the localized case
> > seems
> > to not be supported by POSIX. Does anybody have any thoughts about
> > this?
> > Without that, this patch series might be a bit incomplete, since
> > certain localized printf outputs would not be possible to ingest in
> > strtof. Although I'm also unequally unsure if that's a requirement
> > 
> > Pablo Correa Gómez (2):
> >   langinfo: add support for LC_NUMERIC translations
> >   printf: translate RADIXCHAR for floating-point numbers
> > 
> >  src/locale/langinfo.c | 2 +-
> >  src/stdio/vfprintf.c  | 5 +++--
> >  2 files changed, 4 insertions(+), 3 deletions(-)
> > 
> > --
> > 2.43.0
> 
> This is a topic that's been controversial. I have always been against
> having variable radix character, but I've also been seeking input
> from
> users who want localized output whether the lack of this
> functionality
> is a serious problem that needs revisiting.
> 
> Last time it was discussed, I believe my position was that, if we do
> this, it needs to be a 1-bit setting, where a locale necessarily has
> either '.' or ',' as the radix. No other values actually appear in
> real-world conventions, and on other implementations such as glibc,
> the allowance for arbitrary characters allows doing some ~nasty~
> stuff
> with output and input processing. For example, you could define the
> radix character to be '1' or something that makes conversions fail to
> round-trip.

Makes total sense. I came from the wrong assumption that Spanish might
have use an appostrophe as number separator. But seems like that has
changed since I went to primary school, and certainly the comma is what
I'm used to online in Spanish. All the technical comments you have make
sense, I certainly put this together a bit too fast, but I'm happy that
it spark a discussion on how to do it right.

> 
> As written to support arbitrary radix characters, the patch also
> fails
> to handle the case where the radix character is multi-byte, copying
> only a single byte of it and thereby producing broken output. This is
> actually a nasty case where printf semantics for field width are not
> what the caller is likely to expect, and it breaks our wide printf
> implementation, which assumes when it uses byte-based printf for
> numbers that the byte count and character count are the same.
> Supporting only '.' and ',' avoids all of these issues, too.
> 
> Another detail you've overlooked is that scanf/strto{d,ld,f}/atof
> need
> to process the radix point character. This in turn requires making
> the
> _l wrappers for strto{d,ld,f} so that they actually apply the locale
> argument rather than ignoring it.
> 
> Before proceeding on all of this we should probably try to reach a
> decision on whether it's really needed/wanted functionality.

I really think so. This was indeed a part of Alastair's original
comment on setlocale (https://www.openwall.com/lists/musl/2023/08/10/3)
So it's a thing in Frech, as well as in Spanish, where we have same
problem that Markus mentions in German.

For me personally, I really thing getting these sort of things
functional and well integrated in musl (the way you want to do it), are
pretty important for the postmarketOS project being able to reach a
wider audience :)

So is this convincing enough, that a well-put patch with the changes
you request here and in the other message would make it? If so, I'm
happy to give this a try once the setlocale changes from Alastair get
merged (I already contacted a Polish user from postmarketOS with which
we're going to test a protocol to help users add support to musl-
locales for their language). 

It would certainly be my first time trying to write something this low-
level, so might need some guidance on how to approach the changes like
you've written in your other message.

Best,
Pablo.

> 
> Rich


      parent reply	other threads:[~2023-12-18 17:12 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-16 19:36 Pablo Correa Gómez
2023-12-16 19:36 ` [musl] [PATCH 1/2] langinfo: add support for LC_NUMERIC translations Pablo Correa Gómez
2023-12-16 19:36 ` [musl] [PATCH 2/2] printf: translate RADIXCHAR for floating-point numbers Pablo Correa Gómez
2023-12-16 23:10 ` [musl] [PATCH 0/2] Support printing localized RADIXCHAR Rich Felker
2023-12-17  7:26   ` Markus Wichmann
2023-12-17 22:25   ` Pablo Correa Gomez [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2f2bd83ea63dba54ce4f69906a086cb539cdfd74.camel@postmarketos.org \
    --to=pabloyoyoista@postmarketos.org \
    --cc=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).