mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] Would it to be possible to get strtoll_l?
@ 2020-10-01  0:34 ell1e
  2020-10-01  2:35 ` Rich Felker
  0 siblings, 1 reply; 11+ messages in thread
From: ell1e @ 2020-10-01  0:34 UTC (permalink / raw)
  To: musl

Hi everyone,

I'm working on a project and since the global state setlocale() seems to
be a bit of a mess to rely on, I'm using the *_l() string functions
instead. However, musl libc appears to lack strtoll_l() right now, so
I'm wondering if that'll be added any time soon?

Best regards,

ell1e

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] Would it to be possible to get strtoll_l?
  2020-10-01  0:34 [musl] Would it to be possible to get strtoll_l? ell1e
@ 2020-10-01  2:35 ` Rich Felker
  2020-10-01  4:36   ` ell1e
  0 siblings, 1 reply; 11+ messages in thread
From: Rich Felker @ 2020-10-01  2:35 UTC (permalink / raw)
  To: ell1e; +Cc: musl

On Thu, Oct 01, 2020 at 02:34:47AM +0200, ell1e wrote:
> Hi everyone,
> 
> I'm working on a project and since the global state setlocale() seems to
> be a bit of a mess to rely on, I'm using the *_l() string functions
> instead. However, musl libc appears to lack strtoll_l() right now, so
> I'm wondering if that'll be added any time soon?

The portable way to do this is just calling uselocale() rather than
passing the locale_t to individual *_l functions. You can even
implement a fallback strtoll_l as:

localt_t old = uselocale(l);
result = strtoll(a,b,c);
uselocale(old);

It's slightly more efficient if you keep the uselocale across multiple
calls, but not that big a deal; uselocale is an extremely light
operation.

But is there a reason you don't just want plain strtoll? C allows that
"additional locale-specific subject sequence forms may be accepted" in
locales other than the C locale, but does not permit standard
sequences to be interpreted differently, and in practice I'm not aware
of implementations that do anything funny here.

Rich

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] Would it to be possible to get strtoll_l?
  2020-10-01  2:35 ` Rich Felker
@ 2020-10-01  4:36   ` ell1e
  2020-10-01  5:24     ` Ellie
  0 siblings, 1 reply; 11+ messages in thread
From: ell1e @ 2020-10-01  4:36 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

Thinking more about this, this seems like it might be the less
performant option to me, although I'd be happy to take your thoughts on
this:

I'm thinking, it would require me to reset the thread locale before and
after each C call (I'm working in a bytecode VM here), and it seems like
just passing an additional locale parameter is going to be faster. I
haven't benchmarked however, if you doubt this assumption I will look
into it. But it seems to me that an additional parameter is preferable
for a few of the string operations to making 2+ additional calls around
each call into C.

But given the above guess, my spontaneous preference would still be to
rather use strtoll_l(). It is also available in glibc, it's just missing
in musl libc, which is why I sent this e-mail.

On 10/1/20 4:35 AM, Rich Felker wrote:
> On Thu, Oct 01, 2020 at 02:34:47AM +0200, ell1e wrote:
>> Hi everyone,
>>
>> I'm working on a project and since the global state setlocale() seems to
>> be a bit of a mess to rely on, I'm using the *_l() string functions
>> instead. However, musl libc appears to lack strtoll_l() right now, so
>> I'm wondering if that'll be added any time soon?
> 
> The portable way to do this is just calling uselocale() rather than
> passing the locale_t to individual *_l functions. You can even
> implement a fallback strtoll_l as:
> 
> localt_t old = uselocale(l);
> result = strtoll(a,b,c);
> uselocale(old);
> 
> It's slightly more efficient if you keep the uselocale across multiple
> calls, but not that big a deal; uselocale is an extremely light
> operation.
> 
> But is there a reason you don't just want plain strtoll? C allows that
> "additional locale-specific subject sequence forms may be accepted" in
> locales other than the C locale, but does not permit standard
> sequences to be interpreted differently, and in practice I'm not aware
> of implementations that do anything funny here.
> 
> Rich
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] Would it to be possible to get strtoll_l?
  2020-10-01  4:36   ` ell1e
@ 2020-10-01  5:24     ` Ellie
  2020-10-01  8:08       ` Ellie
  0 siblings, 1 reply; 11+ messages in thread
From: Ellie @ 2020-10-01  5:24 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

Just to add to this after even more thinking, it also seems conceptually
weird to me: why litter additional calls at all when a parameter would
do? Why "fight" any untrusted function by reverting locale changes, or
alternatively "protect" all my own calls against unexpected locale
evildoers when I could just specify all the info with parameters instead
of half of it? It's not like the string functions are already
overflowing with parameters right now.

It just seems like a weird design to me, personally. On the BSDs, the
*_l() functions seem to commonly exist by default (not guarded with
_GNU_SOURCE or other extension macros), I assume it's because they
likely thought similarly that the state machine setlocale/uselocale
approach just is oddly clumsy in comparison.

But maybe that's just me?

Regards,

ell1e

On 10/1/20 6:36 AM, ell1e wrote:
> Thinking more about this, this seems like it might be the less
> performant option to me, although I'd be happy to take your thoughts on
> this:
> 
> I'm thinking, it would require me to reset the thread locale before and
> after each C call (I'm working in a bytecode VM here), and it seems like
> just passing an additional locale parameter is going to be faster. I
> haven't benchmarked however, if you doubt this assumption I will look
> into it. But it seems to me that an additional parameter is preferable
> for a few of the string operations to making 2+ additional calls around
> each call into C.
> 
> But given the above guess, my spontaneous preference would still be to
> rather use strtoll_l(). It is also available in glibc, it's just missing
> in musl libc, which is why I sent this e-mail.
> 
> On 10/1/20 4:35 AM, Rich Felker wrote:
>> On Thu, Oct 01, 2020 at 02:34:47AM +0200, ell1e wrote:
>>> Hi everyone,
>>>
>>> I'm working on a project and since the global state setlocale() seems to
>>> be a bit of a mess to rely on, I'm using the *_l() string functions
>>> instead. However, musl libc appears to lack strtoll_l() right now, so
>>> I'm wondering if that'll be added any time soon?
>>
>> The portable way to do this is just calling uselocale() rather than
>> passing the locale_t to individual *_l functions. You can even
>> implement a fallback strtoll_l as:
>>
>> localt_t old = uselocale(l);
>> result = strtoll(a,b,c);
>> uselocale(old);
>>
>> It's slightly more efficient if you keep the uselocale across multiple
>> calls, but not that big a deal; uselocale is an extremely light
>> operation.
>>
>> But is there a reason you don't just want plain strtoll? C allows that
>> "additional locale-specific subject sequence forms may be accepted" in
>> locales other than the C locale, but does not permit standard
>> sequences to be interpreted differently, and in practice I'm not aware
>> of implementations that do anything funny here.
>>
>> Rich
>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] Would it to be possible to get strtoll_l?
  2020-10-01  5:24     ` Ellie
@ 2020-10-01  8:08       ` Ellie
  2020-10-01 15:47         ` Rich Felker
  0 siblings, 1 reply; 11+ messages in thread
From: Ellie @ 2020-10-01  8:08 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

Hah, sorry for the e-mail spam, I'm only now just realizing I read over
your latest remark that strtoll doesn't really change in behavior.

Yeah, I have actually be wondering how strtoll even could be
locale-specific, but assumed surely there'd be some corner case I don't
know about.

But if that makes it just an alias of strtoll effectively, would it be
possible to add strtoll_l to musl just for the sake of completion? Since
it exists for most platforms (glibc, bsd, windows) who knows how this
will be changed to behave in the future, I'd rather use the "proper"
function and be on the safe side.

In any case, thanks for the insightful response!

On 10/1/20 7:24 AM, Ellie wrote:
> Just to add to this after even more thinking, it also seems conceptually
> weird to me: why litter additional calls at all when a parameter would
> do? Why "fight" any untrusted function by reverting locale changes, or
> alternatively "protect" all my own calls against unexpected locale
> evildoers when I could just specify all the info with parameters instead
> of half of it? It's not like the string functions are already
> overflowing with parameters right now.
> 
> It just seems like a weird design to me, personally. On the BSDs, the
> *_l() functions seem to commonly exist by default (not guarded with
> _GNU_SOURCE or other extension macros), I assume it's because they
> likely thought similarly that the state machine setlocale/uselocale
> approach just is oddly clumsy in comparison.
> 
> But maybe that's just me?
> 
> Regards,
> 
> ell1e
> 
> On 10/1/20 6:36 AM, ell1e wrote:
>> Thinking more about this, this seems like it might be the less
>> performant option to me, although I'd be happy to take your thoughts on
>> this:
>>
>> I'm thinking, it would require me to reset the thread locale before and
>> after each C call (I'm working in a bytecode VM here), and it seems like
>> just passing an additional locale parameter is going to be faster. I
>> haven't benchmarked however, if you doubt this assumption I will look
>> into it. But it seems to me that an additional parameter is preferable
>> for a few of the string operations to making 2+ additional calls around
>> each call into C.
>>
>> But given the above guess, my spontaneous preference would still be to
>> rather use strtoll_l(). It is also available in glibc, it's just missing
>> in musl libc, which is why I sent this e-mail.
>>
>> On 10/1/20 4:35 AM, Rich Felker wrote:
>>> On Thu, Oct 01, 2020 at 02:34:47AM +0200, ell1e wrote:
>>>> Hi everyone,
>>>>
>>>> I'm working on a project and since the global state setlocale() seems to
>>>> be a bit of a mess to rely on, I'm using the *_l() string functions
>>>> instead. However, musl libc appears to lack strtoll_l() right now, so
>>>> I'm wondering if that'll be added any time soon?
>>>
>>> The portable way to do this is just calling uselocale() rather than
>>> passing the locale_t to individual *_l functions. You can even
>>> implement a fallback strtoll_l as:
>>>
>>> localt_t old = uselocale(l);
>>> result = strtoll(a,b,c);
>>> uselocale(old);
>>>
>>> It's slightly more efficient if you keep the uselocale across multiple
>>> calls, but not that big a deal; uselocale is an extremely light
>>> operation.
>>>
>>> But is there a reason you don't just want plain strtoll? C allows that
>>> "additional locale-specific subject sequence forms may be accepted" in
>>> locales other than the C locale, but does not permit standard
>>> sequences to be interpreted differently, and in practice I'm not aware
>>> of implementations that do anything funny here.
>>>
>>> Rich
>>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] Would it to be possible to get strtoll_l?
  2020-10-01  8:08       ` Ellie
@ 2020-10-01 15:47         ` Rich Felker
  2020-10-07 13:44           ` ell1e
  0 siblings, 1 reply; 11+ messages in thread
From: Rich Felker @ 2020-10-01 15:47 UTC (permalink / raw)
  To: Ellie; +Cc: musl

On Thu, Oct 01, 2020 at 10:08:17AM +0200, Ellie wrote:
> Hah, sorry for the e-mail spam, I'm only now just realizing I read over
> your latest remark that strtoll doesn't really change in behavior.
> 
> Yeah, I have actually be wondering how strtoll even could be
> locale-specific, but assumed surely there'd be some corner case I don't
> know about.
> 
> But if that makes it just an alias of strtoll effectively, would it be
> possible to add strtoll_l to musl just for the sake of completion? Since
> it exists for most platforms (glibc, bsd, windows) who knows how this
> will be changed to behave in the future, I'd rather use the "proper"
> function and be on the safe side.
> 
> In any case, thanks for the insightful response!

A *complete* set of *_l functions (for all operations that are
locale-dependent) would be rather large, and would include a lot that
don't really admit such thin implementations e.g. because they'd have
variadic signatures. It looks like the set chosen for standardization
mostly covered just the ones where the function itself was expected to
be so small/fast that setting the thread-local locale around the call
would be relatively expensive, but some don't fit that pattern, like
strftime_l. And then on top of that, we seem to have a somewhat
inconsistent coverage set for non-standardized BSD/GNU extensions --
wcsftime_l, strtod_l, etc.

A big part of maintainership, especially for libc, is saying no. In
this case, it might make sense to add a few more nonstandard ones,
especially if we already have most but not all of them and there's a
clear bounded set of what would be supported and they're all things
that admit ultra-thin wrappers. Would you be interested in
investigating that and following up?

Rich

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] Would it to be possible to get strtoll_l?
  2020-10-01 15:47         ` Rich Felker
@ 2020-10-07 13:44           ` ell1e
  2020-10-07 13:52             ` Ellie
                               ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: ell1e @ 2020-10-07 13:44 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

Hi Rich,

I admit I have a very biased view. Let me first preface this with saying
that for my program, this issue is now solved. However, I think there
might be some use if I explain further where I am coming from, and what
I think about adding more *_l() functions:

This is how I came to here, asking about strtoll_l():

1. I wouldn't necessarily know that strtoll doesn't really PRACTICALLY
do locale dependent changes, and same might apply to other people. I did
however realize that in general, process-wide setlocale() doesn't seem
to have died out yet, and for some programs written to work with 3rd
party plugins (that may basically do whatever) this is a big problem. As
a result, picking strtoll_l was a best effort decision with limited
knowledge, based on a default careful approach (which has proven useful
writing C over the years) to rather pick non-locale dependent stuff out
of principle, just to be safe. It wasn't because I identified
strtoll_l() in particular to be necessary, or even useful.

2. All platforms including even Windows 10, and macOS, seem to have
realized that locale-dependent basic string formatting is occasionally a
giant annoying headache and a mess, and therefore offer *_l functions
for pretty much everything. And so does glibc. So again it seemed only
natural to me to pick strtoll_l(), as a spontaneous decision. That this
might cause a problem with musl wasn't something that occurred to me at
first.

3. I only then HAPPENED to even check with musl libc. (Since I run it on
my linux phone.) Then I only happened to go as far as to ask here on the
mailing list, instead of not bothering for now and just writing off musl
libc compatibility. Of course the most "proper" way to deal with this
would be feature detection, but for some smaller projects that otherwise
don't rely on much otherwise "arcane" functionality that may seem like
overkill.

4. I then also just happened to have an overmotivated friend who was
bored enough to write a custom str-to-int and contributed it, so that I
no longer depend on either of strtoll/strtoll_l, which is why this is
solved for me now.

However, I think this chain of events still shows that *_l() coverage
would be useful, in general, for a majority of string functions, even in
cases where it might seem technically useless. This is of course from
the angle of app developers and maintainers, who do their best effort
but ultimately not always best-informed decisions picking string
formatting functions, and then possibly run into this *_l() issue on
musl when porting.

But then again, most of these things can be worked around when porting
to musl. But how many people will bother, and how many programs be left
behind as a result and not work on Alpine etc.? Is that some quantity
that will bother anyone? (My plan before I got the helpful contribution
for example was to just skip musl compatibility for a while before I had
some more time to either add feature detection, or revisit this issue in
other ways. But would anyone have cared not being able to run my program
in particular? I honestly can't tell you.)

I don't think I can give you further insight than this. Nevertheless, I
hope that wall of text had something of use for you.

Regards,

ell1e

On 10/1/20 5:47 PM, Rich Felker wrote:
> On Thu, Oct 01, 2020 at 10:08:17AM +0200, Ellie wrote:
>> Hah, sorry for the e-mail spam, I'm only now just realizing I read over
>> your latest remark that strtoll doesn't really change in behavior.
>>
>> Yeah, I have actually be wondering how strtoll even could be
>> locale-specific, but assumed surely there'd be some corner case I don't
>> know about.
>>
>> But if that makes it just an alias of strtoll effectively, would it be
>> possible to add strtoll_l to musl just for the sake of completion? Since
>> it exists for most platforms (glibc, bsd, windows) who knows how this
>> will be changed to behave in the future, I'd rather use the "proper"
>> function and be on the safe side.
>>
>> In any case, thanks for the insightful response!
> 
> A *complete* set of *_l functions (for all operations that are
> locale-dependent) would be rather large, and would include a lot that
> don't really admit such thin implementations e.g. because they'd have
> variadic signatures. It looks like the set chosen for standardization
> mostly covered just the ones where the function itself was expected to
> be so small/fast that setting the thread-local locale around the call
> would be relatively expensive, but some don't fit that pattern, like
> strftime_l. And then on top of that, we seem to have a somewhat
> inconsistent coverage set for non-standardized BSD/GNU extensions --
> wcsftime_l, strtod_l, etc.
> 
> A big part of maintainership, especially for libc, is saying no. In
> this case, it might make sense to add a few more nonstandard ones,
> especially if we already have most but not all of them and there's a
> clear bounded set of what would be supported and they're all things
> that admit ultra-thin wrappers. Would you be interested in
> investigating that and following up?
> 
> Rich
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] Would it to be possible to get strtoll_l?
  2020-10-07 13:44           ` ell1e
@ 2020-10-07 13:52             ` Ellie
  2020-10-07 14:58               ` 罗勇刚(Yonggang Luo)
  2020-10-07 15:41             ` Ariadne Conill
  2020-10-07 19:37             ` Rich Felker
  2 siblings, 1 reply; 11+ messages in thread
From: Ellie @ 2020-10-07 13:52 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

Just to add this, since I think I could have been clearer on this:

On 10/7/20 3:44 PM, ell1e wrote:
> mostly covered just the ones where the function itself was expected to
> be so small/fast that setting the thread-local locale around the call
> would be relatively expensive

I think as an app dev this just naturally expands to everything, always.
I just don't see a point to ever go uselocale+call, since that is just a
slower way of doing the same. So why bother with a suboptimal way?)And
then there will be the natural point where people try their program on
musl, some *_l() is missing, and the questions start: feature detection?
do we really need this anyway in that case? just use uselocale+call...?
do we care about musl enough to even spend time on this? And it'll cause
friction and thinking time loss. But maintaining all the wrappers will
eat up your time and resources instead. So no easy decision in any case,
I'm afraid, as for what to add or whether anything at all

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] Would it to be possible to get strtoll_l?
  2020-10-07 13:52             ` Ellie
@ 2020-10-07 14:58               ` 罗勇刚(Yonggang Luo)
  0 siblings, 0 replies; 11+ messages in thread
From: 罗勇刚(Yonggang Luo) @ 2020-10-07 14:58 UTC (permalink / raw)
  To: Musl; +Cc: Rich Felker

[-- Attachment #1: Type: text/plain, Size: 1309 bytes --]

I may suggest maintain an downstream library that contains musl along with
glibc compat, and there is such a
library

On Wed, Oct 7, 2020 at 9:52 PM Ellie <kittens@wobble.ninja> wrote:
>
> Just to add this, since I think I could have been clearer on this:
>
> On 10/7/20 3:44 PM, ell1e wrote:
> > mostly covered just the ones where the function itself was expected to
> > be so small/fast that setting the thread-local locale around the call
> > would be relatively expensive
>
> I think as an app dev this just naturally expands to everything, always.
> I just don't see a point to ever go uselocale+call, since that is just a
> slower way of doing the same. So why bother with a suboptimal way?)And
> then there will be the natural point where people try their program on
> musl, some *_l() is missing, and the questions start: feature detection?
> do we really need this anyway in that case? just use uselocale+call...?
> do we care about musl enough to even spend time on this? And it'll cause
> friction and thinking time loss. But maintaining all the wrappers will
> eat up your time and resources instead. So no easy decision in any case,
> I'm afraid, as for what to add or whether anything at all



--
         此致
礼
罗勇刚
Yours
    sincerely,
Yonggang Luo

[-- Attachment #2: Type: text/html, Size: 1488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] Would it to be possible to get strtoll_l?
  2020-10-07 13:44           ` ell1e
  2020-10-07 13:52             ` Ellie
@ 2020-10-07 15:41             ` Ariadne Conill
  2020-10-07 19:37             ` Rich Felker
  2 siblings, 0 replies; 11+ messages in thread
From: Ariadne Conill @ 2020-10-07 15:41 UTC (permalink / raw)
  To: Rich Felker, musl; +Cc: musl, ell1e

Hello,

On Wednesday, October 7, 2020 7:44:11 AM MDT ell1e wrote:
[...]
> But then again, most of these things can be worked around when porting
> to musl. But how many people will bother, and how many programs be left
> behind as a result and not work on Alpine etc.? Is that some quantity
> that will bother anyone?

Alpine is in an atypical position of both being a distribution and an overall 
platform, which just happens to use musl as one of its components.  We extend 
musl with other libraries that implement functionality considered out of scope 
for musl; the strtoll_l() and related functions could be provided in this way.  
Other examples of such extensions include libucontext, musl-obstack, etc.

I don't think that all situations require musl to provide functionality in 
order to solve the problem -- in fact, in general, I think that the amount of 
situations where that is actually required is minimal.  I would rather musl 
focus on providing a high quality core libc implementation instead of 
implementing things that they don't want to implement and can be provided 
elsewhere.

At any rate, the point here being that simply because musl does not implement 
something does not mean it cannot be implemented in Alpine at large -- and 
yes, this means that sometimes programs built on Alpine require the other 
runtime components (like libucontext or musl-obstack or whatever) along side 
musl.  I don't consider that a problem, since those components are readily 
available for any other distribution to ship if they wish to.

Ariadne



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [musl] Would it to be possible to get strtoll_l?
  2020-10-07 13:44           ` ell1e
  2020-10-07 13:52             ` Ellie
  2020-10-07 15:41             ` Ariadne Conill
@ 2020-10-07 19:37             ` Rich Felker
  2 siblings, 0 replies; 11+ messages in thread
From: Rich Felker @ 2020-10-07 19:37 UTC (permalink / raw)
  To: ell1e; +Cc: musl

On Wed, Oct 07, 2020 at 03:44:11PM +0200, ell1e wrote:
> Hi Rich,
> 
> I admit I have a very biased view. Let me first preface this with saying
> that for my program, this issue is now solved. However, I think there
> might be some use if I explain further where I am coming from, and what
> I think about adding more *_l() functions:
> 
> This is how I came to here, asking about strtoll_l():
> 
> [...]
> 
> I don't think I can give you further insight than this. Nevertheless, I
> hope that wall of text had something of use for you.

Really all I'm looking for is some investigation into what the set of
potentially-wanted extended *_l functions that other implementations
have looks like -- whether you think you/users would want all of those
or some clearly-scoped subset of them that makes sense -- rather than
just asking for one random function at a time with no clear idea of
where that will go.

Rich

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-10-07 19:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-01  0:34 [musl] Would it to be possible to get strtoll_l? ell1e
2020-10-01  2:35 ` Rich Felker
2020-10-01  4:36   ` ell1e
2020-10-01  5:24     ` Ellie
2020-10-01  8:08       ` Ellie
2020-10-01 15:47         ` Rich Felker
2020-10-07 13:44           ` ell1e
2020-10-07 13:52             ` Ellie
2020-10-07 14:58               ` 罗勇刚(Yonggang Luo)
2020-10-07 15:41             ` Ariadne Conill
2020-10-07 19:37             ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).