mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly
@ 2020-04-09 10:29 Norbert Lange
  2020-04-09 18:18 ` Szabolcs Nagy
  0 siblings, 1 reply; 13+ messages in thread
From: Norbert Lange @ 2020-04-09 10:29 UTC (permalink / raw)
  To: musl

Hello,

I ran into a bug with trace-cmd when compiled against musl.
Turns out musl just returns the affinity mask in both cases.

I know those functions are not standard, but the irony is that if they
are implemented,
then they prevent applications to use fallbacks.

See the trace-cmd bugreport:
https://bugzilla.kernel.org/show_bug.cgi?id=206817

Norbert

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly
  2020-04-09 10:29 [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly Norbert Lange
@ 2020-04-09 18:18 ` Szabolcs Nagy
  2020-04-09 18:31   ` Florian Weimer
  0 siblings, 1 reply; 13+ messages in thread
From: Szabolcs Nagy @ 2020-04-09 18:18 UTC (permalink / raw)
  To: musl; +Cc: Norbert Lange

* Norbert Lange <nolange79@gmail.com> [2020-04-09 12:29:20 +0200]:
> Hello,
> 
> I ran into a bug with trace-cmd when compiled against musl.
> Turns out musl just returns the affinity mask in both cases.
> 
> I know those functions are not standard, but the irony is that if they
> are implemented,
> then they prevent applications to use fallbacks.
> 
> See the trace-cmd bugreport:
> https://bugzilla.kernel.org/show_bug.cgi?id=206817

i think there are open unanswered questions about the right
semantics it's not clear what user code may expect

https://www.openwall.com/lists/musl/2019/03/16/1
https://www.openwall.com/lists/musl/2019/03/19/1

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly
  2020-04-09 18:18 ` Szabolcs Nagy
@ 2020-04-09 18:31   ` Florian Weimer
  2020-04-10  1:02     ` Rich Felker
  0 siblings, 1 reply; 13+ messages in thread
From: Florian Weimer @ 2020-04-09 18:31 UTC (permalink / raw)
  To: musl; +Cc: Norbert Lange

* Szabolcs Nagy:

> * Norbert Lange <nolange79@gmail.com> [2020-04-09 12:29:20 +0200]:
>> Hello,
>> 
>> I ran into a bug with trace-cmd when compiled against musl.
>> Turns out musl just returns the affinity mask in both cases.
>> 
>> I know those functions are not standard, but the irony is that if they
>> are implemented,
>> then they prevent applications to use fallbacks.
>> 
>> See the trace-cmd bugreport:
>> https://bugzilla.kernel.org/show_bug.cgi?id=206817
>
> i think there are open unanswered questions about the right
> semantics it's not clear what user code may expect
>
> https://www.openwall.com/lists/musl/2019/03/16/1
> https://www.openwall.com/lists/musl/2019/03/19/1

Stille, returning 1 if the sched_getaffinity system call fails
(because the affinity mask is unexpectedly large) will break some
software that assumes a true uniprocessor system if the processor
count is zero.  (OpenJDK is an example.)

This can also happen if there is some external affinity mask manager.

For glibc, we had to change our logic to artificially inflate the CPU
to 2 if we cannot determine it, as the more conservative choice.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly
  2020-04-09 18:31   ` Florian Weimer
@ 2020-04-10  1:02     ` Rich Felker
  2020-04-14 10:08       ` Florian Weimer
  0 siblings, 1 reply; 13+ messages in thread
From: Rich Felker @ 2020-04-10  1:02 UTC (permalink / raw)
  To: Florian Weimer; +Cc: musl, Norbert Lange

On Thu, Apr 09, 2020 at 08:31:30PM +0200, Florian Weimer wrote:
> * Szabolcs Nagy:
> 
> > * Norbert Lange <nolange79@gmail.com> [2020-04-09 12:29:20 +0200]:
> >> Hello,
> >> 
> >> I ran into a bug with trace-cmd when compiled against musl.
> >> Turns out musl just returns the affinity mask in both cases.
> >> 
> >> I know those functions are not standard, but the irony is that if they
> >> are implemented,
> >> then they prevent applications to use fallbacks.
> >> 
> >> See the trace-cmd bugreport:
> >> https://bugzilla.kernel.org/show_bug.cgi?id=206817
> >
> > i think there are open unanswered questions about the right
> > semantics it's not clear what user code may expect
> >
> > https://www.openwall.com/lists/musl/2019/03/16/1
> > https://www.openwall.com/lists/musl/2019/03/19/1
> 
> Stille, returning 1 if the sched_getaffinity system call fails
> (because the affinity mask is unexpectedly large) will break some
> software that assumes a true uniprocessor system if the processor
> count is zero.  (OpenJDK is an example.)
> 
> This can also happen if there is some external affinity mask manager.
> 
> For glibc, we had to change our logic to artificially inflate the CPU
> to 2 if we cannot determine it, as the more conservative choice.

Wait, you mean some software is abusing these interfaces to omit
memory barriers or something? *facepalm* *sigh*

Yes, we should probably do something better to implement these but I'm
not sure what.

Rich

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly
  2020-04-10  1:02     ` Rich Felker
@ 2020-04-14 10:08       ` Florian Weimer
  2020-04-14 15:55         ` Rich Felker
  0 siblings, 1 reply; 13+ messages in thread
From: Florian Weimer @ 2020-04-14 10:08 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl, Norbert Lange

* Rich Felker:

>> For glibc, we had to change our logic to artificially inflate the CPU
>> to 2 if we cannot determine it, as the more conservative choice.
>
> Wait, you mean some software is abusing these interfaces to omit
> memory barriers or something? *facepalm* *sigh*

Yes, indeed.  glibc itself parses uname -v output for this purpose
(something we should probably remove, too).

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly
  2020-04-14 10:08       ` Florian Weimer
@ 2020-04-14 15:55         ` Rich Felker
  2020-04-14 16:55           ` Florian Weimer
  0 siblings, 1 reply; 13+ messages in thread
From: Rich Felker @ 2020-04-14 15:55 UTC (permalink / raw)
  To: Florian Weimer; +Cc: musl, Norbert Lange

On Tue, Apr 14, 2020 at 12:08:52PM +0200, Florian Weimer wrote:
> * Rich Felker:
> 
> >> For glibc, we had to change our logic to artificially inflate the CPU
> >> to 2 if we cannot determine it, as the more conservative choice.
> >
> > Wait, you mean some software is abusing these interfaces to omit
> > memory barriers or something? *facepalm* *sigh*
> 
> Yes, indeed.  glibc itself parses uname -v output for this purpose
> (something we should probably remove, too).

I don't understand. Certainly it's not executing a child process at
runtime. Do you mean SYS_uname or are you talking about guessing
number of cpus for parallel build at make time or something?

Rich

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly
  2020-04-14 15:55         ` Rich Felker
@ 2020-04-14 16:55           ` Florian Weimer
  2020-04-15  9:38             ` Norbert Lange
  0 siblings, 1 reply; 13+ messages in thread
From: Florian Weimer @ 2020-04-14 16:55 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl, Norbert Lange

* Rich Felker:

> On Tue, Apr 14, 2020 at 12:08:52PM +0200, Florian Weimer wrote:
>> * Rich Felker:
>> 
>> >> For glibc, we had to change our logic to artificially inflate the CPU
>> >> to 2 if we cannot determine it, as the more conservative choice.
>> >
>> > Wait, you mean some software is abusing these interfaces to omit
>> > memory barriers or something? *facepalm* *sigh*
>> 
>> Yes, indeed.  glibc itself parses uname -v output for this purpose
>> (something we should probably remove, too).
>
> I don't understand. Certainly it's not executing a child process at
> runtime. Do you mean SYS_uname or are you talking about guessing
> number of cpus for parallel build at make time or something?

I meant the string that is printed by uname -v.  The internal
implementation is of course different.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly
  2020-04-14 16:55           ` Florian Weimer
@ 2020-04-15  9:38             ` Norbert Lange
  2020-04-15  9:50               ` Florian Weimer
  0 siblings, 1 reply; 13+ messages in thread
From: Norbert Lange @ 2020-04-15  9:38 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Rich Felker, musl

How should  one deal with this?
I understand that the semantics are vague, but given that musl now
implements this
function, it will make detection and fallback hard (especially as musl
doesn't wants to be identified by the likes of macros).

As it is now, just using the affinity mask definitely cant be useful,
an application wanting that behavior should be patched to
use that function directly.
If musl would not define the _SC_NPROCESSORS_* macros (but still keep
the implementation),
this could be used for compile-time detection atleast. Enabling the
current implementation would be
just a matter of explicitly defining those macros.

Norbert

Am Di., 14. Apr. 2020 um 18:55 Uhr schrieb Florian Weimer <fw@deneb.enyo.de>:
>
> * Rich Felker:
>
> > On Tue, Apr 14, 2020 at 12:08:52PM +0200, Florian Weimer wrote:
> >> * Rich Felker:
> >>
> >> >> For glibc, we had to change our logic to artificially inflate the CPU
> >> >> to 2 if we cannot determine it, as the more conservative choice.
> >> >
> >> > Wait, you mean some software is abusing these interfaces to omit
> >> > memory barriers or something? *facepalm* *sigh*
> >>
> >> Yes, indeed.  glibc itself parses uname -v output for this purpose
> >> (something we should probably remove, too).
> >
> > I don't understand. Certainly it's not executing a child process at
> > runtime. Do you mean SYS_uname or are you talking about guessing
> > number of cpus for parallel build at make time or something?
>
> I meant the string that is printed by uname -v.  The internal
> implementation is of course different.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly
  2020-04-15  9:38             ` Norbert Lange
@ 2020-04-15  9:50               ` Florian Weimer
  2020-04-15  9:57                 ` Norbert Lange
  2020-04-15 15:58                 ` Rich Felker
  0 siblings, 2 replies; 13+ messages in thread
From: Florian Weimer @ 2020-04-15  9:50 UTC (permalink / raw)
  To: Norbert Lange; +Cc: musl, Rich Felker

* Norbert Lange:

> How should  one deal with this?
> I understand that the semantics are vague, but given that musl now
> implements this
> function, it will make detection and fallback hard (especially as musl
> doesn't wants to be identified by the likes of macros).
>
> As it is now, just using the affinity mask definitely cant be useful,
> an application wanting that behavior should be patched to
> use that function directly.
> If musl would not define the _SC_NPROCESSORS_* macros (but still keep
> the implementation),
> this could be used for compile-time detection atleast. Enabling the
> current implementation would be
> just a matter of explicitly defining those macros.

_SC_NPROCESSORS_* as implemented in glibc is bad because those values
are not adjusted by cgroups, so it can grossly overestimate available
resources.

The cgroups interfaces themselves are not stable and very complicated.
I don't think it's a good idea to target them, especially not from
code that is expected to be linked statically into applications.

Given that, I'm not sure that glibc's way is a significant
improvement.  musl should perhaps be changed to cope more gracefully
with a sched_getaffinity failure, though (by not reporting a UP
environment by accident).

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly
  2020-04-15  9:50               ` Florian Weimer
@ 2020-04-15  9:57                 ` Norbert Lange
  2020-04-15 10:04                   ` Szabolcs Nagy
  2020-04-15 15:58                 ` Rich Felker
  1 sibling, 1 reply; 13+ messages in thread
From: Norbert Lange @ 2020-04-15  9:57 UTC (permalink / raw)
  To: Florian Weimer; +Cc: musl, Rich Felker

[-- Attachment #1: Type: text/plain, Size: 1852 bytes --]

I can't comment on whether glibc should be emulated. The point I am trying
to make is that it might be better to let the compilation fail by default,
or not provide the function at all.

The implementation right now doesn't seem sufficient (to put it midly) and
it prevents detection and automatic fallbacks. For example trace-cmd would
do this, and would work nicely - but instead it will gets musls
implementation that's defeated by setting an affinity mask.

Florian Weimer <fw@deneb.enyo.de> schrieb am Mi., 15. Apr. 2020, 11:50:

> * Norbert Lange:
>
> > How should  one deal with this?
> > I understand that the semantics are vague, but given that musl now
> > implements this
> > function, it will make detection and fallback hard (especially as musl
> > doesn't wants to be identified by the likes of macros).
> >
> > As it is now, just using the affinity mask definitely cant be useful,
> > an application wanting that behavior should be patched to
> > use that function directly.
> > If musl would not define the _SC_NPROCESSORS_* macros (but still keep
> > the implementation),
> > this could be used for compile-time detection atleast. Enabling the
> > current implementation would be
> > just a matter of explicitly defining those macros.
>
> _SC_NPROCESSORS_* as implemented in glibc is bad because those values
> are not adjusted by cgroups, so it can grossly overestimate available
> resources.
>
> The cgroups interfaces themselves are not stable and very complicated.
> I don't think it's a good idea to target them, especially not from
> code that is expected to be linked statically into applications.
>
> Given that, I'm not sure that glibc's way is a significant
> improvement.  musl should perhaps be changed to cope more gracefully
> with a sched_getaffinity failure, though (by not reporting a UP
> environment by accident).
>

[-- Attachment #2: Type: text/html, Size: 2323 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly
  2020-04-15  9:57                 ` Norbert Lange
@ 2020-04-15 10:04                   ` Szabolcs Nagy
  2020-04-15 16:01                     ` Rich Felker
  0 siblings, 1 reply; 13+ messages in thread
From: Szabolcs Nagy @ 2020-04-15 10:04 UTC (permalink / raw)
  To: musl; +Cc: Florian Weimer, Rich Felker, Norbert Lange

* Norbert Lange <nolange79@gmail.com> [2020-04-15 11:57:16 +0200]:
> I can't comment on whether glibc should be emulated. The point I am trying
> to make is that it might be better to let the compilation fail by default,
> or not provide the function at all.
> 
> The implementation right now doesn't seem sufficient (to put it midly) and
> it prevents detection and automatic fallbacks. For example trace-cmd would
> do this, and would work nicely - but instead it will gets musls
> implementation that's defeated by setting an affinity mask.

the point is that the glibc implementation is not sufficient either.

you don't get what you think you get as a result so you better off
to just always do the fallback.

identifying musl via a macro would be extremely bad in this case
since we are discussing to change the implementation and the
macro would not reflect that so a wrong default would be baked
into the source (which shows why it is a good idea not to provide
such a macro at all: most developers dont understand how to use
such macros and by now there would be a lot of broken musl
workarounds that are not relevant to the latest musl version).


> 
> Florian Weimer <fw@deneb.enyo.de> schrieb am Mi., 15. Apr. 2020, 11:50:
> 
> > * Norbert Lange:
> >
> > > How should  one deal with this?
> > > I understand that the semantics are vague, but given that musl now
> > > implements this
> > > function, it will make detection and fallback hard (especially as musl
> > > doesn't wants to be identified by the likes of macros).
> > >
> > > As it is now, just using the affinity mask definitely cant be useful,
> > > an application wanting that behavior should be patched to
> > > use that function directly.
> > > If musl would not define the _SC_NPROCESSORS_* macros (but still keep
> > > the implementation),
> > > this could be used for compile-time detection atleast. Enabling the
> > > current implementation would be
> > > just a matter of explicitly defining those macros.
> >
> > _SC_NPROCESSORS_* as implemented in glibc is bad because those values
> > are not adjusted by cgroups, so it can grossly overestimate available
> > resources.
> >
> > The cgroups interfaces themselves are not stable and very complicated.
> > I don't think it's a good idea to target them, especially not from
> > code that is expected to be linked statically into applications.
> >
> > Given that, I'm not sure that glibc's way is a significant
> > improvement.  musl should perhaps be changed to cope more gracefully
> > with a sched_getaffinity failure, though (by not reporting a UP
> > environment by accident).
> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly
  2020-04-15  9:50               ` Florian Weimer
  2020-04-15  9:57                 ` Norbert Lange
@ 2020-04-15 15:58                 ` Rich Felker
  1 sibling, 0 replies; 13+ messages in thread
From: Rich Felker @ 2020-04-15 15:58 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Norbert Lange, musl

On Wed, Apr 15, 2020 at 11:50:36AM +0200, Florian Weimer wrote:
> * Norbert Lange:
> 
> > How should  one deal with this?
> > I understand that the semantics are vague, but given that musl now
> > implements this
> > function, it will make detection and fallback hard (especially as musl
> > doesn't wants to be identified by the likes of macros).
> >
> > As it is now, just using the affinity mask definitely cant be useful,
> > an application wanting that behavior should be patched to
> > use that function directly.
> > If musl would not define the _SC_NPROCESSORS_* macros (but still keep
> > the implementation),
> > this could be used for compile-time detection atleast. Enabling the
> > current implementation would be
> > just a matter of explicitly defining those macros.
> 
> _SC_NPROCESSORS_* as implemented in glibc is bad because those values
> are not adjusted by cgroups, so it can grossly overestimate available
> resources.
> 
> The cgroups interfaces themselves are not stable and very complicated.
> I don't think it's a good idea to target them, especially not from
> code that is expected to be linked statically into applications.
> 
> Given that, I'm not sure that glibc's way is a significant
> improvement.  musl should perhaps be changed to cope more gracefully
> with a sched_getaffinity failure, though (by not reporting a UP
> environment by accident).

For what it's worth, even without the sched_getaffinity failure, it's
still problematic for programs linked to musl to be using the values
obtained to omit memory barriers since they may be restricted to a
single core themselves but communicating over shared memory with
another process that's not restricted or restricted to a different
core.

There really should be some documented meaning for the return values,
whereby we decide either that such sketchy application usage is
supported (e.g. document that values less than 2 are never returned,
so that applications doing the hack always use barriers and they have
no remaining documented way to determine it's really a UP environment)
or declare the application usage incorrect/buggy (i.e. that the values
may be specific to the cgroup or other resource-constraints (possibly
virtualized) and can't be relied on if you're communicating with
processes that might live outside those resource constraints).

Rich

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly
  2020-04-15 10:04                   ` Szabolcs Nagy
@ 2020-04-15 16:01                     ` Rich Felker
  0 siblings, 0 replies; 13+ messages in thread
From: Rich Felker @ 2020-04-15 16:01 UTC (permalink / raw)
  To: musl, Florian Weimer, Norbert Lange

On Wed, Apr 15, 2020 at 12:04:43PM +0200, Szabolcs Nagy wrote:
> * Norbert Lange <nolange79@gmail.com> [2020-04-15 11:57:16 +0200]:
> > I can't comment on whether glibc should be emulated. The point I am trying
> > to make is that it might be better to let the compilation fail by default,
> > or not provide the function at all.
> > 
> > The implementation right now doesn't seem sufficient (to put it midly) and
> > it prevents detection and automatic fallbacks. For example trace-cmd would
> > do this, and would work nicely - but instead it will gets musls
> > implementation that's defeated by setting an affinity mask.
> 
> the point is that the glibc implementation is not sufficient either.
> 
> you don't get what you think you get as a result so you better off
> to just always do the fallback.
> 
> identifying musl via a macro would be extremely bad in this case
> since we are discussing to change the implementation and the
> macro would not reflect that so a wrong default would be baked
> into the source (which shows why it is a good idea not to provide
> such a macro at all: most developers dont understand how to use
> such macros and by now there would be a lot of broken musl
> workarounds that are not relevant to the latest musl version).

Note that this could be represented by the sort of macro exposure I
want to propose on libc-coord: not __MUSL__ but something like
_EXT_SC_...NPROC_REFLECTS_RESOURCE_CONSTRAINTS. Of course then it
would document a specific permanent (without redefining the _SC_*
macros to different values for a new one) behavior, so this may not be
a good choice. As a worst case the behavior could be documented to the
application via another sysconf variable. :-P

Rich

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-04-15 16:01 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-09 10:29 [musl] [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN) incorrectly Norbert Lange
2020-04-09 18:18 ` Szabolcs Nagy
2020-04-09 18:31   ` Florian Weimer
2020-04-10  1:02     ` Rich Felker
2020-04-14 10:08       ` Florian Weimer
2020-04-14 15:55         ` Rich Felker
2020-04-14 16:55           ` Florian Weimer
2020-04-15  9:38             ` Norbert Lange
2020-04-15  9:50               ` Florian Weimer
2020-04-15  9:57                 ` Norbert Lange
2020-04-15 10:04                   ` Szabolcs Nagy
2020-04-15 16:01                     ` Rich Felker
2020-04-15 15:58                 ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).