mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] getrandom fallback - wrapper functions dilema
@ 2022-09-16 18:05 Lance Fredrickson
  2022-09-17  5:58 ` Markus Wichmann
  2022-09-17 20:43 ` Rich Felker
  0 siblings, 2 replies; 5+ messages in thread
From: Lance Fredrickson @ 2022-09-16 18:05 UTC (permalink / raw)
  To: musl

I'm using musl on an arm embedded router (netgear R7000) running an old 
kernel, 2.6.36.4. I compiled an application using the meson build system 
which does a check for the getentropy function which it does of course 
find in musl and ultimately the program aborts. I see getentropy uses 
getrandom which is a wrapper around the syscall which came around kernel 
version 3.17 . In the mailing list I saw in one discussion way back 
about adding a fallback to getrandom, maybe after integrating arc4random 
which doesn't seem to have ever happened.

I appreciate that musl strives for correctness, so what is the correct 
solution for this issue?
I think meson checks for the function availability, but I'm not sure 
that it checks for valid output. Is this a meson issue?

Should a libc be compiling in syscalls and functions the running kernel 
can't support?
Help my lack of understanding but I think at least syscalls will return 
not supported right? So maybe the bigger issue are these syscall wrappers?
I know that if down the road I try to run musl on another router, mipsel 
& kernel 2.6.22.19, I'm going to run into prlimit issues because prlimit 
came after this kernel version, but the prlimit function will be 
unconditionally compiled in. And it seems the autoconfs and cmakes and 
mesons are only really checking for the function availability and not so 
much if the syscall they're wrapping is actually going to work.
getentropy is even more removed because it's a  function that relies on 
a syscall wrapped in another function.

So I really hope the solution isn't bumping up the minimum kernel 
requirement. Sure I'm using an old kernel and maybe I should upgrade, 
but in this case I can't because I'm vendor locked.  This type of issue 
will still arise down the road however. Say  kernel 6.3 adds a new 
syscall and musl adds a syscall wrapper, well then your shiny 6.1 kernel 
running musl 1.2.4 (or whatever future version) might claim it has 
functionality it really doesn't, and that could trip something up.

I know uclibc-ng tracks syscalls/functions to kernel availability in 
kernel-features.h that they carry,  but I don't know what is correct for 
musl. Unconditionally included every feature regardless of kernel 
support doesn't feel correct, and in practice causes issue like this. My 
only other option is to start ripping functionality out of musl to match 
the functionality of that particular kernel, and I know that really 
doesn't feel correct either.
Or do the software authors and build systems need better 
syscall/function availability checks?

respectfully,
Lance Fredrickson

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [musl] getrandom fallback - wrapper functions dilema
  2022-09-16 18:05 [musl] getrandom fallback - wrapper functions dilema Lance Fredrickson
@ 2022-09-17  5:58 ` Markus Wichmann
  2022-09-17 20:43 ` Rich Felker
  1 sibling, 0 replies; 5+ messages in thread
From: Markus Wichmann @ 2022-09-17  5:58 UTC (permalink / raw)
  To: musl

On Fri, Sep 16, 2022 at 12:05:02PM -0600, Lance Fredrickson wrote:
> I'm using musl on an arm embedded router (netgear R7000) running an old
> kernel, 2.6.36.4. I compiled an application using the meson build system
> which does a check for the getentropy function which it does of course find
> in musl and ultimately the program aborts. I see getentropy uses getrandom
> which is a wrapper around the syscall which came around kernel version 3.17
> . In the mailing list I saw in one discussion way back about adding a
> fallback to getrandom, maybe after integrating arc4random which doesn't seem
> to have ever happened.
>
> I appreciate that musl strives for correctness, so what is the correct
> solution for this issue?
> I think meson checks for the function availability, but I'm not sure that it
> checks for valid output. Is this a meson issue?
>

I think the application that aborts if getentropy() fails is in the
wrong here. Well, possibly. It is possible that application sees kernel
3.17 as minimum necessity. In that case, they are doing fine (although
then I would question why they detect the function during
configuration). If not, then aborting if a system call fails seems like
the wrong thing to do.  The application should instead attempt to fall
back to a different method, e.g. opening /dev/urandom, trying
getauxval(AT_RANDOM) and seeding an RNG with it, or anything of the
sort.

The fundamental disconnect here is that just because a function is
available doesn't mean it will succeed at run-time.

And no, the build system isn't doing anything wrong. The most it can do
is compile a test binary and if that worked, it has to be good enough.
In case of cross-compilation, it cannot run the binary. And anyway, the
build system is not necessarily the run-time system.

> Should a libc be compiling in syscalls and functions the running kernel
> can't support?

Yes. libc and kernel are always linked together dynamically through the
syscall interface. In general, libc cannot know what syscalls the kernel
will support. So musl uses the newest syscall interface and falls back
to older ones as necessary. In case of getentropy(), however, no
fallback was ever implemented.

> Help my lack of understanding but I think at least syscalls will return not
> supported right? So maybe the bigger issue are these syscall wrappers?

Yes, unsupported system calls will return failure with ENOSYS. And I
just checked getentropy(), and it too will report getrandom() failure.
SO the application should see failure with errno set to ENOSYS and act
accordingly. And that doesn't mean abort.

> I know that if down the road I try to run musl on another router, mipsel &
> kernel 2.6.22.19, I'm going to run into prlimit issues because prlimit came
> after this kernel version, but the prlimit function will be unconditionally
> compiled in. And it seems the autoconfs and cmakes and mesons are only
> really checking for the function availability and not so much if the syscall
> they're wrapping is actually going to work.
> getentropy is even more removed because it's a  function that relies on a
> syscall wrapped in another function.
>

It is possible that lots of open source code out there is badly made. In
this case, they assume that libc defining a certain function means the
run-time kernel will also support the underlying system call, and
absolutely nothing can fail. But it is also illogical to have a
configure option for a function and then abort if it fails. I thought
the function was optional?

Your options are:
1) Bump up the kernel version. Apparently not an option for you.
2) Patch the application to deal with failures appropriately.
3) Patch musl to fall back on failure.

Whether to pursue 2 or 3 depends highly on the applications involved and
whether a change to the libc or the application is more appropriate. For
instance, instead of getentropy(), you can open /dev/urandom. Now, the
application might be the better place to contain that change, since it
can more easily manage the file descriptor life cycle. Changing it in
libc would mean you open the file on each call to getrandom() and close
it again at the end. Or else you use a static variable for the FD and
then the application gets messed with in other ways.

> Or do the software authors and build systems need better syscall/function
> availability checks?
>

They need better run-time logic to deal with failures. Function
availability does not mean the function will succeed.

Ciao,
Markus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [musl] getrandom fallback - wrapper functions dilema
  2022-09-16 18:05 [musl] getrandom fallback - wrapper functions dilema Lance Fredrickson
  2022-09-17  5:58 ` Markus Wichmann
@ 2022-09-17 20:43 ` Rich Felker
  2022-09-19 20:56   ` Lance Fredrickson
  1 sibling, 1 reply; 5+ messages in thread
From: Rich Felker @ 2022-09-17 20:43 UTC (permalink / raw)
  To: Lance Fredrickson; +Cc: musl

On Fri, Sep 16, 2022 at 12:05:02PM -0600, Lance Fredrickson wrote:
> I'm using musl on an arm embedded router (netgear R7000) running an
> old kernel, 2.6.36.4. I compiled an application using the meson
> build system which does a check for the getentropy function which it
> does of course find in musl and ultimately the program aborts. I see
> getentropy uses getrandom which is a wrapper around the syscall
> which came around kernel version 3.17 . In the mailing list I saw in
> one discussion way back about adding a fallback to getrandom, maybe
> after integrating arc4random which doesn't seem to have ever
> happened.
> 
> I appreciate that musl strives for correctness, so what is the
> correct solution for this issue?
> I think meson checks for the function availability, but I'm not sure
> that it checks for valid output. Is this a meson issue?

No, it's not a meson issue. You cannot build-time test properties that
are variable at runtime because you're not (necessarily) building on
the system you're running on. You may be cross compiling, or building
native binaries for the system you're on but planning to run them on a
different system.

If you want to make software that behaves gracefully across a range of
old systems, you need to do *runtime* tests for the specific optional
functionality that might or might not be present. Normally this means
just checking failure returns for ENOSYS, EINVAL, etc. and falling
back to doing something else or reporting that the needed
functionality is not available. Or, if the functionality isn't
actually needed to begin with -- like if you're gratuitously using
getrandom for monte carlo stuff or for salting a hash table hash
function to make it collision-resistant -- then *don't*, and instead
use a deterministic function.

> Should a libc be compiling in syscalls and functions the running
> kernel can't support?

Yes. There is no concept of "the running kernel" in musl. musl is not
built for any particular kernel version, only to run on top of the
Linux syscall API/ABI as the underlying layer, with syscalls present
in 2.6.0 as the baseline for providing the majority of the standard
functionality (all that's possible to implement with that), later 2.6
series for a few POSIX conformance things that early 2.6 couldn't
supply, and everything else as extension functionality that might or
might not be available at runtime.

> Help my lack of understanding but I think at least syscalls will
> return not supported right? So maybe the bigger issue are these
> syscall wrappers?
> I know that if down the road I try to run musl on another router,
> mipsel & kernel 2.6.22.19, I'm going to run into prlimit issues
> because prlimit came after this kernel version, but the prlimit
> function will be unconditionally compiled in. And it seems the
> autoconfs and cmakes and mesons are only really checking for the
> function availability and not so much if the syscall they're
> wrapping is actually going to work.
> getentropy is even more removed because it's a  function that relies
> on a syscall wrapped in another function.
> 
> So I really hope the solution isn't bumping up the minimum kernel
> requirement. Sure I'm using an old kernel and maybe I should
> upgrade, but in this case I can't because I'm vendor locked.  This
> type of issue will still arise down the road however. Say  kernel
> 6.3 adds a new syscall and musl adds a syscall wrapper, well then
> your shiny 6.1 kernel running musl 1.2.4 (or whatever future
> version) might claim it has functionality it really doesn't, and
> that could trip something up.
> 
> I know uclibc-ng tracks syscalls/functions to kernel availability in
> kernel-features.h that they carry,  but I don't know what is correct
> for musl.

uclibc has a very different philosophy with a combinatoric explosion
of build configurations, no officially stable ABI, and an intent that
you build a version for your particular hardware+kernel target.
Rejecting this philosophy was one of the big differences (and, in my
opinion, the big successes) of musl.

> Unconditionally included every feature regardless of
> kernel support doesn't feel correct, and in practice causes issue
> like this. My only other option is to start ripping functionality
> out of musl to match the functionality of that particular kernel,
> and I know that really doesn't feel correct either.

Ripping things out is not the right solution at all.

> Or do the software authors and build systems need better
> syscall/function availability checks?

Nothing to do with build systems. The applications just need to be
checking (at runtime) error returns for functions which are not
guaranteed-not-to-fail. This includes any Linux extensions not present
in the minimum kernel version they require.

All of this should be documented better on musl's side too -- what the
actual (non-)guarantees for availability of functionality are.

Rich

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [musl] getrandom fallback - wrapper functions dilema
  2022-09-17 20:43 ` Rich Felker
@ 2022-09-19 20:56   ` Lance Fredrickson
  2022-09-19 21:41     ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Lance Fredrickson @ 2022-09-19 20:56 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl



On 9/17/2022 2:43 PM, Rich Felker wrote:
> On Fri, Sep 16, 2022 at 12:05:02PM -0600, Lance Fredrickson wrote:
>> I'm using musl on an arm embedded router (netgear R7000) running an
>> old kernel, 2.6.36.4. I compiled an application using the meson
>> build system which does a check for the getentropy function which it
>> does of course find in musl and ultimately the program aborts. I see
>> getentropy uses getrandom which is a wrapper around the syscall
>> which came around kernel version 3.17 . In the mailing list I saw in
>> one discussion way back about adding a fallback to getrandom, maybe
>> after integrating arc4random which doesn't seem to have ever
>> happened.
>>
>> I appreciate that musl strives for correctness, so what is the
>> correct solution for this issue?
>> I think meson checks for the function availability, but I'm not sure
>> that it checks for valid output. Is this a meson issue?
> No, it's not a meson issue. You cannot build-time test properties that
> are variable at runtime because you're not (necessarily) building on
> the system you're running on. You may be cross compiling, or building
> native binaries for the system you're on but planning to run them on a
> different system.
>
> If you want to make software that behaves gracefully across a range of
> old systems, you need to do *runtime* tests for the specific optional
> functionality that might or might not be present. Normally this means
> just checking failure returns for ENOSYS, EINVAL, etc. and falling
> back to doing something else or reporting that the needed
> functionality is not available. Or, if the functionality isn't
> actually needed to begin with -- like if you're gratuitously using
> getrandom for monte carlo stuff or for salting a hash table hash
> function to make it collision-resistant -- then *don't*, and instead
> use a deterministic function.
>
>> Should a libc be compiling in syscalls and functions the running
>> kernel can't support?
> Yes. There is no concept of "the running kernel" in musl. musl is not
> built for any particular kernel version, only to run on top of the
> Linux syscall API/ABI as the underlying layer, with syscalls present
> in 2.6.0 as the baseline for providing the majority of the standard
> functionality (all that's possible to implement with that), later 2.6
> series for a few POSIX conformance things that early 2.6 couldn't
> supply, and everything else as extension functionality that might or
> might not be available at runtime.
>
>> Help my lack of understanding but I think at least syscalls will
>> return not supported right? So maybe the bigger issue are these
>> syscall wrappers?
>> I know that if down the road I try to run musl on another router,
>> mipsel & kernel 2.6.22.19, I'm going to run into prlimit issues
>> because prlimit came after this kernel version, but the prlimit
>> function will be unconditionally compiled in. And it seems the
>> autoconfs and cmakes and mesons are only really checking for the
>> function availability and not so much if the syscall they're
>> wrapping is actually going to work.
>> getentropy is even more removed because it's a  function that relies
>> on a syscall wrapped in another function.
>>
>> So I really hope the solution isn't bumping up the minimum kernel
>> requirement. Sure I'm using an old kernel and maybe I should
>> upgrade, but in this case I can't because I'm vendor locked.  This
>> type of issue will still arise down the road however. Say  kernel
>> 6.3 adds a new syscall and musl adds a syscall wrapper, well then
>> your shiny 6.1 kernel running musl 1.2.4 (or whatever future
>> version) might claim it has functionality it really doesn't, and
>> that could trip something up.
>>
>> I know uclibc-ng tracks syscalls/functions to kernel availability in
>> kernel-features.h that they carry,  but I don't know what is correct
>> for musl.
> uclibc has a very different philosophy with a combinatoric explosion
> of build configurations, no officially stable ABI, and an intent that
> you build a version for your particular hardware+kernel target.
> Rejecting this philosophy was one of the big differences (and, in my
> opinion, the big successes) of musl.
>
>> Unconditionally included every feature regardless of
>> kernel support doesn't feel correct, and in practice causes issue
>> like this. My only other option is to start ripping functionality
>> out of musl to match the functionality of that particular kernel,
>> and I know that really doesn't feel correct either.
> Ripping things out is not the right solution at all.
>
>> Or do the software authors and build systems need better
>> syscall/function availability checks?
> Nothing to do with build systems. The applications just need to be
> checking (at runtime) error returns for functions which are not
> guaranteed-not-to-fail. This includes any Linux extensions not present
> in the minimum kernel version they require.
>
> All of this should be documented better on musl's side too -- what the
> actual (non-)guarantees for availability of functionality are.
>
> Rich
  Thanks for the response! Would having a getrandom fallback still be on 
the table for musl? It's not the first time I've hit this issue so 
having the libc automatically take care of things would be a 
nice-to-have, especially as I've seen getrandom become more prevalent in 
coding projects.

Lance

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [musl] getrandom fallback - wrapper functions dilema
  2022-09-19 20:56   ` Lance Fredrickson
@ 2022-09-19 21:41     ` Rich Felker
  0 siblings, 0 replies; 5+ messages in thread
From: Rich Felker @ 2022-09-19 21:41 UTC (permalink / raw)
  To: Lance Fredrickson; +Cc: musl

On Mon, Sep 19, 2022 at 02:56:32PM -0600, Lance Fredrickson wrote:
> 
> 
> On 9/17/2022 2:43 PM, Rich Felker wrote:
> >On Fri, Sep 16, 2022 at 12:05:02PM -0600, Lance Fredrickson wrote:
> >>I'm using musl on an arm embedded router (netgear R7000) running an
> >>old kernel, 2.6.36.4. I compiled an application using the meson
> >>build system which does a check for the getentropy function which it
> >>does of course find in musl and ultimately the program aborts. I see
> >>getentropy uses getrandom which is a wrapper around the syscall
> >>which came around kernel version 3.17 . In the mailing list I saw in
> >>one discussion way back about adding a fallback to getrandom, maybe
> >>after integrating arc4random which doesn't seem to have ever
> >>happened.
> >>
> >>I appreciate that musl strives for correctness, so what is the
> >>correct solution for this issue?
> >>I think meson checks for the function availability, but I'm not sure
> >>that it checks for valid output. Is this a meson issue?
> >No, it's not a meson issue. You cannot build-time test properties that
> >are variable at runtime because you're not (necessarily) building on
> >the system you're running on. You may be cross compiling, or building
> >native binaries for the system you're on but planning to run them on a
> >different system.
> >
> >If you want to make software that behaves gracefully across a range of
> >old systems, you need to do *runtime* tests for the specific optional
> >functionality that might or might not be present. Normally this means
> >just checking failure returns for ENOSYS, EINVAL, etc. and falling
> >back to doing something else or reporting that the needed
> >functionality is not available. Or, if the functionality isn't
> >actually needed to begin with -- like if you're gratuitously using
> >getrandom for monte carlo stuff or for salting a hash table hash
> >function to make it collision-resistant -- then *don't*, and instead
> >use a deterministic function.
> >
> >>Should a libc be compiling in syscalls and functions the running
> >>kernel can't support?
> >Yes. There is no concept of "the running kernel" in musl. musl is not
> >built for any particular kernel version, only to run on top of the
> >Linux syscall API/ABI as the underlying layer, with syscalls present
> >in 2.6.0 as the baseline for providing the majority of the standard
> >functionality (all that's possible to implement with that), later 2.6
> >series for a few POSIX conformance things that early 2.6 couldn't
> >supply, and everything else as extension functionality that might or
> >might not be available at runtime.
> >
> >>Help my lack of understanding but I think at least syscalls will
> >>return not supported right? So maybe the bigger issue are these
> >>syscall wrappers?
> >>I know that if down the road I try to run musl on another router,
> >>mipsel & kernel 2.6.22.19, I'm going to run into prlimit issues
> >>because prlimit came after this kernel version, but the prlimit
> >>function will be unconditionally compiled in. And it seems the
> >>autoconfs and cmakes and mesons are only really checking for the
> >>function availability and not so much if the syscall they're
> >>wrapping is actually going to work.
> >>getentropy is even more removed because it's a  function that relies
> >>on a syscall wrapped in another function.
> >>
> >>So I really hope the solution isn't bumping up the minimum kernel
> >>requirement. Sure I'm using an old kernel and maybe I should
> >>upgrade, but in this case I can't because I'm vendor locked.  This
> >>type of issue will still arise down the road however. Say  kernel
> >>6.3 adds a new syscall and musl adds a syscall wrapper, well then
> >>your shiny 6.1 kernel running musl 1.2.4 (or whatever future
> >>version) might claim it has functionality it really doesn't, and
> >>that could trip something up.
> >>
> >>I know uclibc-ng tracks syscalls/functions to kernel availability in
> >>kernel-features.h that they carry,  but I don't know what is correct
> >>for musl.
> >uclibc has a very different philosophy with a combinatoric explosion
> >of build configurations, no officially stable ABI, and an intent that
> >you build a version for your particular hardware+kernel target.
> >Rejecting this philosophy was one of the big differences (and, in my
> >opinion, the big successes) of musl.
> >
> >>Unconditionally included every feature regardless of
> >>kernel support doesn't feel correct, and in practice causes issue
> >>like this. My only other option is to start ripping functionality
> >>out of musl to match the functionality of that particular kernel,
> >>and I know that really doesn't feel correct either.
> >Ripping things out is not the right solution at all.
> >
> >>Or do the software authors and build systems need better
> >>syscall/function availability checks?
> >Nothing to do with build systems. The applications just need to be
> >checking (at runtime) error returns for functions which are not
> >guaranteed-not-to-fail. This includes any Linux extensions not present
> >in the minimum kernel version they require.
> >
> >All of this should be documented better on musl's side too -- what the
> >actual (non-)guarantees for availability of functionality are.
> >
> >Rich
>  Thanks for the response! Would having a getrandom fallback still be
> on the table for musl? It's not the first time I've hit this issue
> so having the libc automatically take care of things would be a
> nice-to-have, especially as I've seen getrandom become more
> prevalent in coding projects.

Yes, absolutely. It's on the wishlist and I have a draft of the core
backend using sysctl, but it still needs some work hooking it up.
Hopefully this will meet your needs. The (long deprecated, later
removed) SYS__sysctl syscall is really the only way to get reliable
randomness on these old kernels that's not dependent on being able to
open device nodes (requires available fd slots and system open file
slots, mitigations for entropy-pool-not-ready condition, etc.)

Rich

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-09-19 21:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-16 18:05 [musl] getrandom fallback - wrapper functions dilema Lance Fredrickson
2022-09-17  5:58 ` Markus Wichmann
2022-09-17 20:43 ` Rich Felker
2022-09-19 20:56   ` Lance Fredrickson
2022-09-19 21:41     ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).