mailing list of musl libc
 help / color / mirror / code / Atom feed
* Feature request: building musl in a portable way
@ 2017-12-21 16:25 ardi
  2017-12-21 21:38 ` Rich Felker
  0 siblings, 1 reply; 13+ messages in thread
From: ardi @ 2017-12-21 16:25 UTC (permalink / raw)
  To: musl

Hi,

Related (and as an alternative) to a previous post I made asking about
a way of isolating direct syscalls, I'm thinking about the possibility
of building musl in a way where functions that need to perform
syscalls aren't compiled, so this special compiled version of musl
would have only the functions that don't make syscalls from
themselves.

The purpose is being able to run code in system other than Linux,
replacing such functions by calls to the related functions of the
system host (provided that functions follow POSIX requirements, of
course).

Obviously, I can get this feature by modifying musl, but I'd prefer
not to modify it, because I'd like to be able to update musl to the
last version easily, and if I use a modified/customized musl version,
updating it would require merging, and possibly hard work.

If there was some way of having a switch in the build system so that
all functions that make syscalls are not compiled, I could use musl
without modifying it. Maybe the most elegant way of doing this would
be by tagging such functions with an special tag, like
"__function_makes_syscall__" or whatever. But I'm not sure.

Cheers, and thanks a lot,

ardi


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Feature request: building musl in a portable way
  2017-12-21 16:25 Feature request: building musl in a portable way ardi
@ 2017-12-21 21:38 ` Rich Felker
  2017-12-22 16:09   ` ardi
  0 siblings, 1 reply; 13+ messages in thread
From: Rich Felker @ 2017-12-21 21:38 UTC (permalink / raw)
  To: musl

On Thu, Dec 21, 2017 at 05:25:31PM +0100, ardi wrote:
> Hi,
> 
> Related (and as an alternative) to a previous post I made asking about
> a way of isolating direct syscalls, I'm thinking about the possibility
> of building musl in a way where functions that need to perform
> syscalls aren't compiled, so this special compiled version of musl
> would have only the functions that don't make syscalls from
> themselves.
> 
> The purpose is being able to run code in system other than Linux,
> replacing such functions by calls to the related functions of the
> system host (provided that functions follow POSIX requirements, of
> course).
> 
> Obviously, I can get this feature by modifying musl, but I'd prefer
> not to modify it, because I'd like to be able to update musl to the
> last version easily, and if I use a modified/customized musl version,
> updating it would require merging, and possibly hard work.
> 
> If there was some way of having a switch in the build system so that
> all functions that make syscalls are not compiled, I could use musl
> without modifying it. Maybe the most elegant way of doing this would
> be by tagging such functions with an special tag, like
> "__function_makes_syscall__" or whatever. But I'm not sure.
> 
> Cheers, and thanks a lot,

I'm not clear what you want to do. A program that doesn't make
syscalls has no input except argv[] and environ[], does not terminate,
and has no output. So such a build of musl is certainly not useful as
a libc. Even if it were, configurable builds that exclude
functionality are intentionally outside the scope of musl; instead,
the project provides fine-grained linking so that you just get what
you need; ports to systems where some underlying functionality is not
possible simply need to make the relevant syscalls fail with ENOSYS.

With that said, my guess is that you're really asking for a way to
take the "pure" code out of musl and make it a library that you can
use on an existing C/POSIX (or non-POSIX C) implementation. This is
interesting, but currently outside the scope of musl, and probably
covers less interesting code than you might expect -- mainly:

- charset conversion (iconv, esp. utf-8 encoder and decoder)
- strstr
- qsort
- tsearch
- math (including complex, except complex isn't very good anyway yet)
- strtod family
- snprintf (but not usable independently of musl's stdio framework)
- mo file (gettext) lookup core (but frontend is not at all pure)

The rest of the pure code is almost entirely uninteresting, I think
(unlikely to provide any advantage over what's almost certainly
already present).

For now, the only way to use this code outside of musl is to copy it
(and possibly rename the identifiers if you need to avoid clashing
with the standard library on an existing system). Roughly half of the
above list are easy to do this with (no reserved-namespace
identifiers, single files or a few isolated related files, etc.) while
the rest have issues that make them invalid in application code
without nontrivial changes. Making it easier to use them outside musl
is an interesting problem but I'm afraid not one I have resources to
devote to at present..

Rich


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Feature request: building musl in a portable way
  2017-12-21 21:38 ` Rich Felker
@ 2017-12-22 16:09   ` ardi
  2017-12-22 16:43     ` Rich Felker
  2017-12-22 17:10     ` Nicholas Wilson
  0 siblings, 2 replies; 13+ messages in thread
From: ardi @ 2017-12-22 16:09 UTC (permalink / raw)
  To: musl

On Thu, Dec 21, 2017 at 10:38 PM, Rich Felker <dalias@libc.org> wrote:
[...]
>
> I'm not clear what you want to do.

I'm looking for a C runtime with a MIT-like license that can be
compiled for several architectures, in 32bit and 64bit, mostly Intel,
PowerPC, ARM, and MIPS, and is endian-safe, and written in a tidy
code. I need that such runtime is able to be retargeted to different
OSs by changing the layer where syscalls are made. At the moment of
writing this, the OSs I'm interested in are Linux and MacOS. In the
future I'll likely be interested in other OSs as well.

I don't know of any C runtime that meets all these requirements. The
only two that get close are the different BSDs C runtimes, and musl,
but both lack the last requirement (i.e.: syscalls are not
encapsulated in some confined files so that you could rewrite such
"syscall layer" for each OS --instead, syscalls can be issued from any
place in code, and the only way to locate and encapsulate the proper
functions is to manually search for invocations of syscalls in the
source tree).

So, only options are BSDs and musl (unless I forget any). But both
BSDs and musl require "heavy editing" if you want to encapsulate
syscalls, and by doing such editing, you place yourself out of easily
updating to newer versions without considerable merging work.


> A program that doesn't make
> syscalls has no input except argv[] and environ[], does not terminate,
> and has no output. So such a build of musl is certainly not useful as
> a libc.

Yes, I didn't explain it well. Of course the program will make
syscalls. But they will happen only within a confined set of functions
that I can rewrite for different OSs.


> Even if it were, configurable builds that exclude
> functionality are intentionally outside the scope of musl; instead,
> the project provides fine-grained linking so that you just get what
> you need; ports to systems where some underlying functionality is not
> possible simply need to make the relevant syscalls fail with ENOSYS.

There's more to that: musl assumes syscalls are invoked following the
Linux kernel protocol for syscalls. It's not only a matter of
translating syscalls numbers and their arguments, but about how the
syscalls are triggered. So, writing the "compatibility layer" that I
explained in the previous paragraphs is much harder if it has to
intercept and translate syscalls, than if you could edit the musl
files where syscalls are invoked, redirecting them in a more
comfortable way, and without the tough code for intercepting syscalls.


> With that said, my guess is that you're really asking for a way to
> take the "pure" code out of musl and make it a library that you can
> use on an existing C/POSIX (or non-POSIX C) implementation. This is
> interesting, but currently outside the scope of musl, and probably
> covers less interesting code than you might expect [...]

I'm not interested in taking the "pure" code only, but as much code as
possible, only having to rewrite the syscall retargeting layer.


> [...] Making it easier to use them outside musl
> is an interesting problem but I'm afraid not one I have resources to
> devote to at present..

A somewhat crazy idea came to my mind: libraries like ElectricFence,
that substitute libc functions by their own version if you link them
before the C runtime. This would let me replace functions that make
syscalls by my own version. However, it sounds quite hack-ish and
error-prone in the long run.

After thinking about this, I believe the best way would be to be able
to confine all the musl syscalls invocations in a reduced place that I
can take control of. In the current directory tree, musl has a
function for making syscalls, and this would be exactly what I need: a
place for easily intercepting syscalls and redirecting them. However,
several functions in musl invoke syscalls directly in assembly,
without going through that function, so it's not easy to intercept all
syscalls in this moment.

Do you foresee a possibility of building musl so that all syscalls go
through some function that I can easily intercept **before** the
syscall is actually invoked?

Thanks!
ardi


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Feature request: building musl in a portable way
  2017-12-22 16:09   ` ardi
@ 2017-12-22 16:43     ` Rich Felker
  2017-12-22 19:04       ` ardi
  2017-12-22 17:10     ` Nicholas Wilson
  1 sibling, 1 reply; 13+ messages in thread
From: Rich Felker @ 2017-12-22 16:43 UTC (permalink / raw)
  To: musl

On Fri, Dec 22, 2017 at 05:09:06PM +0100, ardi wrote:
> On Thu, Dec 21, 2017 at 10:38 PM, Rich Felker <dalias@libc.org> wrote:
> [...]
> >
> > I'm not clear what you want to do.
> 
> I'm looking for a C runtime with a MIT-like license that can be
> compiled for several architectures, in 32bit and 64bit, mostly Intel,
> PowerPC, ARM, and MIPS, and is endian-safe, and written in a tidy
> code. I need that such runtime is able to be retargeted to different
> OSs by changing the layer where syscalls are made. At the moment of
> writing this, the OSs I'm interested in are Linux and MacOS. In the
> future I'll likely be interested in other OSs as well.

Changing the layer by which syscalls are made is a lot different from
building with no syscalls. You can change the syscall layer just by
making a new arch that defines (in syscall_arch.h and bits/syscall.h)
your mechanism. See the recent wasm thread or midipix for examples of
more exotic ways this can be done. But be aware that the ability to
get a POSIX-conforming implementation where EINTR and thread
cancellation work requires some sort of atomic boundary that
determines when a syscall has passed the point of having successful
side effects, which makes implementing the syscalls just as functions
hard.

Stepping back a bit, I suspect what you want is to just be able to
implement functions like open(), read(), etc. on your own instead of
implement something like actual syscalls, based on a notion that
open(), read(), etc. "are the syscalls". This is a classical notion
(e.g. based on the organization of man pages into section 2 vs 3) but
it doesn't really correspond to reality. Many/most of the functions
you think of as syscalls _need to actually be library functions_ for
various conformance/subtle-behavior reasons, and the syscalls by the
same names are low-level primitives that are useful in implementing
them. Also, it would be impossible to have a (valid) implementation
where it works to just replace these functions, because for example
stdio _can't_ call open(), read(), etc. for namespace reasons.

> I don't know of any C runtime that meets all these requirements. The
> only two that get close are the different BSDs C runtimes, and musl,
> but both lack the last requirement (i.e.: syscalls are not
> encapsulated in some confined files so that you could rewrite such
> "syscall layer" for each OS --instead, syscalls can be issued from any
> place in code, and the only way to locate and encapsulate the proper
> functions is to manually search for invocations of syscalls in the
> source tree).
> 
> So, only options are BSDs and musl (unless I forget any). But both
> BSDs and musl require "heavy editing" if you want to encapsulate
> syscalls, and by doing such editing, you place yourself out of easily
> updating to newer versions without considerable merging work.

As long as you do it the way that acknowledges the above, the "heavy
editing" is isolated to the arch directories you add.

> > Even if it were, configurable builds that exclude
> > functionality are intentionally outside the scope of musl; instead,
> > the project provides fine-grained linking so that you just get what
> > you need; ports to systems where some underlying functionality is not
> > possible simply need to make the relevant syscalls fail with ENOSYS.
> 
> There's more to that: musl assumes syscalls are invoked following the
> Linux kernel protocol for syscalls. It's not only a matter of
> translating syscalls numbers and their arguments, but about how the
> syscalls are triggered. So, writing the "compatibility layer" that I
> explained in the previous paragraphs is much harder if it has to
> intercept and translate syscalls, than if you could edit the musl
> files where syscalls are invoked, redirecting them in a more
> comfortable way, and without the tough code for intercepting syscalls.

There are indeed a small number of places where workarounds or other
considerations for Linux-specific parts of the syscall interface
boundary are in general source files rather than in the syscall glue
layer, but I think the number is quite small. If there are
particularly egregious ones that you think could be improved upon,
please let me know.

> > With that said, my guess is that you're really asking for a way to
> > take the "pure" code out of musl and make it a library that you can
> > use on an existing C/POSIX (or non-POSIX C) implementation. This is
> > interesting, but currently outside the scope of musl, and probably
> > covers less interesting code than you might expect [...]
> 
> I'm not interested in taking the "pure" code only, but as much code as
> possible, only having to rewrite the syscall retargeting layer.

Yes, I see now that my guess was wrong.

> > [...] Making it easier to use them outside musl
> > is an interesting problem but I'm afraid not one I have resources to
> > devote to at present..
> 
> A somewhat crazy idea came to my mind: libraries like ElectricFence,
> that substitute libc functions by their own version if you link them
> before the C runtime. This would let me replace functions that make
> syscalls by my own version. However, it sounds quite hack-ish and
> error-prone in the long run.

It wouldn't do what you want for one of the reasons I described above.
The important things in musl that want to open files, for instance,
don't call open(). They use sys_open which expands to a syscall.

> After thinking about this, I believe the best way would be to be able
> to confine all the musl syscalls invocations in a reduced place that I
> can take control of. In the current directory tree, musl has a
> function for making syscalls, and this would be exactly what I need: a
> place for easily intercepting syscalls and redirecting them. However,
> several functions in musl invoke syscalls directly in assembly,
> without going through that function, so it's not easy to intercept all
> syscalls in this moment.

In a few places, for some archs, that's necessary, but I'm trying to
reduce that so that the amount of required arch-specific code is
minimized. An irreducible example is vfork which fundamentally cannot
be done from C because it has to return twice. clone is treated that
way too, but I think it's wrong; it should be possible to do clone
from C since the function (not the syscall) never has to return twice.

Rich


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Feature request: building musl in a portable way
  2017-12-22 16:09   ` ardi
  2017-12-22 16:43     ` Rich Felker
@ 2017-12-22 17:10     ` Nicholas Wilson
  2017-12-22 17:49       ` Rich Felker
  2017-12-22 19:27       ` ardi
  1 sibling, 2 replies; 13+ messages in thread
From: Nicholas Wilson @ 2017-12-22 17:10 UTC (permalink / raw)
  To: musl

On 22 December 2017 16:09, ardi wrote:
>> On Thu, Dec 21, 2017 at 10:38 PM, Rich Felker <dalias@libc.org> wrote:
>> I'm not clear what you want to do.
> I'm looking for a C runtime with a MIT-like license that can be
> compiled for several architectures, in 32bit and 64bit, mostly Intel,
> PowerPC, ARM, and MIPS, and is endian-safe, and written in a tidy
> code. I need that such runtime is able to be retargeted to different
> OSs by changing the layer where syscalls are made. At the moment of
> writing this, the OSs I'm interested in are Linux and MacOS. In the
> future I'll likely be interested in other OSs as well.

> I don't know of any C runtime that meets all these requirements. The
> only two that get close are the different BSDs C runtimes, and musl,
> but both lack the last requirement (i.e.: syscalls are not
> encapsulated in some confined files so that you could rewrite such
> "syscall layer" for each OS --instead, syscalls can be issued from any
> place in code, and the only way to locate and encapsulate the proper
> functions is to manually search for invocations of syscalls in the
> source tree).

This still doesn't explain what you want to *do* - this explains what you *want* but not how you're intending to use it.

If you want a libc for MacOS, I would suggest using Apple's libc. On Linux, you may want to use glibc (or Musl, on distros that have chosen that).

As you've discovered, a libc implementation is tightly coupled to the OS - in a sense it really is a component of the OS itself. Writing a libc that targets different kernels would quite an undertaking, but possible - but of much more importance are the "upwards" dependencies. You're thinking about "how many kernels can a libc support", but of more importance is, "how many platform ABIs can the libc support upwards". For your hypothetical libc that runs on Linux and MacOS, do you want it to be usable on MacOS with applications compiled against Apple's headers, and usable on Linux with applications compiled against glibc's headers? That's just as much a challenge, and would be perhaps the major reason no-one's tried to create a "universal" libc: established applications use slightly different ABIs on each platform. If you want to call into any MacOS userland functionality, you'll need to have a libc that's fully ABI-compatible with those MacOS components.

If you insist on using Musl on MacOS, your route forwards would be to implement the Linux syscall ABI using MacOS syscalls, effectively emulating Linux on each platform where you want to run Musl.

That's how we're using Musl on WebAssembly. Musl uses Linux syscalls, so we implement Linux syscalls to keep Musl happy.

Nick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Feature request: building musl in a portable way
  2017-12-22 17:10     ` Nicholas Wilson
@ 2017-12-22 17:49       ` Rich Felker
  2017-12-22 18:01         ` Nicholas Wilson
  2017-12-22 19:27       ` ardi
  1 sibling, 1 reply; 13+ messages in thread
From: Rich Felker @ 2017-12-22 17:49 UTC (permalink / raw)
  To: musl

On Fri, Dec 22, 2017 at 05:10:20PM +0000, Nicholas Wilson wrote:
> That's how we're using Musl on WebAssembly. Musl uses Linux
> syscalls, so we implement Linux syscalls to keep Musl happy.

A bit of a historical note on this: in the late 80s and 90s there was
an effort called "iBCS" to make a unified ABI for Intel-based unices.
I believe a common syscall layer was part of it. It was abandoned
after everybody realized that the Linux syscall ABI _was_, for all
practical purposes, the unified ABI they wanted.

Rich


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Feature request: building musl in a portable way
  2017-12-22 17:49       ` Rich Felker
@ 2017-12-22 18:01         ` Nicholas Wilson
  2017-12-22 18:08           ` Rich Felker
  0 siblings, 1 reply; 13+ messages in thread
From: Nicholas Wilson @ 2017-12-22 18:01 UTC (permalink / raw)
  To: musl

On 22 December 2017 17:49, Rich Felker wrote:
> A bit of a historical note on this: in the late 80s and 90s there was
> an effort called "iBCS" to make a unified ABI for Intel-based unices.
> I believe a common syscall layer was part of it. It was abandoned
> after everybody realized that the Linux syscall ABI _was_, for all
> practical purposes, the unified ABI they wanted.

That's pretty much where WebAssembly is going too! At the moment, the WebAssembly "embedding environment" (the webpage) has to provide a JavaScript implementation of the external dependencies of the WebAssembly module.

There is a desire to eventually standardise that a bit - at the moment it's "whatever Musl wants". I think the conclusion will be "emulate Linux everywhere". I'm expecting some small tweaks though. For example, traditionally timezone information is stored in userland and not available via a syscall: rather than special-case SYS_open for "/etc/localtime", we might add a Wasm-specific syscall for doing it. (Naturally, this would all be done in Wasm via an override in a Wasm-specific directory.)

So it will be "Linux syscalls" - but probably with a few tweaks (as indeed Linux syscalls already differ very slightly between architectures).

Nick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Feature request: building musl in a portable way
  2017-12-22 18:01         ` Nicholas Wilson
@ 2017-12-22 18:08           ` Rich Felker
  2017-12-22 19:06             ` Nicholas Wilson
  0 siblings, 1 reply; 13+ messages in thread
From: Rich Felker @ 2017-12-22 18:08 UTC (permalink / raw)
  To: musl

On Fri, Dec 22, 2017 at 06:01:35PM +0000, Nicholas Wilson wrote:
> On 22 December 2017 17:49, Rich Felker wrote:
> > A bit of a historical note on this: in the late 80s and 90s there was
> > an effort called "iBCS" to make a unified ABI for Intel-based unices.
> > I believe a common syscall layer was part of it. It was abandoned
> > after everybody realized that the Linux syscall ABI _was_, for all
> > practical purposes, the unified ABI they wanted.
> 
> That's pretty much where WebAssembly is going too! At the moment,
> the WebAssembly "embedding environment" (the webpage) has to provide
> a JavaScript implementation of the external dependencies of the
> WebAssembly module.
> 
> There is a desire to eventually standardise that a bit - at the
> moment it's "whatever Musl wants". I think the conclusion will be
> "emulate Linux everywhere". I'm expecting some small tweaks though.
> For example, traditionally timezone information is stored in
> userland and not available via a syscall: rather than special-case
> SYS_open for "/etc/localtime", we might add a Wasm-specific syscall
> for doing it. (Naturally, this would all be done in Wasm via an
> override in a Wasm-specific directory.)

Wouldn't just exporting a TZ variable be easier?

Rich


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Feature request: building musl in a portable way
  2017-12-22 16:43     ` Rich Felker
@ 2017-12-22 19:04       ` ardi
  2017-12-23  8:18         ` Markus Wichmann
  0 siblings, 1 reply; 13+ messages in thread
From: ardi @ 2017-12-22 19:04 UTC (permalink / raw)
  To: musl

On Fri, Dec 22, 2017 at 5:43 PM, Rich Felker <dalias@libc.org> wrote:
> [...] You can change the syscall layer just by
> making a new arch that defines (in syscall_arch.h and bits/syscall.h)
> your mechanism. See the recent wasm thread or midipix for examples of
> more exotic ways this can be done.

Thanks a lot!! I'll try to follow this path. It looks clean.


> But be aware that the ability to
> get a POSIX-conforming implementation where EINTR and thread
> cancellation work requires some sort of atomic boundary that
> determines when a syscall has passed the point of having successful
> side effects, which makes implementing the syscalls just as functions
> hard.

I'm not sure if I can hit this scenario but I'll research this. Thanks!


> Stepping back a bit, I suspect what you want is to just be able to
> implement functions like open(), read(), etc. on your own instead of
> implement something like actual syscalls, based on a notion that
> open(), read(), etc. "are the syscalls".

Not based on that notion, but thinking that the number of functions
issuing syscalls would be reduced, and substituting such functions
would be less work than translating syscalls. But the approach you
suggested above is better.


> [...]
> As long as you do it the way that acknowledges the above, the "heavy
> editing" is isolated to the arch directories you add.

That's really what I want!


> [...]
> There are indeed a small number of places where workarounds or other
> considerations for Linux-specific parts of the syscall interface
> boundary are in general source files rather than in the syscall glue
> layer, but I think the number is quite small. If there are
> particularly egregious ones that you think could be improved upon,
> please let me know.

Yes, I believe that whenever there are assembly source files in some
directory in the musl tree, there're functions there that make
syscalls without going through the interface you defined above. I'll
look at this and I'll see if it can be improved somehow.

Thanks a lot!

ardi


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Feature request: building musl in a portable way
  2017-12-22 18:08           ` Rich Felker
@ 2017-12-22 19:06             ` Nicholas Wilson
  0 siblings, 0 replies; 13+ messages in thread
From: Nicholas Wilson @ 2017-12-22 19:06 UTC (permalink / raw)
  To: musl

On 22 December 2017 18:08, Rich Felker wrote:
> Wouldn't just exporting a TZ variable be easier?

(Sorry ardi for hijacking your thread! Just a brief response.)

That gives you the current timezone, true. But to make localtime() work, you of course need historical timezone information - a list of timestamps when the timezone offset changed. Getting that information from the browser is actually rather hard. Of course the browser internally has the list of timezone data, but it doesn't expose it via an API - all you can do is basically call a JavaScript equivalent of localtime() and find the timezone offset at specific points you sample. So we can't easily extract the current timezone's full data, and use Musl's implementation. Our current solution is simple, and does a wholesale redirection of localtime() to a browser-based version. Implementing localtime() in a browser is easy, but extracting zoneinfo is hard.

Nick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Feature request: building musl in a portable way
  2017-12-22 17:10     ` Nicholas Wilson
  2017-12-22 17:49       ` Rich Felker
@ 2017-12-22 19:27       ` ardi
  1 sibling, 0 replies; 13+ messages in thread
From: ardi @ 2017-12-22 19:27 UTC (permalink / raw)
  To: musl

On Fri, Dec 22, 2017 at 6:10 PM, Nicholas Wilson
<nicholas.wilson@realvnc.com> wrote:
> [...]
> If you insist on using Musl on MacOS, your route forwards would be to implement the Linux syscall ABI using MacOS syscalls, effectively emulating Linux on each platform where you want to run Musl.
>
> That's how we're using Musl on WebAssembly. Musl uses Linux syscalls, so we implement Linux syscalls to keep Musl happy.

Exactly. I'm designing an embedded system (not web-related, though)
and the thoughts are similar to what you describe (it will be open
source if I succeed, so don't worry, you'll know if I succeed, and
I'll keep a coward silence if I fail miserably :-))))

(BTW: don't worry, you didn't hijack the thread, your comments have
been very instructive too)

Thanks!
ardi


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Feature request: building musl in a portable way
  2017-12-22 19:04       ` ardi
@ 2017-12-23  8:18         ` Markus Wichmann
  2017-12-23 20:57           ` ardi
  0 siblings, 1 reply; 13+ messages in thread
From: Markus Wichmann @ 2017-12-23  8:18 UTC (permalink / raw)
  To: musl

On Fri, Dec 22, 2017 at 08:04:31PM +0100, ardi wrote:
> On Fri, Dec 22, 2017 at 5:43 PM, Rich Felker <dalias@libc.org> wrote:
> > [...] You can change the syscall layer just by
> > making a new arch that defines (in syscall_arch.h and bits/syscall.h)
> > your mechanism. See the recent wasm thread or midipix for examples of
> > more exotic ways this can be done.
> 
> Thanks a lot!! I'll try to follow this path. It looks clean.
> 
> 

Clean it might be, but it's also long and stony. Linux currently
supports ca. 300 syscalls.

> > But be aware that the ability to
> > get a POSIX-conforming implementation where EINTR and thread
> > cancellation work requires some sort of atomic boundary that
> > determines when a syscall has passed the point of having successful
> > side effects, which makes implementing the syscalls just as functions
> > hard.
> 
> I'm not sure if I can hit this scenario but I'll research this. Thanks!
> 

That's just one quirk of musl's use of syscalls, though. Here are some
others, off the top of my head:

- musl requires mmap() with MAP_FIXED on a previously allocated area to
  work for shared libraries. In fact, musl itself will use mmap() with
  MAP_FIXED _only_ on previously allocated areas. There are reasons for
  that, but suffice it to say that for instance Cygwin fails these
  calls.
- musl requires the close() syscall to always release the file descriptor
  if it was allocated before. Even if the call itself fails for any
  reason.
- musl assumes the credential setting functions to have thread-local
  effect. Since POSIX defines them to have a process-global effect, it
  goes to some length to match them up. I am not certain every OS is as
  quirky in that respect as Linux (that's the real issue).
- musl assumes to be able to read the instruction pointer from the
  arguments to signal handler, and to be able to set it.

[...]
> > There are indeed a small number of places where workarounds or other
> > considerations for Linux-specific parts of the syscall interface
> > boundary are in general source files rather than in the syscall glue
> > layer, but I think the number is quite small. If there are
> > particularly egregious ones that you think could be improved upon,
> > please let me know.
> 
> Yes, I believe that whenever there are assembly source files in some
> directory in the musl tree, there're functions there that make
> syscalls without going through the interface you defined above. I'll
> look at this and I'll see if it can be improved somehow.
> 

Ooh, thanks, that reminded me: the assembly files do make syscalls
wildly, usually for control of the stack of because the other arch's
need it. For instance src/thread/i386/__set_thread_area.s does nothing
but invoke two syscalls. But it is needed to be an assembly file, since
for some other arch's (e.g. PowerPC), only a register move is required.

> Thanks a lot!
> 
> ardi

Ciao,
Markus


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Feature request: building musl in a portable way
  2017-12-23  8:18         ` Markus Wichmann
@ 2017-12-23 20:57           ` ardi
  0 siblings, 0 replies; 13+ messages in thread
From: ardi @ 2017-12-23 20:57 UTC (permalink / raw)
  To: musl

On Sat, Dec 23, 2017 at 9:18 AM, Markus Wichmann <nullplan@gmx.net> wrote:
> Clean it might be, but it's also long and stony. Linux currently
> supports ca. 300 syscalls.

Does musl use all Linux syscalls? In the musl headers I found traces
of about 60 or so syscall IDs definitions, IIRC (or even less, I don't
remember now).

[...]
> - musl requires mmap() with MAP_FIXED on a previously allocated area to
>   work for shared libraries. In fact, musl itself will use mmap() with
>   MAP_FIXED _only_ on previously allocated areas. There are reasons for
>   that, but suffice it to say that for instance Cygwin fails these
>   calls.

Does musl explicitly query for MAP_FIXED in the proper syscall
arguments when it expects MAP_FIXED, or do you have to guess it? If
musl explicitly queries for MAP_FIXED through the syscall arguments, I
don't see any problem here, just parse the arguments and pass the
MAP_FIXED requirement to the host syscall.


> - musl requires the close() syscall to always release the file descriptor
>   if it was allocated before. Even if the call itself fails for any
>   reason.
> - musl assumes the credential setting functions to have thread-local
>   effect. Since POSIX defines them to have a process-global effect, it
>   goes to some length to match them up. I am not certain every OS is as
>   quirky in that respect as Linux (that's the real issue).
> - musl assumes to be able to read the instruction pointer from the
>   arguments to signal handler, and to be able to set it.

Thanks a lot for all these advices!!

[...]
>> Yes, I believe that whenever there are assembly source files in some
>> directory in the musl tree, there're functions there that make
>> syscalls without going through the interface you defined above. I'll
>> look at this and I'll see if it can be improved somehow.
>>
>
> Ooh, thanks, that reminded me: the assembly files do make syscalls
> wildly, usually for control of the stack of because the other arch's
> need it. For instance src/thread/i386/__set_thread_area.s does nothing
> but invoke two syscalls. But it is needed to be an assembly file, since
> for some other arch's (e.g. PowerPC), only a register move is required.

Yeah, that's my main worry: the musl functions that issue syscalls
directly in assembly on their own, bypassing the musl syscall main
interface. I still need to look at this.

Thanks!
ardi


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-12-23 20:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-21 16:25 Feature request: building musl in a portable way ardi
2017-12-21 21:38 ` Rich Felker
2017-12-22 16:09   ` ardi
2017-12-22 16:43     ` Rich Felker
2017-12-22 19:04       ` ardi
2017-12-23  8:18         ` Markus Wichmann
2017-12-23 20:57           ` ardi
2017-12-22 17:10     ` Nicholas Wilson
2017-12-22 17:49       ` Rich Felker
2017-12-22 18:01         ` Nicholas Wilson
2017-12-22 18:08           ` Rich Felker
2017-12-22 19:06             ` Nicholas Wilson
2017-12-22 19:27       ` ardi

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).