mailing list of musl libc
 help / color / mirror / code / Atom feed
* internal header proposal
@ 2018-09-07 17:23 Rich Felker
  2018-09-08  3:27 ` Rich Felker
  2018-09-09 22:27 ` Alexander Monakov
  0 siblings, 2 replies; 4+ messages in thread
From: Rich Felker @ 2018-09-07 17:23 UTC (permalink / raw)
  To: musl

I'm presently working on moving most or all inline-in-source-file
declarations of internal-use interfaces to header files, so that type
mismatches between points of use and points of declaration can be
caught, and so that I can switch them over to hidden visibility
without having to worry about inconsistent application of visibility,
where the following errors are easy to make:

1. On definition, missing from declaration at site of use: definition
   binds at link time, but caller may generate inefficient code using
   GOT/PLT unnecessarily.

2. On declaration at site of use, missing from definition: depending
   on arch and linker version, linker may produce an error about
   unsatisfiable relocation and refuse to link.

The second is a big problem (regression risk) applying visibility to
internal interfaces, so a good method to preclude it is needed.
Putting the hidden attribute only on the declaration in the headers,
and omitting it everywhere else, should avoid it entirely, and also
avoids the first problem as long as -Wmissing-declarations passes.

Anyway, it turns out we have roughly two distinct types of internal
interfaces:

1. Namespace-safe versions of standard/public interfaces that allow
   parts of one subsystem to be used to implement another in cases
   where the namespace rules would not allow the normal public
   interfaces to be used. This includes things like pthread functions
   used to implement C11 threads or thread-safety in plain-C
   interfaces, __strchrnul, resolv.h functions used in getaddrinfo,
   mman.h functions used in malloc, etc.

2. Interfaces that are private to a particular subsystem. This
   includes things like the timezone functions from __tz.c and related
   files, all the internal stdio and pthread and locale glue, etc.

The reason I've broken them down into these two categories is that the
latter already have appropriate places to declare them: the
corresponding *_impl.h header files (sometimes named differently) for
their subsystems, but the former don't. Putting the former group in
with the latter would just massively balloon the set of source files
that need to include some *_impl.h header, and thereby obscure which
files are really intended/allowed to poke at internals of a subsystem
vs just needing access to namespace-safe public or semi-public
interfaces from that subsystem.

So, we need a new place to declare the first group, and I have two
possible ways to do it:


Option 1: The big fancy header wrapping

Add a new tree of "wrapper headers" for public headers (let's call it
$(srcdir)/src/include), and -I it before the real public ones
($(srcdir)/include). These new headers include their corresponding
public header (../../include/[self].h) then add anything else that's
supposed to be "public within musl". For example sys/mman.h would have
stuff like:

hidden void __vm_wait(void);
hidden void __vm_lock(void);
hidden void __vm_unlock(void);

hidden void *__mmap(void *, size_t, int, int, int, off_t);
hidden int __munmap(void *, size_t);
hidden void *__mremap(void *, size_t, size_t, int, ...);
hidden int __madvise(void *, size_t, int);
hidden int __mprotect(void *, size_t, int);

hidden const unsigned char *__map_file(const char *, size_t *);

Now, every file that needs to use mman.h functions without violating
namespace can just #include <sys/mman.h> and use the above. If we
wanted, at some point we could even #define the unprefixed names to
remap to the prefixed ones, and only #undef them in the files that
define them, so that everything automatically gets the namespace-safe,
low-call-overhead names. This idea is a lot like how
syscall()/__syscall() work now -- the musl source files get programmed
with familiar interfaces, and a small amount of header magic makes
them do the right thing rather than depending on a public namespace
violation.

If this all seems too radical, or like it has potential pitfalls we
need to think about before committing to it, I have a less invasive
proposal too:


Option 2: New namespaced.h header

Introduce a single new header that declares all of the namespace-safe
interfaces across all subsystems, with minimal dependencies on other
headers so that it can be included everywhere it's needed with low
cost. Unfortunately some functions need types exposed, but
<sys/types.h> would probably suffice to get just those without pulling
in lots of other stuff.


I think the second option is actually more invasive to the source
tree, in terms of adding #include lines to files. Option 1 has
slightly more hidden complexity, but leads to simplification of the
source, and the complexity does not significantly detract from
readability of the source, in my opinion.

Thoughts on any of this? So far I've been staging commits moving the
subsystem-private internal declarations to appropriate headers (type 2
above), but doing nothing with the namespace-safe versions of public
interfaces (type 1 above). But I'd like to start on them soon too.
When this is all over, I'll be able to add hidden visibility on all of
these, and most of the efficiency lost in having to drop vis.h (see
commit dc2f368e565c37728b0d620380b849c3a1ddd78f) will be regained.
Dynamic linking performance should also slightly increase.

Rich


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: internal header proposal
  2018-09-07 17:23 internal header proposal Rich Felker
@ 2018-09-08  3:27 ` Rich Felker
  2018-09-09 22:27 ` Alexander Monakov
  1 sibling, 0 replies; 4+ messages in thread
From: Rich Felker @ 2018-09-08  3:27 UTC (permalink / raw)
  To: musl

On Fri, Sep 07, 2018 at 01:23:12PM -0400, Rich Felker wrote:
> Option 1: The big fancy header wrapping
> 
> Add a new tree of "wrapper headers" for public headers (let's call it
> $(srcdir)/src/include), and -I it before the real public ones
> ($(srcdir)/include). These new headers include their corresponding
> public header (../../include/[self].h) then add anything else that's
> supposed to be "public within musl". For example sys/mman.h would have
> stuff like:
> 
> hidden void __vm_wait(void);
> hidden void __vm_lock(void);
> hidden void __vm_unlock(void);
> 
> hidden void *__mmap(void *, size_t, int, int, int, off_t);
> hidden int __munmap(void *, size_t);
> hidden void *__mremap(void *, size_t, size_t, int, ...);
> hidden int __madvise(void *, size_t, int);
> hidden int __mprotect(void *, size_t, int);
> 
> hidden const unsigned char *__map_file(const char *, size_t *);
> 
> Now, every file that needs to use mman.h functions without violating
> namespace can just #include <sys/mman.h> and use the above. If we
> wanted, at some point we could even #define the unprefixed names to
> remap to the prefixed ones, and only #undef them in the files that
> define them, so that everything automatically gets the namespace-safe,
> low-call-overhead names. This idea is a lot like how
> syscall()/__syscall() work now -- the musl source files get programmed
> with familiar interfaces, and a small amount of header magic makes
> them do the right thing rather than depending on a public namespace
> violation.
> 
> If this all seems too radical, or like it has potential pitfalls we
> need to think about before committing to it, I have a less invasive
> proposal too:

I have this option implemented and it's working out really well, with
just the following headers:

src/include/arpa/inet.h
src/include/langinfo.h
src/include/pthread.h
src/include/resolv.h
src/include/signal.h
src/include/stdlib.h
src/include/string.h
src/include/sys/mman.h
src/include/time.h
src/include/unistd.h

This list tells a lot about what parts (how little) of libc are
useful/necessary for implementing other parts.

So far I've dropped the number of inline-in-source declarations down
from over 160 to 41, and most of the ones left are either ABI/linking
glue stuff, or internal interfaces with a single consumer. Nothing
left is purely a namespace-protected version of a public function.

I'll wrap up the rest soon and get all this ready to push. Already
found and fixed a few small bugs in the process. :)

Rich


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: internal header proposal
  2018-09-07 17:23 internal header proposal Rich Felker
  2018-09-08  3:27 ` Rich Felker
@ 2018-09-09 22:27 ` Alexander Monakov
  2018-09-10  0:47   ` Rich Felker
  1 sibling, 1 reply; 4+ messages in thread
From: Alexander Monakov @ 2018-09-09 22:27 UTC (permalink / raw)
  To: musl

I've written the following response a couple of days ago (before seeing your
more recent follow-up), but it bounced due to disk space issues.

On Fri, 7 Sep 2018, Rich Felker wrote:
> Option 1: The big fancy header wrapping

This looks nice to me, except ...

> If we
> wanted, at some point we could even #define the unprefixed names to
> remap to the prefixed ones, and only #undef them in the files that
> define them, so that everything automatically gets the namespace-safe,
> low-call-overhead names.

... for this: I have reservations about such #defines. Preprocessing
works on a wrong level to have such redirections as general policy (e.g.
if redirections are done via function-like macros, their effect won't
apply when taking the address). It creates an unnecessary distraction
for people inspecting generated asm ("hm, where did this __ prefix came
from?"), and I can imagine it causing problems for debugging (e.g. if
one wanted to breakpoint on strchrnul in libc built with LTO, they'd
be surprised to see their breakpoint didn't hit and they needed to
breakpoint on __strchrnul instead).

> Option 2: New namespaced.h header

Just want to remark that internal/libc.h already hosts a few such
declarations, so it might make sense to either grow the kitchen sink
further or aim to dismantle it altogether.

Thank you for writing this up!

Alexander


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: internal header proposal
  2018-09-09 22:27 ` Alexander Monakov
@ 2018-09-10  0:47   ` Rich Felker
  0 siblings, 0 replies; 4+ messages in thread
From: Rich Felker @ 2018-09-10  0:47 UTC (permalink / raw)
  To: musl

On Mon, Sep 10, 2018 at 01:27:20AM +0300, Alexander Monakov wrote:
> I've written the following response a couple of days ago (before seeing your
> more recent follow-up), but it bounced due to disk space issues.
> 
> On Fri, 7 Sep 2018, Rich Felker wrote:
> > Option 1: The big fancy header wrapping
> 
> This looks nice to me, except ...
> 
> > If we
> > wanted, at some point we could even #define the unprefixed names to
> > remap to the prefixed ones, and only #undef them in the files that
> > define them, so that everything automatically gets the namespace-safe,
> > low-call-overhead names.
> 
> .... for this: I have reservations about such #defines. Preprocessing

Yes, me too. It's not something I want to do now ("at some point we
could even...") and I think there are probably good reasons not to do
it at all.

On the other hand I would potentially like to #define memcpy, memset,
etc. back to __builtin_* in string.h, just undefining them in their
respective source files, because -ffreestanding is gratuitously
pessimizing a lot of code using them. In some cases just using struct
assignments would solve the problem, but not all.

> works on a wrong level to have such redirections as general policy (e.g.
> if redirections are done via function-like macros, their effect won't
> apply when taking the address). It creates an unnecessary distraction

I had object-like macros in mind; there's no need for them to be
function-like.

> for people inspecting generated asm ("hm, where did this __ prefix came
> from?"), and I can imagine it causing problems for debugging (e.g. if
> one wanted to breakpoint on strchrnul in libc built with LTO, they'd
> be surprised to see their breakpoint didn't hit and they needed to
> breakpoint on __strchrnul instead).

Actually in this case it'd work because they're just aliases.

> > Option 2: New namespaced.h header
> 
> Just want to remark that internal/libc.h already hosts a few such
> declarations, so it might make sense to either grow the kitchen sink
> further or aim to dismantle it altogether.
> 
> Thank you for writing this up!

Yay, glad someone appreciates it! Sometimes I write up proposals or
documentation of development direction and get no response, and sure
it's nice to have them to look back to, but it's much nicer to have
someone reading and responding.

Rich


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-09-10  0:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-07 17:23 internal header proposal Rich Felker
2018-09-08  3:27 ` Rich Felker
2018-09-09 22:27 ` Alexander Monakov
2018-09-10  0:47   ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).