mailing list of musl libc
 help / color / mirror / code / Atom feed
* About those weak aliases
@ 2019-09-02 19:04 Markus Wichmann
  2019-09-02 20:10 ` Szabolcs Nagy
  0 siblings, 1 reply; 9+ messages in thread
From: Markus Wichmann @ 2019-09-02 19:04 UTC (permalink / raw)
  To: musl

Hi all,

I'd like to know what those weak aliases are for in the many cases where
they are used to define a public interface. Or, more to the point, by
what criteria they are handed out, and by what logic the internal
symbols are used.

For instance, pthread_mutex_lock() et al. are weakly defined, but
pthread_cond_wait() is not. Unlike pthread_cond_timedwait(), which is
called from pthread_cond_wait() by the public symbol that might be
interposed. Makes sense, since pthread_cond_wait() does not depend on
mutex internals (pthread_cond_timedwait() does).

I found no C standard function with a weak definition. But I did find
crypt() being strongly defined, but it calls the internal (strong)
definition of crypt_r(), rather than the weak one.

So I thought maybe the C standard functions get strong definitions and
all others get weak ones. But open(), close(), etc. are also defined
strongly, while fdopen() gets a weak definition. And those are all in
POSIX. Meanwhile, adjtime() gets a strong definition, as does
getdents(), and those are Linux specialities.

So yeah,I have so far failed to identify any rhyme or reason to these
definitions. Can anyone help me?

Ciao,
Markus


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: About those weak aliases
  2019-09-02 19:04 About those weak aliases Markus Wichmann
@ 2019-09-02 20:10 ` Szabolcs Nagy
  2019-09-02 23:01   ` Rich Felker
  2019-09-05 16:50   ` Markus Wichmann
  0 siblings, 2 replies; 9+ messages in thread
From: Szabolcs Nagy @ 2019-09-02 20:10 UTC (permalink / raw)
  To: musl

* Markus Wichmann <nullplan@gmx.net> [2019-09-02 21:04:48 +0200]:
> I'd like to know what those weak aliases are for in the many cases where
> they are used to define a public interface. Or, more to the point, by
> what criteria they are handed out, and by what logic the internal
> symbols are used.
> 
> For instance, pthread_mutex_lock() et al. are weakly defined, but

it's a weak alias for __pthread_mutex_lock which can be used
to implement iso c apis (where pthread* is not reserved and
thus may conflict with user defined symbols)

__pthread_mutex_lock is not used internally right now, but
e.g. __pthread_mutex_timedlock is.

(could be a strong alias, weakness of public api symbols
doesn't matter, you can only observe the difference by
getting a link error when static linking a conflicting
definition, but that is non-standard: when the symbol is
reserved for the implementation user code must not use it)

so following namespace rules for static linking is one
reason for aliases. and musl only uses weak aliases.

there are other usage of weak symbols, there was a patch
that tried to cathegorize them:

https://www.openwall.com/lists/musl/2013/02/15/1

> pthread_cond_wait() is not. Unlike pthread_cond_timedwait(), which is
> called from pthread_cond_wait() by the public symbol that might be
> interposed. Makes sense, since pthread_cond_wait() does not depend on
> mutex internals (pthread_cond_timedwait() does).
> 
> I found no C standard function with a weak definition. But I did find
> crypt() being strongly defined, but it calls the internal (strong)
> definition of crypt_r(), rather than the weak one.
> 
> So I thought maybe the C standard functions get strong definitions and
> all others get weak ones. But open(), close(), etc. are also defined
> strongly, while fdopen() gets a weak definition. And those are all in
> POSIX. Meanwhile, adjtime() gets a strong definition, as does
> getdents(), and those are Linux specialities.
> 
> So yeah,I have so far failed to identify any rhyme or reason to these
> definitions. Can anyone help me?
> 
> Ciao,
> Markus


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: About those weak aliases
  2019-09-02 20:10 ` Szabolcs Nagy
@ 2019-09-02 23:01   ` Rich Felker
  2019-09-03 10:13     ` Szabolcs Nagy
  2019-09-05 16:50   ` Markus Wichmann
  1 sibling, 1 reply; 9+ messages in thread
From: Rich Felker @ 2019-09-02 23:01 UTC (permalink / raw)
  To: musl

On Mon, Sep 02, 2019 at 10:10:10PM +0200, Szabolcs Nagy wrote:
> * Markus Wichmann <nullplan@gmx.net> [2019-09-02 21:04:48 +0200]:
> > I'd like to know what those weak aliases are for in the many cases where
> > they are used to define a public interface. Or, more to the point, by
> > what criteria they are handed out, and by what logic the internal
> > symbols are used.
> > 
> > For instance, pthread_mutex_lock() et al. are weakly defined, but
> 
> it's a weak alias for __pthread_mutex_lock which can be used
> to implement iso c apis (where pthread* is not reserved and
> thus may conflict with user defined symbols)
> 
> __pthread_mutex_lock is not used internally right now, but
> e.g. __pthread_mutex_timedlock is.

Indeed, it looks like commit df7d0dfb9c686df31149d09008ba92834bed9803
added it with an expectation that C11 threads would use it, but
instead mtx_lock just calls mtx_timedlock with a null timeout. Having
it around may be useful at some point though so I don't think it makes
sense to add noise removing it and possibly adding it back later.

> (could be a strong alias, weakness of public api symbols
> doesn't matter, you can only observe the difference by
> getting a link error when static linking a conflicting
> definition, but that is non-standard: when the symbol is
> reserved for the implementation user code must not use it)

I don't follow here. There are very few if any places where strong
alias would be a valid substitute for weak. Where weak aliases provide
dummy implementations of functionality that's only needed if something
else is linked, strong would be a link error if both were linked.
Where weak aliases are used because the identifier being defined is
reserved to the application in some or all standard profiles, a strong
alias would produce a link error if the application actually made use
of its reservation and the file defining the alias got linked (and the
whole point is that this can and does happen).

Rich


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: About those weak aliases
  2019-09-02 23:01   ` Rich Felker
@ 2019-09-03 10:13     ` Szabolcs Nagy
  2019-09-03 12:08       ` Rich Felker
  0 siblings, 1 reply; 9+ messages in thread
From: Szabolcs Nagy @ 2019-09-03 10:13 UTC (permalink / raw)
  To: musl

* Rich Felker <dalias@libc.org> [2019-09-02 19:01:18 -0400]:
> On Mon, Sep 02, 2019 at 10:10:10PM +0200, Szabolcs Nagy wrote:
> > (could be a strong alias, weakness of public api symbols
> > doesn't matter, you can only observe the difference by
> > getting a link error when static linking a conflicting
> > definition, but that is non-standard: when the symbol is
> > reserved for the implementation user code must not use it)
> 
> I don't follow here. There are very few if any places where strong
> alias would be a valid substitute for weak. Where weak aliases provide
> dummy implementations of functionality that's only needed if something
> else is linked, strong would be a link error if both were linked.
> Where weak aliases are used because the identifier being defined is
> reserved to the application in some or all standard profiles, a strong
> alias would produce a link error if the application actually made use
> of its reservation and the file defining the alias got linked (and the
> whole point is that this can and does happen).

you are right. sorry


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: About those weak aliases
  2019-09-03 10:13     ` Szabolcs Nagy
@ 2019-09-03 12:08       ` Rich Felker
  0 siblings, 0 replies; 9+ messages in thread
From: Rich Felker @ 2019-09-03 12:08 UTC (permalink / raw)
  To: musl

On Tue, Sep 03, 2019 at 12:13:39PM +0200, Szabolcs Nagy wrote:
> * Rich Felker <dalias@libc.org> [2019-09-02 19:01:18 -0400]:
> > On Mon, Sep 02, 2019 at 10:10:10PM +0200, Szabolcs Nagy wrote:
> > > (could be a strong alias, weakness of public api symbols
> > > doesn't matter, you can only observe the difference by
> > > getting a link error when static linking a conflicting
> > > definition, but that is non-standard: when the symbol is
> > > reserved for the implementation user code must not use it)
> > 
> > I don't follow here. There are very few if any places where strong
> > alias would be a valid substitute for weak. Where weak aliases provide
> > dummy implementations of functionality that's only needed if something
> > else is linked, strong would be a link error if both were linked.
> > Where weak aliases are used because the identifier being defined is
> > reserved to the application in some or all standard profiles, a strong
> > alias would produce a link error if the application actually made use
> > of its reservation and the file defining the alias got linked (and the
> > whole point is that this can and does happen).
> 
> you are right. sorry

No problem. It's informative for uncovering if there are such cases
and what they're for.

I think the only places strong aliases would be okay is when the alias
is providing a public interface and it's in the same namespace (or a
more restrictive implementation namespace) as the symbol it's
aliasing. These are mostly glibc ABI-compat symbols where the glibc
ABI had __-prefixed versions of some public function as a public ABI
(e.g. the __-prefixed versions of the 'xlocale' functions,
__isoc99_*scanf, __getdelim; maybe nothing else).

Rich


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: About those weak aliases
  2019-09-02 20:10 ` Szabolcs Nagy
  2019-09-02 23:01   ` Rich Felker
@ 2019-09-05 16:50   ` Markus Wichmann
  2019-09-05 16:58     ` Szabolcs Nagy
  1 sibling, 1 reply; 9+ messages in thread
From: Markus Wichmann @ 2019-09-05 16:50 UTC (permalink / raw)
  To: musl

On Mon, Sep 02, 2019 at 10:10:10PM +0200, Szabolcs Nagy wrote:
> * Markus Wichmann <nullplan@gmx.net> [2019-09-02 21:04:48 +0200]:
> > I'd like to know what those weak aliases are for in the many cases where
> > they are used to define a public interface. Or, more to the point, by
> > what criteria they are handed out, and by what logic the internal
> > symbols are used.
> >
> > For instance, pthread_mutex_lock() et al. are weakly defined, but
>
> it's a weak alias for __pthread_mutex_lock which can be used
> to implement iso c apis (where pthread* is not reserved and
> thus may conflict with user defined symbols)
>

Yes, namespacing, I thought so. But this style is not used consistently.
For example, open() does not go that route, even though the name is not
reserved in ISO 9899.

The other issue is, if two versions of a symbol exist, which one is
referenced internally. It seems musl mostly tries to use the internal
(strong) symbol, but not always. mmap() has the same mechanism in use,
but the dynamic linker references the weak version.

> there are other usage of weak symbols, there was a patch
> that tried to cathegorize them:
>
> https://www.openwall.com/lists/musl/2013/02/15/1
>

That thread talks about pretty much every use of weak aliases except the
type at issue here.

Ciao,
Markus


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: About those weak aliases
  2019-09-05 16:50   ` Markus Wichmann
@ 2019-09-05 16:58     ` Szabolcs Nagy
  2019-09-05 17:29       ` Markus Wichmann
  0 siblings, 1 reply; 9+ messages in thread
From: Szabolcs Nagy @ 2019-09-05 16:58 UTC (permalink / raw)
  To: musl

* Markus Wichmann <nullplan@gmx.net> [2019-09-05 18:50:08 +0200]:

> On Mon, Sep 02, 2019 at 10:10:10PM +0200, Szabolcs Nagy wrote:
> > * Markus Wichmann <nullplan@gmx.net> [2019-09-02 21:04:48 +0200]:
> > > I'd like to know what those weak aliases are for in the many cases where
> > > they are used to define a public interface. Or, more to the point, by
> > > what criteria they are handed out, and by what logic the internal
> > > symbols are used.
> > >
> > > For instance, pthread_mutex_lock() et al. are weakly defined, but
> >
> > it's a weak alias for __pthread_mutex_lock which can be used
> > to implement iso c apis (where pthread* is not reserved and
> > thus may conflict with user defined symbols)
> >
> 
> Yes, namespacing, I thought so. But this style is not used consistently.
> For example, open() does not go that route, even though the name is not
> reserved in ISO 9899.

can you show an example use of open in musl code
where it is called form an api implementation
that is defined by iso c?

> 
> The other issue is, if two versions of a symbol exist, which one is
> referenced internally. It seems musl mostly tries to use the internal
> (strong) symbol, but not always. mmap() has the same mechanism in use,
> but the dynamic linker references the weak version.

since it is for namespacing, which one is used
is determined by the namespace rules.

for the dynamic linker it does not matter which
one is used, unless that code can get static
linked into an executable (dlstart.c or in the
future if dlopen is supported with static linking),
then the namespace clean variant (__mmap) must
be used.

> 
> > there are other usage of weak symbols, there was a patch
> > that tried to cathegorize them:
> >
> > https://www.openwall.com/lists/musl/2013/02/15/1
> >
> 
> That thread talks about pretty much every use of weak aliases except the
> type at issue here.
> 
> Ciao,
> Markus


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: About those weak aliases
  2019-09-05 16:58     ` Szabolcs Nagy
@ 2019-09-05 17:29       ` Markus Wichmann
  2019-09-05 18:18         ` Rich Felker
  0 siblings, 1 reply; 9+ messages in thread
From: Markus Wichmann @ 2019-09-05 17:29 UTC (permalink / raw)
  To: musl

On Thu, Sep 05, 2019 at 06:58:22PM +0200, Szabolcs Nagy wrote:
> can you show an example use of open in musl code
> where it is called form an api implementation
> that is defined by iso c?
>

No, I can't. And I think I understand now.

musl is trying to prevent linker errors from namespace pollution. More
specifically, to prevent double definition errors. Such an error would
happen during static linking, if a strong symbol from an unrelated
standard were pulled in. To that end, weak aliases are handed out on an
as-needed basis. open() is not needed to implement any interface from a
standard it is not a part of (fopen() inlines the syscall), so it gets
no alias. mmap() is needed to implement malloc(), so it gets one. Repeat
for all other functions.

How close am I?

Ciao,
Markus


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: About those weak aliases
  2019-09-05 17:29       ` Markus Wichmann
@ 2019-09-05 18:18         ` Rich Felker
  0 siblings, 0 replies; 9+ messages in thread
From: Rich Felker @ 2019-09-05 18:18 UTC (permalink / raw)
  To: musl

On Thu, Sep 05, 2019 at 07:29:04PM +0200, Markus Wichmann wrote:
> On Thu, Sep 05, 2019 at 06:58:22PM +0200, Szabolcs Nagy wrote:
> > can you show an example use of open in musl code
> > where it is called form an api implementation
> > that is defined by iso c?
> >
> 
> No, I can't. And I think I understand now.
> 
> musl is trying to prevent linker errors from namespace pollution. More
> specifically, to prevent double definition errors. Such an error would
> happen during static linking, if a strong symbol from an unrelated
> standard were pulled in. To that end, weak aliases are handed out on an
> as-needed basis. open() is not needed to implement any interface from a
> standard it is not a part of (fopen() inlines the syscall), so it gets
> no alias. mmap() is needed to implement malloc(), so it gets one. Repeat
> for all other functions.
> 
> How close am I?

Correct, but not complete. It's also a matter of preventing not
diagnosable linker errors, but calling the *wrong function*, in the
case of both static and dynamic linking.

If for example fopen called open but the application defined open,
that would not be a linker error. For static linking, the file
defining open from libc.a (open.o) simply wouldn't get pulled into the
link since there would be no outstanding unresolved reference to open
another .o file already defined it. For dynamic linking, semantically
the same thing would happen -- ELF semantics duplicate static linking
semantics, requiring that functions from shared libraries be
interposable. Now, musl uses --dynamic-list (previously
-Bsymbolic-functions) which binds these symbols at libc.so link-time,
but that's just an optimization, relying on the fact that redefining
them would be UB so we can assume it doesn't happen. It's not required
and won't be done if configure determines that the tooling doesn't
support it.

The case where a link error would come up is rarer, and happens when
the reserved symbol is defined in the same file or pulled in by
something else. For example, if you use mtx_lock, that will pull in
__pthread_mutex_timedlock. If pthread_mutex_timedlock were also
defined in the same file and were not weak, this could be a duplicate
definition error at link time when the main program defines its own
pthread_mutex_timedlock.

One example where they appear in the same file is pthread_join with
the pthread_*join_np functions in the same file -- if they were
strong definitions rather than just weak aliases that would produce
link errors on certain valid usage.

Rich


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-09-05 18:18 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-02 19:04 About those weak aliases Markus Wichmann
2019-09-02 20:10 ` Szabolcs Nagy
2019-09-02 23:01   ` Rich Felker
2019-09-03 10:13     ` Szabolcs Nagy
2019-09-03 12:08       ` Rich Felker
2019-09-05 16:50   ` Markus Wichmann
2019-09-05 16:58     ` Szabolcs Nagy
2019-09-05 17:29       ` Markus Wichmann
2019-09-05 18:18         ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).