mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] Care about Symbol Namespacing?
@ 2023-11-14  2:32 Eleanor Bartle
  2023-11-14  3:10 ` Rich Felker
  0 siblings, 1 reply; 9+ messages in thread
From: Eleanor Bartle @ 2023-11-14  2:32 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1036 bytes --]

[please cc]

ELF doesn't have a standard equivalent of Mach-O's Two-Level Namespace, but one can be grafted on, as Solaris does with Direct Binding. I've inquired about this on IRC and the objections raised against it concern moving symbols between or coalescing shared objects without breaking dependent binaries. What I'm wondering is, is it worth thinking about a symbol namespacing system that accounts for this? Would the robustness benefits of such a system be worth the specification complexity?

To be clear, I don't have such a proposal on hand, and it would take me a while to get one ready (and a while more to work out all the kinks I'll inevitably miss); I have the ghost of an idea involving components specifying interface names rather than filenames, which ld.so could then map to shared objects potentially non-injectively, but I don't know the fine details of implementation. This message is mainly to gauge if leadership is at all interested in the broad idea, to determine if even thinking about it is worth my time.

[-- Attachment #2: Type: text/html, Size: 1239 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [musl] Care about Symbol Namespacing?
  2023-11-14  2:32 [musl] Care about Symbol Namespacing? Eleanor Bartle
@ 2023-11-14  3:10 ` Rich Felker
  2023-11-14  3:33   ` Eleanor Bartle
  0 siblings, 1 reply; 9+ messages in thread
From: Rich Felker @ 2023-11-14  3:10 UTC (permalink / raw)
  To: Eleanor Bartle; +Cc: musl

On Tue, Nov 14, 2023 at 01:32:02PM +1100, Eleanor Bartle wrote:
> [please cc]
> 
> ELF doesn't have a standard equivalent of Mach-O's Two-Level
> Namespace, but one can be grafted on, as Solaris does with Direct
> Binding. I've inquired about this on IRC and the objections raised
> against it concern moving symbols between or coalescing shared
> objects without breaking dependent binaries. What I'm wondering is,
> is it worth thinking about a symbol namespacing system that accounts
> for this? Would the robustness benefits of such a system be worth
> the specification complexity?
> 
> To be clear, I don't have such a proposal on hand, and it would take
> me a while to get one ready (and a while more to work out all the
> kinks I'll inevitably miss); I have the ghost of an idea involving
> components specifying interface names rather than filenames, which
> ld.so could then map to shared objects potentially non-injectively,
> but I don't know the fine details of implementation. This message is
> mainly to gauge if leadership is at all interested in the broad
> idea, to determine if even thinking about it is worth my time.

The lack of this in ELF was by design, with the intent to give dynamic
linking semantics equivalent to static linking. This is also aligned
with the musl values of treating static linking as first-class (not
having functionality that doesn't work, or behaves wrong/differently,
in static-linked programs). I don't want to see something like this in
ELF, and it's not something I would support adding to musl even if
there were an ELF extension for it.

As you noted, there are also concrete things that would have been
impossible (? at least difficult, and contingent on details) to fix
with such a system, like glibc moving symbols that wrongly ended up in
librt.so or libpthread.so to libc.so.

Rich

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [musl] Care about Symbol Namespacing?
  2023-11-14  3:10 ` Rich Felker
@ 2023-11-14  3:33   ` Eleanor Bartle
  2023-11-14 15:35     ` Markus Wichmann
  0 siblings, 1 reply; 9+ messages in thread
From: Eleanor Bartle @ 2023-11-14  3:33 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 2173 bytes --]

I see. So to justify such a feature there'd need to be an analogous one for static archives. Yeah, that's...ugly. I can begin to imagine such a mechanism but it twists everything out of shape. Not worth it.

On Tue, 14 Nov 2023, at 14:10, Rich Felker wrote:
> On Tue, Nov 14, 2023 at 01:32:02PM +1100, Eleanor Bartle wrote:
> > [please cc]
> > 
> > ELF doesn't have a standard equivalent of Mach-O's Two-Level
> > Namespace, but one can be grafted on, as Solaris does with Direct
> > Binding. I've inquired about this on IRC and the objections raised
> > against it concern moving symbols between or coalescing shared
> > objects without breaking dependent binaries. What I'm wondering is,
> > is it worth thinking about a symbol namespacing system that accounts
> > for this? Would the robustness benefits of such a system be worth
> > the specification complexity?
> > 
> > To be clear, I don't have such a proposal on hand, and it would take
> > me a while to get one ready (and a while more to work out all the
> > kinks I'll inevitably miss); I have the ghost of an idea involving
> > components specifying interface names rather than filenames, which
> > ld.so could then map to shared objects potentially non-injectively,
> > but I don't know the fine details of implementation. This message is
> > mainly to gauge if leadership is at all interested in the broad
> > idea, to determine if even thinking about it is worth my time.
> 
> The lack of this in ELF was by design, with the intent to give dynamic
> linking semantics equivalent to static linking. This is also aligned
> with the musl values of treating static linking as first-class (not
> having functionality that doesn't work, or behaves wrong/differently,
> in static-linked programs). I don't want to see something like this in
> ELF, and it's not something I would support adding to musl even if
> there were an ELF extension for it.
> 
> As you noted, there are also concrete things that would have been
> impossible (? at least difficult, and contingent on details) to fix
> with such a system, like glibc moving symbols that wrongly ended up in
> librt.so or libpthread.so to libc.so.
> 
> Rich
> 

[-- Attachment #2: Type: text/html, Size: 2919 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [musl] Care about Symbol Namespacing?
  2023-11-14  3:33   ` Eleanor Bartle
@ 2023-11-14 15:35     ` Markus Wichmann
  2023-11-15  6:11       ` Eleanor Bartle
  0 siblings, 1 reply; 9+ messages in thread
From: Markus Wichmann @ 2023-11-14 15:35 UTC (permalink / raw)
  To: musl; +Cc: Eleanor Bartle

Am Tue, Nov 14, 2023 at 02:33:15PM +1100 schrieb Eleanor Bartle:
> I see. So to justify such a feature there'd need to be an analogous
> one for static archives. Yeah, that's...ugly. I can begin to imagine
> such a mechanism but it twists everything out of shape. Not worth it.

Actually, no. The big overarching question is what you would hope to
achieve with that feature. As I understand it, it is essentially what
Windows does with the Import Directory, where you specify for each
symbol what object it comes from.

This would completely break linking semantics as of today. It's not that
it isn't supported with static linking, it's that it would break
existing workflows. Currently, in dynamically linked applications you
can set LD_PRELOAD to overload symbols existing in otherwise loaded
libraries, even libc symbols. This is useful to temporarily run an
application with a different malloc() implementation, for example, or
try out how much vectorized mem* functions would impact the run time.
You can also use these approaches with static linking, but it would
require re-linking each time.

Adding such an extension would break this. Now libc symbols could only
come from libc, and LD_PRELOAD wouldn't work anymore. And for no real
benefit, or at least I can't see one.

Ciao,
Markus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [musl] Care about Symbol Namespacing?
  2023-11-14 15:35     ` Markus Wichmann
@ 2023-11-15  6:11       ` Eleanor Bartle
  2023-11-15 15:20         ` Rich Felker
  0 siblings, 1 reply; 9+ messages in thread
From: Eleanor Bartle @ 2023-11-15  6:11 UTC (permalink / raw)
  To: Markus Wichmann, musl

[-- Attachment #1: Type: text/plain, Size: 1976 bytes --]

That's about it, yes. Though I will point out that Solaris supports LD_PRELOAD just fine -- the preloads just need to be marked as such. For calls between components there's really no way to structurally prevent interposition.

The benefit is faster inter-component symbol lookup, as well as sanity in the face of an _accidental_ name collision. The tradeoff is complexity of specification to support all existing use cases. If the standard were being designed from scratch it might not be too hard to accomplish cleanly; to graft on to an existing model is a nightmare.

On Wed, 15 Nov 2023, at 02:35, Markus Wichmann wrote:
> Am Tue, Nov 14, 2023 at 02:33:15PM +1100 schrieb Eleanor Bartle:
> > I see. So to justify such a feature there'd need to be an analogous
> > one for static archives. Yeah, that's...ugly. I can begin to imagine
> > such a mechanism but it twists everything out of shape. Not worth it.
> 
> Actually, no. The big overarching question is what you would hope to
> achieve with that feature. As I understand it, it is essentially what
> Windows does with the Import Directory, where you specify for each
> symbol what object it comes from.
> 
> This would completely break linking semantics as of today. It's not that
> it isn't supported with static linking, it's that it would break
> existing workflows. Currently, in dynamically linked applications you
> can set LD_PRELOAD to overload symbols existing in otherwise loaded
> libraries, even libc symbols. This is useful to temporarily run an
> application with a different malloc() implementation, for example, or
> try out how much vectorized mem* functions would impact the run time.
> You can also use these approaches with static linking, but it would
> require re-linking each time.
> 
> Adding such an extension would break this. Now libc symbols could only
> come from libc, and LD_PRELOAD wouldn't work anymore. And for no real
> benefit, or at least I can't see one.
> 
> Ciao,
> Markus
> 

[-- Attachment #2: Type: text/html, Size: 2584 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [musl] Care about Symbol Namespacing?
  2023-11-15  6:11       ` Eleanor Bartle
@ 2023-11-15 15:20         ` Rich Felker
  2023-11-28  5:17           ` Fangrui Song
  0 siblings, 1 reply; 9+ messages in thread
From: Rich Felker @ 2023-11-15 15:20 UTC (permalink / raw)
  To: Eleanor Bartle; +Cc: Markus Wichmann, musl

On Wed, Nov 15, 2023 at 05:11:02PM +1100, Eleanor Bartle wrote:
> That's about it, yes. Though I will point out that Solaris supports
> LD_PRELOAD just fine -- the preloads just need to be marked as such.
> For calls between components there's really no way to structurally
> prevent interposition.
> 
> The benefit is faster inter-component symbol lookup, as well as
> sanity in the face of an _accidental_ name collision. The tradeoff
> is complexity of specification to support all existing use cases. If
> the standard were being designed from scratch it might not be too
> hard to accomplish cleanly; to graft on to an existing model is a
> nightmare.

If your intent is just to check for accidental name collision, you can
do this with diagnostic tooling not runtime semantic changes. And this
is what you want to know, and what I mean by static linking being
first-class. Making accidental name collisions transparently work
would make it so things break horribly when someone decides they want
to static link, and you as the author don't realize this because you
never tried static linking. What's better would be running a tool that
basically just does ldd and looks for multiple definitions of the same
symbol, and tells you "you've got something wrong! you need to fix
that!"

Rich

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [musl] Care about Symbol Namespacing?
  2023-11-15 15:20         ` Rich Felker
@ 2023-11-28  5:17           ` Fangrui Song
  2023-11-29 13:45             ` Carlos O'Donell
  0 siblings, 1 reply; 9+ messages in thread
From: Fangrui Song @ 2023-11-28  5:17 UTC (permalink / raw)
  To: Eleanor Bartle; +Cc: musl, Markus Wichmann

On Mon, Nov 13, 2023 at 7:02 PM Eleanor Bartle <eleanor@eleanor-nb.com> wrote:
>
> [please cc]
>
> ELF doesn't have a standard equivalent of Mach-O's Two-Level Namespace, but one can be grafted on, as Solaris does with Direct Binding. I've inquired about this on IRC and the objections raised against it concern moving symbols between or coalescing shared objects without breaking dependent binaries. What I'm wondering is, is it worth thinking about a symbol namespacing system that accounts for this? Would the robustness benefits of such a system be worth the specification complexity?

I would be very interested in knowing the speedup of such a system.
Solaris folks can share performance numbers when direct bindings are
disabled, to give us some idea about how much we could expect.

I am also curious how much we can achieve by utilizing -Bsymbolic
family linker options:
https://maskray.me/blog/2021-05-16-elf-interposition-and-bsymbolic

On Wed, Nov 15, 2023 at 7:20 AM Rich Felker <dalias@libc.org> wrote:
>
> On Wed, Nov 15, 2023 at 05:11:02PM +1100, Eleanor Bartle wrote:
> > That's about it, yes. Though I will point out that Solaris supports
> > LD_PRELOAD just fine -- the preloads just need to be marked as such.
> > For calls between components there's really no way to structurally
> > prevent interposition.
> >
> > The benefit is faster inter-component symbol lookup, as well as
> > sanity in the face of an _accidental_ name collision. The tradeoff
> > is complexity of specification to support all existing use cases. If
> > the standard were being designed from scratch it might not be too
> > hard to accomplish cleanly; to graft on to an existing model is a
> > nightmare.
>
> If your intent is just to check for accidental name collision, you can
> do this with diagnostic tooling not runtime semantic changes. And this
> is what you want to know, and what I mean by static linking being
> first-class. Making accidental name collisions transparently work
> would make it so things break horribly when someone decides they want
> to static link, and you as the author don't realize this because you
> never tried static linking. What's better would be running a tool that
> basically just does ldd and looks for multiple definitions of the same
> symbol, and tells you "you've got something wrong! you need to fix
> that!"
>
> Rich

For name collision issues, I am thinking of an one-definition-rule
violation checking feature
https://maskray.me/blog/2022-11-13-odr-violation-detection#future-direction

---

Mach-O's Two-Level Namespace introduced several linker options to
enable symbol moving from one library to another.
https://blog.darlinghq.org/2018/07/mach-o-linking-and-loading-tricks.html
"To make that possible, Mach-O, ld, and dyld provide a few additional
features, namely, sub-libraries, reexporting symbols, and
meta-symbols."

GNU symbol versioning is actually a system that provides the import
file information: vn_file.
However, glibc rtld does not utilize vn_file to speed up symbol searches.
In addition,

> https://maskray.me/blog/2020-11-26-all-about-symbol-versioning#version-script  vn_file is essentially ignored for symbol search since glibc 2.30 https://sourceware.org/bugzilla/show_bug.cgi?id=24741 . Previously during relocation resolving, after an object failed to provide a match, if it matched vn_file, rtld would report an error `symbol %s version %s not defined in file %s with link time reference`.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [musl] Care about Symbol Namespacing?
  2023-11-28  5:17           ` Fangrui Song
@ 2023-11-29 13:45             ` Carlos O'Donell
  2023-12-03 15:54               ` Fangrui Song
  0 siblings, 1 reply; 9+ messages in thread
From: Carlos O'Donell @ 2023-11-29 13:45 UTC (permalink / raw)
  To: musl, Fangrui Song, Eleanor Bartle; +Cc: Markus Wichmann

On 11/28/23 00:17, Fangrui Song wrote:
> GNU symbol versioning is actually a system that provides the import 
> file information: vn_file. However, glibc rtld does not utilize
> vn_file to speed up symbol searches. In addition,
> 
>> https://maskray.me/blog/2020-11-26-all-about-symbol-versioning#version-script
>> vn_file is essentially ignored for symbol search since glibc 2.30
>> https://sourceware.org/bugzilla/show_bug.cgi?id=24741 . Previously
>> during relocation resolving, after an object failed to provide a
>> match, if it matched vn_file, rtld would report an error `symbol %s
>> version %s not defined in file %s with link time reference`.
 
This change in glibc was intentional. I agree with Rich here that static linking
should be treated as a first class feature and glibc has moved towards ensuring
that dynamic and static linking behaviour is more similar. The exception here is
that in glibc the goal will be to give developers the option to disallow
dlopen() from a statically linked application; thus providing the developer
assurances that nothing else will be loaded (important when crossing namespace
boundaries, particularly mount namespaces).

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [musl] Care about Symbol Namespacing?
  2023-11-29 13:45             ` Carlos O'Donell
@ 2023-12-03 15:54               ` Fangrui Song
  0 siblings, 0 replies; 9+ messages in thread
From: Fangrui Song @ 2023-12-03 15:54 UTC (permalink / raw)
  To: musl, Carlos O'Donell; +Cc: Eleanor Bartle, Markus Wichmann

On Wed, Nov 29, 2023 at 5:46 AM Carlos O'Donell <carlos@redhat.com> wrote:
>
> On 11/28/23 00:17, Fangrui Song wrote:
> > GNU symbol versioning is actually a system that provides the import
> > file information: vn_file. However, glibc rtld does not utilize
> > vn_file to speed up symbol searches. In addition,
> >
> >> https://maskray.me/blog/2020-11-26-all-about-symbol-versioning#version-script
> >> vn_file is essentially ignored for symbol search since glibc 2.30
> >> https://sourceware.org/bugzilla/show_bug.cgi?id=24741 . Previously
> >> during relocation resolving, after an object failed to provide a
> >> match, if it matched vn_file, rtld would report an error `symbol %s
> >> version %s not defined in file %s with link time reference`.
>
> This change in glibc was intentional.

Yes. I agree that dropping the error is useful for symbol versioning.

> I agree with Rich here that static linking
> should be treated as a first class feature and glibc has moved towards ensuring
> that dynamic and static linking behaviour is more similar.

The similarity between archives and shared objects is a vague concept.
That said, I have tried to figure out the similar parts at
https://maskray.me/blog/2021-05-16-elf-interposition-and-bsymbolic#elf-interposition

* If a dynamic symbol is defined by multiple components, they don't conflict.
* For a symbol lookup (due to a relocation like
R_*_JUMP_SLOT/R_*_GLOB_DAT/absolute relocation/etc), the definition
from the first component wins.
* Definitions from subsequent components are overridden.

We can still add Solaris direct bindings style symbol search while
preserving these properties.

> The exception here is
> that in glibc the goal will be to give developers the option to disallow
> dlopen() from a statically linked application; thus providing the developer
> assurances that nothing else will be loaded (important when crossing namespace
> boundaries, particularly mount namespaces).
>
> --
> Cheers,
> Carlos.
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-12-03 16:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-14  2:32 [musl] Care about Symbol Namespacing? Eleanor Bartle
2023-11-14  3:10 ` Rich Felker
2023-11-14  3:33   ` Eleanor Bartle
2023-11-14 15:35     ` Markus Wichmann
2023-11-15  6:11       ` Eleanor Bartle
2023-11-15 15:20         ` Rich Felker
2023-11-28  5:17           ` Fangrui Song
2023-11-29 13:45             ` Carlos O'Donell
2023-12-03 15:54               ` Fangrui Song

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).