mailing list of musl libc
 help / color / Atom feed
* [musl] friendly errors for ABI mismatch
@ 2020-07-27 15:27 Ariadne Conill
  2020-07-27 16:03 ` Rich Felker
  2020-07-27 21:16 ` Florian Weimer
  0 siblings, 2 replies; 8+ messages in thread
From: Ariadne Conill @ 2020-07-27 15:27 UTC (permalink / raw)
  To: musl

Hello,

On 32-bit systems, musl 1.2 has a new ABI (due to time64).  This results in 
programs built against musl 1.2 failing to run against musl 1.1.  That part is 
fine, but you get an error message about being unable to relocate symbols, 
which is not really insightful if you don't know about the ABI break.

glibc, on the other hand, has a minimum version specified in every binary, and 
prints an error message saying the glibc is too old if this situation is 
encountered.

I think we should add this feature to musl, so that in the future if we have 
another ABI break, users will be given useful advice about how to fix it.  Due 
to the relocation error message, a few Alpine contributors have been tripped 
up while trying to debug their work...

Ariadne



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] friendly errors for ABI mismatch
  2020-07-27 15:27 [musl] friendly errors for ABI mismatch Ariadne Conill
@ 2020-07-27 16:03 ` Rich Felker
  2020-07-27 20:54   ` A. Wilcox
  2020-07-27 20:57   ` Ariadne Conill
  2020-07-27 21:16 ` Florian Weimer
  1 sibling, 2 replies; 8+ messages in thread
From: Rich Felker @ 2020-07-27 16:03 UTC (permalink / raw)
  To: musl

On Mon, Jul 27, 2020 at 09:27:28AM -0600, Ariadne Conill wrote:
> Hello,
> 
> On 32-bit systems, musl 1.2 has a new ABI (due to time64).  This results in 
> programs built against musl 1.2 failing to run against musl 1.1.  That part is 
> fine, but you get an error message about being unable to relocate symbols, 
> which is not really insightful if you don't know about the ABI break.
> 
> glibc, on the other hand, has a minimum version specified in every binary, and 
> prints an error message saying the glibc is too old if this situation is 
> encountered.
> 
> I think we should add this feature to musl, so that in the future if we have 
> another ABI break, users will be given useful advice about how to fix it.  Due 
> to the relocation error message, a few Alpine contributors have been tripped 
> up while trying to debug their work...

What you're seeing here is just a special case of the general property
that, if you've linked to a version of libc (or any library) that has
a new symbol and attempt to run with an older version, you'll get a
missing symbol error. It's very intentional (see libc comparison and
"forward compatibility") that we don't encode "minimum version number"
required anywhere. If you attempt to run with a library that has all
the symbols, it will run, subject to any bugs in the library version
you have and any functionality that returns with failure because it's
not supported in the version you have, etc.

There is no way to give a more high-level reason for the runtime link
failure like "your program needs time64 and you're running with an old
musl" because the code reporting the error *is the old musl* that's
not aware of whatever it is that the new binary is missing. Maybe you
have something else in mind that I don't fully understand, but
whatever it is it would only address future missing symbol errors, not
the ones you're seeing right now.

In regards to Alpine, I get this kind of error *all the time* with all
sorts of non-libc libraries while using edge. It's a consequence of
the package management system not encoding a dependency on the version
of the library package it was built against, which can be a good thing
(see "forwards compatibility" above -- it avoids the need to
unnecessarily upgrade a library package just because the system the
dependent package was built on happened to have a newer version
installed) but inherently leads to this sort of issue. Even if we
could "fix" it in libc somehow, it would still happen all over the
place with other libraries.

Rich

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] friendly errors for ABI mismatch
  2020-07-27 16:03 ` Rich Felker
@ 2020-07-27 20:54   ` A. Wilcox
  2020-07-27 20:57   ` Ariadne Conill
  1 sibling, 0 replies; 8+ messages in thread
From: A. Wilcox @ 2020-07-27 20:54 UTC (permalink / raw)
  To: musl

[-- Attachment #1.1: Type: text/plain, Size: 1514 bytes --]

On 27/07/2020 11:03, Rich Felker wrote:
> There is no way to give a more high-level reason for the runtime link
> failure like "your program needs time64 and you're running with an old
> musl" because the code reporting the error *is the old musl* that's
> not aware of whatever it is that the new binary is missing. Maybe you
> have something else in mind that I don't fully understand, but
> whatever it is it would only address future missing symbol errors, not
> the ones you're seeing right now.


I think the request here is to have a "minimum musl version" encoded in
the binary, so that the error would say "Sorry, this binary requires a
newer musl version than you have."

This is similar to the Win32 SUBSYSTEM property in PE.  "The specified
program requires a newer version of Windows."

The problem is that this would lead to the same issue that prevents musl
from defining a compiler macro with its number (#define __MUSL__
0x010201 or such) - those that backport patches and/or features to older
versions would necessarily be reporting a version number that is older
than the patch/feature.

Therefore, I don't see this working out for musl for the same reason.

The bug here is trying to run musl 1.2 programs on musl 1.1 at all; this
shouldn't even be possible.  I've certainly never hit this, even while
upgrading Adélie beta4 systems to RC1 on 32-bit computers.

Best,
--arw

-- 
A. Wilcox (awilfox)
Project Lead, Adélie Linux
https://www.adelielinux.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] friendly errors for ABI mismatch
  2020-07-27 16:03 ` Rich Felker
  2020-07-27 20:54   ` A. Wilcox
@ 2020-07-27 20:57   ` Ariadne Conill
  2020-07-27 21:50     ` Rich Felker
  1 sibling, 1 reply; 8+ messages in thread
From: Ariadne Conill @ 2020-07-27 20:57 UTC (permalink / raw)
  To: musl

Hello,

On Monday, July 27, 2020 10:03:30 AM MDT Rich Felker wrote:
> On Mon, Jul 27, 2020 at 09:27:28AM -0600, Ariadne Conill wrote:
> > Hello,
> > 
> > On 32-bit systems, musl 1.2 has a new ABI (due to time64).  This results
> > in
> > programs built against musl 1.2 failing to run against musl 1.1.  That
> > part is fine, but you get an error message about being unable to relocate
> > symbols, which is not really insightful if you don't know about the ABI
> > break.
> > 
> > glibc, on the other hand, has a minimum version specified in every binary,
> > and prints an error message saying the glibc is too old if this situation
> > is encountered.
> > 
> > I think we should add this feature to musl, so that in the future if we
> > have another ABI break, users will be given useful advice about how to
> > fix it.  Due to the relocation error message, a few Alpine contributors
> > have been tripped up while trying to debug their work...
> 
> What you're seeing here is just a special case of the general property
> that, if you've linked to a version of libc (or any library) that has
> a new symbol and attempt to run with an older version, you'll get a
> missing symbol error. It's very intentional (see libc comparison and
> "forward compatibility") that we don't encode "minimum version number"
> required anywhere. If you attempt to run with a library that has all
> the symbols, it will run, subject to any bugs in the library version
> you have and any functionality that returns with failure because it's
> not supported in the version you have, etc.
> 
> There is no way to give a more high-level reason for the runtime link
> failure like "your program needs time64 and you're running with an old
> musl" because the code reporting the error *is the old musl* that's
> not aware of whatever it is that the new binary is missing. Maybe you
> have something else in mind that I don't fully understand, but
> whatever it is it would only address future missing symbol errors, not
> the ones you're seeing right now.

Simply what I have in mind is having friendly errors in the future, obviously 
we cannot do it with time64.

Ariadne



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] friendly errors for ABI mismatch
  2020-07-27 15:27 [musl] friendly errors for ABI mismatch Ariadne Conill
  2020-07-27 16:03 ` Rich Felker
@ 2020-07-27 21:16 ` Florian Weimer
  1 sibling, 0 replies; 8+ messages in thread
From: Florian Weimer @ 2020-07-27 21:16 UTC (permalink / raw)
  To: Ariadne Conill; +Cc: musl

* Ariadne Conill:

> On 32-bit systems, musl 1.2 has a new ABI (due to time64).  This
> results in programs built against musl 1.2 failing to run against
> musl 1.1.  That part is fine, but you get an error message about
> being unable to relocate symbols, which is not really insightful if
> you don't know about the ABI break.

Are you concerned about static linking across musl versions, or the
dynamic run-time behavior?

There are limits what you can do without changing the entire
toolchain.

> glibc, on the other hand, has a minimum version specified in every
> binary, and prints an error message saying the glibc is too old if
> this situation is encountered.

Do you mean symbol versioning?  It's a bit more flexible than that.
Symbol versions are only required if a symbol with that version is
actually used.  And the symbol version does not have to be a number or
(in the GNU implementation) imply some sort of linear order.

Using symbol versioning has the advantage that the rest of the
toolchain already supports it.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] friendly errors for ABI mismatch
  2020-07-27 20:57   ` Ariadne Conill
@ 2020-07-27 21:50     ` Rich Felker
  2020-07-28  8:40       ` Florian Weimer
  0 siblings, 1 reply; 8+ messages in thread
From: Rich Felker @ 2020-07-27 21:50 UTC (permalink / raw)
  To: musl

On Mon, Jul 27, 2020 at 02:57:10PM -0600, Ariadne Conill wrote:
> Hello,
> 
> On Monday, July 27, 2020 10:03:30 AM MDT Rich Felker wrote:
> > On Mon, Jul 27, 2020 at 09:27:28AM -0600, Ariadne Conill wrote:
> > > Hello,
> > > 
> > > On 32-bit systems, musl 1.2 has a new ABI (due to time64).  This results
> > > in
> > > programs built against musl 1.2 failing to run against musl 1.1.  That
> > > part is fine, but you get an error message about being unable to relocate
> > > symbols, which is not really insightful if you don't know about the ABI
> > > break.
> > > 
> > > glibc, on the other hand, has a minimum version specified in every binary,
> > > and prints an error message saying the glibc is too old if this situation
> > > is encountered.
> > > 
> > > I think we should add this feature to musl, so that in the future if we
> > > have another ABI break, users will be given useful advice about how to
> > > fix it.  Due to the relocation error message, a few Alpine contributors
> > > have been tripped up while trying to debug their work...
> > 
> > What you're seeing here is just a special case of the general property
> > that, if you've linked to a version of libc (or any library) that has
> > a new symbol and attempt to run with an older version, you'll get a
> > missing symbol error. It's very intentional (see libc comparison and
> > "forward compatibility") that we don't encode "minimum version number"
> > required anywhere. If you attempt to run with a library that has all
> > the symbols, it will run, subject to any bugs in the library version
> > you have and any functionality that returns with failure because it's
> > not supported in the version you have, etc.
> > 
> > There is no way to give a more high-level reason for the runtime link
> > failure like "your program needs time64 and you're running with an old
> > musl" because the code reporting the error *is the old musl* that's
> > not aware of whatever it is that the new binary is missing. Maybe you
> > have something else in mind that I don't fully understand, but
> > whatever it is it would only address future missing symbol errors, not
> > the ones you're seeing right now.
> 
> Simply what I have in mind is having friendly errors in the future, obviously 
> we cannot do it with time64.

I'm still not sure what that would look like. ELF dynamic linking,
modeling C static linking semantics, does not bind symbol resolution
to a particular library, so there's no way to know that an unresolved
symbol was "supposed to be defined in libc" and that this means your
libc.so is too old. All the dynamic linker can tell is that the
program being loaded needs the symbol to be defined and that it's not
defined in any libraries present. And that's what the existing error
message tells you.

Symbol versioning, if used, changes this somewhat by binding to a
particular version string (which by convention usually contains a
library name too) *if* the library used to resolve it at runtime has
versioning, but for very good reasons we have not used and do not want
to use symbol versioning. (In short, like here it's an "approximate
solution" for most things people want to use it for, doesn't actually
achieve those things precisely, messes other things up in the process,
and has really really bad tooling support.)

Rich

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] friendly errors for ABI mismatch
  2020-07-27 21:50     ` Rich Felker
@ 2020-07-28  8:40       ` Florian Weimer
  2020-07-29  1:16         ` Rich Felker
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Weimer @ 2020-07-28  8:40 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

* Rich Felker:

> Symbol versioning, if used, changes this somewhat by binding to a
> particular version string (which by convention usually contains a
> library name too) *if* the library used to resolve it at runtime has
> versioning, but for very good reasons we have not used and do not want
> to use symbol versioning. (In short, like here it's an "approximate
> solution" for most things people want to use it for, doesn't actually
> achieve those things precisely, messes other things up in the process,
> and has really really bad tooling support.)

I think you should look at this from a different angle.  You could use
it just to produce an error message in case there is an ABI change,
but not for backwards compatibility with old binaries or enabling
otherwise ABI-incompatible changes without rebuilding the world.

With this approach, all symbols would have a single, default version.
New releases do not add new symbol version strings in general, except
when there is something like time64_t, in which the default (and only
version) for those symbols changes.  Over time, you will end up with a
few symbol versions, but at a much slower pace than what glibc does.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [musl] friendly errors for ABI mismatch
  2020-07-28  8:40       ` Florian Weimer
@ 2020-07-29  1:16         ` Rich Felker
  0 siblings, 0 replies; 8+ messages in thread
From: Rich Felker @ 2020-07-29  1:16 UTC (permalink / raw)
  To: Florian Weimer; +Cc: musl

On Tue, Jul 28, 2020 at 10:40:35AM +0200, Florian Weimer wrote:
> * Rich Felker:
> 
> > Symbol versioning, if used, changes this somewhat by binding to a
> > particular version string (which by convention usually contains a
> > library name too) *if* the library used to resolve it at runtime has
> > versioning, but for very good reasons we have not used and do not want
> > to use symbol versioning. (In short, like here it's an "approximate
> > solution" for most things people want to use it for, doesn't actually
> > achieve those things precisely, messes other things up in the process,
> > and has really really bad tooling support.)
> 
> I think you should look at this from a different angle.  You could use
> it just to produce an error message in case there is an ABI change,
> but not for backwards compatibility with old binaries or enabling
> otherwise ABI-incompatible changes without rebuilding the world.

The only ABI change here is in the ABI defined between libc consumers
using the time_t-derived libc types. From the standpoint of musl's ABI
surface, the change here was similar to any other instance of adding
new interfaces for new functionality except that the new symbols get
used implicitly via redirection rather than only when directly
referenced by the application.

> With this approach, all symbols would have a single, default version.
> New releases do not add new symbol version strings in general, except
> when there is something like time64_t, in which the default (and only
> version) for those symbols changes.  Over time, you will end up with a
> few symbol versions, but at a much slower pace than what glibc does.

I didn't want to get into a detailed discussion of how symbol
versioning is broken, but it is broken, and using it in place of
symbol redirection for time64 would not have meet the needs here. In
particular, without hacking on nonstandard semantics for version
resolution, the following would not work:

- Mixing old object files (static libraries) in code built with new
  musl and time64, or vice versa.

- Supporting call-intercepting interposition libraries like fakeroot
  in a way that's safe for both time32 and time64 binaries.

I suspect there are others that I'm not recalling right off. At least
dlsym would have needed to be handled differently but I suspect it
would still be possible to make it work.

Rich

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-27 15:27 [musl] friendly errors for ABI mismatch Ariadne Conill
2020-07-27 16:03 ` Rich Felker
2020-07-27 20:54   ` A. Wilcox
2020-07-27 20:57   ` Ariadne Conill
2020-07-27 21:50     ` Rich Felker
2020-07-28  8:40       ` Florian Weimer
2020-07-29  1:16         ` Rich Felker
2020-07-27 21:16 ` Florian Weimer

mailing list of musl libc

Archives are clonable: git clone --mirror http://inbox.vuxu.org/musl

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://inbox.vuxu.org/vuxu.archive.musl


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git