mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] [PATCH] Add REL_COPY size change detection
@ 2020-02-26  5:24 Markus Wichmann
  2020-02-26 17:36 ` Rich Felker
  0 siblings, 1 reply; 10+ messages in thread
From: Markus Wichmann @ 2020-02-26  5:24 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1118 bytes --]

Hi all,

I was recently reading the Oracle docs about the ELF, and I came across
their chapter about the COPY relocation. They discuraged its use, since
with those relocations, a binding exists between importing and exporting
module. If the semantics of the imported object changes, then this is an
ABI mismatch.

So I looked at the musl source code and noticed that COPY relocations
are simply processed, and an ABI mismatch is simply accepted. So, since
I am of the opinion that detectable errors should be detected, rather
than left to fester and spring a hard-to-explain bug on you, usually
five minutes before deadline, I wrote the attached patch to add
detection for at least a changed size. This won't detect all changes to
ABI regarding COPY relocation (e.g.int-->float, or in an array of
structs, a change to the struct size and to the array size cancelling
each other out), but it should find most of them.

Also, I wondered whether COPY relocations are even still in use. But on
my system (currently some Ubuntu version) I found over 15000 of the
things. Mostly for stdout and stderr, though.

Ciao,
Markus

[-- Attachment #2: 0001-Add-detection-for-changed-size-of-a-COPY-relocation.patch --]
[-- Type: text/x-diff, Size: 1631 bytes --]

From 75e98f4e4cef2eb2b867062aebc481c3b1f66498 Mon Sep 17 00:00:00 2001
From: Markus Wichmann <nullplan@gmx.net>
Date: Wed, 26 Feb 2020 06:09:14 +0100
Subject: [PATCH] Add detection for changed size of a COPY relocation.

COPY relocations create an ABI binding between importing and exporting
module. Should anything about the object in question change, that would
be an ABI change, and therefore incompatible. While the dynamic linker
is not capable of detecting all changes, it can detect most of them by
detecting a changed size between import and export. Any change is a
problem, since the source buffer will be either overread or underread.
In any case, if the semantics of the imported object changed, the ABI
contract is broken, and it is better to detect this than to silently
allow it and inexplicably crash later on.
---
 ldso/dynlink.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/ldso/dynlink.c b/ldso/dynlink.c
index afec985a..618c2cbd 100644
--- a/ldso/dynlink.c
+++ b/ldso/dynlink.c
@@ -435,6 +435,15 @@ static void do_relocs(struct dso *dso, size_t *rel, size_t rel_size, size_t stri
 			else *reloc_addr = (size_t)base + addend;
 			break;
 		case REL_COPY:
+			if (def.sym && sym->st_size != def.sym->st_size) {
+				error("Error relocating %s: %s: Size mismatch in COPY"
+						" relocation (exp %lu, got %lu)",
+						dso->name, name
+						sym->st_size + 0ul,
+						def.sym->st_size + 0ul);
+				if (runtime) longjmp(*rtld_fail, 1);
+				continue;
+			}
 			memcpy(reloc_addr, (void *)sym_val, sym->st_size);
 			break;
 		case REL_OFFSET32:
--
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] [PATCH] Add REL_COPY size change detection
  2020-02-26  5:24 [musl] [PATCH] Add REL_COPY size change detection Markus Wichmann
@ 2020-02-26 17:36 ` Rich Felker
  2020-02-26 18:38   ` Florian Weimer
  2020-02-26 20:12   ` Markus Wichmann
  0 siblings, 2 replies; 10+ messages in thread
From: Rich Felker @ 2020-02-26 17:36 UTC (permalink / raw)
  To: musl

On Wed, Feb 26, 2020 at 06:24:48AM +0100, Markus Wichmann wrote:
> Hi all,
> 
> I was recently reading the Oracle docs about the ELF, and I came across
> their chapter about the COPY relocation. They discuraged its use, since
> with those relocations, a binding exists between importing and exporting
> module. If the semantics of the imported object changes, then this is an
> ABI mismatch.
> 
> So I looked at the musl source code and noticed that COPY relocations
> are simply processed, and an ABI mismatch is simply accepted. So, since
> I am of the opinion that detectable errors should be detected, rather
> than left to fester and spring a hard-to-explain bug on you, usually
> five minutes before deadline, I wrote the attached patch to add
> detection for at least a changed size. This won't detect all changes to
> ABI regarding COPY relocation (e.g.int-->float, or in an array of
> structs, a change to the struct size and to the array size cancelling
> each other out), but it should find most of them.
> 
> Also, I wondered whether COPY relocations are even still in use. But on
> my system (currently some Ubuntu version) I found over 15000 of the
> things. Mostly for stdout and stderr, though.

In theory copy relocations should have been gone for everyone who
switched to PIE, but then the gcc/binutils folks brought them back as
an "optimization" (allowing pc-rel access to external symbols). :-(

Anyway, I'm in agreement that we should avoid clobbering memory. The
old code did that already by using sym->st_size rather than
def.sym->st_size, which is why I hadn't seen this as an issue. But it
can over-read (and potentially fault) if the copy reloc has a larger
size than the definition, and if it's smaller the library might
wrongly access past the end of the copy, reading or clobbering
unrelated data.

At the very least I think we ought to catch and error on the case
where def.sym->st_size>sym->st_size, since we can't honor it and
failure to honor it can produce silent memory corruption. I'm less
sure about what to do if def.sym->st_size<sym->st-size; this case
seems safe and might be desirable not to break (I vaguely recall an
intent that it be ok), but if you think there are reasons it's
dangerous I'm ok with disallowing it too. I'm having a hard time now
thinking of a reason it would really help to support that, anyway.

> From 75e98f4e4cef2eb2b867062aebc481c3b1f66498 Mon Sep 17 00:00:00 2001
> From: Markus Wichmann <nullplan@gmx.net>
> Date: Wed, 26 Feb 2020 06:09:14 +0100
> Subject: [PATCH] Add detection for changed size of a COPY relocation.
> 
> COPY relocations create an ABI binding between importing and exporting
> module. Should anything about the object in question change, that would
> be an ABI change, and therefore incompatible. While the dynamic linker
> is not capable of detecting all changes, it can detect most of them by
> detecting a changed size between import and export. Any change is a
> problem, since the source buffer will be either overread or underread.
> In any case, if the semantics of the imported object changed, the ABI
> contract is broken, and it is better to detect this than to silently
> allow it and inexplicably crash later on.
> ---
>  ldso/dynlink.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/ldso/dynlink.c b/ldso/dynlink.c
> index afec985a..618c2cbd 100644
> --- a/ldso/dynlink.c
> +++ b/ldso/dynlink.c
> @@ -435,6 +435,15 @@ static void do_relocs(struct dso *dso, size_t *rel, size_t rel_size, size_t stri
>  			else *reloc_addr = (size_t)base + addend;
>  			break;
>  		case REL_COPY:
> +			if (def.sym && sym->st_size != def.sym->st_size) {
> +				error("Error relocating %s: %s: Size mismatch in COPY"
> +						" relocation (exp %lu, got %lu)",
> +						dso->name, name
> +						sym->st_size + 0ul,
> +						def.sym->st_size + 0ul);

Missing , after name, and the indent is weird (2 extra levels). I'd
either align (spaces) with the opening paren or use just one indent
level, and fewer lines.

Semantically, st_size is size_t not ulong, so I'd probably lean
towards a cast to (size_t) here with %zu.

> +				if (runtime) longjmp(*rtld_fail, 1);
> +				continue;
> +			}
>  			memcpy(reloc_addr, (void *)sym_val, sym->st_size);
>  			break;
>  		case REL_OFFSET32:
> --

Otherwise LGTM.

Rich

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] [PATCH] Add REL_COPY size change detection
  2020-02-26 17:36 ` Rich Felker
@ 2020-02-26 18:38   ` Florian Weimer
  2020-02-26 19:00     ` Rich Felker
  2020-02-26 20:12   ` Markus Wichmann
  1 sibling, 1 reply; 10+ messages in thread
From: Florian Weimer @ 2020-02-26 18:38 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

* Rich Felker:

> At the very least I think we ought to catch and error on the case
> where def.sym->st_size>sym->st_size, since we can't honor it and
> failure to honor it can produce silent memory corruption. I'm less
> sure about what to do if def.sym->st_size<sym->st-size; this case
> seems safe and might be desirable not to break (I vaguely recall an
> intent that it be ok), but if you think there are reasons it's
> dangerous I'm ok with disallowing it too. I'm having a hard time now
> thinking of a reason it would really help to support that, anyway.

Unfortunately the Mozilla NSS people disagree that size mismatches for
global symbols are an ABI break.  I don't know if this is relevant in
the musl context, but it means that for glibc, we probably can't make
it a hard error.

I want to have better diagnostics for this in glibc, but the current
warning (which is poorly worded at that) is in the
architecture-specific code, and I got side-tracked when I tried to
clean this up the last time.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] [PATCH] Add REL_COPY size change detection
  2020-02-26 18:38   ` Florian Weimer
@ 2020-02-26 19:00     ` Rich Felker
  2020-02-26 19:53       ` Florian Weimer
  0 siblings, 1 reply; 10+ messages in thread
From: Rich Felker @ 2020-02-26 19:00 UTC (permalink / raw)
  To: musl

On Wed, Feb 26, 2020 at 07:38:31PM +0100, Florian Weimer wrote:
> * Rich Felker:
> 
> > At the very least I think we ought to catch and error on the case
> > where def.sym->st_size>sym->st_size, since we can't honor it and
> > failure to honor it can produce silent memory corruption. I'm less
> > sure about what to do if def.sym->st_size<sym->st-size; this case
> > seems safe and might be desirable not to break (I vaguely recall an
> > intent that it be ok), but if you think there are reasons it's
> > dangerous I'm ok with disallowing it too. I'm having a hard time now
> > thinking of a reason it would really help to support that, anyway.
> 
> Unfortunately the Mozilla NSS people disagree that size mismatches for
> global symbols are an ABI break.  I don't know if this is relevant in
> the musl context, but it means that for glibc, we probably can't make
> it a hard error.
> 
> I want to have better diagnostics for this in glibc, but the current
> warning (which is poorly worded at that) is in the
> architecture-specific code, and I got side-tracked when I tried to
> clean this up the last time.

Thanks for the feedback. Do you have a source where we could read more
about this? What non-broken behavior do they expect to get when sizes
don't match?

As an aside, I think we should be encouraging distros that are using
PIE to get rid of copy relocations by passing whatever options are
needed (or building gcc with whatever options are needed) to avoid
emitting them in PIE. IIRC I looked this up once but I can't remember
what I found.

Rich

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] [PATCH] Add REL_COPY size change detection
  2020-02-26 19:00     ` Rich Felker
@ 2020-02-26 19:53       ` Florian Weimer
  2020-02-26 22:48         ` Rich Felker
  0 siblings, 1 reply; 10+ messages in thread
From: Florian Weimer @ 2020-02-26 19:53 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

* Rich Felker:

> On Wed, Feb 26, 2020 at 07:38:31PM +0100, Florian Weimer wrote:
>> * Rich Felker:
>> 
>> > At the very least I think we ought to catch and error on the case
>> > where def.sym->st_size>sym->st_size, since we can't honor it and
>> > failure to honor it can produce silent memory corruption. I'm less
>> > sure about what to do if def.sym->st_size<sym->st-size; this case
>> > seems safe and might be desirable not to break (I vaguely recall an
>> > intent that it be ok), but if you think there are reasons it's
>> > dangerous I'm ok with disallowing it too. I'm having a hard time now
>> > thinking of a reason it would really help to support that, anyway.
>> 
>> Unfortunately the Mozilla NSS people disagree that size mismatches for
>> global symbols are an ABI break.  I don't know if this is relevant in
>> the musl context, but it means that for glibc, we probably can't make
>> it a hard error.
>> 
>> I want to have better diagnostics for this in glibc, but the current
>> warning (which is poorly worded at that) is in the
>> architecture-specific code, and I got side-tracked when I tried to
>> clean this up the last time.
>
> Thanks for the feedback. Do you have a source where we could read more
> about this? What non-broken behavior do they expect to get when sizes
> don't match?

There's an NSS bug report:

  <https://bugzilla.mozilla.org/show_bug.cgi?id=1201900>

It seems that the NSS situation is better than what I remembered.

> As an aside, I think we should be encouraging distros that are using
> PIE to get rid of copy relocations by passing whatever options are
> needed (or building gcc with whatever options are needed) to avoid
> emitting them in PIE. IIRC I looked this up once but I can't remember
> what I found.

If I recall correctly, the optimization was a factor when rolling out
PIE-by-default in Fedora.  I do not know if we can revert it without
switching back to fixed-address builds.

Even if we did that, the ABI incompatibility will still be there.
There is also a similar truncation issue for TLS variables, I think.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] [PATCH] Add REL_COPY size change detection
  2020-02-26 17:36 ` Rich Felker
  2020-02-26 18:38   ` Florian Weimer
@ 2020-02-26 20:12   ` Markus Wichmann
  2020-02-26 22:02     ` Rich Felker
  1 sibling, 1 reply; 10+ messages in thread
From: Markus Wichmann @ 2020-02-26 20:12 UTC (permalink / raw)
  To: musl

On Wed, Feb 26, 2020 at 12:36:21PM -0500, Rich Felker wrote:
> In theory copy relocations should have been gone for everyone who
> switched to PIE, but then the gcc/binutils folks brought them back as
> an "optimization" (allowing pc-rel access to external symbols). :-(
>

Several quotes about optimization come to mind. Especially ones with the
word "premature" in them.

I just checked and I saw that there are no copy relocs in my libraries,
but my binaries are indeed PIEs. Therefore I have to conclude that gcc
compiles PIE and PIC differently. Weren't they meant to be the same,
except maybe for the selection of object files during the linker phase?

> At the very least I think we ought to catch and error on the case
> where def.sym->st_size>sym->st_size, since we can't honor it and
> failure to honor it can produce silent memory corruption. I'm less
> sure about what to do if def.sym->st_size<sym->st-size; this case
> seems safe and might be desirable not to break (I vaguely recall an
> intent that it be ok), but if you think there are reasons it's
> dangerous I'm ok with disallowing it too. I'm having a hard time now
> thinking of a reason it would really help to support that, anyway.
>

If the application did import sym->st_size bytes, it probably wanted to
do something with them. Finding less than that at the symbol and garbage
afterwards (and, from the point of view of the dynlinker, you have no
way of knowing what counts as garbage and what doesn't) might break the
program. Sometimes. That is exactly the kind of rare subtle breakage I
added the error for.

> Missing , after name, and the indent is weird (2 extra levels). I'd
> either align (spaces) with the opening paren or use just one indent
> level, and fewer lines.
>

The missing comma is weird. Must have gone missing as I was cutting that
line down to size (originally, I had this as one line, but then it was
around 200 columns long, and I thought that was a bit excessive). So
then I tried to stay below 80 and you see the result before you.

The indent was just vim's default setting. Usually this suits me, but
usually I have shiftwidth=4 and expandtab. For musl, it's sw=8 and noet.

> Semantically, st_size is size_t not ulong, so I'd probably lean
> towards a cast to (size_t) here with %zu.
>

Isn't on Linux size_t always a ulong? I will admit I only went for ulong
because it's easier to write the cast.

Ciao,
Markus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] [PATCH] Add REL_COPY size change detection
  2020-02-26 20:12   ` Markus Wichmann
@ 2020-02-26 22:02     ` Rich Felker
  0 siblings, 0 replies; 10+ messages in thread
From: Rich Felker @ 2020-02-26 22:02 UTC (permalink / raw)
  To: musl

On Wed, Feb 26, 2020 at 09:12:02PM +0100, Markus Wichmann wrote:
> On Wed, Feb 26, 2020 at 12:36:21PM -0500, Rich Felker wrote:
> > In theory copy relocations should have been gone for everyone who
> > switched to PIE, but then the gcc/binutils folks brought them back as
> > an "optimization" (allowing pc-rel access to external symbols). :-(
> >
> 
> Several quotes about optimization come to mind. Especially ones with the
> word "premature" in them.
> 
> I just checked and I saw that there are no copy relocs in my libraries,
> but my binaries are indeed PIEs. Therefore I have to conclude that gcc
> compiles PIE and PIC differently. Weren't they meant to be the same,
> except maybe for the selection of object files during the linker phase?

Libraries inherently can't have copy relocations. Only the main
program can.

> > At the very least I think we ought to catch and error on the case
> > where def.sym->st_size>sym->st_size, since we can't honor it and
> > failure to honor it can produce silent memory corruption. I'm less
> > sure about what to do if def.sym->st_size<sym->st-size; this case
> > seems safe and might be desirable not to break (I vaguely recall an
> > intent that it be ok), but if you think there are reasons it's
> > dangerous I'm ok with disallowing it too. I'm having a hard time now
> > thinking of a reason it would really help to support that, anyway.
> >
> 
> If the application did import sym->st_size bytes, it probably wanted to
> do something with them.

Not really. For example:

extern char *strings[];
extern size_t nstrings;

This type of thing was common with historical mistakes like
sys_errlist, the equivalent for signals, etc., where the application
either had an out-of-band way of knowing the runtime-provided size, or
it could just assume all indices it got from the library were
in-bounds to access the array. It went out of fashion entirely because
of copy relocations breaking it.

> > Missing , after name, and the indent is weird (2 extra levels). I'd
> > either align (spaces) with the opening paren or use just one indent
> > level, and fewer lines.
> 
> The missing comma is weird. Must have gone missing as I was cutting that
> line down to size (originally, I had this as one line, but then it was
> around 200 columns long, and I thought that was a bit excessive). So
> then I tried to stay below 80 and you see the result before you.
> 
> The indent was just vim's default setting. Usually this suits me, but
> usually I have shiftwidth=4 and expandtab. For musl, it's sw=8 and noet.
> 
> > Semantically, st_size is size_t not ulong, so I'd probably lean
> > towards a cast to (size_t) here with %zu.
> 
> Isn't on Linux size_t always a ulong? I will admit I only went for ulong
> because it's easier to write the cast.

Yes, but the ldso code doesn't use that assumption, or at least it's
not intended to. It's used to some extent elsewhere in the syscall
glue framework.

Rich

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] [PATCH] Add REL_COPY size change detection
  2020-02-26 19:53       ` Florian Weimer
@ 2020-02-26 22:48         ` Rich Felker
  2020-02-27  5:00           ` Fangrui Song
  2020-04-17 10:58           ` Florian Weimer
  0 siblings, 2 replies; 10+ messages in thread
From: Rich Felker @ 2020-02-26 22:48 UTC (permalink / raw)
  To: musl

On Wed, Feb 26, 2020 at 08:53:03PM +0100, Florian Weimer wrote:
> * Rich Felker:
> 
> > On Wed, Feb 26, 2020 at 07:38:31PM +0100, Florian Weimer wrote:
> >> * Rich Felker:
> >> 
> >> > At the very least I think we ought to catch and error on the case
> >> > where def.sym->st_size>sym->st_size, since we can't honor it and
> >> > failure to honor it can produce silent memory corruption. I'm less
> >> > sure about what to do if def.sym->st_size<sym->st-size; this case
> >> > seems safe and might be desirable not to break (I vaguely recall an
> >> > intent that it be ok), but if you think there are reasons it's
> >> > dangerous I'm ok with disallowing it too. I'm having a hard time now
> >> > thinking of a reason it would really help to support that, anyway.
> >> 
> >> Unfortunately the Mozilla NSS people disagree that size mismatches for
> >> global symbols are an ABI break.  I don't know if this is relevant in
> >> the musl context, but it means that for glibc, we probably can't make
> >> it a hard error.
> >> 
> >> I want to have better diagnostics for this in glibc, but the current
> >> warning (which is poorly worded at that) is in the
> >> architecture-specific code, and I got side-tracked when I tried to
> >> clean this up the last time.
> >
> > Thanks for the feedback. Do you have a source where we could read more
> > about this? What non-broken behavior do they expect to get when sizes
> > don't match?
> 
> There's an NSS bug report:
> 
>   <https://bugzilla.mozilla.org/show_bug.cgi?id=1201900>
> 
> It seems that the NSS situation is better than what I remembered.

Good to know.

> > As an aside, I think we should be encouraging distros that are using
> > PIE to get rid of copy relocations by passing whatever options are
> > needed (or building gcc with whatever options are needed) to avoid
> > emitting them in PIE. IIRC I looked this up once but I can't remember
> > what I found.
> 
> If I recall correctly, the optimization was a factor when rolling out
> PIE-by-default in Fedora.  I do not know if we can revert it without
> switching back to fixed-address builds.

I think this is almost surely premature optimization. In almost all
cases, if there's software where the performance impact makes a
difference it can be avoided by giving the affected global data
objects visibility of hidden (if it's not used outside the main
program anyway) or protected (if it needs to be externally visible).
But on x86_64 and aarch64, and to some extent on 32-bit arm as well,
the performance difference of accessing globals via the got vs
pc-relative is negligible.

> Even if we did that, the ABI incompatibility will still be there.

Yes. But fixing it would avoid any bugs from fallout of the full
object not being copyable at runtime in any newly build programs.

> There is also a similar truncation issue for TLS variables, I think.

TLS variables never use copy relocations, except for a short period of
time on riscv64, where thankfully it was realized to be a mistake and
reverted. So I don't think this issue applies to TLS.

Rich

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] [PATCH] Add REL_COPY size change detection
  2020-02-26 22:48         ` Rich Felker
@ 2020-02-27  5:00           ` Fangrui Song
  2020-04-17 10:58           ` Florian Weimer
  1 sibling, 0 replies; 10+ messages in thread
From: Fangrui Song @ 2020-02-27  5:00 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

On 2020-02-26, Rich Felker wrote:
>On Wed, Feb 26, 2020 at 08:53:03PM +0100, Florian Weimer wrote:
>> * Rich Felker:
>>
>> > On Wed, Feb 26, 2020 at 07:38:31PM +0100, Florian Weimer wrote:
>> >> * Rich Felker:
>> >>
>> >> > At the very least I think we ought to catch and error on the case
>> >> > where def.sym->st_size>sym->st_size, since we can't honor it and
>> >> > failure to honor it can produce silent memory corruption. I'm less
>> >> > sure about what to do if def.sym->st_size<sym->st-size; this case
>> >> > seems safe and might be desirable not to break (I vaguely recall an
>> >> > intent that it be ok), but if you think there are reasons it's
>> >> > dangerous I'm ok with disallowing it too. I'm having a hard time now
>> >> > thinking of a reason it would really help to support that, anyway.
>> >>
>> >> Unfortunately the Mozilla NSS people disagree that size mismatches for
>> >> global symbols are an ABI break.  I don't know if this is relevant in
>> >> the musl context, but it means that for glibc, we probably can't make
>> >> it a hard error.
>> >>
>> >> I want to have better diagnostics for this in glibc, but the current
>> >> warning (which is poorly worded at that) is in the
>> >> architecture-specific code, and I got side-tracked when I tried to
>> >> clean this up the last time.
>> >
>> > Thanks for the feedback. Do you have a source where we could read more
>> > about this? What non-broken behavior do they expect to get when sizes
>> > don't match?

+1 for a `def.sym->st_size!=sym->st-size` diagnostic.

>> There's an NSS bug report:
>>
>>   <https://bugzilla.mozilla.org/show_bug.cgi?id=1201900>
>>
>> It seems that the NSS situation is better than what I remembered.
>
>Good to know.

There may be instances where the code takes the address of a global variable
but does not actually care about the contents (st_size does not matter).

>
>> > As an aside, I think we should be encouraging distros that are using
>> > PIE to get rid of copy relocations by passing whatever options are
>> > needed (or building gcc with whatever options are needed) to avoid
>> > emitting them in PIE. IIRC I looked this up once but I can't remember
>> > what I found.
>>
>> If I recall correctly, the optimization was a factor when rolling out
>> PIE-by-default in Fedora.  I do not know if we can revert it without
>> switching back to fixed-address builds.
>
>I think this is almost surely premature optimization. In almost all
>cases, if there's software where the performance impact makes a
>difference it can be avoided by giving the affected global data
>objects visibility of hidden (if it's not used outside the main
>program anyway) or protected (if it needs to be externally visible).
>But on x86_64 and aarch64, and to some extent on 32-bit arm as well,
>the performance difference of accessing globals via the got vs
>pc-relative is negligible.

clang has an option -mpie-copy-relocations (the name could be improved),
which enables direct access (usually PC-relative) for -fPIE.

// -target aarch64 -fPIE -O3
        adrp    x8, :got:var
        ldr     x8, [x8, :got_lo12:var]
        ldr     w0, [x8]
        ret
// -target aarch64 -fPIE -O3 -mpie-copy-relocations
        adrp    x8, var
        ldr     w0, [x8, :lo12:var]
        ret

On x86, with R_X86_64_[REX_]GOTPCRELX, the option is still beneficial.

// -O3 -fPIE a.c -Wa,--mrelax-relocations=yes
    0:   48 8b 05 00 00 00 00    mov    0x0(%rip),%rax        # 7 <foo+0x7>
                         3: R_X86_64_REX_GOTPCRELX       var-0x4
    7:   8b 00                   mov    (%rax),%eax
    9:   c3                      retq
// -O3 -fPIE a.c -mpie-copy-relocations
    0:   8b 05 00 00 00 00       mov    0x0(%rip),%eax        # 6 <foo+0x6>
                         2: R_X86_64_PC32        var-0x4
    6:   c3                      retq


Users of -mpie-copy-relocations may compile their applications in a
mostly statically linking mode, with only a few definitions in DSOs. If some
variables are unfortunately really external, R_*_COPY will be produced
by the linker.

>> Even if we did that, the ABI incompatibility will still be there.
>
>Yes. But fixing it would avoid any bugs from fallout of the full
>object not being copyable at runtime in any newly build programs.
>
>> There is also a similar truncation issue for TLS variables, I think.
>
>TLS variables never use copy relocations, except for a short period of
>time on riscv64, where thankfully it was realized to be a mistake and
>reverted. So I don't think this issue applies to TLS.
>
>Rich

https://sourceware.org/bugzilla/show_bug.cgi?id=23825
(GCC 10 riscv should have fixed this.)
(glibc/sysdeps/riscv/dl-machine.h has a hack supporting R_RISCV_COPY on
STT_TLS. No other known ld.so supports it.)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] [PATCH] Add REL_COPY size change detection
  2020-02-26 22:48         ` Rich Felker
  2020-02-27  5:00           ` Fangrui Song
@ 2020-04-17 10:58           ` Florian Weimer
  1 sibling, 0 replies; 10+ messages in thread
From: Florian Weimer @ 2020-04-17 10:58 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

* Rich Felker:

>> If I recall correctly, the optimization was a factor when rolling out
>> PIE-by-default in Fedora.  I do not know if we can revert it without
>> switching back to fixed-address builds.
>
> I think this is almost surely premature optimization. In almost all
> cases, if there's software where the performance impact makes a
> difference it can be avoided by giving the affected global data
> objects visibility of hidden (if it's not used outside the main
> program anyway) or protected (if it needs to be externally visible).
> But on x86_64 and aarch64, and to some extent on 32-bit arm as well,
> the performance difference of accessing globals via the got vs
> pc-relative is negligible.

I think the main obstacle is that we do not have the required
annotations in the header files.

LTO only for deciding whether calls are local or not could perhaps
solve this.

>> There is also a similar truncation issue for TLS variables, I think.
>
> TLS variables never use copy relocations, except for a short period of
> time on riscv64, where thankfully it was realized to be a mistake and
> reverted. So I don't think this issue applies to TLS.

You are right, the optimization I feared is not actually implemented.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-04-17 10:58 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-26  5:24 [musl] [PATCH] Add REL_COPY size change detection Markus Wichmann
2020-02-26 17:36 ` Rich Felker
2020-02-26 18:38   ` Florian Weimer
2020-02-26 19:00     ` Rich Felker
2020-02-26 19:53       ` Florian Weimer
2020-02-26 22:48         ` Rich Felker
2020-02-27  5:00           ` Fangrui Song
2020-04-17 10:58           ` Florian Weimer
2020-02-26 20:12   ` Markus Wichmann
2020-02-26 22:02     ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).