mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] Getting access to section data during dynlink.c
@ 2023-10-16  1:06 Farid Zakaria
  2023-10-16 14:26 ` Rich Felker
  0 siblings, 1 reply; 10+ messages in thread
From: Farid Zakaria @ 2023-10-16  1:06 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 461 bytes --]

Hi!

I'd like to read some section data during dynlink.c
Does anyone have any good suggestions on the best way to do so?
I believe most ELF files ask for the load to start from the start of the
ELF file.

I see in dynlink.c the kernel sends AT_PHDR as an auxiliary vector --
Should I try applying a fixed offset from it to get to the start of the
ehdr ?

Any advice is appreciated.

Please include me in the CC for the reply.
I can't recall if I've subscribed.

[-- Attachment #2: Type: text/html, Size: 600 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Getting access to section data during dynlink.c
  2023-10-16  1:06 [musl] Getting access to section data during dynlink.c Farid Zakaria
@ 2023-10-16 14:26 ` Rich Felker
  2023-10-16 21:09   ` Farid Zakaria
  2023-10-16 21:53   ` Szabolcs Nagy
  0 siblings, 2 replies; 10+ messages in thread
From: Rich Felker @ 2023-10-16 14:26 UTC (permalink / raw)
  To: Farid Zakaria; +Cc: musl

On Sun, Oct 15, 2023 at 06:06:48PM -0700, Farid Zakaria wrote:
> Hi!
> 
> I'd like to read some section data during dynlink.c
> Does anyone have any good suggestions on the best way to do so?
> I believe most ELF files ask for the load to start from the start of the
> ELF file.
> 
> I see in dynlink.c the kernel sends AT_PHDR as an auxiliary vector --
> Should I try applying a fixed offset from it to get to the start of the
> ehdr ?
> 
> Any advice is appreciated.
> 
> Please include me in the CC for the reply.
> I can't recall if I've subscribed.

Neither the Ehdrs nor sections are "loadable" parts of an executable
ELF file. They may happen to be present in the mapped pages due to
page granularity of mappings, but that doesn't mean they're guaranteed
to be there; the Ehdrs are for the program loader's use, and the
sections are for the use of linker (non-dynamic), debugger, etc.

In musl we use Ehdrs in a couple places: the dynamic linker finds its
own program headers via assuming they're mapped, but this is rather
reasonable since we built it and it's either going to always-succeed
or always-fail and get caught before deployment if that build-time
assumption somehow isn't met. It's not contingent on properties of a
program encountered at runtime. We also use Ehdrs when loading a
program (invoking ldso as a command) or shared library, but in that
case we are the loaded and have access to them via the file being
loaded.

Depending on what you want to do, and whether you just need to be
compatible with your own binaries or arbitrary ones, it may suffice to
do some sort of hack like rounding down from the program header
address to the start of the page and hoping the Ehdrs live there. But
it might make sense to look for other ways to do what you're trying to
do, without needing to access non-runtime data structures.

Rich

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Getting access to section data during dynlink.c
  2023-10-16 14:26 ` Rich Felker
@ 2023-10-16 21:09   ` Farid Zakaria
  2023-10-16 21:16     ` Farid Zakaria
  2023-10-16 21:53   ` Szabolcs Nagy
  1 sibling, 1 reply; 10+ messages in thread
From: Farid Zakaria @ 2023-10-16 21:09 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

Thanks for the advice.

I'm now looking at the following two options (seeing if they work):
1. I see that an auxiliary vector is the FD or the name of the file
given to dynlink.
Can I mmap this file again in it's entirety and read the section I care about?

2. Try to create a segment that maps to the sections I care about such
that they are loaded.
(Try to apply some heuristic to then identify it)

Early attempts at (1) have the mmap failing -- i need to debug further.


On Mon, Oct 16, 2023 at 7:25 AM Rich Felker <dalias@libc.org> wrote:
>
> On Sun, Oct 15, 2023 at 06:06:48PM -0700, Farid Zakaria wrote:
> > Hi!
> >
> > I'd like to read some section data during dynlink.c
> > Does anyone have any good suggestions on the best way to do so?
> > I believe most ELF files ask for the load to start from the start of the
> > ELF file.
> >
> > I see in dynlink.c the kernel sends AT_PHDR as an auxiliary vector --
> > Should I try applying a fixed offset from it to get to the start of the
> > ehdr ?
> >
> > Any advice is appreciated.
> >
> > Please include me in the CC for the reply.
> > I can't recall if I've subscribed.
>
> Neither the Ehdrs nor sections are "loadable" parts of an executable
> ELF file. They may happen to be present in the mapped pages due to
> page granularity of mappings, but that doesn't mean they're guaranteed
> to be there; the Ehdrs are for the program loader's use, and the
> sections are for the use of linker (non-dynamic), debugger, etc.
>
> In musl we use Ehdrs in a couple places: the dynamic linker finds its
> own program headers via assuming they're mapped, but this is rather
> reasonable since we built it and it's either going to always-succeed
> or always-fail and get caught before deployment if that build-time
> assumption somehow isn't met. It's not contingent on properties of a
> program encountered at runtime. We also use Ehdrs when loading a
> program (invoking ldso as a command) or shared library, but in that
> case we are the loaded and have access to them via the file being
> loaded.
>
> Depending on what you want to do, and whether you just need to be
> compatible with your own binaries or arbitrary ones, it may suffice to
> do some sort of hack like rounding down from the program header
> address to the start of the page and hoping the Ehdrs live there. But
> it might make sense to look for other ways to do what you're trying to
> do, without needing to access non-runtime data structures.
>
> Rich

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Getting access to section data during dynlink.c
  2023-10-16 21:09   ` Farid Zakaria
@ 2023-10-16 21:16     ` Farid Zakaria
  0 siblings, 0 replies; 10+ messages in thread
From: Farid Zakaria @ 2023-10-16 21:16 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

Okay -- the following works (see below).
At first I was trying with AT_EXECFD but looks like it's not set for
ELF -- instead I used the app.name variable.

```
int fd = open(app.name, O_RDONLY);
if (fd < 0) {
        dprintf(2, "failed to open");
        _exit(1);
    }

struct stat st;
    fstat(fd, &st);
    const ElfW(Ehdr)* ehdr = mmap(NULL, st.st_size, PROT_READ,
MAP_PRIVATE, fd, 0);
    if (ehdr == MAP_FAILED) {
        dprintf(2, "failed to mmap");
        _exit(1);
    }

if (!ehdr || memcmp(ehdr->e_ident, ELFMAG, SELFMAG) != 0) {
dprintf(2, "Not a valid elf file\n");
_exit(1);
    }

const ElfW(Shdr)* section_header = find_section_by_name(ehdr, ".watever");
if (section_header == NULL) {
dprintf(2, "Cannot find .sqlelf section\n");
_exit(1);
}
```

On Mon, Oct 16, 2023 at 2:09 PM Farid Zakaria <fmzakari@ucsc.edu> wrote:
>
> Thanks for the advice.
>
> I'm now looking at the following two options (seeing if they work):
> 1. I see that an auxiliary vector is the FD or the name of the file
> given to dynlink.
> Can I mmap this file again in it's entirety and read the section I care about?
>
> 2. Try to create a segment that maps to the sections I care about such
> that they are loaded.
> (Try to apply some heuristic to then identify it)
>
> Early attempts at (1) have the mmap failing -- i need to debug further.
>
>
> On Mon, Oct 16, 2023 at 7:25 AM Rich Felker <dalias@libc.org> wrote:
> >
> > On Sun, Oct 15, 2023 at 06:06:48PM -0700, Farid Zakaria wrote:
> > > Hi!
> > >
> > > I'd like to read some section data during dynlink.c
> > > Does anyone have any good suggestions on the best way to do so?
> > > I believe most ELF files ask for the load to start from the start of the
> > > ELF file.
> > >
> > > I see in dynlink.c the kernel sends AT_PHDR as an auxiliary vector --
> > > Should I try applying a fixed offset from it to get to the start of the
> > > ehdr ?
> > >
> > > Any advice is appreciated.
> > >
> > > Please include me in the CC for the reply.
> > > I can't recall if I've subscribed.
> >
> > Neither the Ehdrs nor sections are "loadable" parts of an executable
> > ELF file. They may happen to be present in the mapped pages due to
> > page granularity of mappings, but that doesn't mean they're guaranteed
> > to be there; the Ehdrs are for the program loader's use, and the
> > sections are for the use of linker (non-dynamic), debugger, etc.
> >
> > In musl we use Ehdrs in a couple places: the dynamic linker finds its
> > own program headers via assuming they're mapped, but this is rather
> > reasonable since we built it and it's either going to always-succeed
> > or always-fail and get caught before deployment if that build-time
> > assumption somehow isn't met. It's not contingent on properties of a
> > program encountered at runtime. We also use Ehdrs when loading a
> > program (invoking ldso as a command) or shared library, but in that
> > case we are the loaded and have access to them via the file being
> > loaded.
> >
> > Depending on what you want to do, and whether you just need to be
> > compatible with your own binaries or arbitrary ones, it may suffice to
> > do some sort of hack like rounding down from the program header
> > address to the start of the page and hoping the Ehdrs live there. But
> > it might make sense to look for other ways to do what you're trying to
> > do, without needing to access non-runtime data structures.
> >
> > Rich

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Getting access to section data during dynlink.c
  2023-10-16 14:26 ` Rich Felker
  2023-10-16 21:09   ` Farid Zakaria
@ 2023-10-16 21:53   ` Szabolcs Nagy
  2023-10-16 22:04     ` Rich Felker
  1 sibling, 1 reply; 10+ messages in thread
From: Szabolcs Nagy @ 2023-10-16 21:53 UTC (permalink / raw)
  To: Rich Felker; +Cc: Farid Zakaria, musl

* Rich Felker <dalias@libc.org> [2023-10-16 10:26:04 -0400]:
> On Sun, Oct 15, 2023 at 06:06:48PM -0700, Farid Zakaria wrote:
> > Hi!
> > 
> > I'd like to read some section data during dynlink.c
> > Does anyone have any good suggestions on the best way to do so?
> > I believe most ELF files ask for the load to start from the start of the
> > ELF file.
> > 
> > I see in dynlink.c the kernel sends AT_PHDR as an auxiliary vector --
> > Should I try applying a fixed offset from it to get to the start of the
> > ehdr ?
> > 
> > Any advice is appreciated.
> > 
> > Please include me in the CC for the reply.
> > I can't recall if I've subscribed.
> 
> Neither the Ehdrs nor sections are "loadable" parts of an executable
> ELF file. They may happen to be present in the mapped pages due to
> page granularity of mappings, but that doesn't mean they're guaranteed
> to be there; the Ehdrs are for the program loader's use, and the
> sections are for the use of linker (non-dynamic), debugger, etc.
> 
> In musl we use Ehdrs in a couple places: the dynamic linker finds its
> own program headers via assuming they're mapped, but this is rather
> reasonable since we built it and it's either going to always-succeed
> or always-fail and get caught before deployment if that build-time
> assumption somehow isn't met. It's not contingent on properties of a
> program encountered at runtime. We also use Ehdrs when loading a
> program (invoking ldso as a command) or shared library, but in that
> case we are the loaded and have access to them via the file being
> loaded.
> 
> Depending on what you want to do, and whether you just need to be
> compatible with your own binaries or arbitrary ones, it may suffice to
> do some sort of hack like rounding down from the program header
> address to the start of the page and hoping the Ehdrs live there. But
> it might make sense to look for other ways to do what you're trying to
> do, without needing to access non-runtime data structures.

note that (not too old) bfd ld and lld defines a hidden linker symbol
__ehdr_start that at runtime resolves to where the ehdr is.

example:

#include <elf.h>
#include <stdio.h>

__attribute__((visibility("hidden"), weak)) extern char __ehdr_start[];

int main()
{
	if (__ehdr_start) {
		Elf64_Ehdr *ehdr = (void *)__ehdr_start;
		printf("ehdr %p\n", ehdr);
		Elf64_Phdr *phdr = (void *)(__ehdr_start + ehdr->e_phoff);
		printf("phdr %p\n", phdr);
	} else
		printf("__ehdr_start is undefined\n");

	// to compare against the actual mappings
	char buf[9999];
	FILE *f = fopen("/proc/self/maps","r");
	size_t n = fread(buf, 1, sizeof buf, f);
	fwrite(buf, 1, n, stdout);
}

this should work for 64bit elf exe if ehdr is mapped into memory.

if you want link time error on an old linker instead of 0 __ehdr_start,
then just drop "weak" and the runtime check. (the code as written assumes
ehdr is not at exact 0 address, which is guaranteed by usual linux setups)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Getting access to section data during dynlink.c
  2023-10-16 21:53   ` Szabolcs Nagy
@ 2023-10-16 22:04     ` Rich Felker
  2023-10-17  3:39       ` Farid Zakaria
  2023-10-17  8:28       ` Szabolcs Nagy
  0 siblings, 2 replies; 10+ messages in thread
From: Rich Felker @ 2023-10-16 22:04 UTC (permalink / raw)
  To: Farid Zakaria, musl

On Mon, Oct 16, 2023 at 11:53:07PM +0200, Szabolcs Nagy wrote:
> * Rich Felker <dalias@libc.org> [2023-10-16 10:26:04 -0400]:
> > On Sun, Oct 15, 2023 at 06:06:48PM -0700, Farid Zakaria wrote:
> > > Hi!
> > > 
> > > I'd like to read some section data during dynlink.c
> > > Does anyone have any good suggestions on the best way to do so?
> > > I believe most ELF files ask for the load to start from the start of the
> > > ELF file.
> > > 
> > > I see in dynlink.c the kernel sends AT_PHDR as an auxiliary vector --
> > > Should I try applying a fixed offset from it to get to the start of the
> > > ehdr ?
> > > 
> > > Any advice is appreciated.
> > > 
> > > Please include me in the CC for the reply.
> > > I can't recall if I've subscribed.
> > 
> > Neither the Ehdrs nor sections are "loadable" parts of an executable
> > ELF file. They may happen to be present in the mapped pages due to
> > page granularity of mappings, but that doesn't mean they're guaranteed
> > to be there; the Ehdrs are for the program loader's use, and the
> > sections are for the use of linker (non-dynamic), debugger, etc.
> > 
> > In musl we use Ehdrs in a couple places: the dynamic linker finds its
> > own program headers via assuming they're mapped, but this is rather
> > reasonable since we built it and it's either going to always-succeed
> > or always-fail and get caught before deployment if that build-time
> > assumption somehow isn't met. It's not contingent on properties of a
> > program encountered at runtime. We also use Ehdrs when loading a
> > program (invoking ldso as a command) or shared library, but in that
> > case we are the loaded and have access to them via the file being
> > loaded.
> > 
> > Depending on what you want to do, and whether you just need to be
> > compatible with your own binaries or arbitrary ones, it may suffice to
> > do some sort of hack like rounding down from the program header
> > address to the start of the page and hoping the Ehdrs live there. But
> > it might make sense to look for other ways to do what you're trying to
> > do, without needing to access non-runtime data structures.
> 
> note that (not too old) bfd ld and lld defines a hidden linker symbol
> __ehdr_start that at runtime resolves to where the ehdr is.
> 
> example:
> 
> #include <elf.h>
> #include <stdio.h>
> 
> __attribute__((visibility("hidden"), weak)) extern char __ehdr_start[];
> 
> int main()
> {
> 	if (__ehdr_start) {
> 		Elf64_Ehdr *ehdr = (void *)__ehdr_start;
> 		printf("ehdr %p\n", ehdr);
> 		Elf64_Phdr *phdr = (void *)(__ehdr_start + ehdr->e_phoff);
> 		printf("phdr %p\n", phdr);
> 	} else
> 		printf("__ehdr_start is undefined\n");
> 
> 	// to compare against the actual mappings
> 	char buf[9999];
> 	FILE *f = fopen("/proc/self/maps","r");
> 	size_t n = fread(buf, 1, sizeof buf, f);
> 	fwrite(buf, 1, n, stdout);
> }
> 
> this should work for 64bit elf exe if ehdr is mapped into memory.
> 
> if you want link time error on an old linker instead of 0 __ehdr_start,
> then just drop "weak" and the runtime check. (the code as written assumes
> ehdr is not at exact 0 address, which is guaranteed by usual linux setups)

Interesting -- perhaps we should find a way to use this in ldso to
find its own ehdr.

Rich

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Getting access to section data during dynlink.c
  2023-10-16 22:04     ` Rich Felker
@ 2023-10-17  3:39       ` Farid Zakaria
  2023-10-17  8:28       ` Szabolcs Nagy
  1 sibling, 0 replies; 10+ messages in thread
From: Farid Zakaria @ 2023-10-17  3:39 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

For those reading the list, I ended up just opening the file with the
'app.name' and then mmap it.

```
int fd = open(app.name, O_RDONLY);
if (fd < 0) {
        dprintf(2, "failed to open");
        _exit(1);
}

struct stat st;
fstat(fd, &st);
const ElfW(Ehdr)* ehdr = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
if (ehdr == MAP_FAILED) {
      dprintf(2, "failed to mmap");
      _exit(1);
    }

if (!ehdr || memcmp(ehdr->e_ident, ELFMAG, SELFMAG) != 0) {
   dprintf(2, "Not a valid elf file\n");
   _exit(1);
}

const ElfW(Shdr)* section_header = find_section_by_name(ehdr, ".some-section");
if (section_header == NULL) {
   dprintf(2, "Cannot find .sqlelf section\n");
   _exit(1);
}
```

On Mon, Oct 16, 2023 at 3:04 PM Rich Felker <dalias@libc.org> wrote:
>
> On Mon, Oct 16, 2023 at 11:53:07PM +0200, Szabolcs Nagy wrote:
> > * Rich Felker <dalias@libc.org> [2023-10-16 10:26:04 -0400]:
> > > On Sun, Oct 15, 2023 at 06:06:48PM -0700, Farid Zakaria wrote:
> > > > Hi!
> > > >
> > > > I'd like to read some section data during dynlink.c
> > > > Does anyone have any good suggestions on the best way to do so?
> > > > I believe most ELF files ask for the load to start from the start of the
> > > > ELF file.
> > > >
> > > > I see in dynlink.c the kernel sends AT_PHDR as an auxiliary vector --
> > > > Should I try applying a fixed offset from it to get to the start of the
> > > > ehdr ?
> > > >
> > > > Any advice is appreciated.
> > > >
> > > > Please include me in the CC for the reply.
> > > > I can't recall if I've subscribed.
> > >
> > > Neither the Ehdrs nor sections are "loadable" parts of an executable
> > > ELF file. They may happen to be present in the mapped pages due to
> > > page granularity of mappings, but that doesn't mean they're guaranteed
> > > to be there; the Ehdrs are for the program loader's use, and the
> > > sections are for the use of linker (non-dynamic), debugger, etc.
> > >
> > > In musl we use Ehdrs in a couple places: the dynamic linker finds its
> > > own program headers via assuming they're mapped, but this is rather
> > > reasonable since we built it and it's either going to always-succeed
> > > or always-fail and get caught before deployment if that build-time
> > > assumption somehow isn't met. It's not contingent on properties of a
> > > program encountered at runtime. We also use Ehdrs when loading a
> > > program (invoking ldso as a command) or shared library, but in that
> > > case we are the loaded and have access to them via the file being
> > > loaded.
> > >
> > > Depending on what you want to do, and whether you just need to be
> > > compatible with your own binaries or arbitrary ones, it may suffice to
> > > do some sort of hack like rounding down from the program header
> > > address to the start of the page and hoping the Ehdrs live there. But
> > > it might make sense to look for other ways to do what you're trying to
> > > do, without needing to access non-runtime data structures.
> >
> > note that (not too old) bfd ld and lld defines a hidden linker symbol
> > __ehdr_start that at runtime resolves to where the ehdr is.
> >
> > example:
> >
> > #include <elf.h>
> > #include <stdio.h>
> >
> > __attribute__((visibility("hidden"), weak)) extern char __ehdr_start[];
> >
> > int main()
> > {
> >       if (__ehdr_start) {
> >               Elf64_Ehdr *ehdr = (void *)__ehdr_start;
> >               printf("ehdr %p\n", ehdr);
> >               Elf64_Phdr *phdr = (void *)(__ehdr_start + ehdr->e_phoff);
> >               printf("phdr %p\n", phdr);
> >       } else
> >               printf("__ehdr_start is undefined\n");
> >
> >       // to compare against the actual mappings
> >       char buf[9999];
> >       FILE *f = fopen("/proc/self/maps","r");
> >       size_t n = fread(buf, 1, sizeof buf, f);
> >       fwrite(buf, 1, n, stdout);
> > }
> >
> > this should work for 64bit elf exe if ehdr is mapped into memory.
> >
> > if you want link time error on an old linker instead of 0 __ehdr_start,
> > then just drop "weak" and the runtime check. (the code as written assumes
> > ehdr is not at exact 0 address, which is guaranteed by usual linux setups)
>
> Interesting -- perhaps we should find a way to use this in ldso to
> find its own ehdr.
>
> Rich

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Getting access to section data during dynlink.c
  2023-10-16 22:04     ` Rich Felker
  2023-10-17  3:39       ` Farid Zakaria
@ 2023-10-17  8:28       ` Szabolcs Nagy
  2023-10-17 12:24         ` Rich Felker
  2023-10-17 17:37         ` Farid Zakaria
  1 sibling, 2 replies; 10+ messages in thread
From: Szabolcs Nagy @ 2023-10-17  8:28 UTC (permalink / raw)
  To: Rich Felker; +Cc: Farid Zakaria, musl

* Rich Felker <dalias@libc.org> [2023-10-16 18:04:11 -0400]:
> On Mon, Oct 16, 2023 at 11:53:07PM +0200, Szabolcs Nagy wrote:
> > note that (not too old) bfd ld and lld defines a hidden linker symbol
> > __ehdr_start that at runtime resolves to where the ehdr is.
> > 
> > example:
> > 
> > #include <elf.h>
> > #include <stdio.h>
> > 
> > __attribute__((visibility("hidden"), weak)) extern char __ehdr_start[];
> > 
> > int main()
> > {
> > 	if (__ehdr_start) {
> > 		Elf64_Ehdr *ehdr = (void *)__ehdr_start;
> > 		printf("ehdr %p\n", ehdr);
> > 		Elf64_Phdr *phdr = (void *)(__ehdr_start + ehdr->e_phoff);
> > 		printf("phdr %p\n", phdr);
> > 	} else
> > 		printf("__ehdr_start is undefined\n");
> > 
> > 	// to compare against the actual mappings
> > 	char buf[9999];
> > 	FILE *f = fopen("/proc/self/maps","r");
> > 	size_t n = fread(buf, 1, sizeof buf, f);
> > 	fwrite(buf, 1, n, stdout);
> > }
> > 
> > this should work for 64bit elf exe if ehdr is mapped into memory.
> > 
> > if you want link time error on an old linker instead of 0 __ehdr_start,
> > then just drop "weak" and the runtime check. (the code as written assumes
> > ehdr is not at exact 0 address, which is guaranteed by usual linux setups)
> 
> Interesting -- perhaps we should find a way to use this in ldso to
> find its own ehdr.

for that use it is a bit target specific:
the symbol address computation must be pc-relative with no dynamic reloc,
e.g. 'weak' would create a got reloc so not usable before relocs are done.

glibc switched using it (but can use auxv too), requires binutils >= 2.23.
i think lld had issues with setting GOT[0] up with vaddr of _DYNAMIC
which is what glibc was relying on previously on many targets.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Getting access to section data during dynlink.c
  2023-10-17  8:28       ` Szabolcs Nagy
@ 2023-10-17 12:24         ` Rich Felker
  2023-10-17 17:37         ` Farid Zakaria
  1 sibling, 0 replies; 10+ messages in thread
From: Rich Felker @ 2023-10-17 12:24 UTC (permalink / raw)
  To: musl

On Tue, Oct 17, 2023 at 10:28:00AM +0200, Szabolcs Nagy wrote:
> * Rich Felker <dalias@libc.org> [2023-10-16 18:04:11 -0400]:
> > On Mon, Oct 16, 2023 at 11:53:07PM +0200, Szabolcs Nagy wrote:
> > > note that (not too old) bfd ld and lld defines a hidden linker symbol
> > > __ehdr_start that at runtime resolves to where the ehdr is.
> > > 
> > > example:
> > > 
> > > #include <elf.h>
> > > #include <stdio.h>
> > > 
> > > __attribute__((visibility("hidden"), weak)) extern char __ehdr_start[];
> > > 
> > > int main()
> > > {
> > > 	if (__ehdr_start) {
> > > 		Elf64_Ehdr *ehdr = (void *)__ehdr_start;
> > > 		printf("ehdr %p\n", ehdr);
> > > 		Elf64_Phdr *phdr = (void *)(__ehdr_start + ehdr->e_phoff);
> > > 		printf("phdr %p\n", phdr);
> > > 	} else
> > > 		printf("__ehdr_start is undefined\n");
> > > 
> > > 	// to compare against the actual mappings
> > > 	char buf[9999];
> > > 	FILE *f = fopen("/proc/self/maps","r");
> > > 	size_t n = fread(buf, 1, sizeof buf, f);
> > > 	fwrite(buf, 1, n, stdout);
> > > }
> > > 
> > > this should work for 64bit elf exe if ehdr is mapped into memory.
> > > 
> > > if you want link time error on an old linker instead of 0 __ehdr_start,
> > > then just drop "weak" and the runtime check. (the code as written assumes
> > > ehdr is not at exact 0 address, which is guaranteed by usual linux setups)
> > 
> > Interesting -- perhaps we should find a way to use this in ldso to
> > find its own ehdr.
> 
> for that use it is a bit target specific:
> the symbol address computation must be pc-relative with no dynamic reloc,

Indeed, that's what makes it difficult. crt_start.h could compute it
along with _DYNAMIC, but that's more per-arch burden I would not like
to see, and it's not clear how it would distinguish the undefined
case if we're supporting that.

> e.g. 'weak' would create a got reloc so not usable before relocs are done.

A GOT reloc for a hidden symbol will be relative and already resolved
by dlstart.c. I'm not sure if we're making use of such a property
right now but it seems reasonable to do so; the symbol name cannot
exist in a form satisfiable by the symbolic relocations performed
later, so it must have been done at this point. At first I was
thinking of storing the address in a static var that dlstart.c would
have filled in, but this seems no better than (and equivalent to) just
letting the GOT do its thing.

Rich

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [musl] Getting access to section data during dynlink.c
  2023-10-17  8:28       ` Szabolcs Nagy
  2023-10-17 12:24         ` Rich Felker
@ 2023-10-17 17:37         ` Farid Zakaria
  1 sibling, 0 replies; 10+ messages in thread
From: Farid Zakaria @ 2023-10-17 17:37 UTC (permalink / raw)
  To: Rich Felker, Farid Zakaria, musl

Here is a link for posterity with the code:
https://github.com/fzakaria/musllibc/blob/ea4b030db9dfab2b6163883bf15e33f5b22d70f1/ldso/dynlink.c#L1997
For those that are curious how I achieved it.

On Tue, Oct 17, 2023 at 1:28 AM Szabolcs Nagy <nsz@port70.net> wrote:
>
> * Rich Felker <dalias@libc.org> [2023-10-16 18:04:11 -0400]:
> > On Mon, Oct 16, 2023 at 11:53:07PM +0200, Szabolcs Nagy wrote:
> > > note that (not too old) bfd ld and lld defines a hidden linker symbol
> > > __ehdr_start that at runtime resolves to where the ehdr is.
> > >
> > > example:
> > >
> > > #include <elf.h>
> > > #include <stdio.h>
> > >
> > > __attribute__((visibility("hidden"), weak)) extern char __ehdr_start[];
> > >
> > > int main()
> > > {
> > >     if (__ehdr_start) {
> > >             Elf64_Ehdr *ehdr = (void *)__ehdr_start;
> > >             printf("ehdr %p\n", ehdr);
> > >             Elf64_Phdr *phdr = (void *)(__ehdr_start + ehdr->e_phoff);
> > >             printf("phdr %p\n", phdr);
> > >     } else
> > >             printf("__ehdr_start is undefined\n");
> > >
> > >     // to compare against the actual mappings
> > >     char buf[9999];
> > >     FILE *f = fopen("/proc/self/maps","r");
> > >     size_t n = fread(buf, 1, sizeof buf, f);
> > >     fwrite(buf, 1, n, stdout);
> > > }
> > >
> > > this should work for 64bit elf exe if ehdr is mapped into memory.
> > >
> > > if you want link time error on an old linker instead of 0 __ehdr_start,
> > > then just drop "weak" and the runtime check. (the code as written assumes
> > > ehdr is not at exact 0 address, which is guaranteed by usual linux setups)
> >
> > Interesting -- perhaps we should find a way to use this in ldso to
> > find its own ehdr.
>
> for that use it is a bit target specific:
> the symbol address computation must be pc-relative with no dynamic reloc,
> e.g. 'weak' would create a got reloc so not usable before relocs are done.
>
> glibc switched using it (but can use auxv too), requires binutils >= 2.23.
> i think lld had issues with setting GOT[0] up with vaddr of _DYNAMIC
> which is what glibc was relying on previously on many targets.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-10-17 17:38 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-16  1:06 [musl] Getting access to section data during dynlink.c Farid Zakaria
2023-10-16 14:26 ` Rich Felker
2023-10-16 21:09   ` Farid Zakaria
2023-10-16 21:16     ` Farid Zakaria
2023-10-16 21:53   ` Szabolcs Nagy
2023-10-16 22:04     ` Rich Felker
2023-10-17  3:39       ` Farid Zakaria
2023-10-17  8:28       ` Szabolcs Nagy
2023-10-17 12:24         ` Rich Felker
2023-10-17 17:37         ` Farid Zakaria

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).