From: musl <b.brezillon.musl@gmail.com>
To: musl@lists.openwall.com
Cc: Szabolcs Nagy <nsz@port70.net>
Subject: Re: ldso : dladdr support
Date: Wed, 08 Aug 2012 15:57:15 +0200 [thread overview]
Message-ID: <5022703B.3090105@gmail.com> (raw)
In-Reply-To: <20120808115202.GL30810@port70.net>
[-- Attachment #1: Type: text/plain, Size: 3712 bytes --]
Same as before except I use bit mask to support multiple hash algorithms.
On 08/08/2012 13:52, Szabolcs Nagy wrote:
> * musl <b.brezillon.musl@gmail.com> [2012-08-08 11:55:45 +0200]:
>> Here is the new patch for dladdr + gnu hash support.
>>
>> On 08/08/2012 01:09, Rich Felker wrote:
>>> don't remember which) where I described some optimizations that should
>>> be present if gnu hash is to be supported. Basically, the dynamic
>>> linker should keep track of whether there's one of the two hashes
>>> that's supported by ALL loaded libs, and in that case, only ever use
>>> that hash, to avoid computing two different hashes.
>> The hash for given algo is only computed once (if needed).
>> That's the reason for the computed table.
>> If all libs uses the same hash algo, the other will never be computed.
> a lib can support both hash tables
> (so it's non-trivial to use the one that all libs support,
> eg instead of a single alg index a bit mask should be used)
>
See patch changes.
>> +#define SYSV_HASH_ALG_IDX 0
>> +#define GNU_HASH_ALG_IDX 1
>> +#define HASH_ALG_CNT 2
> if we go with this approach then use shorter names
> (thes are only used locally)
>
> eg. SYSV_HASH, GNU_HASH, NHASH
Corrected.
>> + for (h1 &= (uint32_t)-2;; sym++) {
>> + h2 = *hashval++;
>> + if ((h1 == (h2 & ~1)) && !strcmp(s, strings + sym->st_name))
> these is still a ~1
>
> looking at it now probably writing out & 0xfffffffe
> is the cleanest
>
>> + static uint32_t precomp[HASH_ALG_CNT][3] = {
>> + {0x6b366be, 0x6b3afd, 0x595a4cc},
>> + {0xf9040207, 0xf4dc4ae, 0x1f4039c9},
>> + };
>> + uint32_t h[HASH_ALG_CNT];
>> + uint8_t computed[HASH_ALG_CNT] = {0, 0};
>> + uint8_t alg = dso->hashalg;
>> + if (alg == SYSV_HASH_ALG_IDX)
>> + h[alg] = sysv_hash(s);
>> + else
>> + h[alg] = gnu_hash(s);
>> +
>> + computed[alg] = 1;
>> +
>> + if (h[alg] == precomp[alg][0] && !strcmp(s, "dlopen")) rtld_used = 1;
>> + if (h[alg] == precomp[alg][1] && !strcmp(s, "dlsym")) rtld_used = 1;
>> + if (h[alg] == precomp[alg][2] && !strcmp(s, "__stack_chk_fail")) ssp_used = 1;
>> +
>> for (; dso; dso=dso->next) {
>> Sym *sym;
>> +
>> if (!dso->global) continue;
>> - sym = lookup(s, h, dso);
>> +
>> + alg = dso->hashalg;
>> + if (!computed[alg]) {
>> + if (alg == SYSV_HASH_ALG_IDX) {
>> + h[alg] = sysv_hash(s);
>> + sym = sysv_lookup(s, h[alg], dso);
>> + }
>> + else {
>> + h[alg] = gnu_hash(s);
>> + sym = gnu_lookup(s, h[alg], dso);
>> + }
>> + computed[alg] = 1;
>> + } else {
>> + if (alg == SYSV_HASH_ALG_IDX)
>> + sym = sysv_lookup(s, h[alg], dso);
>> + else
>> + sym = gnu_lookup(s, h[alg], dso);
>> + }
> instead of arrays i'd write
>
> alg = dso->hashalg;
> if (alg == SYSV_HASH) {
> if (sysv_ok) {
> ...
> sysv_ok = 1;
> }
> sym = sysv_lookup(s, sysv_h, dso);
> } else {
> }
Haven't changed it yet.
> since there are many ifs anyway
>
> the table approach is nicer when all data and functions
> are in a table:
>
> if (!ok[alg]) {
> h[alg] = hash[alg](s);
> ok[alg] = 1;
> }
> sym = lookup[alg](s, h[alg], dso);
>
>> + p->hashalg = SYSV_HASH_ALG_IDX;
>> p->strings = (void *)(p->base + dyn[DT_STRTAB]);
>> + for (; v[0]; v+=2) if (v[0] == DT_GNU_HASH) {
>> + p->hashtab = (void *)(p->base + v[1]);
>> + p->hashalg = GNU_HASH_ALG_IDX;
>> + }
> so it seems gnu hash is used whenever it's present
> i'm not sure if that's the right default..
>
> another possibility is to have a plain find_sym function
> which is simple and only supports sysv hash
> and whenever it encounters a lib that has no sysv hash in
> it a find_sym_gnu is called that does the hard work
> (so using gnu only libs is penalized, i don't know how
> common that case is though)
[-- Attachment #2: ldso-add-dladdr-and-gnu-hash-support.patch --]
[-- Type: text/x-patch, Size: 10514 bytes --]
From 359843b57223db412847f69702e19511dfeb435d Mon Sep 17 00:00:00 2001
From: Boris BREZILLON <b.brezillon@overkiz.com>
Date: Wed, 8 Aug 2012 15:49:29 +0200
Subject: [PATCH] ldso : add dladdr and gnu hash support
---
include/dlfcn.h | 15 +++
src/ldso/dynlink.c | 272 ++++++++++++++++++++++++++++++++++++++++++++++++----
2 files changed, 269 insertions(+), 18 deletions(-)
diff --git a/include/dlfcn.h b/include/dlfcn.h
index dea74c7..8c45822 100644
--- a/include/dlfcn.h
+++ b/include/dlfcn.h
@@ -18,6 +18,21 @@ char *dlerror(void);
void *dlopen(const char *, int);
void *dlsym(void *, const char *);
+#ifdef _GNU_SOURCE
+typedef struct {
+ const char *dli_fname; /* Pathname of shared object that
+ contains address */
+ void *dli_fbase; /* Address at which shared object
+ is loaded */
+ const char *dli_sname; /* Name of nearest symbol with address
+ lower than addr */
+ void *dli_saddr; /* Exact address of symbol named
+ in dli_sname */
+} Dl_info;
+
+int dladdr (void *addr, Dl_info *info);
+#endif
+
#ifdef __cplusplus
}
#endif
diff --git a/src/ldso/dynlink.c b/src/ldso/dynlink.c
index f55c6f1..4a236b2 100644
--- a/src/ldso/dynlink.c
+++ b/src/ldso/dynlink.c
@@ -1,3 +1,4 @@
+#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
@@ -28,14 +29,20 @@ typedef Elf32_Phdr Phdr;
typedef Elf32_Sym Sym;
#define R_TYPE(x) ((x)&255)
#define R_SYM(x) ((x)>>8)
+#define ELF_ST_TYPE ELF32_ST_TYPE
#else
typedef Elf64_Ehdr Ehdr;
typedef Elf64_Phdr Phdr;
typedef Elf64_Sym Sym;
#define R_TYPE(x) ((x)&0xffffffff)
#define R_SYM(x) ((x)>>32)
+#define ELF_ST_TYPE ELF64_ST_TYPE
#endif
+#define SYSV_HASH 0
+#define GNU_HASH 1
+#define NHASH 2
+
struct debug {
int ver;
void *head;
@@ -52,7 +59,8 @@ struct dso {
int refcnt;
Sym *syms;
- uint32_t *hashtab;
+ uint32_t hashalgs;
+ uint32_t *hashtabs[NHASH];
char *strings;
unsigned char *map;
size_t map_len;
@@ -94,7 +102,7 @@ static void decode_vec(size_t *v, size_t *a, size_t cnt)
}
}
-static uint32_t hash(const char *s0)
+static uint32_t sysv_hash(const char *s0)
{
const unsigned char *s = (void *)s0;
uint_fast32_t h = 0;
@@ -105,11 +113,20 @@ static uint32_t hash(const char *s0)
return h & 0xfffffff;
}
-static Sym *lookup(const char *s, uint32_t h, struct dso *dso)
+static uint32_t gnu_hash (const char *s0)
+{
+ const unsigned char *s = (void *)s0;
+ uint_fast32_t h = 5381;
+ for (; *s; s++)
+ h = h*33 + *s;
+ return h & 0xffffffff;
+}
+
+static Sym *sysv_lookup(const char *s, uint32_t h, struct dso *dso)
{
size_t i;
Sym *syms = dso->syms;
- uint32_t *hashtab = dso->hashtab;
+ uint32_t *hashtab = dso->hashtabs[SYSV_HASH];
char *strings = dso->strings;
for (i=hashtab[2+h%hashtab[0]]; i; i=hashtab[2+hashtab[0]+i]) {
if (!strcmp(s, strings+syms[i].st_name))
@@ -118,20 +135,101 @@ static Sym *lookup(const char *s, uint32_t h, struct dso *dso)
return 0;
}
+static Sym *gnu_lookup(const char *s, uint32_t h1, struct dso *dso)
+{
+ size_t i;
+ Sym *sym;
+ char *strings = dso->strings;
+ uint32_t *hashtab = dso->hashtabs[GNU_HASH];
+ uint32_t nbuckets = hashtab[0];
+ size_t *maskwords = (size_t *)(hashtab + 4);
+ uint32_t *buckets = hashtab + 4 + (hashtab[2]*(sizeof(size_t)/sizeof(uint32_t)));
+ uint32_t symndx = hashtab[1];
+ Sym *syms = dso->syms;
+ uint32_t shift2 = hashtab[3];
+ uint32_t h2 = h1 >> shift2;
+ uint32_t *hashvals = buckets + nbuckets;
+ uint32_t *hashval;
+ size_t c = sizeof(size_t) * 8;
+ size_t n = (h1/c) & (hashtab[2]-1);
+ size_t bitmask = (1 << (h1%c)) | (1 << (h2%c));
+
+ if ((maskwords[n] & bitmask) != bitmask)
+ return 0;
+
+ n = buckets[h1 % nbuckets];
+ if (!n)
+ return 0;
+
+ sym = syms + n;
+ hashval = hashvals + n - symndx;
+
+ for (h1 &= (uint32_t)-2;; sym++) {
+ h2 = *hashval++;
+ if ((h1 == (h2 & ~1)) && !strcmp(s, strings + sym->st_name))
+ return sym;
+
+ if (h2 & 1)
+ break;
+ }
+
+ return 0;
+}
+
#define OK_TYPES (1<<STT_NOTYPE | 1<<STT_OBJECT | 1<<STT_FUNC | 1<<STT_COMMON)
#define OK_BINDS (1<<STB_GLOBAL | 1<<STB_WEAK)
static void *find_sym(struct dso *dso, const char *s, int need_def)
{
- uint32_t h = hash(s);
void *def = 0;
- if (h==0x6b366be && !strcmp(s, "dlopen")) rtld_used = 1;
- if (h==0x6b3afd && !strcmp(s, "dlsym")) rtld_used = 1;
- if (h==0x595a4cc && !strcmp(s, "__stack_chk_fail")) ssp_used = 1;
+ static uint32_t precomp[NHASH][3] = {
+ {0x6b366be, 0x6b3afd, 0x595a4cc},
+ {0xf9040207, 0xf4dc4ae, 0x1f4039c9},
+ };
+ uint32_t h[NHASH];
+ uint8_t ok = 0;
+ uint8_t algs = dso->hashalgs;
+ uint8_t alg;
+ if (algs & (1 << SYSV_HASH)) {
+ h[SYSV_HASH] = sysv_hash(s);
+ alg = SYSV_HASH;
+ } else {
+ h[GNU_HASH] = gnu_hash(s);
+ alg = GNU_HASH;
+ }
+
+ ok |= (1 << alg);
+
+ if (h[alg] == precomp[alg][0] && !strcmp(s, "dlopen")) rtld_used = 1;
+ if (h[alg] == precomp[alg][1] && !strcmp(s, "dlsym")) rtld_used = 1;
+ if (h[alg] == precomp[alg][2] && !strcmp(s, "__stack_chk_fail")) ssp_used = 1;
+
for (; dso; dso=dso->next) {
Sym *sym;
+
if (!dso->global) continue;
- sym = lookup(s, h, dso);
+
+ algs = dso->hashalgs;
+ if (!(algs & ok)) {
+ if (algs & (1 << SYSV_HASH)) {
+ alg = SYSV_HASH;
+ h[alg] = sysv_hash(s);
+ sym = sysv_lookup(s, h[alg], dso);
+ }
+ else {
+ alg = GNU_HASH;
+ h[alg] = gnu_hash(s);
+ sym = gnu_lookup(s, h[alg], dso);
+ }
+
+ ok |= (1 << alg);
+ } else {
+ if ((algs & ok) & (1 << SYSV_HASH))
+ sym = sysv_lookup(s, h[SYSV_HASH], dso);
+ else
+ sym = gnu_lookup(s, h[GNU_HASH], dso);
+ }
+
if (sym && (!need_def || sym->st_shndx) && sym->st_value
&& (1<<(sym->st_info&0xf) & OK_TYPES)
&& (1<<(sym->st_info>>4) & OK_BINDS)) {
@@ -320,11 +418,28 @@ static int path_open(const char *name, const char *search, char *buf, size_t buf
static void decode_dyn(struct dso *p)
{
- size_t dyn[DYN_CNT] = {0};
- decode_vec(p->dynv, dyn, DYN_CNT);
- p->syms = (void *)(p->base + dyn[DT_SYMTAB]);
- p->hashtab = (void *)(p->base + dyn[DT_HASH]);
- p->strings = (void *)(p->base + dyn[DT_STRTAB]);
+ size_t *v;
+ p->hashalgs = 0;
+ for (v = p->dynv; v[0]; v+=2) {
+ switch (v[0]) {
+ case DT_SYMTAB:
+ p->syms = (void *)(p->base + v[1]);
+ break;
+ case DT_HASH:
+ p->hashtabs[SYSV_HASH] = (void *)(p->base + v[1]);
+ p->hashalgs |= (1 << SYSV_HASH);
+ break;
+ case DT_STRTAB:
+ p->strings = (void *)(p->base + v[1]);
+ break;
+ case DT_GNU_HASH:
+ p->hashtabs[GNU_HASH] = (void *)(p->base + v[1]);
+ p->hashalgs |= (1 << GNU_HASH);
+ break;
+ default:
+ break;
+ }
+ }
}
static struct dso *load_library(const char *name)
@@ -784,8 +899,11 @@ end:
static void *do_dlsym(struct dso *p, const char *s, void *ra)
{
size_t i;
- uint32_t h;
Sym *sym;
+ uint32_t ok = 0;
+ uint8_t algs = p->hashalgs;
+ uint32_t h[NHASH];
+
if (p == RTLD_NEXT) {
for (p=head; p && (unsigned char *)ra-p->map>p->map_len; p=p->next);
if (!p) p=head;
@@ -798,12 +916,39 @@ static void *do_dlsym(struct dso *p, const char *s, void *ra)
if (!res) goto failed;
return res;
}
- h = hash(s);
- sym = lookup(s, h, p);
+
+ if (algs & (1 << SYSV_HASH)) {
+ h[SYSV_HASH] = sysv_hash(s);
+ sym = sysv_lookup(s, h[SYSV_HASH], p);
+ ok |= 1 << SYSV_HASH;
+ } else {
+ h[GNU_HASH] = gnu_hash(s);
+ sym = gnu_lookup(s, h[GNU_HASH], p);
+ ok |= 1 << GNU_HASH;
+ }
+
+
if (sym && sym->st_value && (1<<(sym->st_info&0xf) & OK_TYPES))
return p->base + sym->st_value;
if (p->deps) for (i=0; p->deps[i]; i++) {
- sym = lookup(s, h, p->deps[i]);
+ algs = p->deps[i]->hashalgs;
+ if (!(algs & ok)) {
+ if (algs & SYSV_HASH) {
+ h[SYSV_HASH] = sysv_hash(s);
+ sym = sysv_lookup(s, h[SYSV_HASH], p->deps[i]);
+ ok |= SYSV_HASH;
+ } else {
+ h[GNU_HASH] = gnu_hash(s);
+ sym = gnu_lookup(s, h[GNU_HASH], p->deps[i]);
+ ok |= GNU_HASH;
+ }
+ } else {
+ if (algs & SYSV_HASH)
+ sym = sysv_lookup(s, h[SYSV_HASH], p->deps[i]);
+ else
+ sym = gnu_lookup(s, h[GNU_HASH], p->deps[i]);
+ }
+
if (sym && sym->st_value && (1<<(sym->st_info&0xf) & OK_TYPES))
return p->deps[i]->base + sym->st_value;
}
@@ -813,6 +958,97 @@ failed:
return 0;
}
+struct sym_search {
+ void *addr;
+ Dl_info *info;
+};
+
+static int find_closest_sym (struct dso *dso, Sym *sym, struct sym_search *search)
+{
+ void *symaddr = dso->base + sym->st_value;
+ char *strings = dso->strings;
+ Dl_info *info = search->info;
+ void *addr = search->addr;
+ void *prevaddr = info->dli_saddr;
+
+ if (sym->st_value == 0 && sym->st_shndx == SHN_UNDEF)
+ return 1;
+
+ if (ELF_ST_TYPE(sym->st_info) == STT_TLS)
+ return 1;
+
+ if (addr < symaddr)
+ return 1;
+
+ if (prevaddr && (addr - symaddr) > (addr - prevaddr))
+ return 1;
+
+ info->dli_saddr = symaddr;
+ info->dli_sname = strings + sym->st_name;
+
+ if (addr == symaddr)
+ return 0;
+
+ return 1;
+
+}
+
+static int do_dladdr (void *addr, Dl_info *info)
+{
+ struct sym_search search;
+ struct dso *p;
+ memset (info, 0, sizeof (*info));
+ search.info = info;
+ search.addr = addr;
+ for (p=head; p; p=p->next) {
+ if ((unsigned char *)addr >= p->map && (unsigned char *)addr < p->map + p->map_len) {
+ Sym *syms = p->syms;
+ uint32_t *hashtab;
+ size_t i;
+ info->dli_fname = p->name;
+ info->dli_fbase = p->base;
+ if (p->hashalgs & (1 << SYSV_HASH)) {
+ hashtab = p->hashtabs[SYSV_HASH];
+ for (i = 0; i < hashtab[1]; i++) {
+ if (!find_closest_sym (p, syms + i, &search))
+ return 1;
+ }
+ } else if(p->hashalgs & (1 << GNU_HASH)) {
+ uint32_t *buckets;
+ uint32_t nbuckets;
+ uint32_t *hashvals;
+ uint32_t symndx;
+ hashtab = p->hashtabs[GNU_HASH];
+ buckets = hashtab + 4 + (hashtab[2] * (sizeof(size_t)/sizeof(uint32_t)));
+ nbuckets = hashtab[0];
+ hashvals = buckets + nbuckets;
+ symndx = hashtab[1];
+ for (i = 0; i < nbuckets; ++i) {
+ uint32_t n = buckets[i];
+ Sym *sym = syms + n;
+ uint32_t *hashval = hashvals + n - symndx;
+
+ do {
+ if (!find_closest_sym (p, sym, &search))
+ return 1;
+ }while (!(*hashval++ & 1));
+ }
+ }
+ return 1;
+ }
+ }
+ return 0;
+}
+
+int dladdr (void *addr, Dl_info *info)
+{
+ int res;
+ pthread_rwlock_rdlock(&lock);
+ res = do_dladdr (addr, info);
+ pthread_rwlock_unlock(&lock);
+ return res;
+}
+
void *__dlsym(void *p, const char *s, void *ra)
{
void *res;
--
1.7.9.5
next prev parent reply other threads:[~2012-08-08 13:57 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-07 9:04 musl
2012-08-07 11:46 ` Szabolcs Nagy
2012-08-07 14:15 ` musl
2012-08-07 14:53 ` Szabolcs Nagy
2012-08-07 23:09 ` Rich Felker
2012-08-08 9:55 ` musl
2012-08-08 11:52 ` Szabolcs Nagy
2012-08-08 12:54 ` Rich Felker
2012-08-08 13:57 ` musl [this message]
2012-08-11 23:05 ` Rich Felker
2012-08-15 22:41 ` boris brezillon
2012-08-17 5:39 ` Rich Felker
2012-08-19 16:42 ` musl
2012-08-20 2:06 ` Rich Felker
2012-08-20 12:55 ` musl
2012-08-20 14:32 ` musl
2012-08-23 21:39 ` Rich Felker
2012-08-23 22:21 ` Rich Felker
2012-08-24 7:29 ` musl
2012-08-24 18:38 ` Rich Felker
2012-08-25 7:42 ` boris brezillon
2012-08-25 12:35 ` Rich Felker
2012-08-25 22:13 ` musl
2012-08-25 22:37 ` musl
2012-08-26 0:00 ` musl
2012-08-24 8:12 ` Szabolcs Nagy
2012-08-24 8:56 ` musl
2012-08-24 9:38 ` Szabolcs Nagy
2012-08-25 21:34 ` musl
2012-08-25 21:42 ` Rich Felker
2012-08-16 18:03 ` musl
2012-08-17 16:35 ` musl
2012-08-08 12:49 ` Rich Felker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5022703B.3090105@gmail.com \
--to=b.brezillon.musl@gmail.com \
--cc=musl@lists.openwall.com \
--cc=nsz@port70.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).