From: musl <b.brezillon.musl@gmail.com>
To: musl@lists.openwall.com
Subject: Re: ldso : dladdr support
Date: Mon, 20 Aug 2012 16:32:00 +0200 [thread overview]
Message-ID: <50324A60.7040206@gmail.com> (raw)
In-Reply-To: <503233A8.8000604@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 4089 bytes --]
I missed a bug in my previous patch :
in find_sym func precomptab was always set to sysv_precomp.
On 20/08/2012 14:55, musl wrote:
> Hi,
>
> This patch adds:
> * decode_vec2 implemented for high DT_x values.
> * find_closest_sym code integrated in do_dladdr function.
> * other fixes you asked in your previous mail.
>
> Regards,
>
> Boris.
>
>
> On 20/08/2012 04:06, Rich Felker wrote:
>> On Sun, Aug 19, 2012 at 06:42:30PM +0200, musl wrote:
>>> Hi,
>>>
>>> This patch fixes a bug in dladdr: sym var was not incremented across gnu hash chain iteration).
>>> I also reworked the dladdr implem to share more code between sysv and gnu hash.
>>> I still haven't found a better way to get the symbol table size. Do you?
>>>
>>> This patch uses the new decode_vec function, but as I told you in my
>>> previous mail, I'm not sure this the way to go.
>>> Could you tell me what you think?
>> Yeah, I'm not really happy with it either. Trying to think of
>> something better...
>>
>>> diff --git a/src/ldso/dynlink.c b/src/ldso/dynlink.c
>>> index f55c6f1..bf1ec6b 100644
>>> --- a/src/ldso/dynlink.c
>>> +++ b/src/ldso/dynlink.c
>>> @@ -1,3 +1,4 @@
>>> +#define _GNU_SOURCE
>>> #include <stdio.h>
>>> #include <stdlib.h>
>>> #include <string.h>
>>> @@ -28,12 +29,14 @@ typedef Elf32_Phdr Phdr;
>>> typedef Elf32_Sym Sym;
>>> #define R_TYPE(x) ((x)&255)
>>> #define R_SYM(x) ((x)>>8)
>>> +#define ELF_ST_TYPE ELF32_ST_TYPE
>>> #else
>>> typedef Elf64_Ehdr Ehdr;
>>> typedef Elf64_Phdr Phdr;
>>> typedef Elf64_Sym Sym;
>>> #define R_TYPE(x) ((x)&0xffffffff)
>>> #define R_SYM(x) ((x)>>32)
>>> +#define ELF_ST_TYPE ELF64_ST_TYPE
>>> #endif
>> These definitions are actually the same. I would just
>> #define ST_TYPE(x) ((x)&15)
>>
>>> -static uint32_t hash(const char *s0)
>>> +static uint32_t sysv_hash(const char *s0)
>>> {
>>> const unsigned char *s = (void *)s0;
>>> uint_fast32_t h = 0;
>>> @@ -105,7 +117,16 @@ static uint32_t hash(const char *s0)
>>> return h & 0xfffffff;
>>> }
>>>
>>> -static Sym *lookup(const char *s, uint32_t h, struct dso *dso)
>>> +static uint32_t gnu_hash (const char *s0)
>>> +{
>>> + const unsigned char *s = (void *)s0;
>>> + uint_fast32_t h = 5381;
>>> + for (; *s; s++)
>>> + h = h*33 + *s;
>>> + return h & 0xffffffff;
>>> +}
>> The final &0xffffffff is a no-op. Note that the one in sysv_hash is
>> not a no-op; sysv_hash's result is 28 bits, not 32.
>>
>> Re-reading this code also raised another issue: I'm not entirely
>> convinced that 0 is not a possible hash value, which may invalidate
>> what I said before about using h==0 to indicate "not yet computed". Of
>> course, it may not matter; if one in 4 billion symbol names get their
>> hashes repeatedly recomputed rather than being reused, it's not going
>> to make any difference to overall performance...
>>
>>> /* Only trust user/env if kernel says we're not suid/sgid */
>>> - if ((aux[0]&0x7800)!=0x7800 || aux[AT_UID]!=aux[AT_EUID]
>>> - || aux[AT_GID]!=aux[AT_EGID] || aux[AT_SECURE]) {
>>> + if ((found&0x1e0)!=0x1e0 || aux[5]!=aux[6]
>>> + || aux[7]!=aux[8] || aux[9]) {
>>> env_path = 0;
>>> env_preload = 0;
>> Looking at this, I agree that the new decode_vec idea is not a good
>> direction. It's obfuscating the code badly.
>>
>> For now, how about leaving the old decode_vec alone and just adding a
>> new one with a different name for getting to "high" entries. I wonder
>> if it would be possible, rather than using a list of wanted entries,
>> to use a base/count rather than always working zero-based like
>> decode_vec does. This would allow the resulting indices to still
>> actually mean something so we don't wind up with magic numbers all
>> over the code..
> The decode_vec uses the first entry in 'a' to store the tags found in the given vector.
> If cnt is bigger than sizeof(size_t) * 8, there is a shift overflow.
> It's fine in the current use as you only test tag values < 32 (or don't use decode_vec to test it : AT_SYSINFO_EHDR).
> Should I take care of those cases in decode_vec2 (add a separate 'found' table argument)?
>> Rich
[-- Attachment #2: dladdr-gnu-hash-v4-bis.patch --]
[-- Type: text/x-patch, Size: 7892 bytes --]
diff --git a/include/dlfcn.h b/include/dlfcn.h
index dea74c7..8524e0b 100644
--- a/include/dlfcn.h
+++ b/include/dlfcn.h
@@ -18,6 +18,17 @@ char *dlerror(void);
void *dlopen(const char *, int);
void *dlsym(void *, const char *);
+#ifdef _GNU_SOURCE
+typedef struct {
+ const char *dli_fname;
+ void *dli_fbase;
+ const char *dli_sname;
+ void *dli_saddr;
+} Dl_info;
+
+int dladdr (void *addr, Dl_info *info);
+#endif
+
#ifdef __cplusplus
}
#endif
diff --git a/src/ldso/dynlink.c b/src/ldso/dynlink.c
index 9692c6b..a7725ca 100644
--- a/src/ldso/dynlink.c
+++ b/src/ldso/dynlink.c
@@ -1,3 +1,4 @@
+#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
@@ -35,6 +36,7 @@ typedef Elf64_Sym Sym;
#define R_TYPE(x) ((x)&0xffffffff)
#define R_SYM(x) ((x)>>32)
#endif
+#define ST_TYPE(x) ((x)&15)
struct debug {
int ver;
@@ -53,6 +55,7 @@ struct dso {
int refcnt;
Sym *syms;
uint32_t *hashtab;
+ uint32_t *ghashtab;
char *strings;
unsigned char *map;
size_t map_len;
@@ -85,6 +88,7 @@ struct debug *_dl_debug_addr = &debug;
#define AUX_CNT 24
#define DYN_CNT 34
+#define DYN_ADDR_CNT 12
static void decode_vec(size_t *v, size_t *a, size_t cnt)
{
@@ -95,7 +99,16 @@ static void decode_vec(size_t *v, size_t *a, size_t cnt)
}
}
-static uint32_t hash(const char *s0)
+static void decode_vec2(size_t *v, size_t *a, size_t cnt, size_t offset)
+{
+ memset(a, 0, cnt*sizeof(size_t));
+ for (; v[0]; v+=2) if (v[0]<offset && v[0]>(offset-cnt)) {
+ a[0] |= 1ULL<<(offset-v[0]);
+ a[offset-v[0]] = v[1];
+ }
+}
+
+static uint32_t sysv_hash(const char *s0)
{
const unsigned char *s = (void *)s0;
uint_fast32_t h = 0;
@@ -106,7 +119,16 @@ static uint32_t hash(const char *s0)
return h & 0xfffffff;
}
-static Sym *lookup(const char *s, uint32_t h, struct dso *dso)
+static uint32_t gnu_hash (const char *s0)
+{
+ const unsigned char *s = (void *)s0;
+ uint_fast32_t h = 5381;
+ for (; *s; s++)
+ h = h*33 + *s;
+ return h;
+}
+
+static Sym *sysv_lookup(const char *s, uint32_t h, struct dso *dso)
{
size_t i;
Sym *syms = dso->syms;
@@ -119,20 +141,81 @@ static Sym *lookup(const char *s, uint32_t h, struct dso *dso)
return 0;
}
+static Sym *gnu_lookup(const char *s, uint32_t h1, struct dso *dso)
+{
+ size_t i;
+ Sym *sym;
+ char *strings = dso->strings;
+ uint32_t *hashtab = dso->ghashtab;
+ uint32_t nbuckets = hashtab[0];
+ size_t *maskwords = (size_t *)(hashtab + 4);
+ uint32_t *buckets = hashtab + 4 + (hashtab[2]*(sizeof(size_t)/sizeof(uint32_t)));
+ uint32_t symndx = hashtab[1];
+ Sym *syms = dso->syms;
+ uint32_t shift2 = hashtab[3];
+ uint32_t h2 = h1 >> shift2;
+ uint32_t *hashvals = buckets + nbuckets;
+ uint32_t *hashval;
+ size_t c = sizeof(size_t) * 8;
+ size_t n = (h1/c) & (hashtab[2]-1);
+ size_t bitmask = (1 << (h1%c)) | (1 << (h2%c));
+
+ if ((maskwords[n] & bitmask) != bitmask)
+ return 0;
+
+ n = buckets[h1 % nbuckets];
+ if (!n)
+ return 0;
+
+ sym = syms + n;
+ hashval = hashvals + n - symndx;
+
+ for (h1 &= (uint32_t)-2;; sym++) {
+ h2 = *hashval++;
+ if ((h1 == (h2 & ~1)) && !strcmp(s, strings + sym->st_name))
+ return sym;
+
+ if (h2 & 1)
+ break;
+ }
+
+ return 0;
+}
+
#define OK_TYPES (1<<STT_NOTYPE | 1<<STT_OBJECT | 1<<STT_FUNC | 1<<STT_COMMON)
#define OK_BINDS (1<<STB_GLOBAL | 1<<STB_WEAK)
static void *find_sym(struct dso *dso, const char *s, int need_def)
{
- uint32_t h = hash(s);
+ uint32_t h = 0, gh = 0;
void *def = 0;
- if (h==0x6b366be && !strcmp(s, "dlopen")) rtld_used = 1;
- if (h==0x6b3afd && !strcmp(s, "dlsym")) rtld_used = 1;
- if (h==0x595a4cc && !strcmp(s, "__stack_chk_fail")) ssp_used = 1;
+ static uint32_t precomp[2][3] = {
+ {0x6b366be, 0x6b3afd, 0x595a4cc},
+ {0xf9040207, 0xf4dc4ae, 0x1f4039c9},
+ };
+ uint32_t *precomptab;
+ if (dso->hashtab) {
+ h = sysv_hash(s);
+ precomptab = precomp[0];
+ } else {
+ gh = gnu_hash(s);
+ precomptab = precomp[1];
+ }
+ if (h == precomptab[0] && !strcmp(s, "dlopen")) rtld_used = 1;
+ if (h == precomptab[1] && !strcmp(s, "dlsym")) rtld_used = 1;
+ if (h == precomptab[2] && !strcmp(s, "__stack_chk_fail")) ssp_used = 1;
for (; dso; dso=dso->next) {
Sym *sym;
if (!dso->global) continue;
- sym = lookup(s, h, dso);
+ if (dso->hashtab && (h || !dso->ghashtab)) {
+ if (!h)
+ h = sysv_hash(s);
+ sym = sysv_lookup(s, h, dso);
+ } else {
+ if (!gh)
+ gh = gnu_hash(s);
+ sym = gnu_lookup(s, gh, dso);
+ }
if (sym && (!need_def || sym->st_shndx) && sym->st_value
&& (1<<(sym->st_info&0xf) & OK_TYPES)
&& (1<<(sym->st_info>>4) & OK_BINDS)) {
@@ -325,8 +408,12 @@ static void decode_dyn(struct dso *p)
size_t dyn[DYN_CNT] = {0};
decode_vec(p->dynv, dyn, DYN_CNT);
p->syms = (void *)(p->base + dyn[DT_SYMTAB]);
- p->hashtab = (void *)(p->base + dyn[DT_HASH]);
p->strings = (void *)(p->base + dyn[DT_STRTAB]);
+ if (dyn[0]&(1<<DT_HASH))
+ p->hashtab = (void *)(p->base + dyn[DT_HASH]);
+ decode_vec2(p->dynv, dyn, DYN_ADDR_CNT, DT_ADDRRNGHI+1);
+ if (dyn[0]&(1<<(DT_ADDRTAGIDX(DT_GNU_HASH)+1)))
+ p->ghashtab = (void *)(p->base + dyn[DT_ADDRTAGIDX(DT_GNU_HASH)+1]);
}
static struct dso *load_library(const char *name)
@@ -788,7 +875,7 @@ end:
static void *do_dlsym(struct dso *p, const char *s, void *ra)
{
size_t i;
- uint32_t h;
+ uint32_t h = 0, gh = 0;
Sym *sym;
if (p == RTLD_NEXT) {
for (p=head; p && (unsigned char *)ra-p->map>p->map_len; p=p->next);
@@ -802,12 +889,25 @@ static void *do_dlsym(struct dso *p, const char *s, void *ra)
if (!res) goto failed;
return res;
}
- h = hash(s);
- sym = lookup(s, h, p);
+ if (p->hashtab) {
+ h = sysv_hash(s);
+ sym = sysv_lookup(s, h, p);
+ } else {
+ gh = gnu_hash(s);
+ sym = gnu_lookup(s, gh, p);
+ }
if (sym && sym->st_value && (1<<(sym->st_info&0xf) & OK_TYPES))
return p->base + sym->st_value;
if (p->deps) for (i=0; p->deps[i]; i++) {
- sym = lookup(s, h, p->deps[i]);
+ if (p->deps[i]->hashtab && (h || !p->deps[i]->ghashtab)) {
+ if (!h)
+ h = sysv_hash(s);
+ sym = sysv_lookup(s, h, p->deps[i]);
+ } else {
+ if (!gh)
+ gh = gnu_hash(s);
+ sym = gnu_lookup(s, h, p->deps[i]);
+ }
if (sym && sym->st_value && (1<<(sym->st_info&0xf) & OK_TYPES))
return p->deps[i]->base + sym->st_value;
}
@@ -817,6 +917,69 @@ failed:
return 0;
}
+static int do_dladdr (void *addr, Dl_info *info)
+{
+ struct dso *p;
+ memset (info, 0, sizeof (*info));
+ for (p=head; p; p=p->next) {
+ if ((unsigned char *)addr >= p->map && (unsigned char *)addr < p->map + p->map_len) {
+ Sym *syms = p->syms;
+ uint32_t nsym = 0;
+ char *strings = p->strings;
+ size_t i;
+ info->dli_fname = p->name;
+ info->dli_fbase = p->base;
+ if (p->hashtab)
+ nsym = p->hashtab[1];
+ else {
+ uint32_t *buckets;
+ buckets = p->ghashtab + 4 + (p->ghashtab[2] * (sizeof(size_t)/sizeof(uint32_t)));
+ syms += p->ghashtab[1];
+ for (i = 0; i < p->ghashtab[0]; ++i) {
+ if (buckets[i] > nsym)
+ nsym = buckets[i];
+ }
+
+ if (nsym) {
+ nsym -= p->ghashtab[1];
+ uint32_t *hashval = buckets + p->ghashtab[0] + nsym;
+ do {
+ nsym++;
+ }while (!(*hashval++ & 1));
+ }
+
+ }
+
+ for (i = 0; i < nsym; i++) {
+ if ((syms[i].st_value != 0 || syms[i].st_shndx != SHN_UNDEF) &&
+ ST_TYPE(syms[i].st_info) != STT_TLS) {
+ void *symaddr = p->base + syms[i].st_value;
+
+ if (addr >= symaddr && (!info->dli_saddr || (addr-symaddr)<(addr-info->dli_saddr))) {
+ info->dli_saddr = symaddr;
+ info->dli_sname = strings + syms[i].st_name;
+
+ if (addr == symaddr)
+ break;
+ }
+ }
+ }
+
+ return 1;
+ }
+ }
+ return 0;
+}
+
+int dladdr (void *addr, Dl_info *info)
+{
+ int res;
+ pthread_rwlock_rdlock(&lock);
+ res = do_dladdr (addr, info);
+ pthread_rwlock_unlock(&lock);
+ return res;
+}
+
void *__dlsym(void *p, const char *s, void *ra)
{
void *res;
next prev parent reply other threads:[~2012-08-20 14:32 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-07 9:04 musl
2012-08-07 11:46 ` Szabolcs Nagy
2012-08-07 14:15 ` musl
2012-08-07 14:53 ` Szabolcs Nagy
2012-08-07 23:09 ` Rich Felker
2012-08-08 9:55 ` musl
2012-08-08 11:52 ` Szabolcs Nagy
2012-08-08 12:54 ` Rich Felker
2012-08-08 13:57 ` musl
2012-08-11 23:05 ` Rich Felker
2012-08-15 22:41 ` boris brezillon
2012-08-17 5:39 ` Rich Felker
2012-08-19 16:42 ` musl
2012-08-20 2:06 ` Rich Felker
2012-08-20 12:55 ` musl
2012-08-20 14:32 ` musl [this message]
2012-08-23 21:39 ` Rich Felker
2012-08-23 22:21 ` Rich Felker
2012-08-24 7:29 ` musl
2012-08-24 18:38 ` Rich Felker
2012-08-25 7:42 ` boris brezillon
2012-08-25 12:35 ` Rich Felker
2012-08-25 22:13 ` musl
2012-08-25 22:37 ` musl
2012-08-26 0:00 ` musl
2012-08-24 8:12 ` Szabolcs Nagy
2012-08-24 8:56 ` musl
2012-08-24 9:38 ` Szabolcs Nagy
2012-08-25 21:34 ` musl
2012-08-25 21:42 ` Rich Felker
2012-08-16 18:03 ` musl
2012-08-17 16:35 ` musl
2012-08-08 12:49 ` Rich Felker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50324A60.7040206@gmail.com \
--to=b.brezillon.musl@gmail.com \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).