mailing list of musl libc
 help / color / mirror / code / Atom feed
From: musl <b.brezillon.musl@gmail.com>
To: musl@lists.openwall.com
Cc: Rich Felker <dalias@aerifal.cx>
Subject: Re: ldso : dladdr support
Date: Thu, 16 Aug 2012 20:03:59 +0200	[thread overview]
Message-ID: <502D360F.3060606@gmail.com> (raw)
In-Reply-To: <20120811230536.GQ27715@brightrain.aerifal.cx>

[-- Attachment #1: Type: text/plain, Size: 5563 bytes --]

Hi,

Here is a new patch (and a diff from the previous one).

Could you tell me if the new decode_vec function is what you expected?

Regards,

Boris

On 12/08/2012 01:05, Rich Felker wrote:
> On Wed, Aug 08, 2012 at 03:57:15PM +0200, musl wrote:
>> Same as before except I use bit mask to support multiple hash algorithms.
> Sorry for taking a while to get back to you. I haven't had as much
> time to work on musl the past couple weeks and some other topics (like
> mips dynamic linking) had priority, but I hope to have more again for
> a while now. Here's a quick review of some things that will hopefully
> turn into a discussion for improving/simplifying the code.
>
>> +#ifdef _GNU_SOURCE
>> +typedef struct {
>> +    const char *dli_fname;  /* Pathname of shared object that
>> +                               contains address */
>> +    void       *dli_fbase;  /* Address at which shared object
>> +                               is loaded */
>> +    const char *dli_sname;  /* Name of nearest symbol with address
>> +                               lower than addr */
>> +    void       *dli_saddr;  /* Exact address of symbol named
>> +                               in dli_sname */
>> +} Dl_info;
> musl policy is not to have commentary, especially copied from
> third-party sources, in the public headers. This is partly to
> strengthen the claim that all public headers are public domain
> (contain no copyrightable content) and partly just to avoid size creep
> and wasted parsing time by the compiler.
>
>>  static void decode_dyn(struct dso *p)
>>  {
>> -	size_t dyn[DYN_CNT] = {0};
>> -	decode_vec(p->dynv, dyn, DYN_CNT);
>> -	p->syms = (void *)(p->base + dyn[DT_SYMTAB]);
>> -	p->hashtab = (void *)(p->base + dyn[DT_HASH]);
>> -	p->strings = (void *)(p->base + dyn[DT_STRTAB]);
>> +	size_t *v;
>> +	p->hashalgs = 0;
>> +	for (v = p->dynv; v[0]; v+=2) {
>> +		switch (v[0]) {
>> +		case DT_SYMTAB:
>> +			p->syms = (void *)(p->base + v[1]);
>> +			break;
>> +		case DT_HASH:
>> +			p->hashtabs[SYSV_HASH] = (void *)(p->base + v[1]);
>> +			p->hashalgs |= (1 << SYSV_HASH);
>> +			break;
>> +		case DT_STRTAB:
>> +			p->strings = (void *)(p->base + v[1]);
>> +			break;
>> +		case DT_GNU_HASH:
>> +			p->hashtabs[GNU_HASH] = (void *)(p->base + v[1]);
>> +			p->hashalgs |= (1 << GNU_HASH);
>> +			break;
>> +		default:
>> +			break;
>> +		}
>> +	}
>>  }
> This is rather ugly but I don't see a better way right off, since
> DT_GNU_HASH has that huge value... Maybe it would be nice to improve
> decode_vec to have a variant that takes a (static const) table of DT_x
> values and struct offsets to store them at when found..? This is just
> some rambling and I'm not sure it's better, but it might be smart if
> we're going to want to continue adding support for more
> non-original-sysv DT_* entries with huge values, so we don't have to
> write new switches for each one.
>
> BTW, while I _want_ it to be safe, it's possible that early switches
> (early meaning prior to the comment in __dynlink that says libc is now
> fully functional) will actually fail to work/crash on some archs... So
> this needs consideration too.
>
>>  static struct dso *load_library(const char *name)
>> @@ -784,8 +899,11 @@ end:
>>  static void *do_dlsym(struct dso *p, const char *s, void *ra)
>>  {
>> [...]
>>  	if (sym && sym->st_value && (1<<(sym->st_info&0xf) & OK_TYPES))
>>  		return p->base + sym->st_value;
>>  	if (p->deps) for (i=0; p->deps[i]; i++) {
>> -		sym = lookup(s, h, p->deps[i]);
>> +		algs = p->deps[i]->hashalgs;
>> +		if (!(algs & ok)) {
>> +			if (algs & SYSV_HASH) {
>> +				h[SYSV_HASH] = sysv_hash(s);
>> +				sym = sysv_lookup(s, h[SYSV_HASH], p->deps[i]);
>> +				ok |= SYSV_HASH;
>> +			} else {
>> +				h[GNU_HASH] = gnu_hash(s);
>> +				sym = gnu_lookup(s, h[GNU_HASH], p->deps[i]);
>> +				ok |= GNU_HASH;
>> +			}
>> +		} else {
>> +			if (algs & SYSV_HASH)
>> +				sym = sysv_lookup(s, h[SYSV_HASH], p->deps[i]);
>> +			else
>> +				sym = gnu_lookup(s, h[GNU_HASH], p->deps[i]);
>> +		}
> This looks like a lot of code duplication and extra unnecessary
> variables. The way I would do it is something like:
>
> if (p->deps[i]->hashtab && (h || !p->deps[i]->ghashtab)) {
> 	if (!h) h = hash(s);
> 	sym = sysv_lookup(s, h, p->deps[i]);
> }
>
> i.e. if there's a sysv hash table and we've already computed h (sysv
> hash) or if there's no gnu hash table, compute h if it wasn't already
> computed, and then attempt a lookup with it.
>
> I'm not sure I got the logic all right (this is really a 1-minute
> glance over the code right now amidst doing lots of other stuff too)
> but the ideas are:
>
> - no need for extra vars for bitmask. Whether the hash var for the
>   corresponding hash type is nonzero is sufficient to tell whether
>   it's been computed.
> - no need for extra vars/fields to store which hash types a dso has.
>   Just use the hashtab/ghashtab fields in the dso struct, and let them
>   be null if the corresponding hash table does not exist. (And don't
>   make them an array unless there's a real benefit in using an array;
>   I don't think there is any benefit unless you're aiming for
>   extensibility to support N hash types.)
>
>> +static int do_dladdr (void *addr, Dl_info *info)
>> [...]
>> +			if (p->hashalgs & (1 << SYSV_HASH)) {
>> +				hashtab = p->hashtabs[SYSV_HASH];
>> +				for (i = 0; i < hashtab[1]; i++) {
> I'm not seeing why this function needs hash tables at all. It's not
> looking up symbols, just iterating over the entire symbol table, no?
> Please explain if I'm mistaken.
>
> Rich


[-- Attachment #2: dladdr-gnu-hash-v2.diff --]
[-- Type: text/x-patch, Size: 9819 bytes --]

diff --git a/include/dlfcn.h b/include/dlfcn.h
index 8c45822..8524e0b 100644
--- a/include/dlfcn.h
+++ b/include/dlfcn.h
@@ -20,14 +20,10 @@ void  *dlsym(void *, const char *);
 
 #ifdef _GNU_SOURCE
 typedef struct {
-    const char *dli_fname;  /* Pathname of shared object that
-                               contains address */
-    void       *dli_fbase;  /* Address at which shared object
-                               is loaded */
-    const char *dli_sname;  /* Name of nearest symbol with address
-                               lower than addr */
-    void       *dli_saddr;  /* Exact address of symbol named
-                               in dli_sname */
+	const char *dli_fname;
+	void *dli_fbase;
+	const char *dli_sname;
+	void *dli_saddr;
 } Dl_info;
 
 int dladdr (void *addr, Dl_info *info);
diff --git a/src/ldso/dynlink.c b/src/ldso/dynlink.c
index 4a236b2..d63da1c 100644
--- a/src/ldso/dynlink.c
+++ b/src/ldso/dynlink.c
@@ -59,8 +59,8 @@ struct dso {
 
 	int refcnt;
 	Sym *syms;
-	uint32_t hashalgs;
-	uint32_t *hashtabs[NHASH];
+	uint32_t *hashtab;
+	uint32_t *ghashtab;
 	char *strings;
 	unsigned char *map;
 	size_t map_len;
@@ -93,12 +93,23 @@ struct debug *_dl_debug_addr = &debug;
 #define AUX_CNT 24
 #define DYN_CNT 34
 
-static void decode_vec(size_t *v, size_t *a, size_t cnt)
+struct tag_range {
+	size_t start;
+	size_t size;
+};
+
+static void decode_vec(size_t *v, const struct tag_range *defs, size_t **storages, size_t defsize)
 {
-	memset(a, 0, cnt*sizeof(size_t));
-	for (; v[0]; v+=2) if (v[0]<cnt) {
-		a[0] |= 1ULL<<v[0];
-		a[v[0]] = v[1];
+	size_t i;
+	for (i = 0; i < defsize; ++i)
+		memset(storages[i], 0, defs[i].size*sizeof(size_t));
+	for (; v[0]; v+=2) {
+		for (i = 0; i < defsize; ++i) {
+			if (v[0]>defs[i].start && v[0]<=defs[i].start+defs[i].size) {
+				storages[i][0] |= 1ULL<<v[0];
+				storages[i][v[0] - defs[i].start] = v[1];
+			}
+		}
 	}
 }
 
@@ -126,7 +137,7 @@ static Sym *sysv_lookup(const char *s, uint32_t h, struct dso *dso)
 {
 	size_t i;
 	Sym *syms = dso->syms;
-	uint32_t *hashtab = dso->hashtabs[SYSV_HASH];
+	uint32_t *hashtab = dso->hashtab;
 	char *strings = dso->strings;
 	for (i=hashtab[2+h%hashtab[0]]; i; i=hashtab[2+hashtab[0]+i]) {
 		if (!strcmp(s, strings+syms[i].st_name))
@@ -140,7 +151,7 @@ static Sym *gnu_lookup(const char *s, uint32_t h1, struct dso *dso)
 	size_t i;
 	Sym *sym;
 	char *strings = dso->strings;
-	uint32_t *hashtab = dso->hashtabs[GNU_HASH];
+	uint32_t *hashtab = dso->ghashtab;
 	uint32_t nbuckets = hashtab[0];
 	size_t *maskwords = (size_t *)(hashtab + 4);
 	uint32_t *buckets = hashtab + 4 + (hashtab[2]*(sizeof(size_t)/sizeof(uint32_t)));
@@ -182,52 +193,37 @@ static Sym *gnu_lookup(const char *s, uint32_t h1, struct dso *dso)
 static void *find_sym(struct dso *dso, const char *s, int need_def)
 {
 	void *def = 0;
-	static uint32_t precomp[NHASH][3] = {
+	static uint32_t precomp[2][3] = {
 		{0x6b366be, 0x6b3afd, 0x595a4cc},
 		{0xf9040207, 0xf4dc4ae, 0x1f4039c9},
 	};
-	uint32_t h[NHASH];
-	uint8_t ok = 0;
-	uint8_t algs = dso->hashalgs;
-	uint8_t alg;
-	if (algs & (1 << SYSV_HASH)) {
-		h[SYSV_HASH] = sysv_hash(s);
-		alg = SYSV_HASH;
+	uint32_t *precomptab;
+	uint32_t h = 0, gh = 0;
+	if (dso->hashtab) {
+		h = sysv_hash(s);
+		precomptab = precomp[0];
 	} else {
-		h[GNU_HASH] = gnu_hash(s);
-		alg = GNU_HASH;
+		gh = gnu_hash(s);
+		precomptab = precomp[0];
 	}
 
-	ok |= (1 << alg);
-
-	if (h[alg] == precomp[alg][0] && !strcmp(s, "dlopen")) rtld_used = 1;
-	if (h[alg] == precomp[alg][1] && !strcmp(s, "dlsym")) rtld_used = 1;
-	if (h[alg] == precomp[alg][2] && !strcmp(s, "__stack_chk_fail")) ssp_used = 1;
+	if (h == precomptab[0] && !strcmp(s, "dlopen")) rtld_used = 1;
+	if (h == precomptab[1] && !strcmp(s, "dlsym")) rtld_used = 1;
+	if (h == precomptab[2] && !strcmp(s, "__stack_chk_fail")) ssp_used = 1;
 
 	for (; dso; dso=dso->next) {
 		Sym *sym;
 
 		if (!dso->global) continue;
 
-		algs = dso->hashalgs;
-		if (!(algs & ok)) {
-			if (algs & (1 << SYSV_HASH)) {
-				alg = SYSV_HASH;
-				h[alg] = sysv_hash(s);
-				sym = sysv_lookup(s, h[alg], dso);
-			}
-			else {
-				alg = GNU_HASH;
-				h[alg] = gnu_hash(s);
-				sym = gnu_lookup(s, h[alg], dso);
-			}
-
-			ok |= (1 << alg);
+		if (dso->hashtab && (h || !dso->ghashtab)) {
+			if (!h)
+				h = sysv_hash(s);
+			sym = sysv_lookup(s, h, dso);
 		} else {
-			if ((algs & ok) & (1 << SYSV_HASH))
-				sym = sysv_lookup(s, h[SYSV_HASH], dso);
-			else
-				sym = gnu_lookup(s, h[GNU_HASH], dso);
+			if (!gh)
+				gh = gnu_hash(s);
+			sym = gnu_lookup(s, gh, dso);
 		}
 
 		if (sym && (!need_def || sym->st_shndx) && sym->st_value
@@ -418,28 +414,18 @@ static int path_open(const char *name, const char *search, char *buf, size_t buf
 
 static void decode_dyn(struct dso *p)
 {
-	size_t *v;
-	p->hashalgs = 0;
-	for (v = p->dynv; v[0]; v+=2) {
-		switch (v[0]) {
-		case DT_SYMTAB:
-			p->syms = (void *)(p->base + v[1]);
-			break;
-		case DT_HASH:
-			p->hashtabs[SYSV_HASH] = (void *)(p->base + v[1]);
-			p->hashalgs |= (1 << SYSV_HASH);
-			break;
-		case DT_STRTAB:
-			p->strings = (void *)(p->base + v[1]);
-			break;
-		case DT_GNU_HASH:
-			p->hashtabs[GNU_HASH] = (void *)(p->base + v[1]);
-			p->hashalgs |= (1 << GNU_HASH);
-			break;
-		default:
-			break;
-		}
-	}
+	static const struct tag_range defs[2] = {{0, DYN_CNT}, {DT_ADDRRNGHI-DT_ADDRNUM, DT_ADDRNUM}};
+	size_t dynbase[DYN_CNT] = {0};
+	size_t dynaddr[DT_ADDRNUM+1] = {0};
+	size_t *storage[2] = {dynbase, dynaddr};
+	decode_vec (p->dynv, defs, storage, 2);
+	p->syms = (void *)(p->base + dynbase[DT_SYMTAB]);
+	p->strings = (void *)(p->base + dynbase[DT_STRTAB]);
+	if (dynbase[DT_HASH])
+		p->hashtab = (void *)(p->base + dynbase[DT_HASH]);
+	if (dynaddr[DT_ADDRNUM-DT_ADDRTAGIDX(DT_GNU_HASH)])
+		p->ghashtab = (void *)(p->base + dynaddr[DT_ADDRNUM-DT_ADDRTAGIDX(DT_GNU_HASH)]);
+
 }
 
 static struct dso *load_library(const char *name)
@@ -601,9 +587,11 @@ static void make_global(struct dso *p)
 static void reloc_all(struct dso *p)
 {
 	size_t dyn[DYN_CNT] = {0};
+	static const struct tag_range defs = {0, DYN_CNT};
+	size_t *storage = dyn;
 	for (; p; p=p->next) {
 		if (p->relocated) continue;
-		decode_vec(p->dynv, dyn, DYN_CNT);
+		decode_vec(p->dynv, &defs, &storage, 1);
 #ifdef NEED_ARCH_RELOCS
 		do_arch_relocs(p, head);
 #endif
@@ -636,9 +624,11 @@ static size_t find_dyn(Phdr *ph, size_t cnt, size_t stride)
 static void do_init_fini(struct dso *p)
 {
 	size_t dyn[DYN_CNT] = {0};
+	static const struct tag_range defs = {0, DYN_CNT};
+	size_t *storage = dyn;
 	for (; p; p=p->prev) {
 		if (p->constructed) return;
-		decode_vec(p->dynv, dyn, DYN_CNT);
+		decode_vec(p->dynv, &defs, &storage, 1);
 		if (dyn[0] & (1<<DT_FINI))
 			atexit((void (*)(void))(p->base + dyn[DT_FINI]));
 		if (dyn[0] & (1<<DT_INIT))
@@ -654,6 +644,8 @@ void _dl_debug_state(void)
 void *__dynlink(int argc, char **argv)
 {
 	size_t *auxv, aux[AUX_CNT] = {0};
+	static const struct tag_range defs = {0, AUX_CNT};
+	size_t *storage = aux;
 	size_t i;
 	Phdr *phdr;
 	Ehdr *ehdr;
@@ -671,7 +663,7 @@ void *__dynlink(int argc, char **argv)
 			env_preload = argv[i]+11;
 	auxv = (void *)(argv+i+1);
 
-	decode_vec(auxv, aux, AUX_CNT);
+	decode_vec(auxv, &defs, &storage, 1);
 
 	/* Only trust user/env if kernel says we're not suid/sgid */
 	if ((aux[0]&0x7800)!=0x7800 || aux[AT_UID]!=aux[AT_EUID]
@@ -900,9 +892,7 @@ static void *do_dlsym(struct dso *p, const char *s, void *ra)
 {
 	size_t i;
 	Sym *sym;
-	uint32_t ok = 0;
-	uint8_t algs = p->hashalgs;
-	uint32_t h[NHASH];
+	uint32_t h = 0, gh = 0;
 
 	if (p == RTLD_NEXT) {
 		for (p=head; p && (unsigned char *)ra-p->map>p->map_len; p=p->next);
@@ -917,36 +907,26 @@ static void *do_dlsym(struct dso *p, const char *s, void *ra)
 		return res;
 	}
 
-	if (algs & (1 << SYSV_HASH)) {
-		h[SYSV_HASH] = sysv_hash(s);
-		sym = sysv_lookup(s, h[SYSV_HASH], p);
-		ok |= 1 << SYSV_HASH;
+	if (p->hashtab) {
+		h = sysv_hash(s);
+		sym = sysv_lookup(s, h, p);
 	} else {
-		h[GNU_HASH] = gnu_hash(s);
-		sym = gnu_lookup(s, h[GNU_HASH], p);
-		ok |= 1 << GNU_HASH;
+		gh = gnu_hash(s);
+		sym = gnu_lookup(s, gh, p);
 	}
 
 
 	if (sym && sym->st_value && (1<<(sym->st_info&0xf) & OK_TYPES))
 		return p->base + sym->st_value;
 	if (p->deps) for (i=0; p->deps[i]; i++) {
-		algs = p->deps[i]->hashalgs;
-		if (!(algs & ok)) {
-			if (algs & SYSV_HASH) {
-				h[SYSV_HASH] = sysv_hash(s);
-				sym = sysv_lookup(s, h[SYSV_HASH], p->deps[i]);
-				ok |= SYSV_HASH;
-			} else {
-				h[GNU_HASH] = gnu_hash(s);
-				sym = gnu_lookup(s, h[GNU_HASH], p->deps[i]);
-				ok |= GNU_HASH;
-			}
+		if (p->deps[i]->hashtab && (h || !p->deps[i]->ghashtab)) {
+			if (!h)
+				h = sysv_hash(s);
+			sym = sysv_lookup(s, h, p->deps[i]);
 		} else {
-			if (algs & SYSV_HASH)
-				sym = sysv_lookup(s, h[SYSV_HASH], p->deps[i]);
-			else
-				sym = gnu_lookup(s, h[GNU_HASH], p->deps[i]);
+			if (!gh)
+				gh = gnu_hash(s);
+			sym = gnu_lookup(s, h, p->deps[i]);
 		}
 
 		if (sym && sym->st_value && (1<<(sym->st_info&0xf) & OK_TYPES))
@@ -1007,18 +987,18 @@ static int do_dladdr (void *addr, Dl_info *info)
 			size_t i;
 			info->dli_fname = p->name;
 			info->dli_fbase = p->base;
-			if (p->hashalgs & (1 << SYSV_HASH)) {
-				hashtab = p->hashtabs[SYSV_HASH];
+			if (p->hashtab) {
+				hashtab = p->hashtab;
 				for (i = 0; i < hashtab[1]; i++) {
 					if (!find_closest_sym (p, syms + i, &search))
 						return 1;
 				}
-			} else if(p->hashalgs & (1 << GNU_HASH)) {
+			} else {
 				uint32_t *buckets;
 				uint32_t nbuckets;
 				uint32_t *hashvals;
 				uint32_t symndx;
-				hashtab = p->hashtabs[GNU_HASH];
+				hashtab = p->ghashtab;
 				buckets = hashtab + 4 + (hashtab[2] * (sizeof(size_t)/sizeof(uint32_t)));
 				nbuckets = hashtab[0];
 				hashvals = buckets + nbuckets;

[-- Attachment #3: dladdr-gnu-hash-v2.patch --]
[-- Type: text/x-patch, Size: 10873 bytes --]

diff --git a/include/dlfcn.h b/include/dlfcn.h
index dea74c7..8524e0b 100644
--- a/include/dlfcn.h
+++ b/include/dlfcn.h
@@ -18,6 +18,17 @@ char  *dlerror(void);
 void  *dlopen(const char *, int);
 void  *dlsym(void *, const char *);
 
+#ifdef _GNU_SOURCE
+typedef struct {
+	const char *dli_fname;
+	void *dli_fbase;
+	const char *dli_sname;
+	void *dli_saddr;
+} Dl_info;
+
+int dladdr (void *addr, Dl_info *info);
+#endif
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/src/ldso/dynlink.c b/src/ldso/dynlink.c
index f55c6f1..d63da1c 100644
--- a/src/ldso/dynlink.c
+++ b/src/ldso/dynlink.c
@@ -1,3 +1,4 @@
+#define _GNU_SOURCE
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
@@ -28,14 +29,20 @@ typedef Elf32_Phdr Phdr;
 typedef Elf32_Sym Sym;
 #define R_TYPE(x) ((x)&255)
 #define R_SYM(x) ((x)>>8)
+#define ELF_ST_TYPE ELF32_ST_TYPE
 #else
 typedef Elf64_Ehdr Ehdr;
 typedef Elf64_Phdr Phdr;
 typedef Elf64_Sym Sym;
 #define R_TYPE(x) ((x)&0xffffffff)
 #define R_SYM(x) ((x)>>32)
+#define ELF_ST_TYPE ELF64_ST_TYPE
 #endif
 
+#define SYSV_HASH	0
+#define GNU_HASH	1
+#define NHASH		2
+
 struct debug {
 	int ver;
 	void *head;
@@ -53,6 +60,7 @@ struct dso {
 	int refcnt;
 	Sym *syms;
 	uint32_t *hashtab;
+	uint32_t *ghashtab;
 	char *strings;
 	unsigned char *map;
 	size_t map_len;
@@ -85,16 +93,27 @@ struct debug *_dl_debug_addr = &debug;
 #define AUX_CNT 24
 #define DYN_CNT 34
 
-static void decode_vec(size_t *v, size_t *a, size_t cnt)
+struct tag_range {
+	size_t start;
+	size_t size;
+};
+
+static void decode_vec(size_t *v, const struct tag_range *defs, size_t **storages, size_t defsize)
 {
-	memset(a, 0, cnt*sizeof(size_t));
-	for (; v[0]; v+=2) if (v[0]<cnt) {
-		a[0] |= 1ULL<<v[0];
-		a[v[0]] = v[1];
+	size_t i;
+	for (i = 0; i < defsize; ++i)
+		memset(storages[i], 0, defs[i].size*sizeof(size_t));
+	for (; v[0]; v+=2) {
+		for (i = 0; i < defsize; ++i) {
+			if (v[0]>defs[i].start && v[0]<=defs[i].start+defs[i].size) {
+				storages[i][0] |= 1ULL<<v[0];
+				storages[i][v[0] - defs[i].start] = v[1];
+			}
+		}
 	}
 }
 
-static uint32_t hash(const char *s0)
+static uint32_t sysv_hash(const char *s0)
 {
 	const unsigned char *s = (void *)s0;
 	uint_fast32_t h = 0;
@@ -105,7 +124,16 @@ static uint32_t hash(const char *s0)
 	return h & 0xfffffff;
 }
 
-static Sym *lookup(const char *s, uint32_t h, struct dso *dso)
+static uint32_t gnu_hash (const char *s0)
+{
+	const unsigned char *s = (void *)s0;
+	uint_fast32_t h = 5381;
+	for (; *s; s++)
+		h = h*33 + *s;
+	return h & 0xffffffff;
+}
+
+static Sym *sysv_lookup(const char *s, uint32_t h, struct dso *dso)
 {
 	size_t i;
 	Sym *syms = dso->syms;
@@ -118,20 +146,86 @@ static Sym *lookup(const char *s, uint32_t h, struct dso *dso)
 	return 0;
 }
 
+static Sym *gnu_lookup(const char *s, uint32_t h1, struct dso *dso)
+{
+	size_t i;
+	Sym *sym;
+	char *strings = dso->strings;
+	uint32_t *hashtab = dso->ghashtab;
+	uint32_t nbuckets = hashtab[0];
+	size_t *maskwords = (size_t *)(hashtab + 4);
+	uint32_t *buckets = hashtab + 4 + (hashtab[2]*(sizeof(size_t)/sizeof(uint32_t)));
+	uint32_t symndx = hashtab[1];
+	Sym *syms = dso->syms;
+	uint32_t shift2 = hashtab[3];
+	uint32_t h2 = h1 >> shift2;
+	uint32_t *hashvals = buckets + nbuckets;
+	uint32_t *hashval;
+	size_t c = sizeof(size_t) * 8;
+	size_t n = (h1/c) & (hashtab[2]-1);
+	size_t bitmask = (1 << (h1%c)) | (1 << (h2%c));
+
+	if ((maskwords[n] & bitmask) != bitmask)
+		return 0;
+
+	n = buckets[h1 % nbuckets];
+	if (!n)
+		return 0;
+
+	sym = syms + n;
+	hashval = hashvals + n - symndx;
+
+	for (h1 &= (uint32_t)-2;; sym++) {
+		h2 = *hashval++;
+		if ((h1 == (h2 & ~1)) && !strcmp(s, strings + sym->st_name))
+			return sym;
+
+		if (h2 & 1)
+			break;
+	}
+
+	return 0;
+}
+
 #define OK_TYPES (1<<STT_NOTYPE | 1<<STT_OBJECT | 1<<STT_FUNC | 1<<STT_COMMON)
 #define OK_BINDS (1<<STB_GLOBAL | 1<<STB_WEAK)
 
 static void *find_sym(struct dso *dso, const char *s, int need_def)
 {
-	uint32_t h = hash(s);
 	void *def = 0;
-	if (h==0x6b366be && !strcmp(s, "dlopen")) rtld_used = 1;
-	if (h==0x6b3afd && !strcmp(s, "dlsym")) rtld_used = 1;
-	if (h==0x595a4cc && !strcmp(s, "__stack_chk_fail")) ssp_used = 1;
+	static uint32_t precomp[2][3] = {
+		{0x6b366be, 0x6b3afd, 0x595a4cc},
+		{0xf9040207, 0xf4dc4ae, 0x1f4039c9},
+	};
+	uint32_t *precomptab;
+	uint32_t h = 0, gh = 0;
+	if (dso->hashtab) {
+		h = sysv_hash(s);
+		precomptab = precomp[0];
+	} else {
+		gh = gnu_hash(s);
+		precomptab = precomp[0];
+	}
+
+	if (h == precomptab[0] && !strcmp(s, "dlopen")) rtld_used = 1;
+	if (h == precomptab[1] && !strcmp(s, "dlsym")) rtld_used = 1;
+	if (h == precomptab[2] && !strcmp(s, "__stack_chk_fail")) ssp_used = 1;
+
 	for (; dso; dso=dso->next) {
 		Sym *sym;
+
 		if (!dso->global) continue;
-		sym = lookup(s, h, dso);
+
+		if (dso->hashtab && (h || !dso->ghashtab)) {
+			if (!h)
+				h = sysv_hash(s);
+			sym = sysv_lookup(s, h, dso);
+		} else {
+			if (!gh)
+				gh = gnu_hash(s);
+			sym = gnu_lookup(s, gh, dso);
+		}
+
 		if (sym && (!need_def || sym->st_shndx) && sym->st_value
 		 && (1<<(sym->st_info&0xf) & OK_TYPES)
 		 && (1<<(sym->st_info>>4) & OK_BINDS)) {
@@ -320,11 +414,18 @@ static int path_open(const char *name, const char *search, char *buf, size_t buf
 
 static void decode_dyn(struct dso *p)
 {
-	size_t dyn[DYN_CNT] = {0};
-	decode_vec(p->dynv, dyn, DYN_CNT);
-	p->syms = (void *)(p->base + dyn[DT_SYMTAB]);
-	p->hashtab = (void *)(p->base + dyn[DT_HASH]);
-	p->strings = (void *)(p->base + dyn[DT_STRTAB]);
+	static const struct tag_range defs[2] = {{0, DYN_CNT}, {DT_ADDRRNGHI-DT_ADDRNUM, DT_ADDRNUM}};
+	size_t dynbase[DYN_CNT] = {0};
+	size_t dynaddr[DT_ADDRNUM+1] = {0};
+	size_t *storage[2] = {dynbase, dynaddr};
+	decode_vec (p->dynv, defs, storage, 2);
+	p->syms = (void *)(p->base + dynbase[DT_SYMTAB]);
+	p->strings = (void *)(p->base + dynbase[DT_STRTAB]);
+	if (dynbase[DT_HASH])
+		p->hashtab = (void *)(p->base + dynbase[DT_HASH]);
+	if (dynaddr[DT_ADDRNUM-DT_ADDRTAGIDX(DT_GNU_HASH)])
+		p->ghashtab = (void *)(p->base + dynaddr[DT_ADDRNUM-DT_ADDRTAGIDX(DT_GNU_HASH)]);
+
 }
 
 static struct dso *load_library(const char *name)
@@ -486,9 +587,11 @@ static void make_global(struct dso *p)
 static void reloc_all(struct dso *p)
 {
 	size_t dyn[DYN_CNT] = {0};
+	static const struct tag_range defs = {0, DYN_CNT};
+	size_t *storage = dyn;
 	for (; p; p=p->next) {
 		if (p->relocated) continue;
-		decode_vec(p->dynv, dyn, DYN_CNT);
+		decode_vec(p->dynv, &defs, &storage, 1);
 #ifdef NEED_ARCH_RELOCS
 		do_arch_relocs(p, head);
 #endif
@@ -521,9 +624,11 @@ static size_t find_dyn(Phdr *ph, size_t cnt, size_t stride)
 static void do_init_fini(struct dso *p)
 {
 	size_t dyn[DYN_CNT] = {0};
+	static const struct tag_range defs = {0, DYN_CNT};
+	size_t *storage = dyn;
 	for (; p; p=p->prev) {
 		if (p->constructed) return;
-		decode_vec(p->dynv, dyn, DYN_CNT);
+		decode_vec(p->dynv, &defs, &storage, 1);
 		if (dyn[0] & (1<<DT_FINI))
 			atexit((void (*)(void))(p->base + dyn[DT_FINI]));
 		if (dyn[0] & (1<<DT_INIT))
@@ -539,6 +644,8 @@ void _dl_debug_state(void)
 void *__dynlink(int argc, char **argv)
 {
 	size_t *auxv, aux[AUX_CNT] = {0};
+	static const struct tag_range defs = {0, AUX_CNT};
+	size_t *storage = aux;
 	size_t i;
 	Phdr *phdr;
 	Ehdr *ehdr;
@@ -556,7 +663,7 @@ void *__dynlink(int argc, char **argv)
 			env_preload = argv[i]+11;
 	auxv = (void *)(argv+i+1);
 
-	decode_vec(auxv, aux, AUX_CNT);
+	decode_vec(auxv, &defs, &storage, 1);
 
 	/* Only trust user/env if kernel says we're not suid/sgid */
 	if ((aux[0]&0x7800)!=0x7800 || aux[AT_UID]!=aux[AT_EUID]
@@ -784,8 +891,9 @@ end:
 static void *do_dlsym(struct dso *p, const char *s, void *ra)
 {
 	size_t i;
-	uint32_t h;
 	Sym *sym;
+	uint32_t h = 0, gh = 0;
+
 	if (p == RTLD_NEXT) {
 		for (p=head; p && (unsigned char *)ra-p->map>p->map_len; p=p->next);
 		if (!p) p=head;
@@ -798,12 +906,29 @@ static void *do_dlsym(struct dso *p, const char *s, void *ra)
 		if (!res) goto failed;
 		return res;
 	}
-	h = hash(s);
-	sym = lookup(s, h, p);
+
+	if (p->hashtab) {
+		h = sysv_hash(s);
+		sym = sysv_lookup(s, h, p);
+	} else {
+		gh = gnu_hash(s);
+		sym = gnu_lookup(s, gh, p);
+	}
+
+
 	if (sym && sym->st_value && (1<<(sym->st_info&0xf) & OK_TYPES))
 		return p->base + sym->st_value;
 	if (p->deps) for (i=0; p->deps[i]; i++) {
-		sym = lookup(s, h, p->deps[i]);
+		if (p->deps[i]->hashtab && (h || !p->deps[i]->ghashtab)) {
+			if (!h)
+				h = sysv_hash(s);
+			sym = sysv_lookup(s, h, p->deps[i]);
+		} else {
+			if (!gh)
+				gh = gnu_hash(s);
+			sym = gnu_lookup(s, h, p->deps[i]);
+		}
+
 		if (sym && sym->st_value && (1<<(sym->st_info&0xf) & OK_TYPES))
 			return p->deps[i]->base + sym->st_value;
 	}
@@ -813,6 +938,97 @@ failed:
 	return 0;
 }
 
+struct sym_search {
+	void *addr;
+	Dl_info *info;
+};
+
+static int find_closest_sym (struct dso *dso, Sym *sym, struct sym_search *search)
+{
+	void *symaddr = dso->base + sym->st_value;
+	char *strings = dso->strings;
+	Dl_info *info = search->info;
+	void *addr = search->addr;
+	void *prevaddr = info->dli_saddr;
+
+	if (sym->st_value == 0 && sym->st_shndx == SHN_UNDEF)
+		return 1;
+
+	if (ELF_ST_TYPE(sym->st_info) == STT_TLS)
+		return 1;
+
+	if (addr < symaddr)
+		return 1;
+
+	if (prevaddr && (addr - symaddr) > (addr - prevaddr))
+		return 1;
+
+	info->dli_saddr = symaddr;
+	info->dli_sname = strings + sym->st_name;
+
+	if (addr == symaddr)
+		return 0;
+
+	return 1;
+
+}
+
+static int do_dladdr (void *addr, Dl_info *info)
+{
+	struct sym_search search;
+	struct dso *p;
+	memset (info, 0, sizeof (*info));
+	search.info = info;
+	search.addr = addr;
+	for (p=head; p; p=p->next) {
+		if ((unsigned char *)addr >= p->map && (unsigned char *)addr < p->map + p->map_len) {
+			Sym *syms = p->syms;
+			uint32_t *hashtab;
+			size_t i;
+			info->dli_fname = p->name;
+			info->dli_fbase = p->base;
+			if (p->hashtab) {
+				hashtab = p->hashtab;
+				for (i = 0; i < hashtab[1]; i++) {
+					if (!find_closest_sym (p, syms + i, &search))
+						return 1;
+				}
+			} else {
+				uint32_t *buckets;
+				uint32_t nbuckets;
+				uint32_t *hashvals;
+				uint32_t symndx;
+				hashtab = p->ghashtab;
+				buckets = hashtab + 4 + (hashtab[2] * (sizeof(size_t)/sizeof(uint32_t)));
+				nbuckets = hashtab[0];
+				hashvals = buckets + nbuckets;
+				symndx = hashtab[1];
+				for (i = 0; i < nbuckets; ++i) {
+					uint32_t n = buckets[i];
+					Sym *sym = syms + n;
+					uint32_t *hashval = hashvals + n - symndx;
+
+					do {
+						if (!find_closest_sym (p, sym, &search))
+							return 1;
+					}while (!(*hashval++ & 1));
+				}
+			}
+			return 1;
+		}
+	}
+	return 0;
+}
+
+int dladdr (void *addr, Dl_info *info)
+{
+	int res;
+	pthread_rwlock_rdlock(&lock);
+	res = do_dladdr (addr, info);
+	pthread_rwlock_unlock(&lock);
+	return res;
+}
+
 void *__dlsym(void *p, const char *s, void *ra)
 {
 	void *res;

  parent reply	other threads:[~2012-08-16 18:03 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-07  9:04 musl
2012-08-07 11:46 ` Szabolcs Nagy
2012-08-07 14:15   ` musl
2012-08-07 14:53     ` Szabolcs Nagy
2012-08-07 23:09     ` Rich Felker
2012-08-08  9:55       ` musl
2012-08-08 11:52         ` Szabolcs Nagy
2012-08-08 12:54           ` Rich Felker
2012-08-08 13:57           ` musl
2012-08-11 23:05             ` Rich Felker
2012-08-15 22:41               ` boris brezillon
2012-08-17  5:39                 ` Rich Felker
2012-08-19 16:42                   ` musl
2012-08-20  2:06                     ` Rich Felker
2012-08-20 12:55                       ` musl
2012-08-20 14:32                         ` musl
2012-08-23 21:39                           ` Rich Felker
2012-08-23 22:21                             ` Rich Felker
2012-08-24  7:29                               ` musl
2012-08-24 18:38                                 ` Rich Felker
2012-08-25  7:42                                   ` boris brezillon
2012-08-25 12:35                                     ` Rich Felker
2012-08-25 22:13                                   ` musl
2012-08-25 22:37                                     ` musl
2012-08-26  0:00                                   ` musl
2012-08-24  8:12                               ` Szabolcs Nagy
2012-08-24  8:56                                 ` musl
2012-08-24  9:38                                   ` Szabolcs Nagy
2012-08-25 21:34                               ` musl
2012-08-25 21:42                                 ` Rich Felker
2012-08-16 18:03               ` musl [this message]
2012-08-17 16:35               ` musl
2012-08-08 12:49         ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=502D360F.3060606@gmail.com \
    --to=b.brezillon.musl@gmail.com \
    --cc=dalias@aerifal.cx \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).