From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/1465 Path: news.gmane.org!not-for-mail From: musl Newsgroups: gmane.linux.lib.musl.general Subject: Re: ldso : dladdr support Date: Wed, 08 Aug 2012 15:57:15 +0200 Message-ID: <5022703B.3090105@gmail.com> References: <5020DA13.6080803@gmail.com> <20120807114627.GG30810@port70.net> <50212306.6070402@gmail.com> <20120807230933.GC27715@brightrain.aerifal.cx> <502237A1.1000805@gmail.com> <20120808115202.GL30810@port70.net> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------030800040608070504070108" X-Trace: dough.gmane.org 1344434252 17440 80.91.229.3 (8 Aug 2012 13:57:32 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Wed, 8 Aug 2012 13:57:32 +0000 (UTC) Cc: Szabolcs Nagy To: musl@lists.openwall.com Original-X-From: musl-return-1466-gllmg-musl=m.gmane.org@lists.openwall.com Wed Aug 08 15:57:31 2012 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Sz6lO-00033R-Ds for gllmg-musl@plane.gmane.org; Wed, 08 Aug 2012 15:57:31 +0200 Original-Received: (qmail 25837 invoked by uid 550); 8 Aug 2012 13:57:29 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 25829 invoked from network); 8 Aug 2012 13:57:29 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type; bh=PEanMs4xJ/R2JHl8vp/wgyP1j6uijjPRjMMLwbNuEDE=; b=QIK4aCUI+6HAZsKHGIipXyYGvaow5ROf3Kpkpt+Laap3l1SdO3W/KQ7M0o2uWQjV70 6sNcq5A4mENEBNNe1fDDTL5TaGkW26YPkHJQAx7sDcce46PJrIb0TnmcenHyDZ6VwEFY yOHNl1064qER2uxxIw2NarTCVo7ih9S8jopV0zfR9QjkWz+Z7e7rOmA8+HyN9HmLH9tw gQGCiOYH6Un5Tq7uhiBQdIW/B/RsEJDbIdyyoSg9KSWn+OIoIqKHm6bfrN8hAqQFyFHF DZuOmY48VYhdUSPIn6gcZFO6XeuKUHCiESkLOUzt0zwlvHTQM6sA7IVWBKA4dTVJkY31 D1IQ== User-Agent: Mozilla/5.0 (X11; Linux i686; rv:14.0) Gecko/20120714 Thunderbird/14.0 In-Reply-To: <20120808115202.GL30810@port70.net> Xref: news.gmane.org gmane.linux.lib.musl.general:1465 Archived-At: This is a multi-part message in MIME format. --------------030800040608070504070108 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Same as before except I use bit mask to support multiple hash algorithms. On 08/08/2012 13:52, Szabolcs Nagy wrote: > * musl [2012-08-08 11:55:45 +0200]: >> Here is the new patch for dladdr + gnu hash support. >> >> On 08/08/2012 01:09, Rich Felker wrote: >>> don't remember which) where I described some optimizations that should >>> be present if gnu hash is to be supported. Basically, the dynamic >>> linker should keep track of whether there's one of the two hashes >>> that's supported by ALL loaded libs, and in that case, only ever use >>> that hash, to avoid computing two different hashes. >> The hash for given algo is only computed once (if needed). >> That's the reason for the computed table. >> If all libs uses the same hash algo, the other will never be computed. > a lib can support both hash tables > (so it's non-trivial to use the one that all libs support, > eg instead of a single alg index a bit mask should be used) > See patch changes. >> +#define SYSV_HASH_ALG_IDX 0 >> +#define GNU_HASH_ALG_IDX 1 >> +#define HASH_ALG_CNT 2 > if we go with this approach then use shorter names > (thes are only used locally) > > eg. SYSV_HASH, GNU_HASH, NHASH Corrected. >> + for (h1 &= (uint32_t)-2;; sym++) { >> + h2 = *hashval++; >> + if ((h1 == (h2 & ~1)) && !strcmp(s, strings + sym->st_name)) > these is still a ~1 > > looking at it now probably writing out & 0xfffffffe > is the cleanest > >> + static uint32_t precomp[HASH_ALG_CNT][3] = { >> + {0x6b366be, 0x6b3afd, 0x595a4cc}, >> + {0xf9040207, 0xf4dc4ae, 0x1f4039c9}, >> + }; >> + uint32_t h[HASH_ALG_CNT]; >> + uint8_t computed[HASH_ALG_CNT] = {0, 0}; >> + uint8_t alg = dso->hashalg; >> + if (alg == SYSV_HASH_ALG_IDX) >> + h[alg] = sysv_hash(s); >> + else >> + h[alg] = gnu_hash(s); >> + >> + computed[alg] = 1; >> + >> + if (h[alg] == precomp[alg][0] && !strcmp(s, "dlopen")) rtld_used = 1; >> + if (h[alg] == precomp[alg][1] && !strcmp(s, "dlsym")) rtld_used = 1; >> + if (h[alg] == precomp[alg][2] && !strcmp(s, "__stack_chk_fail")) ssp_used = 1; >> + >> for (; dso; dso=dso->next) { >> Sym *sym; >> + >> if (!dso->global) continue; >> - sym = lookup(s, h, dso); >> + >> + alg = dso->hashalg; >> + if (!computed[alg]) { >> + if (alg == SYSV_HASH_ALG_IDX) { >> + h[alg] = sysv_hash(s); >> + sym = sysv_lookup(s, h[alg], dso); >> + } >> + else { >> + h[alg] = gnu_hash(s); >> + sym = gnu_lookup(s, h[alg], dso); >> + } >> + computed[alg] = 1; >> + } else { >> + if (alg == SYSV_HASH_ALG_IDX) >> + sym = sysv_lookup(s, h[alg], dso); >> + else >> + sym = gnu_lookup(s, h[alg], dso); >> + } > instead of arrays i'd write > > alg = dso->hashalg; > if (alg == SYSV_HASH) { > if (sysv_ok) { > ... > sysv_ok = 1; > } > sym = sysv_lookup(s, sysv_h, dso); > } else { > } Haven't changed it yet. > since there are many ifs anyway > > the table approach is nicer when all data and functions > are in a table: > > if (!ok[alg]) { > h[alg] = hash[alg](s); > ok[alg] = 1; > } > sym = lookup[alg](s, h[alg], dso); > >> + p->hashalg = SYSV_HASH_ALG_IDX; >> p->strings = (void *)(p->base + dyn[DT_STRTAB]); >> + for (; v[0]; v+=2) if (v[0] == DT_GNU_HASH) { >> + p->hashtab = (void *)(p->base + v[1]); >> + p->hashalg = GNU_HASH_ALG_IDX; >> + } > so it seems gnu hash is used whenever it's present > i'm not sure if that's the right default.. > > another possibility is to have a plain find_sym function > which is simple and only supports sysv hash > and whenever it encounters a lib that has no sysv hash in > it a find_sym_gnu is called that does the hard work > (so using gnu only libs is penalized, i don't know how > common that case is though) --------------030800040608070504070108 Content-Type: text/x-patch; name="ldso-add-dladdr-and-gnu-hash-support.patch" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="ldso-add-dladdr-and-gnu-hash-support.patch" >From 359843b57223db412847f69702e19511dfeb435d Mon Sep 17 00:00:00 2001 From: Boris BREZILLON Date: Wed, 8 Aug 2012 15:49:29 +0200 Subject: [PATCH] ldso : add dladdr and gnu hash support --- include/dlfcn.h | 15 +++ src/ldso/dynlink.c | 272 ++++++++++++++++++++++++++++++++++++++++++++++++---- 2 files changed, 269 insertions(+), 18 deletions(-) diff --git a/include/dlfcn.h b/include/dlfcn.h index dea74c7..8c45822 100644 --- a/include/dlfcn.h +++ b/include/dlfcn.h @@ -18,6 +18,21 @@ char *dlerror(void); void *dlopen(const char *, int); void *dlsym(void *, const char *); +#ifdef _GNU_SOURCE +typedef struct { + const char *dli_fname; /* Pathname of shared object that + contains address */ + void *dli_fbase; /* Address at which shared object + is loaded */ + const char *dli_sname; /* Name of nearest symbol with address + lower than addr */ + void *dli_saddr; /* Exact address of symbol named + in dli_sname */ +} Dl_info; + +int dladdr (void *addr, Dl_info *info); +#endif + #ifdef __cplusplus } #endif diff --git a/src/ldso/dynlink.c b/src/ldso/dynlink.c index f55c6f1..4a236b2 100644 --- a/src/ldso/dynlink.c +++ b/src/ldso/dynlink.c @@ -1,3 +1,4 @@ +#define _GNU_SOURCE #include #include #include @@ -28,14 +29,20 @@ typedef Elf32_Phdr Phdr; typedef Elf32_Sym Sym; #define R_TYPE(x) ((x)&255) #define R_SYM(x) ((x)>>8) +#define ELF_ST_TYPE ELF32_ST_TYPE #else typedef Elf64_Ehdr Ehdr; typedef Elf64_Phdr Phdr; typedef Elf64_Sym Sym; #define R_TYPE(x) ((x)&0xffffffff) #define R_SYM(x) ((x)>>32) +#define ELF_ST_TYPE ELF64_ST_TYPE #endif +#define SYSV_HASH 0 +#define GNU_HASH 1 +#define NHASH 2 + struct debug { int ver; void *head; @@ -52,7 +59,8 @@ struct dso { int refcnt; Sym *syms; - uint32_t *hashtab; + uint32_t hashalgs; + uint32_t *hashtabs[NHASH]; char *strings; unsigned char *map; size_t map_len; @@ -94,7 +102,7 @@ static void decode_vec(size_t *v, size_t *a, size_t cnt) } } -static uint32_t hash(const char *s0) +static uint32_t sysv_hash(const char *s0) { const unsigned char *s = (void *)s0; uint_fast32_t h = 0; @@ -105,11 +113,20 @@ static uint32_t hash(const char *s0) return h & 0xfffffff; } -static Sym *lookup(const char *s, uint32_t h, struct dso *dso) +static uint32_t gnu_hash (const char *s0) +{ + const unsigned char *s = (void *)s0; + uint_fast32_t h = 5381; + for (; *s; s++) + h = h*33 + *s; + return h & 0xffffffff; +} + +static Sym *sysv_lookup(const char *s, uint32_t h, struct dso *dso) { size_t i; Sym *syms = dso->syms; - uint32_t *hashtab = dso->hashtab; + uint32_t *hashtab = dso->hashtabs[SYSV_HASH]; char *strings = dso->strings; for (i=hashtab[2+h%hashtab[0]]; i; i=hashtab[2+hashtab[0]+i]) { if (!strcmp(s, strings+syms[i].st_name)) @@ -118,20 +135,101 @@ static Sym *lookup(const char *s, uint32_t h, struct dso *dso) return 0; } +static Sym *gnu_lookup(const char *s, uint32_t h1, struct dso *dso) +{ + size_t i; + Sym *sym; + char *strings = dso->strings; + uint32_t *hashtab = dso->hashtabs[GNU_HASH]; + uint32_t nbuckets = hashtab[0]; + size_t *maskwords = (size_t *)(hashtab + 4); + uint32_t *buckets = hashtab + 4 + (hashtab[2]*(sizeof(size_t)/sizeof(uint32_t))); + uint32_t symndx = hashtab[1]; + Sym *syms = dso->syms; + uint32_t shift2 = hashtab[3]; + uint32_t h2 = h1 >> shift2; + uint32_t *hashvals = buckets + nbuckets; + uint32_t *hashval; + size_t c = sizeof(size_t) * 8; + size_t n = (h1/c) & (hashtab[2]-1); + size_t bitmask = (1 << (h1%c)) | (1 << (h2%c)); + + if ((maskwords[n] & bitmask) != bitmask) + return 0; + + n = buckets[h1 % nbuckets]; + if (!n) + return 0; + + sym = syms + n; + hashval = hashvals + n - symndx; + + for (h1 &= (uint32_t)-2;; sym++) { + h2 = *hashval++; + if ((h1 == (h2 & ~1)) && !strcmp(s, strings + sym->st_name)) + return sym; + + if (h2 & 1) + break; + } + + return 0; +} + #define OK_TYPES (1<hashalgs; + uint8_t alg; + if (algs & (1 << SYSV_HASH)) { + h[SYSV_HASH] = sysv_hash(s); + alg = SYSV_HASH; + } else { + h[GNU_HASH] = gnu_hash(s); + alg = GNU_HASH; + } + + ok |= (1 << alg); + + if (h[alg] == precomp[alg][0] && !strcmp(s, "dlopen")) rtld_used = 1; + if (h[alg] == precomp[alg][1] && !strcmp(s, "dlsym")) rtld_used = 1; + if (h[alg] == precomp[alg][2] && !strcmp(s, "__stack_chk_fail")) ssp_used = 1; + for (; dso; dso=dso->next) { Sym *sym; + if (!dso->global) continue; - sym = lookup(s, h, dso); + + algs = dso->hashalgs; + if (!(algs & ok)) { + if (algs & (1 << SYSV_HASH)) { + alg = SYSV_HASH; + h[alg] = sysv_hash(s); + sym = sysv_lookup(s, h[alg], dso); + } + else { + alg = GNU_HASH; + h[alg] = gnu_hash(s); + sym = gnu_lookup(s, h[alg], dso); + } + + ok |= (1 << alg); + } else { + if ((algs & ok) & (1 << SYSV_HASH)) + sym = sysv_lookup(s, h[SYSV_HASH], dso); + else + sym = gnu_lookup(s, h[GNU_HASH], dso); + } + if (sym && (!need_def || sym->st_shndx) && sym->st_value && (1<<(sym->st_info&0xf) & OK_TYPES) && (1<<(sym->st_info>>4) & OK_BINDS)) { @@ -320,11 +418,28 @@ static int path_open(const char *name, const char *search, char *buf, size_t buf static void decode_dyn(struct dso *p) { - size_t dyn[DYN_CNT] = {0}; - decode_vec(p->dynv, dyn, DYN_CNT); - p->syms = (void *)(p->base + dyn[DT_SYMTAB]); - p->hashtab = (void *)(p->base + dyn[DT_HASH]); - p->strings = (void *)(p->base + dyn[DT_STRTAB]); + size_t *v; + p->hashalgs = 0; + for (v = p->dynv; v[0]; v+=2) { + switch (v[0]) { + case DT_SYMTAB: + p->syms = (void *)(p->base + v[1]); + break; + case DT_HASH: + p->hashtabs[SYSV_HASH] = (void *)(p->base + v[1]); + p->hashalgs |= (1 << SYSV_HASH); + break; + case DT_STRTAB: + p->strings = (void *)(p->base + v[1]); + break; + case DT_GNU_HASH: + p->hashtabs[GNU_HASH] = (void *)(p->base + v[1]); + p->hashalgs |= (1 << GNU_HASH); + break; + default: + break; + } + } } static struct dso *load_library(const char *name) @@ -784,8 +899,11 @@ end: static void *do_dlsym(struct dso *p, const char *s, void *ra) { size_t i; - uint32_t h; Sym *sym; + uint32_t ok = 0; + uint8_t algs = p->hashalgs; + uint32_t h[NHASH]; + if (p == RTLD_NEXT) { for (p=head; p && (unsigned char *)ra-p->map>p->map_len; p=p->next); if (!p) p=head; @@ -798,12 +916,39 @@ static void *do_dlsym(struct dso *p, const char *s, void *ra) if (!res) goto failed; return res; } - h = hash(s); - sym = lookup(s, h, p); + + if (algs & (1 << SYSV_HASH)) { + h[SYSV_HASH] = sysv_hash(s); + sym = sysv_lookup(s, h[SYSV_HASH], p); + ok |= 1 << SYSV_HASH; + } else { + h[GNU_HASH] = gnu_hash(s); + sym = gnu_lookup(s, h[GNU_HASH], p); + ok |= 1 << GNU_HASH; + } + + if (sym && sym->st_value && (1<<(sym->st_info&0xf) & OK_TYPES)) return p->base + sym->st_value; if (p->deps) for (i=0; p->deps[i]; i++) { - sym = lookup(s, h, p->deps[i]); + algs = p->deps[i]->hashalgs; + if (!(algs & ok)) { + if (algs & SYSV_HASH) { + h[SYSV_HASH] = sysv_hash(s); + sym = sysv_lookup(s, h[SYSV_HASH], p->deps[i]); + ok |= SYSV_HASH; + } else { + h[GNU_HASH] = gnu_hash(s); + sym = gnu_lookup(s, h[GNU_HASH], p->deps[i]); + ok |= GNU_HASH; + } + } else { + if (algs & SYSV_HASH) + sym = sysv_lookup(s, h[SYSV_HASH], p->deps[i]); + else + sym = gnu_lookup(s, h[GNU_HASH], p->deps[i]); + } + if (sym && sym->st_value && (1<<(sym->st_info&0xf) & OK_TYPES)) return p->deps[i]->base + sym->st_value; } @@ -813,6 +958,97 @@ failed: return 0; } +struct sym_search { + void *addr; + Dl_info *info; +}; + +static int find_closest_sym (struct dso *dso, Sym *sym, struct sym_search *search) +{ + void *symaddr = dso->base + sym->st_value; + char *strings = dso->strings; + Dl_info *info = search->info; + void *addr = search->addr; + void *prevaddr = info->dli_saddr; + + if (sym->st_value == 0 && sym->st_shndx == SHN_UNDEF) + return 1; + + if (ELF_ST_TYPE(sym->st_info) == STT_TLS) + return 1; + + if (addr < symaddr) + return 1; + + if (prevaddr && (addr - symaddr) > (addr - prevaddr)) + return 1; + + info->dli_saddr = symaddr; + info->dli_sname = strings + sym->st_name; + + if (addr == symaddr) + return 0; + + return 1; + +} + +static int do_dladdr (void *addr, Dl_info *info) +{ + struct sym_search search; + struct dso *p; + memset (info, 0, sizeof (*info)); + search.info = info; + search.addr = addr; + for (p=head; p; p=p->next) { + if ((unsigned char *)addr >= p->map && (unsigned char *)addr < p->map + p->map_len) { + Sym *syms = p->syms; + uint32_t *hashtab; + size_t i; + info->dli_fname = p->name; + info->dli_fbase = p->base; + if (p->hashalgs & (1 << SYSV_HASH)) { + hashtab = p->hashtabs[SYSV_HASH]; + for (i = 0; i < hashtab[1]; i++) { + if (!find_closest_sym (p, syms + i, &search)) + return 1; + } + } else if(p->hashalgs & (1 << GNU_HASH)) { + uint32_t *buckets; + uint32_t nbuckets; + uint32_t *hashvals; + uint32_t symndx; + hashtab = p->hashtabs[GNU_HASH]; + buckets = hashtab + 4 + (hashtab[2] * (sizeof(size_t)/sizeof(uint32_t))); + nbuckets = hashtab[0]; + hashvals = buckets + nbuckets; + symndx = hashtab[1]; + for (i = 0; i < nbuckets; ++i) { + uint32_t n = buckets[i]; + Sym *sym = syms + n; + uint32_t *hashval = hashvals + n - symndx; + + do { + if (!find_closest_sym (p, sym, &search)) + return 1; + }while (!(*hashval++ & 1)); + } + } + return 1; + } + } + return 0; +} + +int dladdr (void *addr, Dl_info *info) +{ + int res; + pthread_rwlock_rdlock(&lock); + res = do_dladdr (addr, info); + pthread_rwlock_unlock(&lock); + return res; +} + void *__dlsym(void *p, const char *s, void *ra) { void *res; -- 1.7.9.5 --------------030800040608070504070108--