* Removing glibc from the musl .2 ABI @ 2019-07-11 23:58 A. Wilcox 2019-07-12 0:51 ` Khem Raj ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: A. Wilcox @ 2019-07-11 23:58 UTC (permalink / raw) To: musl [-- Attachment #1.1: Type: text/plain, Size: 790 bytes --] (Full disclosure: I am the principal author of gcompat.) Hi, Now that gcompat has matured, I was wondering if perhaps musl should consider dropping the glibc ABI guarantees when the "2 ABI" lands. This would make the LFS64 symbol mess completely moot. It would also allow musl to "fix" a lot of dumb glibc decisions. I'm thinking specifically here of things like ctermid(3), which musl could actually implement correctly if it wasn't being held back by glibc defining L_ctermid as 9. I'm aware this is probably controversial, and it will probably be shot down quickly, but I thought I would at least suggest this as an option. Thank you for your consideration. Best, --arw -- A. Wilcox (awilfox) Project Lead, Adélie Linux https://www.adelielinux.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-11 23:58 Removing glibc from the musl .2 ABI A. Wilcox @ 2019-07-12 0:51 ` Khem Raj 2019-07-12 1:45 ` Rich Felker 2019-07-17 3:37 ` Rich Felker 2 siblings, 0 replies; 18+ messages in thread From: Khem Raj @ 2019-07-12 0:51 UTC (permalink / raw) To: musl On Thu, Jul 11, 2019 at 4:59 PM A. Wilcox <awilfox@adelielinux.org> wrote: > > (Full disclosure: I am the principal author of gcompat.) > > Hi, > > Now that gcompat has matured, I was wondering if perhaps musl should > consider dropping the glibc ABI guarantees when the "2 ABI" lands. > > This would make the LFS64 symbol mess completely moot. > > It would also allow musl to "fix" a lot of dumb glibc decisions. I'm > thinking specifically here of things like ctermid(3), which musl could > actually implement correctly if it wasn't being held back by glibc > defining L_ctermid as 9. > > I'm aware this is probably controversial, and it will probably be shot > down quickly, but I thought I would at least suggest this as an option. > I think its too early to drop it but we could provide a configure option for dropping it and keep the defaults. Since there are enough pre-compiled apps which probably are not going to change in anytime soon. > Thank you for your consideration. > > Best, > --arw > > -- > A. Wilcox (awilfox) > Project Lead, Adélie Linux > https://www.adelielinux.org > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-11 23:58 Removing glibc from the musl .2 ABI A. Wilcox 2019-07-12 0:51 ` Khem Raj @ 2019-07-12 1:45 ` Rich Felker 2019-07-12 1:47 ` Rich Felker 2019-07-17 3:37 ` Rich Felker 2 siblings, 1 reply; 18+ messages in thread From: Rich Felker @ 2019-07-12 1:45 UTC (permalink / raw) To: musl On Thu, Jul 11, 2019 at 06:58:38PM -0500, A. Wilcox wrote: > (Full disclosure: I am the principal author of gcompat.) > > Hi, > > Now that gcompat has matured, I was wondering if perhaps musl should > consider dropping the glibc ABI guarantees when the "2 ABI" lands. It's not decided that it will, or at least not in the near term. I think the other approach proposed to 64-bit time_t is a lot more appealing to most existing 32-bit users. I've had out-of-band feedback from one big user that they depend on ABI stability for the existing 32-bit arch+ABIs and hope there won't be a hard Y2038 EOL for them, and I myself would also rather prefer not to have to do an ABI switch. From the beginning, ABI stability was one of the big promises of musl. I realize we have "enough" time between now and 2038 for putting off an ABI switch (except for ppl making embedded stuff with really long lifetimes), so that users who care about ABI stability could stick with .1 "for now", but then we just push the problem back and they're unhappy in some moderately-distant future, and probably end up in a mess when they realize they need time_t's representing times a decade or two out sooner than they thought... > This would make the LFS64 symbol mess completely moot. Yes. Actually I'd like to move all of the ABI-compat symbols out of ld-reachable symbol table and make them ABI-compat only. But I'd also like to *improve* ABI-compat, e.g. making regexec from glibc libs safe on 64-bit (where their regoff_t was wrong), > It would also allow musl to "fix" a lot of dumb glibc decisions. I'm > thinking specifically here of things like ctermid(3), which musl could > actually implement correctly if it wasn't being held back by glibc > defining L_ctermid as 9. ctermid is something of a junk function anyway, but there are similar non-junk interfaces affected. Identifying and overhauling them all is probably a bigger project than I want to take on now, but I still think it's a promising direction at some point in the future. Ideally this might go hand in hand with making musl less Linux-centric, in the form of developing a types ABI that's uniform across archs and meant to be used natively on bare metal or non-Linux kernels. > I'm aware this is probably controversial, and it will probably be shot > down quickly, but I thought I would at least suggest this as an option. > > Thank you for your consideration. Thanks for the feedback. Rich ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-12 1:45 ` Rich Felker @ 2019-07-12 1:47 ` Rich Felker 0 siblings, 0 replies; 18+ messages in thread From: Rich Felker @ 2019-07-12 1:47 UTC (permalink / raw) To: musl On Thu, Jul 11, 2019 at 09:45:27PM -0400, Rich Felker wrote: > > This would make the LFS64 symbol mess completely moot. > > Yes. Actually I'd like to move all of the ABI-compat symbols out of > ld-reachable symbol table and make them ABI-compat only. But I'd also > like to *improve* ABI-compat, e.g. making regexec from glibc libs safe > on 64-bit (where their regoff_t was wrong), I forgot to finish this paragraph. To follow up, doing this stuff in the dynamic linker would likely improve ABI-compat functionality, making it possible to remap symbols just for binaries/libraries that were detected as being glibc-linked. I'm actually not sure if this will still be relevant by the time we get around to doing it, but it's nice to have the option open. Rich ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-11 23:58 Removing glibc from the musl .2 ABI A. Wilcox 2019-07-12 0:51 ` Khem Raj 2019-07-12 1:45 ` Rich Felker @ 2019-07-17 3:37 ` Rich Felker 2019-07-17 13:13 ` A. Wilcox 2 siblings, 1 reply; 18+ messages in thread From: Rich Felker @ 2019-07-17 3:37 UTC (permalink / raw) To: musl On Thu, Jul 11, 2019 at 06:58:38PM -0500, A. Wilcox wrote: > (Full disclosure: I am the principal author of gcompat.) > > Hi, > > Now that gcompat has matured, I was wondering if perhaps musl should > consider dropping the glibc ABI guarantees when the "2 ABI" lands. > > This would make the LFS64 symbol mess completely moot. This is separate from the .2 ABI topic, but what would you think about removing glibc ABI-compat from the current .1 ABI and replacing it with enhanced gcompat? I was thinking ldso could load libgcompat instead of returning a reference to itself for DT_NEEDED referencing libc.so.6, and we could move all ABI-compat symbols into gcompat. The reason I bring it up is that ripping out the LFS64 unwantedly-linkable stuff while keeping it as ABI-only is looking like more of a pain than I expected. Rich ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-17 3:37 ` Rich Felker @ 2019-07-17 13:13 ` A. Wilcox 2019-07-17 15:11 ` Rich Felker 0 siblings, 1 reply; 18+ messages in thread From: A. Wilcox @ 2019-07-17 13:13 UTC (permalink / raw) To: musl [-- Attachment #1.1: Type: text/plain, Size: 1444 bytes --] On 07/16/19 22:37, Rich Felker wrote: > On Thu, Jul 11, 2019 at 06:58:38PM -0500, A. Wilcox wrote: >> (Full disclosure: I am the principal author of gcompat.) >> >> Hi, >> >> Now that gcompat has matured, I was wondering if perhaps musl should >> consider dropping the glibc ABI guarantees when the "2 ABI" lands. >> >> This would make the LFS64 symbol mess completely moot. > > This is separate from the .2 ABI topic, but what would you think about > removing glibc ABI-compat from the current .1 ABI and replacing it > with enhanced gcompat? I was thinking ldso could load libgcompat > instead of returning a reference to itself for DT_NEEDED referencing > libc.so.6, and we could move all ABI-compat symbols into gcompat. > > The reason I bring it up is that ripping out the LFS64 > unwantedly-linkable stuff while keeping it as ABI-only is looking like > more of a pain than I expected. > > Rich We would be more than happy to work with you on that. Would gcompat then become a runtime requirement for glibc apps on musl? What would musl do if gcompat isn't installed on a system? What about things like libm and libdl, which I've seen some apps force DT_NEEDED anyway when built against musl? Just trying to make sure the community has a clear view of what this looks like before we jump in. Best, --arw -- A. Wilcox (awilfox) Project Lead, Adélie Linux https://www.adelielinux.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-17 13:13 ` A. Wilcox @ 2019-07-17 15:11 ` Rich Felker 2019-07-17 18:10 ` A. Wilcox 0 siblings, 1 reply; 18+ messages in thread From: Rich Felker @ 2019-07-17 15:11 UTC (permalink / raw) To: musl On Wed, Jul 17, 2019 at 08:13:44AM -0500, A. Wilcox wrote: > On 07/16/19 22:37, Rich Felker wrote: > > On Thu, Jul 11, 2019 at 06:58:38PM -0500, A. Wilcox wrote: > >> (Full disclosure: I am the principal author of gcompat.) > >> > >> Hi, > >> > >> Now that gcompat has matured, I was wondering if perhaps musl should > >> consider dropping the glibc ABI guarantees when the "2 ABI" lands. > >> > >> This would make the LFS64 symbol mess completely moot. > > > > This is separate from the .2 ABI topic, but what would you think about > > removing glibc ABI-compat from the current .1 ABI and replacing it > > with enhanced gcompat? I was thinking ldso could load libgcompat > > instead of returning a reference to itself for DT_NEEDED referencing > > libc.so.6, and we could move all ABI-compat symbols into gcompat. > > > > The reason I bring it up is that ripping out the LFS64 > > unwantedly-linkable stuff while keeping it as ABI-only is looking like > > more of a pain than I expected. > > We would be more than happy to work with you on that. > > Would gcompat then become a runtime requirement for glibc apps on musl? > What would musl do if gcompat isn't installed on a system? It would just be a failed DT_NEEDED. > What about > things like libm and libdl, which I've seen some apps force DT_NEEDED > anyway when built against musl? These could still be ignored (mapped to internal libc) since any program using them would also necessarily be using libc.so.6. > Just trying to make sure the community has a clear view of what this > looks like before we jump in. Yes. This isn't a request to jump in, just looking at feasability and whether there'd be interest from your side. Being that ABI-compat doesn't actually work very well without gcompat right now, though, I think it might make sense. I'll continue to look at whether there are other options, possibly just transitional, that might be good too. Rich ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-17 15:11 ` Rich Felker @ 2019-07-17 18:10 ` A. Wilcox 2019-07-17 18:16 ` Rich Felker 0 siblings, 1 reply; 18+ messages in thread From: A. Wilcox @ 2019-07-17 18:10 UTC (permalink / raw) To: musl [-- Attachment #1.1: Type: text/plain, Size: 2631 bytes --] On 07/17/19 10:11, Rich Felker wrote: > On Wed, Jul 17, 2019 at 08:13:44AM -0500, A. Wilcox wrote: >> On 07/16/19 22:37, Rich Felker wrote: >>> On Thu, Jul 11, 2019 at 06:58:38PM -0500, A. Wilcox wrote: >>>> (Full disclosure: I am the principal author of gcompat.) >>>> >>>> Hi, >>>> >>>> Now that gcompat has matured, I was wondering if perhaps musl should >>>> consider dropping the glibc ABI guarantees when the "2 ABI" lands. >>>> >>>> This would make the LFS64 symbol mess completely moot. >>> >>> This is separate from the .2 ABI topic, but what would you think about >>> removing glibc ABI-compat from the current .1 ABI and replacing it >>> with enhanced gcompat? I was thinking ldso could load libgcompat >>> instead of returning a reference to itself for DT_NEEDED referencing >>> libc.so.6, and we could move all ABI-compat symbols into gcompat. >>> >>> The reason I bring it up is that ripping out the LFS64 >>> unwantedly-linkable stuff while keeping it as ABI-only is looking like >>> more of a pain than I expected. >> >> We would be more than happy to work with you on that. >> >> Would gcompat then become a runtime requirement for glibc apps on musl? >> What would musl do if gcompat isn't installed on a system? > > It would just be a failed DT_NEEDED. Okay, sounds reasonable. >> What about >> things like libm and libdl, which I've seen some apps force DT_NEEDED >> anyway when built against musl? > > These could still be ignored (mapped to internal libc) since any > program using them would also necessarily be using libc.so.6. Likewise. >> Just trying to make sure the community has a clear view of what this >> looks like before we jump in. > > Yes. This isn't a request to jump in, just looking at feasability and > whether there'd be interest from your side. Being that ABI-compat > doesn't actually work very well without gcompat right now, though, I > think it might make sense. I'll continue to look at whether there are > other options, possibly just transitional, that might be good too. I meant: I want a clear view of the boundaries between musl and gcompat, before we (Adélie / the gcompat team) jump in and start designing how we want to handle all the new symbols we may end up with :) We also were considering setting up a dedicated gcompat site so that the community could share apps that are known to work / fail, symbol presence, LSB missing symbols, etc. Would that be of interest from your side as well? Best, --arw -- A. Wilcox (awilfox) Project Lead, Adélie Linux https://www.adelielinux.org [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-17 18:10 ` A. Wilcox @ 2019-07-17 18:16 ` Rich Felker 2019-07-22 15:52 ` Rich Felker 0 siblings, 1 reply; 18+ messages in thread From: Rich Felker @ 2019-07-17 18:16 UTC (permalink / raw) To: musl On Wed, Jul 17, 2019 at 01:10:19PM -0500, A. Wilcox wrote: > >> Just trying to make sure the community has a clear view of what this > >> looks like before we jump in. > > > > Yes. This isn't a request to jump in, just looking at feasability and > > whether there'd be interest from your side. Being that ABI-compat > > doesn't actually work very well without gcompat right now, though, I > > think it might make sense. I'll continue to look at whether there are > > other options, possibly just transitional, that might be good too. > > I meant: I want a clear view of the boundaries between musl and gcompat, > before we (Adélie / the gcompat team) jump in and start designing how we > want to handle all the new symbols we may end up with :) If we go this route, I would think that gcompat could provide all symbols which are not either public APIs (extensions you can legitimately use in source) or musl-header-induced ABIs (for example things like __ctype_get_mb_cur_max, which is used to define the MB_CUR_MAX macro). This would include LFS64 as well as the "__xstat" stuff, the other __ctype_* stuff, etc. > We also were considering setting up a dedicated gcompat site so that the > community could share apps that are known to work / fail, symbol > presence, LSB missing symbols, etc. Would that be of interest from your > side as well? Definitely, regardless of whether we go ahead with the above or not. Rich ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-17 18:16 ` Rich Felker @ 2019-07-22 15:52 ` Rich Felker 2019-07-24 15:17 ` Szabolcs Nagy 2019-07-24 16:33 ` James Y Knight 0 siblings, 2 replies; 18+ messages in thread From: Rich Felker @ 2019-07-22 15:52 UTC (permalink / raw) To: musl On Wed, Jul 17, 2019 at 02:16:51PM -0400, Rich Felker wrote: > On Wed, Jul 17, 2019 at 01:10:19PM -0500, A. Wilcox wrote: > > >> Just trying to make sure the community has a clear view of what this > > >> looks like before we jump in. > > > > > > Yes. This isn't a request to jump in, just looking at feasability and > > > whether there'd be interest from your side. Being that ABI-compat > > > doesn't actually work very well without gcompat right now, though, I > > > think it might make sense. I'll continue to look at whether there are > > > other options, possibly just transitional, that might be good too. > > > > I meant: I want a clear view of the boundaries between musl and gcompat, > > before we (Adélie / the gcompat team) jump in and start designing how we > > want to handle all the new symbols we may end up with :) > > If we go this route, I would think that gcompat could provide all > symbols which are not either public APIs (extensions you can > legitimately use in source) or musl-header-induced ABIs (for example > things like __ctype_get_mb_cur_max, which is used to define the > MB_CUR_MAX macro). This would include LFS64 as well as the "__xstat" > stuff, the other __ctype_* stuff, etc. I think I'd like to go foward with this. Further work on time64 has made it apparent to me that the current glibc ABI-compat we have inside musl is fragile and is imposing unwanted constraints on musl, which has long been one of the criteria for exclusion. In particular, consider this situation: Several structures that are part of public interfaces in musl were created with extra space reserved for future extension. In some cases the reserved space was added by musl; in other cases glibc had the same. However, if we mandate glibc ABI-compat, *all* of this reserved space is permanently unusable: - If the reserved space is specific to musl, then reads from it may fault, and stores to it may clobber unrelated memory, if the structure was allocated by glibc-linked code. - If the reserved space is present in both musl and glibc, we can't make use of it without risking that glibc makes some different use of it in the future, making calls from glibc-linked code dangerous. This came up in the context of structs rusage and timex, but also applies to stat, sched_param, sysinfo, statvfs, and perhaps others, which might have reason for wanting extensibility in the future. Right now, without the glibc ABI-compat constraint, getrusage, wait3, and wait4 can avoid new time64 remappings entirely (by using the reserved space we already have in rusage, which glibc doesn't have at all). [clock_]adjtime[x] hit the second case -- glibc also has reserved space in timex, but if they end up wanting to use it for something else and we've put the 64-bit time there, we may be in trouble. I don't think the rusage and timex issues here are compelling by themselves. It's not a big deal to make compat shims here, and I might still end up doing it. But I think it's indicative that maintaining glibc ABI-compat in musl is going to become increasingly problematic. So, what I'd (tentatively; for discussion) like to do: When ldso loads an application or shared library and detects that it's glibc-linked (DT_NEEDED for libc.so.6), it both loads a gcompat library instead *and* flags the dso as needing ABI-compat. The gcompat library would be permanently RTLD_LOCAL, unable to be used for resolving global symbols, since it would have to define symbols conflicting with libc symbols names and with future directions of the musl ABI. Symbol lookups when relocating such a flagged dso would take place by first processing gcompat (logically, adding it to the head of the dso search list), then the normal symbol search order. The gcompat library could also provide a replacement dlsym function, so that dlsym calls from the glibc-linked DSO also follow this order, and a replacement dlopen, so that dlopen of libc from the glibc-linked DSO would get the gcompat module. I'm not sure what mechanism gcompat would then use to make its own references to the underlying real libc functions. This is something we'd need to think about. Before we decide to do it, please be aware that this would be a bit of a burden on gcompat to do more than it's doing now. But it would also make lots of cases work that fundamentally *can't* work now -- compat with 32-bit code using the legacy 32-bit off_t functions, compat with 64-bit code using regexec, etc. -- anywhere the musl ABI currently conflicts with the glibc ABI. Of course much of this is optional. The new things that would be mandatory would mainly be moving over existing glibc compat shims (like the __ctype and __xstat stuff) and implementing converting wrappers where musl's use of reserved space creates unsafety/incompatibility with the existing glibc code. Rich ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-22 15:52 ` Rich Felker @ 2019-07-24 15:17 ` Szabolcs Nagy 2019-07-24 16:02 ` Rich Felker 2019-07-24 16:33 ` James Y Knight 1 sibling, 1 reply; 18+ messages in thread From: Szabolcs Nagy @ 2019-07-24 15:17 UTC (permalink / raw) To: musl * Rich Felker <dalias@libc.org> [2019-07-22 11:52:59 -0400]: > So, what I'd (tentatively; for discussion) like to do: > > When ldso loads an application or shared library and detects that it's > glibc-linked (DT_NEEDED for libc.so.6), it both loads a gcompat > library instead *and* flags the dso as needing ABI-compat. The gcompat > library would be permanently RTLD_LOCAL, unable to be used for > resolving global symbols, since it would have to define symbols > conflicting with libc symbols names and with future directions of the > musl ABI. > > Symbol lookups when relocating such a flagged dso would take place by > first processing gcompat (logically, adding it to the head of the dso > search list), then the normal symbol search order. The gcompat library > could also provide a replacement dlsym function, so that dlsym calls > from the glibc-linked DSO also follow this order, and a replacement > dlopen, so that dlopen of libc from the glibc-linked DSO would get the > gcompat module. > > I'm not sure what mechanism gcompat would then use to make its own > references to the underlying real libc functions. This is something > we'd need to think about. i'm not sure how gcompat would implement dlsym, if it's on top of the musl dlsym, then that needs to be accessible already (e.g. by exposing a __musl_dlsym alias) and can be used to do lookups in libc.so. > > Before we decide to do it, please be aware that this would be a bit of > a burden on gcompat to do more than it's doing now. But it would also > make lots of cases work that fundamentally *can't* work now -- compat > with 32-bit code using the legacy 32-bit off_t functions, compat with > 64-bit code using regexec, etc. -- anywhere the musl ABI currently > conflicts with the glibc ABI. Of course much of this is optional. The > new things that would be mandatory would mainly be moving over > existing glibc compat shims (like the __ctype and __xstat stuff) and > implementing converting wrappers where musl's use of reserved space > creates unsafety/incompatibility with the existing glibc code. > > Rich ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-24 15:17 ` Szabolcs Nagy @ 2019-07-24 16:02 ` Rich Felker 0 siblings, 0 replies; 18+ messages in thread From: Rich Felker @ 2019-07-24 16:02 UTC (permalink / raw) To: musl On Wed, Jul 24, 2019 at 05:17:35PM +0200, Szabolcs Nagy wrote: > * Rich Felker <dalias@libc.org> [2019-07-22 11:52:59 -0400]: > > So, what I'd (tentatively; for discussion) like to do: > > > > When ldso loads an application or shared library and detects that it's > > glibc-linked (DT_NEEDED for libc.so.6), it both loads a gcompat > > library instead *and* flags the dso as needing ABI-compat. The gcompat > > library would be permanently RTLD_LOCAL, unable to be used for > > resolving global symbols, since it would have to define symbols > > conflicting with libc symbols names and with future directions of the > > musl ABI. > > > > Symbol lookups when relocating such a flagged dso would take place by > > first processing gcompat (logically, adding it to the head of the dso > > search list), then the normal symbol search order. The gcompat library > > could also provide a replacement dlsym function, so that dlsym calls > > from the glibc-linked DSO also follow this order, and a replacement > > dlopen, so that dlopen of libc from the glibc-linked DSO would get the > > gcompat module. > > > > I'm not sure what mechanism gcompat would then use to make its own > > references to the underlying real libc functions. This is something > > we'd need to think about. > > i'm not sure how gcompat would implement dlsym, if it's > on top of the musl dlsym, then that needs to be accessible > already (e.g. by exposing a __musl_dlsym alias) and can be > used to do lookups in libc.so. The same applies to any interface it would have to wrap due to mismatched ABI. I can think of a couple potential ways: 1. Simply by referencing the symbol name directly. libgcompat would not be a glibc dso, so ldso would not use it to resolve its own symbols, and would end up finding them in libc. The only concern here is whether the compiler or linker might do some kind of early binding that makes the references circular and non-symbolic. I don't think the linker can (can in the sense of "has standing to") do this, but perhaps the compiler could optimize a call from a function named dlsym to a function named dlsym to be local rather than going through the GOT/PLT, since if it were interposed it would never be called to begin with. However this wouldn't work if there were other aliases for the same function, so it's probably not something the compiler could do. gcompat could explicitly preclude it via something like: void *__gcompat_dlsym(void *dso, const char *restrict name) { ... return dlsym(...); } weak_alias(__gcompat_dlsym, dlsym); This works because the alias imposes a mandate that the definitions be the same, and __gcompat_dlsym would be exported (thus not able to be optimized out) and would necessarily have to honor interposition of dlsym. 2. By having ldso remap symbol references *from* gcompat, so that gcompat could refer to __libc_foo and have the reference get remapped to plain foo. I feel like option 2 is a nastier hack (especially on the musl side) and not needed, since option 1 seems to work. Rich ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-22 15:52 ` Rich Felker 2019-07-24 15:17 ` Szabolcs Nagy @ 2019-07-24 16:33 ` James Y Knight 2019-07-24 17:36 ` Szabolcs Nagy 2019-07-24 21:29 ` Rich Felker 1 sibling, 2 replies; 18+ messages in thread From: James Y Knight @ 2019-07-24 16:33 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 5889 bytes --] One thing I've not seen mentioned yet: if this is done, then anyone (whether intentionally or inadvertently) who links any glibc-compiled .o or .a files into a musl binary/shared-lib will be broken. Up until now, with musl's mostly-glibc-compatible ABI, you could link the two object files together, and generally expect it to work. When compatibility is instead done with magic in the dynamic loader, that obviously can only ever work with a shared-object boundary. I don't know if anyone actually uses musl in a context where this is likely to be a problem, but it at least seems worth discussing (and loudly documenting as a warning to users not to do this if implemented). On Mon, Jul 22, 2019 at 8:53 AM Rich Felker <dalias@libc.org> wrote: > On Wed, Jul 17, 2019 at 02:16:51PM -0400, Rich Felker wrote: > > On Wed, Jul 17, 2019 at 01:10:19PM -0500, A. Wilcox wrote: > > > >> Just trying to make sure the community has a clear view of what this > > > >> looks like before we jump in. > > > > > > > > Yes. This isn't a request to jump in, just looking at feasability and > > > > whether there'd be interest from your side. Being that ABI-compat > > > > doesn't actually work very well without gcompat right now, though, I > > > > think it might make sense. I'll continue to look at whether there are > > > > other options, possibly just transitional, that might be good too. > > > > > > I meant: I want a clear view of the boundaries between musl and > gcompat, > > > before we (Adélie / the gcompat team) jump in and start designing how > we > > > want to handle all the new symbols we may end up with :) > > > > If we go this route, I would think that gcompat could provide all > > symbols which are not either public APIs (extensions you can > > legitimately use in source) or musl-header-induced ABIs (for example > > things like __ctype_get_mb_cur_max, which is used to define the > > MB_CUR_MAX macro). This would include LFS64 as well as the "__xstat" > > stuff, the other __ctype_* stuff, etc. > > I think I'd like to go foward with this. Further work on time64 has > made it apparent to me that the current glibc ABI-compat we have > inside musl is fragile and is imposing unwanted constraints on musl, > which has long been one of the criteria for exclusion. In particular, > consider this situation: > > Several structures that are part of public interfaces in musl were > created with extra space reserved for future extension. In some cases > the reserved space was added by musl; in other cases glibc had the > same. However, if we mandate glibc ABI-compat, *all* of this reserved > space is permanently unusable: > > - If the reserved space is specific to musl, then reads from it may > fault, and stores to it may clobber unrelated memory, if the > structure was allocated by glibc-linked code. > > - If the reserved space is present in both musl and glibc, we can't > make use of it without risking that glibc makes some different use > of it in the future, making calls from glibc-linked code dangerous. > > This came up in the context of structs rusage and timex, but also > applies to stat, sched_param, sysinfo, statvfs, and perhaps others, > which might have reason for wanting extensibility in the future. > > Right now, without the glibc ABI-compat constraint, getrusage, wait3, > and wait4 can avoid new time64 remappings entirely (by using the > reserved space we already have in rusage, which glibc doesn't have at > all). [clock_]adjtime[x] hit the second case -- glibc also has > reserved space in timex, but if they end up wanting to use it for > something else and we've put the 64-bit time there, we may be in > trouble. > > I don't think the rusage and timex issues here are compelling by > themselves. It's not a big deal to make compat shims here, and I might > still end up doing it. But I think it's indicative that maintaining > glibc ABI-compat in musl is going to become increasingly problematic. > > So, what I'd (tentatively; for discussion) like to do: > > When ldso loads an application or shared library and detects that it's > glibc-linked (DT_NEEDED for libc.so.6), it both loads a gcompat > library instead *and* flags the dso as needing ABI-compat. The gcompat > library would be permanently RTLD_LOCAL, unable to be used for > resolving global symbols, since it would have to define symbols > conflicting with libc symbols names and with future directions of the > musl ABI. > > Symbol lookups when relocating such a flagged dso would take place by > first processing gcompat (logically, adding it to the head of the dso > search list), then the normal symbol search order. The gcompat library > could also provide a replacement dlsym function, so that dlsym calls > from the glibc-linked DSO also follow this order, and a replacement > dlopen, so that dlopen of libc from the glibc-linked DSO would get the > gcompat module. > > I'm not sure what mechanism gcompat would then use to make its own > references to the underlying real libc functions. This is something > we'd need to think about. > > Before we decide to do it, please be aware that this would be a bit of > a burden on gcompat to do more than it's doing now. But it would also > make lots of cases work that fundamentally *can't* work now -- compat > with 32-bit code using the legacy 32-bit off_t functions, compat with > 64-bit code using regexec, etc. -- anywhere the musl ABI currently > conflicts with the glibc ABI. Of course much of this is optional. The > new things that would be mandatory would mainly be moving over > existing glibc compat shims (like the __ctype and __xstat stuff) and > implementing converting wrappers where musl's use of reserved space > creates unsafety/incompatibility with the existing glibc code. > > Rich > [-- Attachment #2: Type: text/html, Size: 6631 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-24 16:33 ` James Y Knight @ 2019-07-24 17:36 ` Szabolcs Nagy 2019-07-24 21:31 ` Rich Felker 2019-07-24 21:29 ` Rich Felker 1 sibling, 1 reply; 18+ messages in thread From: Szabolcs Nagy @ 2019-07-24 17:36 UTC (permalink / raw) To: musl * James Y Knight <jyknight@google.com> [2019-07-24 09:33:05 -0700]: > One thing I've not seen mentioned yet: if this is done, then anyone > (whether intentionally or inadvertently) who links any glibc-compiled .o or > .a files into a musl binary/shared-lib will be broken. > > Up until now, with musl's mostly-glibc-compatible ABI, you could link the > two object files together, and generally expect it to work. When > compatibility is instead done with magic in the dynamic loader, that > obviously can only ever work with a shared-object boundary. > > I don't know if anyone actually uses musl in a context where this is likely > to be a problem, but it at least seems worth discussing (and loudly > documenting as a warning to users not to do this if implemented). is it common that binary only .o or .a is distributed? binary only shared libs with glibc dependency are fairly common (plugins, userspace driver code etc). i think the abi compat was mainly intended to support that. > > > On Mon, Jul 22, 2019 at 8:53 AM Rich Felker <dalias@libc.org> wrote: > > > On Wed, Jul 17, 2019 at 02:16:51PM -0400, Rich Felker wrote: > > > On Wed, Jul 17, 2019 at 01:10:19PM -0500, A. Wilcox wrote: > > > > >> Just trying to make sure the community has a clear view of what this > > > > >> looks like before we jump in. > > > > > > > > > > Yes. This isn't a request to jump in, just looking at feasability and > > > > > whether there'd be interest from your side. Being that ABI-compat > > > > > doesn't actually work very well without gcompat right now, though, I > > > > > think it might make sense. I'll continue to look at whether there are > > > > > other options, possibly just transitional, that might be good too. > > > > > > > > I meant: I want a clear view of the boundaries between musl and > > gcompat, > > > > before we (Adélie / the gcompat team) jump in and start designing how > > we > > > > want to handle all the new symbols we may end up with :) > > > > > > If we go this route, I would think that gcompat could provide all > > > symbols which are not either public APIs (extensions you can > > > legitimately use in source) or musl-header-induced ABIs (for example > > > things like __ctype_get_mb_cur_max, which is used to define the > > > MB_CUR_MAX macro). This would include LFS64 as well as the "__xstat" > > > stuff, the other __ctype_* stuff, etc. > > > > I think I'd like to go foward with this. Further work on time64 has > > made it apparent to me that the current glibc ABI-compat we have > > inside musl is fragile and is imposing unwanted constraints on musl, > > which has long been one of the criteria for exclusion. In particular, > > consider this situation: > > > > Several structures that are part of public interfaces in musl were > > created with extra space reserved for future extension. In some cases > > the reserved space was added by musl; in other cases glibc had the > > same. However, if we mandate glibc ABI-compat, *all* of this reserved > > space is permanently unusable: > > > > - If the reserved space is specific to musl, then reads from it may > > fault, and stores to it may clobber unrelated memory, if the > > structure was allocated by glibc-linked code. > > > > - If the reserved space is present in both musl and glibc, we can't > > make use of it without risking that glibc makes some different use > > of it in the future, making calls from glibc-linked code dangerous. > > > > This came up in the context of structs rusage and timex, but also > > applies to stat, sched_param, sysinfo, statvfs, and perhaps others, > > which might have reason for wanting extensibility in the future. > > > > Right now, without the glibc ABI-compat constraint, getrusage, wait3, > > and wait4 can avoid new time64 remappings entirely (by using the > > reserved space we already have in rusage, which glibc doesn't have at > > all). [clock_]adjtime[x] hit the second case -- glibc also has > > reserved space in timex, but if they end up wanting to use it for > > something else and we've put the 64-bit time there, we may be in > > trouble. > > > > I don't think the rusage and timex issues here are compelling by > > themselves. It's not a big deal to make compat shims here, and I might > > still end up doing it. But I think it's indicative that maintaining > > glibc ABI-compat in musl is going to become increasingly problematic. > > > > So, what I'd (tentatively; for discussion) like to do: > > > > When ldso loads an application or shared library and detects that it's > > glibc-linked (DT_NEEDED for libc.so.6), it both loads a gcompat > > library instead *and* flags the dso as needing ABI-compat. The gcompat > > library would be permanently RTLD_LOCAL, unable to be used for > > resolving global symbols, since it would have to define symbols > > conflicting with libc symbols names and with future directions of the > > musl ABI. > > > > Symbol lookups when relocating such a flagged dso would take place by > > first processing gcompat (logically, adding it to the head of the dso > > search list), then the normal symbol search order. The gcompat library > > could also provide a replacement dlsym function, so that dlsym calls > > from the glibc-linked DSO also follow this order, and a replacement > > dlopen, so that dlopen of libc from the glibc-linked DSO would get the > > gcompat module. > > > > I'm not sure what mechanism gcompat would then use to make its own > > references to the underlying real libc functions. This is something > > we'd need to think about. > > > > Before we decide to do it, please be aware that this would be a bit of > > a burden on gcompat to do more than it's doing now. But it would also > > make lots of cases work that fundamentally *can't* work now -- compat > > with 32-bit code using the legacy 32-bit off_t functions, compat with > > 64-bit code using regexec, etc. -- anywhere the musl ABI currently > > conflicts with the glibc ABI. Of course much of this is optional. The > > new things that would be mandatory would mainly be moving over > > existing glibc compat shims (like the __ctype and __xstat stuff) and > > implementing converting wrappers where musl's use of reserved space > > creates unsafety/incompatibility with the existing glibc code. > > > > Rich > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-24 17:36 ` Szabolcs Nagy @ 2019-07-24 21:31 ` Rich Felker 0 siblings, 0 replies; 18+ messages in thread From: Rich Felker @ 2019-07-24 21:31 UTC (permalink / raw) To: musl On Wed, Jul 24, 2019 at 07:36:00PM +0200, Szabolcs Nagy wrote: > * James Y Knight <jyknight@google.com> [2019-07-24 09:33:05 -0700]: > > One thing I've not seen mentioned yet: if this is done, then anyone > > (whether intentionally or inadvertently) who links any glibc-compiled .o or > > .a files into a musl binary/shared-lib will be broken. > > > > Up until now, with musl's mostly-glibc-compatible ABI, you could link the > > two object files together, and generally expect it to work. When > > compatibility is instead done with magic in the dynamic loader, that > > obviously can only ever work with a shared-object boundary. > > > > I don't know if anyone actually uses musl in a context where this is likely > > to be a problem, but it at least seems worth discussing (and loudly > > documenting as a warning to users not to do this if implemented). > > is it common that binary only .o or .a is distributed? > > binary only shared libs with glibc dependency are fairly > common (plugins, userspace driver code etc). i think the > abi compat was mainly intended to support that. It may be common with proprietary middleware or userspace-drive stuff for hardware devices, where presumably the idea of shipping a static lib rather than a shared one is that you don't ship usable copies of the middleware vendor's library too your customers along with your product. Rich ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-24 16:33 ` James Y Knight 2019-07-24 17:36 ` Szabolcs Nagy @ 2019-07-24 21:29 ` Rich Felker 2019-07-25 16:42 ` James Y Knight 1 sibling, 1 reply; 18+ messages in thread From: Rich Felker @ 2019-07-24 21:29 UTC (permalink / raw) To: musl On Wed, Jul 24, 2019 at 09:33:05AM -0700, James Y Knight wrote: > One thing I've not seen mentioned yet: if this is done, then anyone > (whether intentionally or inadvertently) who links any glibc-compiled .o or > ..a files into a musl binary/shared-lib will be broken. If it referenced glibc symbols that have been moved out of musl, it would just fail to link (at ld time or ldso time, depending on program binary/shared lib). The only way it would be silently broken is with symbols where glibc and musl share the same symbol name but with different ABI (like regexec on 64-bit, which is already possible now, or the non-64bit-off_t functions on 32-bit archs, or lots of stuff on mips and powerpc where there's minimal or no ABI-compat). For the time64 stuff, my thought is to try to use redirected-symbol names that don't match whatever names glibc will be using, so that there's no risk of the link accidentally succeeding. I think it makes sense in general to try to have ABI match when we add symbols that will also exist in glibc, on the archs that have ABI-compat. > Up until now, with musl's mostly-glibc-compatible ABI, you could link the > two object files together, and generally expect it to work. When > compatibility is instead done with magic in the dynamic loader, that > obviously can only ever work with a shared-object boundary. > > I don't know if anyone actually uses musl in a context where this is likely > to be a problem, but it at least seems worth discussing (and loudly > documenting as a warning to users not to do this if implemented). My thought, for the things where it matters, is that it's an improvement to fail. If you really want it to work (e.g. if you have a binary-only static library you need to use), you can probably use objcopy or similar to remap the symbols to shims. Does my above analysis sound reasonable to you? Rich > On Mon, Jul 22, 2019 at 8:53 AM Rich Felker <dalias@libc.org> wrote: > > > On Wed, Jul 17, 2019 at 02:16:51PM -0400, Rich Felker wrote: > > > On Wed, Jul 17, 2019 at 01:10:19PM -0500, A. Wilcox wrote: > > > > >> Just trying to make sure the community has a clear view of what this > > > > >> looks like before we jump in. > > > > > > > > > > Yes. This isn't a request to jump in, just looking at feasability and > > > > > whether there'd be interest from your side. Being that ABI-compat > > > > > doesn't actually work very well without gcompat right now, though, I > > > > > think it might make sense. I'll continue to look at whether there are > > > > > other options, possibly just transitional, that might be good too. > > > > > > > > I meant: I want a clear view of the boundaries between musl and > > gcompat, > > > > before we (Adélie / the gcompat team) jump in and start designing how > > we > > > > want to handle all the new symbols we may end up with :) > > > > > > If we go this route, I would think that gcompat could provide all > > > symbols which are not either public APIs (extensions you can > > > legitimately use in source) or musl-header-induced ABIs (for example > > > things like __ctype_get_mb_cur_max, which is used to define the > > > MB_CUR_MAX macro). This would include LFS64 as well as the "__xstat" > > > stuff, the other __ctype_* stuff, etc. > > > > I think I'd like to go foward with this. Further work on time64 has > > made it apparent to me that the current glibc ABI-compat we have > > inside musl is fragile and is imposing unwanted constraints on musl, > > which has long been one of the criteria for exclusion. In particular, > > consider this situation: > > > > Several structures that are part of public interfaces in musl were > > created with extra space reserved for future extension. In some cases > > the reserved space was added by musl; in other cases glibc had the > > same. However, if we mandate glibc ABI-compat, *all* of this reserved > > space is permanently unusable: > > > > - If the reserved space is specific to musl, then reads from it may > > fault, and stores to it may clobber unrelated memory, if the > > structure was allocated by glibc-linked code. > > > > - If the reserved space is present in both musl and glibc, we can't > > make use of it without risking that glibc makes some different use > > of it in the future, making calls from glibc-linked code dangerous. > > > > This came up in the context of structs rusage and timex, but also > > applies to stat, sched_param, sysinfo, statvfs, and perhaps others, > > which might have reason for wanting extensibility in the future. > > > > Right now, without the glibc ABI-compat constraint, getrusage, wait3, > > and wait4 can avoid new time64 remappings entirely (by using the > > reserved space we already have in rusage, which glibc doesn't have at > > all). [clock_]adjtime[x] hit the second case -- glibc also has > > reserved space in timex, but if they end up wanting to use it for > > something else and we've put the 64-bit time there, we may be in > > trouble. > > > > I don't think the rusage and timex issues here are compelling by > > themselves. It's not a big deal to make compat shims here, and I might > > still end up doing it. But I think it's indicative that maintaining > > glibc ABI-compat in musl is going to become increasingly problematic. > > > > So, what I'd (tentatively; for discussion) like to do: > > > > When ldso loads an application or shared library and detects that it's > > glibc-linked (DT_NEEDED for libc.so.6), it both loads a gcompat > > library instead *and* flags the dso as needing ABI-compat. The gcompat > > library would be permanently RTLD_LOCAL, unable to be used for > > resolving global symbols, since it would have to define symbols > > conflicting with libc symbols names and with future directions of the > > musl ABI. > > > > Symbol lookups when relocating such a flagged dso would take place by > > first processing gcompat (logically, adding it to the head of the dso > > search list), then the normal symbol search order. The gcompat library > > could also provide a replacement dlsym function, so that dlsym calls > > from the glibc-linked DSO also follow this order, and a replacement > > dlopen, so that dlopen of libc from the glibc-linked DSO would get the > > gcompat module. > > > > I'm not sure what mechanism gcompat would then use to make its own > > references to the underlying real libc functions. This is something > > we'd need to think about. > > > > Before we decide to do it, please be aware that this would be a bit of > > a burden on gcompat to do more than it's doing now. But it would also > > make lots of cases work that fundamentally *can't* work now -- compat > > with 32-bit code using the legacy 32-bit off_t functions, compat with > > 64-bit code using regexec, etc. -- anywhere the musl ABI currently > > conflicts with the glibc ABI. Of course much of this is optional. The > > new things that would be mandatory would mainly be moving over > > existing glibc compat shims (like the __ctype and __xstat stuff) and > > implementing converting wrappers where musl's use of reserved space > > creates unsafety/incompatibility with the existing glibc code. > > > > Rich > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-24 21:29 ` Rich Felker @ 2019-07-25 16:42 ` James Y Knight 2019-07-25 20:03 ` Rich Felker 0 siblings, 1 reply; 18+ messages in thread From: James Y Knight @ 2019-07-25 16:42 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 8389 bytes --] On Wed, Jul 24, 2019 at 2:29 PM Rich Felker <dalias@libc.org> wrote: > On Wed, Jul 24, 2019 at 09:33:05AM -0700, James Y Knight wrote: > > One thing I've not seen mentioned yet: if this is done, then anyone > > (whether intentionally or inadvertently) who links any glibc-compiled .o > or > > ..a files into a musl binary/shared-lib will be broken. > > If it referenced glibc symbols that have been moved out of musl, it > would just fail to link (at ld time or ldso time, depending on program > binary/shared lib). The only way it would be silently broken is with > symbols where glibc and musl share the same symbol name but with > different ABI (like regexec on 64-bit, which is already possible now, > or the non-64bit-off_t functions on 32-bit archs, or lots of stuff on > mips and powerpc where there's minimal or no ABI-compat). > > For the time64 stuff, my thought is to try to use redirected-symbol > names that don't match whatever names glibc will be using, so that > there's no risk of the link accidentally succeeding. I think it makes > sense in general to try to have ABI match when we add symbols that > will also exist in glibc, on the archs that have ABI-compat. > > > Up until now, with musl's mostly-glibc-compatible ABI, you could link the > > two object files together, and generally expect it to work. When > > compatibility is instead done with magic in the dynamic loader, that > > obviously can only ever work with a shared-object boundary. > > > > I don't know if anyone actually uses musl in a context where this is > likely > > to be a problem, but it at least seems worth discussing (and loudly > > documenting as a warning to users not to do this if implemented). > > My thought, for the things where it matters, is that it's an > improvement to fail. If you really want it to work (e.g. if you have a > binary-only static library you need to use), you can probably use > objcopy or similar to remap the symbols to shims. > > Does my above analysis sound reasonable to you? > I had understood from your previous emails that musl would start dropping glibc-abi-compatibility (potentially in general, not just for the 64-bit-time transition) of existing "undecorated" functions, and then restore compatibility only in a shadowed version of that same function name in libgcompat.so. But yes -- just dropping symbols and triggering a link error seems totally fine. My worry was mainly that there would be mysterious runtime bugs, especially if a given function's ABI had previously been compatible, and now becomes incompatible. And again, I don't think it's a non-starter to make such a change, only that if that is to happen, it should happen with deliberation and notice to users. Rich > > > > On Mon, Jul 22, 2019 at 8:53 AM Rich Felker <dalias@libc.org> wrote: > > > > > On Wed, Jul 17, 2019 at 02:16:51PM -0400, Rich Felker wrote: > > > > On Wed, Jul 17, 2019 at 01:10:19PM -0500, A. Wilcox wrote: > > > > > >> Just trying to make sure the community has a clear view of what > this > > > > > >> looks like before we jump in. > > > > > > > > > > > > Yes. This isn't a request to jump in, just looking at > feasability and > > > > > > whether there'd be interest from your side. Being that ABI-compat > > > > > > doesn't actually work very well without gcompat right now, > though, I > > > > > > think it might make sense. I'll continue to look at whether > there are > > > > > > other options, possibly just transitional, that might be good > too. > > > > > > > > > > I meant: I want a clear view of the boundaries between musl and > > > gcompat, > > > > > before we (Adélie / the gcompat team) jump in and start designing > how > > > we > > > > > want to handle all the new symbols we may end up with :) > > > > > > > > If we go this route, I would think that gcompat could provide all > > > > symbols which are not either public APIs (extensions you can > > > > legitimately use in source) or musl-header-induced ABIs (for example > > > > things like __ctype_get_mb_cur_max, which is used to define the > > > > MB_CUR_MAX macro). This would include LFS64 as well as the "__xstat" > > > > stuff, the other __ctype_* stuff, etc. > > > > > > I think I'd like to go foward with this. Further work on time64 has > > > made it apparent to me that the current glibc ABI-compat we have > > > inside musl is fragile and is imposing unwanted constraints on musl, > > > which has long been one of the criteria for exclusion. In particular, > > > consider this situation: > > > > > > Several structures that are part of public interfaces in musl were > > > created with extra space reserved for future extension. In some cases > > > the reserved space was added by musl; in other cases glibc had the > > > same. However, if we mandate glibc ABI-compat, *all* of this reserved > > > space is permanently unusable: > > > > > > - If the reserved space is specific to musl, then reads from it may > > > fault, and stores to it may clobber unrelated memory, if the > > > structure was allocated by glibc-linked code. > > > > > > - If the reserved space is present in both musl and glibc, we can't > > > make use of it without risking that glibc makes some different use > > > of it in the future, making calls from glibc-linked code dangerous. > > > > > > This came up in the context of structs rusage and timex, but also > > > applies to stat, sched_param, sysinfo, statvfs, and perhaps others, > > > which might have reason for wanting extensibility in the future. > > > > > > Right now, without the glibc ABI-compat constraint, getrusage, wait3, > > > and wait4 can avoid new time64 remappings entirely (by using the > > > reserved space we already have in rusage, which glibc doesn't have at > > > all). [clock_]adjtime[x] hit the second case -- glibc also has > > > reserved space in timex, but if they end up wanting to use it for > > > something else and we've put the 64-bit time there, we may be in > > > trouble. > > > > > > I don't think the rusage and timex issues here are compelling by > > > themselves. It's not a big deal to make compat shims here, and I might > > > still end up doing it. But I think it's indicative that maintaining > > > glibc ABI-compat in musl is going to become increasingly problematic. > > > > > > So, what I'd (tentatively; for discussion) like to do: > > > > > > When ldso loads an application or shared library and detects that it's > > > glibc-linked (DT_NEEDED for libc.so.6), it both loads a gcompat > > > library instead *and* flags the dso as needing ABI-compat. The gcompat > > > library would be permanently RTLD_LOCAL, unable to be used for > > > resolving global symbols, since it would have to define symbols > > > conflicting with libc symbols names and with future directions of the > > > musl ABI. > > > > > > Symbol lookups when relocating such a flagged dso would take place by > > > first processing gcompat (logically, adding it to the head of the dso > > > search list), then the normal symbol search order. The gcompat library > > > could also provide a replacement dlsym function, so that dlsym calls > > > from the glibc-linked DSO also follow this order, and a replacement > > > dlopen, so that dlopen of libc from the glibc-linked DSO would get the > > > gcompat module. > > > > > > I'm not sure what mechanism gcompat would then use to make its own > > > references to the underlying real libc functions. This is something > > > we'd need to think about. > > > > > > Before we decide to do it, please be aware that this would be a bit of > > > a burden on gcompat to do more than it's doing now. But it would also > > > make lots of cases work that fundamentally *can't* work now -- compat > > > with 32-bit code using the legacy 32-bit off_t functions, compat with > > > 64-bit code using regexec, etc. -- anywhere the musl ABI currently > > > conflicts with the glibc ABI. Of course much of this is optional. The > > > new things that would be mandatory would mainly be moving over > > > existing glibc compat shims (like the __ctype and __xstat stuff) and > > > implementing converting wrappers where musl's use of reserved space > > > creates unsafety/incompatibility with the existing glibc code. > > > > > > Rich > > > > [-- Attachment #2: Type: text/html, Size: 10113 bytes --] ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Removing glibc from the musl .2 ABI 2019-07-25 16:42 ` James Y Knight @ 2019-07-25 20:03 ` Rich Felker 0 siblings, 0 replies; 18+ messages in thread From: Rich Felker @ 2019-07-25 20:03 UTC (permalink / raw) To: musl On Thu, Jul 25, 2019 at 09:42:23AM -0700, James Y Knight wrote: > On Wed, Jul 24, 2019 at 2:29 PM Rich Felker <dalias@libc.org> wrote: > > > On Wed, Jul 24, 2019 at 09:33:05AM -0700, James Y Knight wrote: > > > One thing I've not seen mentioned yet: if this is done, then anyone > > > (whether intentionally or inadvertently) who links any glibc-compiled .o > > or > > > ..a files into a musl binary/shared-lib will be broken. > > > > If it referenced glibc symbols that have been moved out of musl, it > > would just fail to link (at ld time or ldso time, depending on program > > binary/shared lib). The only way it would be silently broken is with > > symbols where glibc and musl share the same symbol name but with > > different ABI (like regexec on 64-bit, which is already possible now, > > or the non-64bit-off_t functions on 32-bit archs, or lots of stuff on > > mips and powerpc where there's minimal or no ABI-compat). > > > > For the time64 stuff, my thought is to try to use redirected-symbol > > names that don't match whatever names glibc will be using, so that > > there's no risk of the link accidentally succeeding. I think it makes > > sense in general to try to have ABI match when we add symbols that > > will also exist in glibc, on the archs that have ABI-compat. > > > > > Up until now, with musl's mostly-glibc-compatible ABI, you could link the > > > two object files together, and generally expect it to work. When > > > compatibility is instead done with magic in the dynamic loader, that > > > obviously can only ever work with a shared-object boundary. > > > > > > I don't know if anyone actually uses musl in a context where this is > > likely > > > to be a problem, but it at least seems worth discussing (and loudly > > > documenting as a warning to users not to do this if implemented). > > > > My thought, for the things where it matters, is that it's an > > improvement to fail. If you really want it to work (e.g. if you have a > > binary-only static library you need to use), you can probably use > > objcopy or similar to remap the symbols to shims. > > > > Does my above analysis sound reasonable to you? > > I had understood from your previous emails that musl would start dropping > glibc-abi-compatibility (potentially in general, not just for the > 64-bit-time transition) of existing "undecorated" functions, and then > restore compatibility only in a shadowed version of that same function name > in libgcompat.so. Unless I misunderstand what you're saying, that's impossible without also dropping musl-ABI compatibility. So no, it wouldn't happen. Rich ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2019-07-25 20:03 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-07-11 23:58 Removing glibc from the musl .2 ABI A. Wilcox 2019-07-12 0:51 ` Khem Raj 2019-07-12 1:45 ` Rich Felker 2019-07-12 1:47 ` Rich Felker 2019-07-17 3:37 ` Rich Felker 2019-07-17 13:13 ` A. Wilcox 2019-07-17 15:11 ` Rich Felker 2019-07-17 18:10 ` A. Wilcox 2019-07-17 18:16 ` Rich Felker 2019-07-22 15:52 ` Rich Felker 2019-07-24 15:17 ` Szabolcs Nagy 2019-07-24 16:02 ` Rich Felker 2019-07-24 16:33 ` James Y Knight 2019-07-24 17:36 ` Szabolcs Nagy 2019-07-24 21:31 ` Rich Felker 2019-07-24 21:29 ` Rich Felker 2019-07-25 16:42 ` James Y Knight 2019-07-25 20:03 ` Rich Felker
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).