* time_t progress/findings @ 2019-07-18 15:41 Rich Felker 2019-07-18 16:37 ` Rich Felker 0 siblings, 1 reply; 5+ messages in thread From: Rich Felker @ 2019-07-18 15:41 UTC (permalink / raw) To: musl I've started on a sketch of the work needed for moving 32-bit archs to 64-bit time_t. First, one "good" thing: the sysvipc structs with times in them are only used with ioctl-eqsue command numbers, so rather than defining new functions, we can just define new command numbers. Of course that means we have to pick numbers, which is never fun. This is the same situation as sockopts and ioctls except that we define them rather than the kernel doing so. Now, for the proposed form for the legacy ABI functions, I'll show a few examples: time32_t __time32(time32_t *p) { time_t t = time(0); if (t < INT32_MIN || t > INT32_MAX) { errno = EOVERFLOW; return -1; } *p = t; return t; } struct tm *__gmtime32_r(time32_t *t, struct tm *tm) { return gmtime_r(&(time_t){*t}, tm); } double __difftime32(time32_t t1, time_t t2) { return difftime(t1, t2); } The naming is done such that, at the source level, the standard names are all the "real" functions that support 64-bit time_t. Public headers (also included internally, of course) would remap these names to the time64 symbol names (time->__time64, etc.) while private headers would remap the time32 names above to the ABI-compat symbol names (__time32->time). Note in particular that the names with 32 in them are purely source-level, not present in any symbols, so they could be renamed freely if the scheme is deemed ugly or anything. Not only is this approach fairly clean; if we ever do have cause to do a hard ABI break (".2 ABI"), just disabling/removing the above compat functions and the symbol name redirections gives it (with no redirections or tricks). Rich ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: time_t progress/findings 2019-07-18 15:41 time_t progress/findings Rich Felker @ 2019-07-18 16:37 ` Rich Felker 2019-07-18 20:52 ` Rich Felker 0 siblings, 1 reply; 5+ messages in thread From: Rich Felker @ 2019-07-18 16:37 UTC (permalink / raw) To: musl Second bit of progress here: stat. First change can be done before any actual time64 work is done or even decided upon: changing all the stat-family functions to use fstatat (possibly with AT_EMPTY_PATH) as their backend, and changing fstatat to do proper fallbacks if SYS_fstatat is missing. Now there's a single point of potential stat conversion rather than 4 functions. Next, add an internal, arch-provided kstat type and make fstatat translate from this to the public stat type. This eliminates the need for all the mips*/syscall_arch.h hacks. Third, add use of SYS_statx when it's available, and translate from it to the public stat type. Only fallback to SYS_fstatat if SYS_statx is missing. At this point we have the kernel giving us 64-bit timespecs for stat, even if we can't use them. The last step is just changing over the public types for 32-bit archs (we get to define struct stat entirely, since it's not filled in by the kernel at all anymore at this point). The best part of all this is that none of the steps until the last depend on choices of 64-bit time_t action to take, and all of them are beneficial changes even without 64-bit time_t. Final note: some attention may be needed to how O_PATH file descriptors are handled in the fallback code paths, since this is presently a delicate issue for fstat() due to old kernel bugs. I think modern fstatat+AT_EMPTY_PATH and statx+AT_EMPTY_PATH get it right though. Rich ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: time_t progress/findings 2019-07-18 16:37 ` Rich Felker @ 2019-07-18 20:52 ` Rich Felker 2019-07-20 4:48 ` Rich Felker 0 siblings, 1 reply; 5+ messages in thread From: Rich Felker @ 2019-07-18 20:52 UTC (permalink / raw) To: musl On Thu, Jul 18, 2019 at 12:37:45PM -0400, Rich Felker wrote: > Second bit of progress here: stat. First change can be done before any > actual time64 work is done or even decided upon: changing all the > stat-family functions to use fstatat (possibly with AT_EMPTY_PATH) as > their backend, and changing fstatat to do proper fallbacks if > SYS_fstatat is missing. Now there's a single point of potential stat > conversion rather than 4 functions. > > Next, add an internal, arch-provided kstat type and make fstatat > translate from this to the public stat type. This eliminates the need > for all the mips*/syscall_arch.h hacks. This step admits a few questions about how to do it best, inspired in part by a related question: What should the new time64 stat structures look like? There are at least three possible goals: 1. Make them as clean and uniform as possible, same for all archs. 2. Avoid increasing the size at all cost so as to maximize memory-safety of mismatched interfaces between libc consumers defined in terms of struct stat. 3. Make the start of the new struct match the old struct to minimize behavioral errors under mismatched interfaces between libc consumers defined in terms of struct stat. Choice 2 is pretty much out because I think it's impossible on at least one arch, and would impose really ugly constraints (making timespec 24-byte, relying on non-64bit-alignment) on others. In many ways choice 3 is actually more appealing, because when third-party libraries *do* use stat in public interfaces, it's usually understood that the same party both allocates and fills it in, and shares the contents with the other party. There are actually 2 subvariants of choice 3: either keep exposing the 32-bit time in the old locations so that mismatched consumers just work, or fill it in with something like INT_MIN (year~=1902) so that breakage is caught quickly. Now, back to kstat and the above-quoted text. If we go with option 3, we don't actually need a kstat struct. The existing stat syscalls just write into the beginning of the buffer, and then we copy the result to the time64 timespecs at the end that make up the new public interface. This results in the smallest code, and the least amount of new per-arch definitions. But it doesn't clean up the existing mips stat-translation hell (currently buried in mips*/syscall_arch.h), and it imposes assumptions about the relationship between kernel types and public libc types. On the other hand, if we make archs define a struct kstat and always translate everything, the code is a bit larger, but we: - don't impose any particular choice 1/2/3 above. - make it easy to cleanup the mips brokenness. - facilitate future musl archs/ABIs (e.g. a ".2 ABI") where userspace stat has nothing to do with the legacy kernel stat structs. So I'm leaning strongly towards just always doing the translation, even though I'm also leaning towards choice 3 above that won't require it. If nothing else, it allows me to do the prep work that will set the stage for time64 transition now, without having finalize the decisions about how time64 will look. Rich ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: time_t progress/findings 2019-07-18 20:52 ` Rich Felker @ 2019-07-20 4:48 ` Rich Felker 2019-07-20 21:46 ` Rich Felker 0 siblings, 1 reply; 5+ messages in thread From: Rich Felker @ 2019-07-20 4:48 UTC (permalink / raw) To: musl On Thu, Jul 18, 2019 at 04:52:37PM -0400, Rich Felker wrote: > On Thu, Jul 18, 2019 at 12:37:45PM -0400, Rich Felker wrote: > > Second bit of progress here: stat. First change can be done before any > > actual time64 work is done or even decided upon: changing all the > > stat-family functions to use fstatat (possibly with AT_EMPTY_PATH) as > > their backend, and changing fstatat to do proper fallbacks if > > SYS_fstatat is missing. Now there's a single point of potential stat > > conversion rather than 4 functions. > > > > Next, add an internal, arch-provided kstat type and make fstatat > > translate from this to the public stat type. This eliminates the need > > for all the mips*/syscall_arch.h hacks. > > This step admits a few questions about how to do it best, inspired in > part by a related question: > > What should the new time64 stat structures look like? > > There are at least three possible goals: > > 1. Make them as clean and uniform as possible, same for all archs. > > 2. Avoid increasing the size at all cost so as to maximize > memory-safety of mismatched interfaces between libc consumers > defined in terms of struct stat. > > 3. Make the start of the new struct match the old struct to minimize > behavioral errors under mismatched interfaces between libc > consumers defined in terms of struct stat. > > Choice 2 is pretty much out because I think it's impossible on at > least one arch, and would impose really ugly constraints (making > timespec 24-byte, relying on non-64bit-alignment) on others. In many > ways choice 3 is actually more appealing, because when third-party > libraries *do* use stat in public interfaces, it's usually understood > that the same party both allocates and fills it in, and shares the > contents with the other party. > > There are actually 2 subvariants of choice 3: either keep exposing the > 32-bit time in the old locations so that mismatched consumers just > work, or fill it in with something like INT_MIN (year~=1902) so that > breakage is caught quickly. > > Now, back to kstat and the above-quoted text. If we go with option 3, > we don't actually need a kstat struct. The existing stat syscalls just > write into the beginning of the buffer, and then we copy the result to > the time64 timespecs at the end that make up the new public interface. > This results in the smallest code, and the least amount of new > per-arch definitions. But it doesn't clean up the existing mips > stat-translation hell (currently buried in mips*/syscall_arch.h), and > it imposes assumptions about the relationship between kernel types and > public libc types. > > On the other hand, if we make archs define a struct kstat and always > translate everything, the code is a bit larger, but we: > > - don't impose any particular choice 1/2/3 above. > - make it easy to cleanup the mips brokenness. > - facilitate future musl archs/ABIs (e.g. a ".2 ABI") where userspace > stat has nothing to do with the legacy kernel stat structs. > > So I'm leaning strongly towards just always doing the translation, > even though I'm also leaning towards choice 3 above that won't require > it. If nothing else, it allows me to do the prep work that will set > the stage for time64 transition now, without having finalize the > decisions about how time64 will look. Another data point in favor of choice 3: libc actually has some functions of its own that pass stat structures to callbacks: ftw and nftw. With choice 3, these don't need any change; a legacy binary calling them will get back stat structures it can read (with some extra 64-bit timespecs afterwards that it's not aware of). With any other choice, these functions would need painful replacements, and just wrapping them is not easy because they lack a context argument to pass through. Since similar usage is likely common in third-party library code, I think this is a really strong argument in favor of choice 3. FWIW the existing glibc proposal looks like option 1, and they weren't aware of this problem until I reported it just now. Rich ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: time_t progress/findings 2019-07-20 4:48 ` Rich Felker @ 2019-07-20 21:46 ` Rich Felker 0 siblings, 0 replies; 5+ messages in thread From: Rich Felker @ 2019-07-20 21:46 UTC (permalink / raw) To: musl On Sat, Jul 20, 2019 at 12:48:40AM -0400, Rich Felker wrote: > On Thu, Jul 18, 2019 at 04:52:37PM -0400, Rich Felker wrote: > > On Thu, Jul 18, 2019 at 12:37:45PM -0400, Rich Felker wrote: > > > Second bit of progress here: stat. First change can be done before any > > > actual time64 work is done or even decided upon: changing all the > > > stat-family functions to use fstatat (possibly with AT_EMPTY_PATH) as > > > their backend, and changing fstatat to do proper fallbacks if > > > SYS_fstatat is missing. Now there's a single point of potential stat > > > conversion rather than 4 functions. > > > > > > Next, add an internal, arch-provided kstat type and make fstatat > > > translate from this to the public stat type. This eliminates the need > > > for all the mips*/syscall_arch.h hacks. > > > > This step admits a few questions about how to do it best, inspired in > > part by a related question: > > > > What should the new time64 stat structures look like? > > > > There are at least three possible goals: > > > > 1. Make them as clean and uniform as possible, same for all archs. > > > > 2. Avoid increasing the size at all cost so as to maximize > > memory-safety of mismatched interfaces between libc consumers > > defined in terms of struct stat. > > > > 3. Make the start of the new struct match the old struct to minimize > > behavioral errors under mismatched interfaces between libc > > consumers defined in terms of struct stat. > > > > Choice 2 is pretty much out because I think it's impossible on at > > least one arch, and would impose really ugly constraints (making > > timespec 24-byte, relying on non-64bit-alignment) on others. In many > > ways choice 3 is actually more appealing, because when third-party > > libraries *do* use stat in public interfaces, it's usually understood > > that the same party both allocates and fills it in, and shares the > > contents with the other party. > > > > There are actually 2 subvariants of choice 3: either keep exposing the > > 32-bit time in the old locations so that mismatched consumers just > > work, or fill it in with something like INT_MIN (year~=1902) so that > > breakage is caught quickly. > > > > Now, back to kstat and the above-quoted text. If we go with option 3, > > we don't actually need a kstat struct. The existing stat syscalls just > > write into the beginning of the buffer, and then we copy the result to > > the time64 timespecs at the end that make up the new public interface. > > This results in the smallest code, and the least amount of new > > per-arch definitions. But it doesn't clean up the existing mips > > stat-translation hell (currently buried in mips*/syscall_arch.h), and > > it imposes assumptions about the relationship between kernel types and > > public libc types. > > > > On the other hand, if we make archs define a struct kstat and always > > translate everything, the code is a bit larger, but we: > > > > - don't impose any particular choice 1/2/3 above. > > - make it easy to cleanup the mips brokenness. > > - facilitate future musl archs/ABIs (e.g. a ".2 ABI") where userspace > > stat has nothing to do with the legacy kernel stat structs. > > > > So I'm leaning strongly towards just always doing the translation, > > even though I'm also leaning towards choice 3 above that won't require > > it. If nothing else, it allows me to do the prep work that will set > > the stage for time64 transition now, without having finalize the > > decisions about how time64 will look. > > Another data point in favor of choice 3: libc actually has some > functions of its own that pass stat structures to callbacks: ftw and > nftw. With choice 3, these don't need any change; a legacy binary > calling them will get back stat structures it can read (with some > extra 64-bit timespecs afterwards that it's not aware of). With any > other choice, these functions would need painful replacements, and > just wrapping them is not easy because they lack a context argument to > pass through. > > Since similar usage is likely common in third-party library code, I > think this is a really strong argument in favor of choice 3. FWIW the > existing glibc proposal looks like option 1, and they weren't aware of > this problem until I reported it just now. Related find: for struct rusage, we can satisfy both (2) and (3) simultaneously. This means no new structures or symbols are needed for getrusage, wait3, and wait4. The existing 32-bit musl structs left 16 slots for extensibility at the end, so we can just put the new 64-bit time fields there, and still fill in the 32-bit ones too for legacy callers. This is basically what I was already planning for utmp, except that it's less interesting for utmp because the functions are stubs. Rich ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-07-20 21:46 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-07-18 15:41 time_t progress/findings Rich Felker 2019-07-18 16:37 ` Rich Felker 2019-07-18 20:52 ` Rich Felker 2019-07-20 4:48 ` Rich Felker 2019-07-20 21:46 ` Rich Felker
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).