mailing list of musl libc
 help / color / mirror / code / Atom feed
* time_t progress/findings
@ 2019-07-18 15:41 Rich Felker
  2019-07-18 16:37 ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2019-07-18 15:41 UTC (permalink / raw)
  To: musl

I've started on a sketch of the work needed for moving 32-bit archs to
64-bit time_t. First, one "good" thing: the sysvipc structs with times
in them are only used with ioctl-eqsue command numbers, so rather than
defining new functions, we can just define new command numbers. Of
course that means we have to pick numbers, which is never fun. This is
the same situation as sockopts and ioctls except that we define them
rather than the kernel doing so.

Now, for the proposed form for the legacy ABI functions, I'll show a
few examples:

time32_t __time32(time32_t *p)
{
	time_t t = time(0);
	if (t < INT32_MIN || t > INT32_MAX) {
		errno = EOVERFLOW;
		return -1;
	}
	*p = t;
	return t;
}

struct tm *__gmtime32_r(time32_t *t, struct tm *tm)
{
	return gmtime_r(&(time_t){*t}, tm);
}

double __difftime32(time32_t t1, time_t t2)
{
	return difftime(t1, t2);
}

The naming is done such that, at the source level, the standard names
are all the "real" functions that support 64-bit time_t. Public
headers (also included internally, of course) would remap these names
to the time64 symbol names (time->__time64, etc.) while private
headers would remap the time32 names above to the ABI-compat symbol
names (__time32->time). Note in particular that the names with 32 in
them are purely source-level, not present in any symbols, so they
could be renamed freely if the scheme is deemed ugly or anything.

Not only is this approach fairly clean; if we ever do have cause to do
a hard ABI break (".2 ABI"), just disabling/removing the above compat
functions and the symbol name redirections gives it (with no
redirections or tricks).

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: time_t progress/findings
  2019-07-18 15:41 time_t progress/findings Rich Felker
@ 2019-07-18 16:37 ` Rich Felker
  2019-07-18 20:52   ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2019-07-18 16:37 UTC (permalink / raw)
  To: musl

Second bit of progress here: stat. First change can be done before any
actual time64 work is done or even decided upon: changing all the
stat-family functions to use fstatat (possibly with AT_EMPTY_PATH) as
their backend, and changing fstatat to do proper fallbacks if
SYS_fstatat is missing. Now there's a single point of potential stat
conversion rather than 4 functions.

Next, add an internal, arch-provided kstat type and make fstatat
translate from this to the public stat type. This eliminates the need
for all the mips*/syscall_arch.h hacks.

Third, add use of SYS_statx when it's available, and translate from it
to the public stat type. Only fallback to SYS_fstatat if SYS_statx is
missing.

At this point we have the kernel giving us 64-bit timespecs for stat,
even if we can't use them. The last step is just changing over the
public types for 32-bit archs (we get to define struct stat entirely,
since it's not filled in by the kernel at all anymore at this point).

The best part of all this is that none of the steps until the last
depend on choices of 64-bit time_t action to take, and all of them are
beneficial changes even without 64-bit time_t.

Final note: some attention may be needed to how O_PATH file
descriptors are handled in the fallback code paths, since this is
presently a delicate issue for fstat() due to old kernel bugs. I think
modern fstatat+AT_EMPTY_PATH and statx+AT_EMPTY_PATH get it right
though.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: time_t progress/findings
  2019-07-18 16:37 ` Rich Felker
@ 2019-07-18 20:52   ` Rich Felker
  2019-07-20  4:48     ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2019-07-18 20:52 UTC (permalink / raw)
  To: musl

On Thu, Jul 18, 2019 at 12:37:45PM -0400, Rich Felker wrote:
> Second bit of progress here: stat. First change can be done before any
> actual time64 work is done or even decided upon: changing all the
> stat-family functions to use fstatat (possibly with AT_EMPTY_PATH) as
> their backend, and changing fstatat to do proper fallbacks if
> SYS_fstatat is missing. Now there's a single point of potential stat
> conversion rather than 4 functions.
> 
> Next, add an internal, arch-provided kstat type and make fstatat
> translate from this to the public stat type. This eliminates the need
> for all the mips*/syscall_arch.h hacks.

This step admits a few questions about how to do it best, inspired in
part by a related question:

What should the new time64 stat structures look like?

There are at least three possible goals:

1. Make them as clean and uniform as possible, same for all archs.

2. Avoid increasing the size at all cost so as to maximize
   memory-safety of mismatched interfaces between libc consumers
   defined in terms of struct stat.

3. Make the start of the new struct match the old struct to minimize
   behavioral errors under mismatched interfaces between libc
   consumers defined in terms of struct stat.

Choice 2 is pretty much out because I think it's impossible on at
least one arch, and would impose really ugly constraints (making
timespec 24-byte, relying on non-64bit-alignment) on others. In many
ways choice 3 is actually more appealing, because when third-party
libraries *do* use stat in public interfaces, it's usually understood
that the same party both allocates and fills it in, and shares the
contents with the other party.

There are actually 2 subvariants of choice 3: either keep exposing the
32-bit time in the old locations so that mismatched consumers just
work, or fill it in with something like INT_MIN (year~=1902) so that
breakage is caught quickly.

Now, back to kstat and the above-quoted text. If we go with option 3,
we don't actually need a kstat struct. The existing stat syscalls just
write into the beginning of the buffer, and then we copy the result to
the time64 timespecs at the end that make up the new public interface.
This results in the smallest code, and the least amount of new
per-arch definitions. But it doesn't clean up the existing mips
stat-translation hell (currently buried in mips*/syscall_arch.h), and
it imposes assumptions about the relationship between kernel types and
public libc types.

On the other hand, if we make archs define a struct kstat and always
translate everything, the code is a bit larger, but we:

- don't impose any particular choice 1/2/3 above.
- make it easy to cleanup the mips brokenness.
- facilitate future musl archs/ABIs (e.g. a ".2 ABI") where userspace
  stat has nothing to do with the legacy kernel stat structs.

So I'm leaning strongly towards just always doing the translation,
even though I'm also leaning towards choice 3 above that won't require
it. If nothing else, it allows me to do the prep work that will set
the stage for time64 transition now, without having finalize the
decisions about how time64 will look.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: time_t progress/findings
  2019-07-18 20:52   ` Rich Felker
@ 2019-07-20  4:48     ` Rich Felker
  2019-07-20 21:46       ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2019-07-20  4:48 UTC (permalink / raw)
  To: musl

On Thu, Jul 18, 2019 at 04:52:37PM -0400, Rich Felker wrote:
> On Thu, Jul 18, 2019 at 12:37:45PM -0400, Rich Felker wrote:
> > Second bit of progress here: stat. First change can be done before any
> > actual time64 work is done or even decided upon: changing all the
> > stat-family functions to use fstatat (possibly with AT_EMPTY_PATH) as
> > their backend, and changing fstatat to do proper fallbacks if
> > SYS_fstatat is missing. Now there's a single point of potential stat
> > conversion rather than 4 functions.
> > 
> > Next, add an internal, arch-provided kstat type and make fstatat
> > translate from this to the public stat type. This eliminates the need
> > for all the mips*/syscall_arch.h hacks.
> 
> This step admits a few questions about how to do it best, inspired in
> part by a related question:
> 
> What should the new time64 stat structures look like?
> 
> There are at least three possible goals:
> 
> 1. Make them as clean and uniform as possible, same for all archs.
> 
> 2. Avoid increasing the size at all cost so as to maximize
>    memory-safety of mismatched interfaces between libc consumers
>    defined in terms of struct stat.
> 
> 3. Make the start of the new struct match the old struct to minimize
>    behavioral errors under mismatched interfaces between libc
>    consumers defined in terms of struct stat.
> 
> Choice 2 is pretty much out because I think it's impossible on at
> least one arch, and would impose really ugly constraints (making
> timespec 24-byte, relying on non-64bit-alignment) on others. In many
> ways choice 3 is actually more appealing, because when third-party
> libraries *do* use stat in public interfaces, it's usually understood
> that the same party both allocates and fills it in, and shares the
> contents with the other party.
> 
> There are actually 2 subvariants of choice 3: either keep exposing the
> 32-bit time in the old locations so that mismatched consumers just
> work, or fill it in with something like INT_MIN (year~=1902) so that
> breakage is caught quickly.
> 
> Now, back to kstat and the above-quoted text. If we go with option 3,
> we don't actually need a kstat struct. The existing stat syscalls just
> write into the beginning of the buffer, and then we copy the result to
> the time64 timespecs at the end that make up the new public interface.
> This results in the smallest code, and the least amount of new
> per-arch definitions. But it doesn't clean up the existing mips
> stat-translation hell (currently buried in mips*/syscall_arch.h), and
> it imposes assumptions about the relationship between kernel types and
> public libc types.
> 
> On the other hand, if we make archs define a struct kstat and always
> translate everything, the code is a bit larger, but we:
> 
> - don't impose any particular choice 1/2/3 above.
> - make it easy to cleanup the mips brokenness.
> - facilitate future musl archs/ABIs (e.g. a ".2 ABI") where userspace
>   stat has nothing to do with the legacy kernel stat structs.
> 
> So I'm leaning strongly towards just always doing the translation,
> even though I'm also leaning towards choice 3 above that won't require
> it. If nothing else, it allows me to do the prep work that will set
> the stage for time64 transition now, without having finalize the
> decisions about how time64 will look.

Another data point in favor of choice 3: libc actually has some
functions of its own that pass stat structures to callbacks: ftw and
nftw. With choice 3, these don't need any change; a legacy binary
calling them will get back stat structures it can read (with some
extra 64-bit timespecs afterwards that it's not aware of). With any
other choice, these functions would need painful replacements, and
just wrapping them is not easy because they lack a context argument to
pass through.

Since similar usage is likely common in third-party library code, I
think this is a really strong argument in favor of choice 3. FWIW the
existing glibc proposal looks like option 1, and they weren't aware of
this problem until I reported it just now.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: time_t progress/findings
  2019-07-20  4:48     ` Rich Felker
@ 2019-07-20 21:46       ` Rich Felker
  0 siblings, 0 replies; 5+ messages in thread
From: Rich Felker @ 2019-07-20 21:46 UTC (permalink / raw)
  To: musl

On Sat, Jul 20, 2019 at 12:48:40AM -0400, Rich Felker wrote:
> On Thu, Jul 18, 2019 at 04:52:37PM -0400, Rich Felker wrote:
> > On Thu, Jul 18, 2019 at 12:37:45PM -0400, Rich Felker wrote:
> > > Second bit of progress here: stat. First change can be done before any
> > > actual time64 work is done or even decided upon: changing all the
> > > stat-family functions to use fstatat (possibly with AT_EMPTY_PATH) as
> > > their backend, and changing fstatat to do proper fallbacks if
> > > SYS_fstatat is missing. Now there's a single point of potential stat
> > > conversion rather than 4 functions.
> > > 
> > > Next, add an internal, arch-provided kstat type and make fstatat
> > > translate from this to the public stat type. This eliminates the need
> > > for all the mips*/syscall_arch.h hacks.
> > 
> > This step admits a few questions about how to do it best, inspired in
> > part by a related question:
> > 
> > What should the new time64 stat structures look like?
> > 
> > There are at least three possible goals:
> > 
> > 1. Make them as clean and uniform as possible, same for all archs.
> > 
> > 2. Avoid increasing the size at all cost so as to maximize
> >    memory-safety of mismatched interfaces between libc consumers
> >    defined in terms of struct stat.
> > 
> > 3. Make the start of the new struct match the old struct to minimize
> >    behavioral errors under mismatched interfaces between libc
> >    consumers defined in terms of struct stat.
> > 
> > Choice 2 is pretty much out because I think it's impossible on at
> > least one arch, and would impose really ugly constraints (making
> > timespec 24-byte, relying on non-64bit-alignment) on others. In many
> > ways choice 3 is actually more appealing, because when third-party
> > libraries *do* use stat in public interfaces, it's usually understood
> > that the same party both allocates and fills it in, and shares the
> > contents with the other party.
> > 
> > There are actually 2 subvariants of choice 3: either keep exposing the
> > 32-bit time in the old locations so that mismatched consumers just
> > work, or fill it in with something like INT_MIN (year~=1902) so that
> > breakage is caught quickly.
> > 
> > Now, back to kstat and the above-quoted text. If we go with option 3,
> > we don't actually need a kstat struct. The existing stat syscalls just
> > write into the beginning of the buffer, and then we copy the result to
> > the time64 timespecs at the end that make up the new public interface.
> > This results in the smallest code, and the least amount of new
> > per-arch definitions. But it doesn't clean up the existing mips
> > stat-translation hell (currently buried in mips*/syscall_arch.h), and
> > it imposes assumptions about the relationship between kernel types and
> > public libc types.
> > 
> > On the other hand, if we make archs define a struct kstat and always
> > translate everything, the code is a bit larger, but we:
> > 
> > - don't impose any particular choice 1/2/3 above.
> > - make it easy to cleanup the mips brokenness.
> > - facilitate future musl archs/ABIs (e.g. a ".2 ABI") where userspace
> >   stat has nothing to do with the legacy kernel stat structs.
> > 
> > So I'm leaning strongly towards just always doing the translation,
> > even though I'm also leaning towards choice 3 above that won't require
> > it. If nothing else, it allows me to do the prep work that will set
> > the stage for time64 transition now, without having finalize the
> > decisions about how time64 will look.
> 
> Another data point in favor of choice 3: libc actually has some
> functions of its own that pass stat structures to callbacks: ftw and
> nftw. With choice 3, these don't need any change; a legacy binary
> calling them will get back stat structures it can read (with some
> extra 64-bit timespecs afterwards that it's not aware of). With any
> other choice, these functions would need painful replacements, and
> just wrapping them is not easy because they lack a context argument to
> pass through.
> 
> Since similar usage is likely common in third-party library code, I
> think this is a really strong argument in favor of choice 3. FWIW the
> existing glibc proposal looks like option 1, and they weren't aware of
> this problem until I reported it just now.

Related find: for struct rusage, we can satisfy both (2) and (3)
simultaneously. This means no new structures or symbols are needed for
getrusage, wait3, and wait4. The existing 32-bit musl structs left 16
slots for extensibility at the end, so we can just put the new 64-bit
time fields there, and still fill in the 32-bit ones too for legacy
callers.

This is basically what I was already planning for utmp, except that
it's less interesting for utmp because the functions are stubs.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-07-20 21:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-18 15:41 time_t progress/findings Rich Felker
2019-07-18 16:37 ` Rich Felker
2019-07-18 20:52   ` Rich Felker
2019-07-20  4:48     ` Rich Felker
2019-07-20 21:46       ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).