mailing list of musl libc
 help / color / mirror / code / Atom feed
* vdso clock_gettime and time64
@ 2019-07-31  5:13 Rich Felker
  2019-07-31  8:30 ` Florian Weimer
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2019-07-31  5:13 UTC (permalink / raw)
  To: musl

One looming thing that folks probably aren't going to like about
switching to 64-bit time_t is losing the vdso clock_gettime on old
kernels. Instead of a function call in userspace, you get *two*
syscalls, the first (time64) one failing, every time you call
clock_gettime. Of course the problem goes away immediately if you have
a time64-capable kernel providing the time64 vdso function.

Is this a problem, and if so, what can be done about it?

Obviously it's possible to grab the legacy time32 vdso symbol and wrap
it with a function to translate. Aside from being more code and
complexity, the problem with this is that it precludes the ability to
checkpoint/resume long-lived processes from an old kernel to a new one
with time64, which might become a real need in some environments where
people realize they've screwed up at the last minute as Y2038 is
approaching.

What might make sense is checking that the tv_sec obtained from the
legacy time32 vdso function is non-negative, and disabling it
permanently if the check fails, reverting to syscalls. This would be
safe for any process that makes at least one call to clock_gettime
before ~2106 after migration.

Alternatively we could figure the burden is on someone performing such
checkpoint/resume to figure out how to patch process images to disable
the no-longer-usable vdso, and that musl has no role in making it
work. (Note that this is something of a position advocating for tools
poking at internals, which I don't like...)

The cleanest course of action is of course just not using the 32-bit
vdso at all, and accepting that clock_gettime will be slower until you
get a Y2038-safe kernel.

Thoughts?

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: vdso clock_gettime and time64
  2019-07-31  5:13 vdso clock_gettime and time64 Rich Felker
@ 2019-07-31  8:30 ` Florian Weimer
  2019-07-31 15:07   ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Florian Weimer @ 2019-07-31  8:30 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

* Rich Felker:

> One looming thing that folks probably aren't going to like about
> switching to 64-bit time_t is losing the vdso clock_gettime on old
> kernels. Instead of a function call in userspace, you get *two*
> syscalls, the first (time64) one failing, every time you call
> clock_gettime. Of course the problem goes away immediately if you have
> a time64-capable kernel providing the time64 vdso function.
>
> Is this a problem, and if so, what can be done about it?

Some users notice fairly quickly if the vDSO fast path is gone and file
bug reports.  (This can happen for various reasons, e.g. buggy kernels
detecting CPU cycle counter drift when there is actually none.)  I don't
know to what extent this matters to legacy architectures.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: vdso clock_gettime and time64
  2019-07-31  8:30 ` Florian Weimer
@ 2019-07-31 15:07   ` Rich Felker
  2019-07-31 16:47     ` Szabolcs Nagy
  2019-07-31 17:11     ` Florian Weimer
  0 siblings, 2 replies; 7+ messages in thread
From: Rich Felker @ 2019-07-31 15:07 UTC (permalink / raw)
  To: musl

On Wed, Jul 31, 2019 at 10:30:26AM +0200, Florian Weimer wrote:
> * Rich Felker:
> 
> > One looming thing that folks probably aren't going to like about
> > switching to 64-bit time_t is losing the vdso clock_gettime on old
> > kernels. Instead of a function call in userspace, you get *two*
> > syscalls, the first (time64) one failing, every time you call
> > clock_gettime. Of course the problem goes away immediately if you have
> > a time64-capable kernel providing the time64 vdso function.
> >
> > Is this a problem, and if so, what can be done about it?
> 
> Some users notice fairly quickly if the vDSO fast path is gone and file
> bug reports.  (This can happen for various reasons, e.g. buggy kernels
> detecting CPU cycle counter drift when there is actually none.)  I don't
> know to what extent this matters to legacy architectures.

These are good points. A lot of these archs actually don't even have
vdso clock_gettime (only mips, arm, and i386 seem to).

I wonder if it would make sense to support use of 32-bit vdso for now,
possibly with logic to drop it if it ever returns a negative tv_sec,
and consider removing it after the last kernel without time64 is
EOL'd, so that it's gone well before 2038.

Thinking about it more, I'm actually concerned about how vdso can
possibly work at all with checkpoint/resume functionality. The code in
the vdso has to match the running kernel (which will update the data
it reads), but the suspended task could be in the middle of vdso code,
and even if not it's already bound the function entry point addresses
and knowledge of which ones exist. We probably need the answer to this
to know if there's even a meaningful problem to solve here. (And if
this somehow isn't a question with a known answer already, someone's
going to have a really bad day...)

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: vdso clock_gettime and time64
  2019-07-31 15:07   ` Rich Felker
@ 2019-07-31 16:47     ` Szabolcs Nagy
  2019-08-01  9:16       ` Szabolcs Nagy
  2019-07-31 17:11     ` Florian Weimer
  1 sibling, 1 reply; 7+ messages in thread
From: Szabolcs Nagy @ 2019-07-31 16:47 UTC (permalink / raw)
  To: musl

* Rich Felker <dalias@libc.org> [2019-07-31 11:07:25 -0400]:
> Thinking about it more, I'm actually concerned about how vdso can
> possibly work at all with checkpoint/resume functionality. The code in
> the vdso has to match the running kernel (which will update the data
> it reads), but the suspended task could be in the middle of vdso code,
> and even if not it's already bound the function entry point addresses
> and knowledge of which ones exist. We probably need the answer to this
> to know if there's even a meaningful problem to solve here. (And if
> this somehow isn't a question with a known answer already, someone's
> going to have a really bad day...)

even if the vdso is the same, criu has issues because
the time is different at restore, so there is a time
namespace proposal to address this

https://marc.info/?l=linux-api&m=156443756221829&w=2
https://www.linuxplumbersconf.org/event/2/contributions/202/attachments/25/28/LPC2018__Time_Namespace_4.pdf

if the kernel version is different then i doubt criu
can be reliable anyway.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: vdso clock_gettime and time64
  2019-07-31 15:07   ` Rich Felker
  2019-07-31 16:47     ` Szabolcs Nagy
@ 2019-07-31 17:11     ` Florian Weimer
  2019-07-31 17:45       ` Rich Felker
  1 sibling, 1 reply; 7+ messages in thread
From: Florian Weimer @ 2019-07-31 17:11 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

* Rich Felker:

> On Wed, Jul 31, 2019 at 10:30:26AM +0200, Florian Weimer wrote:
>> * Rich Felker:
>> 
>> > One looming thing that folks probably aren't going to like about
>> > switching to 64-bit time_t is losing the vdso clock_gettime on old
>> > kernels. Instead of a function call in userspace, you get *two*
>> > syscalls, the first (time64) one failing, every time you call
>> > clock_gettime. Of course the problem goes away immediately if you have
>> > a time64-capable kernel providing the time64 vdso function.
>> >
>> > Is this a problem, and if so, what can be done about it?
>> 
>> Some users notice fairly quickly if the vDSO fast path is gone and file
>> bug reports.  (This can happen for various reasons, e.g. buggy kernels
>> detecting CPU cycle counter drift when there is actually none.)  I don't
>> know to what extent this matters to legacy architectures.
>
> These are good points. A lot of these archs actually don't even have
> vdso clock_gettime (only mips, arm, and i386 seem to).
>
> I wonder if it would make sense to support use of 32-bit vdso for now,
> possibly with logic to drop it if it ever returns a negative tv_sec,
> and consider removing it after the last kernel without time64 is
> EOL'd, so that it's gone well before 2038.

In glibc, we perform vDSO lookup early.  I will push for a solution that
does a probing system call during startup if it cannot find the *_time64
vDSO entry, to determine if it should use the real *_time64 system call
or the 32-bit system call (or vDSO).  That should help to keep the
complexity at bay, at the cost of increased startup time, but which will
reduce with future completion of the interfaces.

I do not think resuming a process on a kernel with a different system
call set is supportable.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: vdso clock_gettime and time64
  2019-07-31 17:11     ` Florian Weimer
@ 2019-07-31 17:45       ` Rich Felker
  0 siblings, 0 replies; 7+ messages in thread
From: Rich Felker @ 2019-07-31 17:45 UTC (permalink / raw)
  To: musl

On Wed, Jul 31, 2019 at 07:11:40PM +0200, Florian Weimer wrote:
> * Rich Felker:
> 
> > On Wed, Jul 31, 2019 at 10:30:26AM +0200, Florian Weimer wrote:
> >> * Rich Felker:
> >> 
> >> > One looming thing that folks probably aren't going to like about
> >> > switching to 64-bit time_t is losing the vdso clock_gettime on old
> >> > kernels. Instead of a function call in userspace, you get *two*
> >> > syscalls, the first (time64) one failing, every time you call
> >> > clock_gettime. Of course the problem goes away immediately if you have
> >> > a time64-capable kernel providing the time64 vdso function.
> >> >
> >> > Is this a problem, and if so, what can be done about it?
> >> 
> >> Some users notice fairly quickly if the vDSO fast path is gone and file
> >> bug reports.  (This can happen for various reasons, e.g. buggy kernels
> >> detecting CPU cycle counter drift when there is actually none.)  I don't
> >> know to what extent this matters to legacy architectures.
> >
> > These are good points. A lot of these archs actually don't even have
> > vdso clock_gettime (only mips, arm, and i386 seem to).
> >
> > I wonder if it would make sense to support use of 32-bit vdso for now,
> > possibly with logic to drop it if it ever returns a negative tv_sec,
> > and consider removing it after the last kernel without time64 is
> > EOL'd, so that it's gone well before 2038.
> 
> In glibc, we perform vDSO lookup early.  I will push for a solution that
> does a probing system call during startup if it cannot find the *_time64
> vDSO entry, to determine if it should use the real *_time64 system call
> or the 32-bit system call (or vDSO).  That should help to keep the
> complexity at bay, at the cost of increased startup time, but which will
> reduce with future completion of the interfaces.
> 
> I do not think resuming a process on a kernel with a different system
> call set is supportable.

Not using vdso, it's definitely supportable; musl's fallbacks for
unsupported syscalls are entirely stateless. Doing it statefully
without data race UB all over the place is painful.

For vdso clock_gettime now, we do it on the first call and use a
relaxed atomic. It wouldn't be a big deal to do it at startup
conditional on linking of clock_gettime (with a weak init symbol) if
that helps.

Note that changing vdso is orthogonal to different syscall set. You
can be resuming on a kernel with the same syscall set, but where vdso
changed due to bugfixes or different hardware or whatever.

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: vdso clock_gettime and time64
  2019-07-31 16:47     ` Szabolcs Nagy
@ 2019-08-01  9:16       ` Szabolcs Nagy
  0 siblings, 0 replies; 7+ messages in thread
From: Szabolcs Nagy @ 2019-08-01  9:16 UTC (permalink / raw)
  To: musl

* Szabolcs Nagy <nsz@port70.net> [2019-07-31 18:47:55 +0200]:
> * Rich Felker <dalias@libc.org> [2019-07-31 11:07:25 -0400]:
> > Thinking about it more, I'm actually concerned about how vdso can
> > possibly work at all with checkpoint/resume functionality. The code in
> > the vdso has to match the running kernel (which will update the data
> > it reads), but the suspended task could be in the middle of vdso code,
> > and even if not it's already bound the function entry point addresses
> > and knowledge of which ones exist. We probably need the answer to this
> > to know if there's even a meaningful problem to solve here. (And if
> > this somehow isn't a question with a known answer already, someone's
> > going to have a really bad day...)
> 
> even if the vdso is the same, criu has issues because
> the time is different at restore, so there is a time
> namespace proposal to address this
> 
> https://marc.info/?l=linux-api&m=156443756221829&w=2
> https://www.linuxplumbersconf.org/event/2/contributions/202/attachments/25/28/LPC2018__Time_Namespace_4.pdf
> 
> if the kernel version is different then i doubt criu
> can be reliable anyway.

hm vdso is mentioned in

https://criu.org/What_can_change_after_C/R


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-08-01  9:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-31  5:13 vdso clock_gettime and time64 Rich Felker
2019-07-31  8:30 ` Florian Weimer
2019-07-31 15:07   ` Rich Felker
2019-07-31 16:47     ` Szabolcs Nagy
2019-08-01  9:16       ` Szabolcs Nagy
2019-07-31 17:11     ` Florian Weimer
2019-07-31 17:45       ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).