mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: Arnd Bergmann <arnd@kernel.org>
Cc: musl@lists.openwall.com
Subject: Re: [musl] Question about musl's time() implementation in time.c
Date: Tue, 14 Jun 2022 19:28:27 -0400	[thread overview]
Message-ID: <20220614232826.GJ7074@brightrain.aerifal.cx> (raw)
In-Reply-To: <CAK8P3a0jk736rPueff--Uor=tHmicHZgoikrAsjp0DHxmkaiWg@mail.gmail.com>

On Tue, Jun 14, 2022 at 11:11:32PM +0200, Arnd Bergmann wrote:
> On Tue, Jun 14, 2022 at 10:49 PM Rich Felker <dalias@libc.org> wrote:
> > On Tue, Jun 14, 2022 at 10:37:25PM +0200, Arnd Bergmann wrote:
> > > On Tue, Jun 14, 2022 at 7:00 PM Rich Felker <dalias@libc.org> wrote:
> > > > On Tue, Jun 14, 2022 at 06:50:40PM +0200, Arnd Bergmann wrote:
> > > > > The coarse time can be up to one timer tick behind, so reading
> > > > > CLOCK_REALTIME first
> > > > > can give you the exact second with a small nanosecond value, while the
> > > > > utime will still
> > > > > set the previous value.
> > > > >
> > > > > Can you change the test case to check if the later time is less than
> > > > > clock_getres(CLOCK_REALTIME_COARSE, ...) behind?
> > > >
> > > > This seems like a bug that the kernel uses the wrong clock for setting
> > > > file timestamps. It can result in seeing events out-of-order (exactly
> > > > as described in this thread). This should really be fixed or at least
> > > > made switchable so users who care can fix it.
> > >
> > > I can't find any reference to what the correct clock is here,
> > > are you sure that this is specified at all? The decision to use the coarse
> > > time in the kernel is definitely intentional, as reading the hardware
> > > clocksource can be expensive (depending on the hardware), and
> > > changing the behavior would likely break applications that rely on
> > > it being the coarse clock.
> >
> > POSIX specifies operations that set the file timestamps in terms of
> > the system (CLOCK_REALTIME) clock, not a weird implementation-defined
> > alternate clock.
> >
> > Maybe you're right that getting the correct clock is costly on some
> > archs, but it's almost surely not on any arch that admits vdso
> > clock_gettime. And "race that causes applications to see wrong
> > ordering of filesystem operations with respect to other activity for
> > the sake of performance" does not seem like a good idea.
> 
> The thing is that a lot of file systems would still behave the same way
> because they round times down to a filesystem specific resolution,
> often one microsecond or one second, while the kernel time accounting
> is in nanoseconds. There have been discussions about an interface
> to find out what the actual resolution on a given mount point is (similar
> to clock_getres), but that never made it in. The guarantees that you
> get from file systems at the moment are:

It's normal that they may be rounded down the the filesystem timestamp
granularity. I thought what was going on here was worse.

> - the timestamp is always rounded down, not up, so a newly
>   created file never gets a timestamp that is newer than either
>   CLOCK_REALTIME or CLOCK_REALTIME_COARSE as
>   reported by a subsequent clock_gettime()/gettimeofday()/time().
> 
> - the in-memory timestamp is the same that you read back
>   after umount/mount, and gets adjusted for both resolution
>   and range of the on-disk representation.
> 
> - any file system that supports timestamps (some always
>   report tv_sec=0) set the timestamps to at most three
>   seconds before the current time as read by an earlier
>   time() syscall.
> 
> Making it use CLOCK_REALTIME instead of
> CLOCK_REALTIME_COARSE would improve the third
> guarantee so it could be within two seconds (or one second
> on file systems with full-second resolution like ext3), but would
> break the first rule by making it report timestamps that can
> be either before or after the time reported by the time() syscall.

OK, the time syscall doing the wrong thing here (using a different
clock that's not correctly ordered with respect to CLOCK_REALTIME)
seems to be the worst problem here -- if I'm understanding it right.
The filesystem issue might be a non-issue if it's truly equivalent to
just having coarser fs timestamp granularity, which is allowed.

Rich

  reply	other threads:[~2022-06-14 23:28 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-07 12:25 Zev Levy Stevenson
2022-06-07 14:29 ` Arnd Bergmann
2022-06-07 16:30   ` Rich Felker
2022-06-10  8:52     ` Zev Levy Stevenson
2022-06-14 16:50       ` Arnd Bergmann
2022-06-14 17:00         ` Rich Felker
2022-06-14 20:37           ` Arnd Bergmann
2022-06-14 20:49             ` Rich Felker
2022-06-14 21:11               ` Arnd Bergmann
2022-06-14 23:28                 ` Rich Felker [this message]
2022-06-15 12:09                   ` Arnd Bergmann
2022-06-15 16:55                     ` Adhemerval Zanella
2022-06-16  9:06                     ` Thomas Gleixner
2022-06-16 14:57       ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220614232826.GJ7074@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=arnd@kernel.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).