From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 2066 invoked from network); 10 Jun 2022 08:53:18 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 10 Jun 2022 08:53:18 -0000 Received: (qmail 9573 invoked by uid 550); 10 Jun 2022 08:53:15 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 9525 invoked from network); 10 Jun 2022 08:53:14 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=o/C82GX9h2X1uRvYwD5h26Ju34xYC68czvxNijNz8pQ=; b=S8eXuuj74Y4UTVsDFdMn4GwiJqM1dPsg/69ondccbs1JXgwMRSIWmUoBRQ1rRaUHnT P56HzDfX9UGB3TbTBNufc1jTEsi8HMp3EQjaJhEbg1/Th5qcaOyYSuT+TOg3t0HvMlPK ZOcQXS2V9Uka0GHkQT5vU+gLW+oXFC3ebOI73m1NUWqr7T8mmDBP5xSyO3HvHBnlBYEm 4qaythMLQ9cny/Gy2TcSszytk7PLJ/dIhHtI8cNEDUNhKjV2L18wzriUkj4cc4bzYmLh V2T6cLXde20/CnT+yKnRy06mHyyY33jtaI9AiqINSG1wcBtrJWqDKHxFJV2hxL3IggCt c4CQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=o/C82GX9h2X1uRvYwD5h26Ju34xYC68czvxNijNz8pQ=; b=Y30v2CSQI7lkaU+wHm6J5uN//0MISUdtsbf8A9q7ePTF1qlR+Yg1qFSxfKimn3JKWX zJcaQgJvYqR8dpWSSBVNaaz78RsnT8bBljHyMZ8xeQNTx0d12BEq4KC1k0+Hmaqa9stP 2VWbPFYBlVMCYhds1AcAzs20/7DE4IhjL391mlI1fsybOHyudQpTDuS2szmRimYfUnQh WwGw4crvjOMufjoZbIqhOaHKoGCcvTuMVMEtnR1w19cHDMj2BCSstxsxE0tLpwRnS7aG YnZXl1fFQVVOorEYBMxI9cs0SktCiwyCBUtBrQHt4pYkuLvdExAQfP+BvGoDk8N6MvH5 01Bg== X-Gm-Message-State: AOAM530oOI2iRHC4RM9nP4d6E8U3llfb+m7tFUkWsp9JDGSSmT7eDFRI winm4dJXJyShn/jxlvAcxa6Oe35p4/C5JoJcLEI= X-Google-Smtp-Source: ABdhPJyJRIvYwpFK9TpbT6Nq/wRvkSKxMmER2DmoqwxFw3Lpl7wHtK1184HEEGlfFIbEGLWNerd8bLvotq5S8nv0vtA= X-Received: by 2002:a25:77d8:0:b0:663:5e85:a632 with SMTP id s207-20020a2577d8000000b006635e85a632mr29168918ybc.375.1654851182146; Fri, 10 Jun 2022 01:53:02 -0700 (PDT) MIME-Version: 1.0 References: <20220607163053.GD7074@brightrain.aerifal.cx> In-Reply-To: <20220607163053.GD7074@brightrain.aerifal.cx> From: Zev Levy Stevenson Date: Fri, 10 Jun 2022 11:52:50 +0300 Message-ID: To: Rich Felker Cc: Arnd Bergmann , musl@lists.openwall.com Content-Type: multipart/alternative; boundary="000000000000de394605e1141018" Subject: Re: [musl] Question about musl's time() implementation in time.c --000000000000de394605e1141018 Content-Type: text/plain; charset="UTF-8" Thank you for the responses, those reasons make sense to me. We are using a very customized toolchain but the kernel itself is standard. We looked into it a bit further and we were able to reproduce the issue with a clean musl-gcc toolchain for x86_64 (version 1.2.2) on a Linux kernel that we took from a standard Ubuntu distribution. Specifically, tests in the libc-test suite ( https://wiki.musl-libc.org/libc-test.html) using the time() function fail sometimes, e.g. src/functional/utime.c, which fails on about ~3-4 runs in every 1,000 runs. This can be reduced to this type of code failing: t = time(0); if(futimens(fd, ((struct timespec[2]){{.tv_nsec=UTIME_NOW},{.tv_nsec=UTIME_OMIT}})) != 0) return 1; if (fstat(fd, &st) != 0) return 1; if (st.st_atim.tv_sec < t) printf("time inconsistency\n"); When replacing the call to time(0) with a raw call to the Linux time() syscall the issue seems to disappear. On the other hand, using the clock_gettime syscall results in the same issue. Perhaps this is an issue with the Linux implementation of these syscalls / vdso functions, in which case further research may be required, or maybe such consistency when using different methods for measuring the system time doesn't have to be guaranteed, in which case the tests should probably be modified to allow for small inaccuracies such as the one described above. On Tue, Jun 7, 2022, 19:30 Rich Felker wrote: > On Tue, Jun 07, 2022 at 04:29:28PM +0200, Arnd Bergmann wrote: > > On Tue, Jun 7, 2022 at 2:25 PM Zev Levy Stevenson > wrote: > > > > > > Hi all, > > > > > > While running the libc-test test suite on a customized clang+musl > > > build, I had trouble with some of the tests because of issues with > > > time accuracy. > > > I can go in detail if needed, but the problem seemed to boil down > > > to the time() function in musl (in src/time/time.c) using a > > > clock_gettime syscall (without vdso) instead of using the Linux > > > time syscall that we expected it to use. Some other libc > > > implementations use this syscall, and indeed after switching the > > > syscall used in time () the tests passed, seemingly because the > > > accuracy of the clocks used matched up. > > > My main question is why musl's implementation doesn't use the time > > > syscall, I'd be happy to hear if there was a special reason for > > > this. > > > > The time() syscall on 32-bit architectures returns a 32-bit integer, > > which overflows in y2038, only > > clock_gettime() has the required range. > > This is indeed a good reason it can't be changed. Historically, though > it was just a matter of avoiding code duplication. Due to the desire > to support vdso and, clock_gettime requires a good deal of logic to > find and use the vdso function and perform fallbacks if it's not > available, or if the newer syscalls are not available. If time() did > not use clock_gettime as its backend, but instead used separate kernel > interfaces, this logic would need to be duplicated in time() too. It > would also impose weird incentives for new archs to provide a time() > syscall or vdso function, or would impose a requirement that we *also* > duplicate the vdso logic to consume the vdso clock_gettime in time() > if there's no vdso time(). > > If you've created an alternate kernel/syscall implementation where > clock_gettime behaves badly and a legacy time syscall (or vdso > function?) behaves good, that really doesn't seem like a good > implementation choice. Especially if they produce mismatching output. > I guess you could make a stretch argument that the implementation > behaves as if someone is constantly changing the clock, but short of > that, think it's even nonconforming (there's a single realtime clock > and they're both supposed to return times in terms of it). Are there > reasons you're trying to do things that way? > > Rich > --000000000000de394605e1141018 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thank you for the responses, those reas= ons make sense to me. We are using a very customized toolchain but the kern= el itself is standard.
We looked into it a bit furth= er and we were able to reproduce the issue with a clean musl-gcc toolchain = for x86_64 (version 1.2.2) on a Linux kernel that we took from a standard U= buntu distribution.
Specifically, tests in the libc-= test suite (https://w= iki.musl-libc.org/libc-test.html) using the time() function fail someti= mes, e.g. src/functional/utime.c, which fails on about ~3-4 runs in every 1= ,000 runs. This can be reduced to this type of code failing:

t =3D time(0);
if(futimens(fd, ((struct timespec[2]){{.tv_nsec=3DUTIME_NOW},{.tv_nsec=3DU= TIME_OMIT}})) !=3D 0) return 1;
if (fstat(fd, &s= t) !=3D 0) return 1;
if (st.st_atim.tv_sec < t) p= rintf("time inconsistency\n");

<= div dir=3D"auto">When replacing the call to time(0) with a raw call to the = Linux time() syscall the issue seems to disappear. On the other hand, using= the clock_gettime syscall results in the same issue.
Perhaps this is an issue with the Linux implementation of these syscalls = / vdso functions, in which case further research may be required, or maybe = such consistency when using different methods for measuring the system time= doesn't have to be guaranteed, in which case the tests should probably= be modified to allow for small inaccuracies such as the one described abov= e.

On Tue, Jun 7, 2022, 19:30 Rich Felker <dalias@libc.org> wrote:
On Tue, Jun 07, 2022 at 04:29:28PM +0200, Arnd Bergmann wrote:
> On Tue, Jun 7, 2022 at 2:25 PM Zev Levy Stevenson <zevlevys@gmail.c= om> wrote:
> >
> > Hi all,
> >
> > While running the libc-test test suite on a customized clang+musl=
> > build, I had trouble with some of the tests because of issues wit= h
> > time accuracy.
> > I can go in detail if needed, but the problem seemed to boil down=
> > to the time() function in musl (in src/time/time.c) using a
> > clock_gettime syscall (without vdso) instead of using the Linux > > time syscall that we expected it to use. Some other libc
> > implementations use this syscall, and indeed after switching the<= br> > > syscall used in time () the tests passed, seemingly because the > > accuracy of the clocks used matched up.
> > My main question is why musl's implementation doesn't use= the time
> > syscall, I'd be happy to hear if there was a special reason f= or
> > this.
>
> The time() syscall on 32-bit architectures returns a 32-bit integer, > which overflows in y2038, only
> clock_gettime() has the required range.

This is indeed a good reason it can't be changed. Historically, though<= br> it was just a matter of avoiding code duplication. Due to the desire
to support vdso and, clock_gettime requires a good deal of logic to
find and use the vdso function and perform fallbacks if it's not
available, or if the newer syscalls are not available. If time() did
not use clock_gettime as its backend, but instead used separate kernel
interfaces, this logic would need to be duplicated in time() too. It
would also impose weird incentives for new archs to provide a time()
syscall or vdso function, or would impose a requirement that we *also*
duplicate the vdso logic to consume the vdso clock_gettime in time()
if there's no vdso time().

If you've created an alternate kernel/syscall implementation where
clock_gettime behaves badly and a legacy time syscall (or vdso
function?) behaves good, that really doesn't seem like a good
implementation choice. Especially if they produce mismatching output.
I guess you could make a stretch argument that the implementation
behaves as if someone is constantly changing the clock, but short of
that, think it's even nonconforming (there's a single realtime cloc= k
and they're both supposed to return times in terms of it). Are there reasons you're trying to do things that way?

Rich
--000000000000de394605e1141018--