mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: lolzery wowzery <wowzeryest@gmail.com>
Cc: musl@lists.openwall.com, Duncan Bellamy <dunk@denkimushi.com>,
	info@bnoordhuis.nl, tony.ambardar@gmail.co
Subject: Re: [musl] [PATCH 1/2] V3 resubmitting old statx patch with changes
Date: Sun, 28 Apr 2024 12:13:56 -0400	[thread overview]
Message-ID: <20240428161356.GB10433@brightrain.aerifal.cx> (raw)
In-Reply-To: <CADeg5Su1L+FWOPJOw98RnUb-DxhEqMw-s0ULiJcg+W9ZsE+0kw@mail.gmail.com>

On Sat, Apr 27, 2024 at 10:29:35PM -0400, lolzery wowzery wrote:
> Hi,
> 
> Update: you've given me a lot to think about in my suggestions and
> proposals to musl and, because you took time to respond to me and
> explain things clearly, I feel much more compelled to put good effort
> into making sure my changes are solid and indisputably beneficial (not
> as in more=better kind of way but less=more but keeping in mind your
> responses about musl's design.)
> 
> > statx is already in musl 1.2.5 (commit b817541f1cfd). If there are
> > other problems than the ones I fixed when merging it and the
> > stx_attributes field, please report them here rather than spending
> > time trying to make a comprehensive changeset for a lot of things that
> > might or might not actually be wrong.
> 
> I must have dyslexia or something because it turned out my efforts were a
> wild goose chase for xstat, not statx! I'm so embarrassed, ha ha.
> 
> > > symbol
> > > weakness problems,
> >
> > This sounds unlikely. It's more likely that you misunderstand how/why
> > they're used. But I'm happy to look at your findings.
> 
> There are some symbol weakness inconsistencies with glibc I found in musl

There is nothing about weakness that is a public interface. It does
nothing at all in dynamic linking, and in static linking, it does not
declare an intent that the application can override/redefine the
function, only that libc not conflict with the application's namespace
*if the libc-provided function is not being used at all* because the
application is using a different namespace profile, like plain C
instead of POSIX or POSIX instead of POSIX+extensions. If you review
this I think you'll find they're all correct.

The other place they're used is for controlling link dependencies in
static linking and avoiding pulling in code that's not used/needed
because other functionality was not linked.

> but trying to track them all down by hand would be insane. I will get together
> a tool to difference musl's and glibc symbols and list all changes to you one
> of these days.
> 
> For reference, symbol weakness only affects what happens when linking two
> libraries with same symbol names, which is used to override libc methods
> for various necessary purposes.

> Glibc is the golden standard software is
> built to link to,

No, just no. This is not a premise that is acceptable here. musl is
not a glibc clone/drop-in replacement. musl and glibc are *different
systems*. musl implements certain standards (C, POSIX, IEEE754) and
selected nonstandard extensions based on certain criteria. We do not
do anything "because glibc does it and glibc is the 'golden
standard'".

> so, if anything, this will only help some software
> work in musl.

Facilitating hacks that involve UB and poking at implementation
internals is generally not a goal of musl.

> > Can you clarify what you mean? There are some places where correctly
> > atomic fallback is impossible and the fallback is best-effort only.
> > This is generally only for missing O_CLOEXEC type functionality. If
> > there are others, please report.
> 
> My original thinking was that my proposed solution of trying both the full
> syscall and the fallback and seeing which work to handle seccomp would
> introduce a race condition where the file is created between the two calls.

I don't see how that matters. If the fallback is correct, either is
correct for at least one moment in time and there is no distinction
except an ordering that's not synchronized and thereby arbitrary.


> > It's never undefined. That's not how this works.
> >
> > If you're making out-of-tree bare-metal ports and don't want the
> > overhead of having to add new syscalls like this, you can do something
> > like define the SYS_* macros for them such that syscall_arch.h can
> > statically catch that they're nonexistant (e.g. by given them values
> > in some high range) and directly return -ENOSYS; then the code would
> > collapse down as you want with the ENOSYS path being always-taken.
> >
> > Note that there is no way to emulate the nonzero flags for renameat2
> > without race conditions. Since this is not standard/mandatory
> > functionality, the right thing to do is just return an error for
> > "unsupported flags", not try to emulate them.
> 
> Got it and thanks for explaining. Now that I've fixed my eyes and am
> looking at statx, I see four main things to be fixed:
> 1. Remove the `#ifndef SYS_fstatat` because it's nonsensible

No, riscv32 does not have (and no new 32-bit archs will have)
SYS_fstatat because they require SYS_statx for time64 support (they
don't have a 32-bit native kernel stat structure).

The conditional is not necessary but it just optimizes out dead code.
On such archs, it's known that if SYS_statx fails, fstatat() will also
fail for the same reason, because it makes the same syscall.

> 2. Add comments to explain things

This is possibly okay, if they're explaining reasons for doing things
and not just translating C into a natural-language description of what
it's doing.

> 3. Correctly validate the flags and EINVAL if unsupported by fallback

fstatat() does that already. There isn't really any way to do the
validation in userspace without blocking access to newly-added kernel
functionality.

> 4. Zero all extraneous fields like __pad1 for future proofing.

This is probably a good idea, but it should be done via
zero-initializing the whole structure before filling it, not referring
to those fields by name. The names are not a public or even
libc-internal-private interface, but placeholders, and shouldn't be
used.

> 5. The stx_rdev_major and stx_rdev_minor fields were not correctly filled in

Another thing I missed on the initial review. Thanks for catching it.

> Please do not make these 5 changes yourself yet as I might find more and
> I have some great comments I want to add to explain why things are.

If you'd like to submit the fixes, please do them as individual
changes with commit messages that explain what was wrong and what
specifically is being fixed, not a big combined "fix statx fallback"
patch.

> I also discovered and would like to do a very minor cleanup on
> src/stat/fstatat.c, which has a duplicate copy of the struct statx for
> no reason.

That's because the patch 2 in this series as submitted was wrong and
never fixed, so I refrained from applying it, with the intent to
revisit after release. So that would be fine to do now.

> Additionally, I am working on a proper fallback and implementation for fstatx
> and lstatx, which are the open-fd and symlink equivalents of statx. It will

statx already supports those usages with proper flags
(AT_SYMLINK_NOFOLLOW, AT_EMPTY_PATH). Unless there's precedent for
functions by those names that wrap it with the necessary flags, I
don't see a motivation for adding them rather than just writing
application code to use the flags with statx().

> > These functions return information about a file, in the buffer
> > pointed to by statbuf.  No permissions are required on the file
> > itself, but—in the case of stat(), fstatat(), and lstat()—execute
> > (search) permission is required on all of the directories in
> >  pathname that lead to the file.
> 
> I swear I've spent the last 5 hours digging into the nitty gritty
> depths of these
> little-documented methods to ensure musl will have proper fallbacks.
> 
> QUESTION: The glibc wrapper for statx explicitly sets the errno to ENOSYS
> upon successful execution of its fallback to fstat64 (yes you read that right.)
> I swore I misread something or this was a bug until quadruple checking
> that this is works-as-intended over the next 2 hours. Will check around
> glibc more tomorrow but what should musl's policy be about this (and
> perhaps other works-as-intended POSIX violations around glibc?) I'm
> personally strongly leaning towards do-what-glibc-does for compatibility

The value of errno on success is not meaningful; it's valid for it to
take on any nonzero value as a consequence of a function call that
succeeds.

Our policy in general if offering interfaces modelled off glibc
(normally only happens when these are Linux-specific interfaces) is
that only properties which an application can reasonably expect to
rely on need to be matched. Things like the value of errno after
success would not fall under that. Even though POSIX does not govern
nonstandard interfaces like this, the principle that errno is not
meaningful in this case still applies.

Rich

  reply	other threads:[~2024-04-28 16:13 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-19 12:12 [musl] [PATCH] add statx Ben Noordhuis
2020-01-24  8:38 ` [musl] " Ben Noordhuis
2020-01-24 14:01   ` Rich Felker
2020-01-28  8:59     ` Ben Noordhuis
2020-01-28 13:39       ` Rich Felker
2020-01-24 14:00 ` [musl] " Rich Felker
2020-01-24 15:27   ` Florian Weimer
2020-01-24 15:54     ` Rich Felker
2020-01-24 16:12       ` Florian Weimer
2020-01-24 16:29         ` Rich Felker
2020-01-28 10:41           ` Florian Weimer
2020-01-28 13:18             ` Rich Felker
2020-02-17  9:10               ` Florian Weimer
2020-02-17 15:29                 ` Rich Felker
2022-08-27 14:57 ` [musl] [PATCH 0/1] " Duncan Bellamy
2022-08-27 14:57   ` [musl] [PATCH 1/1] resubmitting old statx patch with changes Duncan Bellamy
2022-08-27 18:10     ` Rich Felker
2022-08-27 23:11       ` Dunk
2022-08-27 23:11 ` [musl] [PATCH 0/2] V2 Duncan Bellamy
2022-08-27 23:11   ` [musl] [PATCH 1/2] V2 resubmitting old statx patch with changes Duncan Bellamy
2022-08-29 13:50     ` [musl] " Dunk
2022-08-27 23:11   ` [musl] [PATCH 2/2] V2 src/stat/fstatat.c use new statx define Duncan Bellamy
2022-08-31 19:07 ` [musl] [PATCH 0/2] V3 Duncan Bellamy
2022-08-31 19:07   ` [musl] [PATCH 1/2] V3 resubmitting old statx patch with changes Duncan Bellamy
2024-02-24 16:56     ` Rich Felker
2024-04-24 19:30       ` Rich Felker
2024-04-24 23:55         ` lolzery wowzery
2024-04-25  3:21           ` Markus Wichmann
2024-04-25 12:25           ` Rich Felker
2024-04-28  2:29             ` lolzery wowzery
2024-04-28 16:13               ` Rich Felker [this message]
2024-05-06 14:57                 ` Rich Felker
2024-04-27 16:40           ` Rich Felker
2022-08-31 19:07   ` [musl] [PATCH 2/2] V3 src/stat/fstatat.c use new statx define Duncan Bellamy
2024-02-24 16:57     ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240428161356.GB10433@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=dunk@denkimushi.com \
    --cc=info@bnoordhuis.nl \
    --cc=musl@lists.openwall.com \
    --cc=tony.ambardar@gmail.co \
    --cc=wowzeryest@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).