Re: Re: Data structures defined by both linux and musl

mailing list of musl libc
 help / color / mirror / code / Atom feed

From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: Re: Data structures defined by both linux and musl
Date: Fri, 18 Jan 2019 13:55:00 -0500	[thread overview]
Message-ID: <20190118185500.GP23599@brightrain.aerifal.cx> (raw)
In-Reply-To: <CAK8P3a1-CO0WzodX-UDZSJuJTXA95_t1G38KiF4hMrxeC=F-kA@mail.gmail.com>

On Fri, Jan 18, 2019 at 06:06:01PM +0100, Arnd Bergmann wrote:
> > On Thu, Dec 20, 2018 at 11:33:59AM +0100, Szabolcs Nagy wrote:
> > > * Rich Felker <dalias@...c.org> [2018-12-19 19:30:44 -0500]:
> > > > On Tue, Dec 18, 2018 at 08:41:53PM +0100, Arnd Bergmann wrote:
> > > > ".1" ABIs, this translation would mostly be the identity
> > > > transformation, but on archs where we're already doing some hacks to
> > > > fix up kernel ABI bugs (sysvipc on big endian, mips stat structure,
> > > > x32 stuff, etc.) the hacks could be replaced by used of this
> > > > translation infrastructure.
> > >
> > > lesson of ilp32 was that libc cannot generally translate between
> > > a user and kernel abi (otherwise it could be done in userspace).
> > >
> > > the problematic cases are when user talks to the kernel directly
> > > using libc types in a way that the libc cannot do the translation.
> > >
> > > interfaces where the libc does not know the type, just an opaque
> > > pointer: ioctl, fcntl, getsockopt, setsockopt, raw syscall
> >
> > Ultimately all of these *can* be translated just by enumerating all
> > the broken interfaces and special-casing them. It's not pretty,
> > though. What would probably happen (Arnd, do you know?) would be
> > redefining the ioctl numbers etc. to "time64" versions of the
> > interfaces, and for interfaces which are actually "important" to have
> > work on old kernels, including translations to/from the corresponding
> > old ioctl. Depending on the scope, that might be all or nearly all of
> > them.
> 
> We've done it for most of them by now. In a lot of cases we
> got lucky because the ioctl command code changes with
> sizeof(time_t), so all we had to do in the kernel was to interpret
> those ioctl commands for 32-bit and 64-bit time_t.
> 
> In other cases, we have redefined the ioctl command codes
> in the header with some clever (hopefully not too clever) trick:
> 
> #if __BITS_PER_LONG == 64
> #define LPSETTIMEOUT LPSETTIMEOUT_OLD
> #else
> #define LPSETTIMEOUT (sizeof(time_t) > sizeof(__kernel_long_t) ? \
>     LPSETTIMEOUT_NEW : LPSETTIMEOUT_OLD)
> #endif
> 
> This way, we guarantee that we can still detect the data type
> expected by an application calling LPSETTIMEOUT.
> The same approach is used for setsockopt and some other
> interfaces.

Unless I'm misunderstanding something, this still leaves new programs
using 64-bit time_t unable to make the ioctls on old kernels that lack
the updated ioctl command. There's probably some significant subset of
important commands that ioctl.c needs to be able to intercept and
emulate.

> In other cases (in particular when we never pass absolute
> CLOCK_REALTIME data), we changed the type inside
> of a structure from time_t to 'long' or 'unsigned long', in
> order to keep the ABI unchanged. The disadvantage here
> is that it requires user space to use updated kernel headers,
> which is a problem for applications that ship with a copy of
> the kernel header.

I think this is reasonable. It's not reasonable for kernel structures
to have standard userspace types like time_t in them (except
fixed-size ones like uint32_t, but kernel has __u32 for that anyway)
and shipping copies of such headers was likewise a bug that should be
corrected. It may be a moderate pain for distro ppl fixing this until
the affected upstreams do, tho.

> I think for fcntl we were lucky that nothing passesa time_t.

Indeed.

> > > direct communication channel to the kernel that may expose the
> > > abi incompatibility: netlink, sysfs, procfs
> >
> > Netlink is the worst here since it's "hidden" behind normal read/write
> > calls where the data is abstract bytes. If there's anything that needs
> > to be fixed at the netlink layer it probably just requires redefining
> > part of the _API_ to use fixed-width types rather than time_t or such.
> 
> I don't remember seeing any such case with netlink. Generally
> speaking, netlink already has to use fixed-width types in order
> to support compat mode, but there may be a couple of exceptions
> where the kernel requires nasty hacks here.

OK, that sounds good.

> The same is true
> for read/write based chardev interfaces such as /dev/input/eventX,
> which we had to redefine to use a structure based on 'unsigned long'

Uhg. How does this work with a 32-bit userspace running on a 64-bit
kernel?! These should never have used long, only u32 or u64. Is it
fixable? Or is there some reasonable way for userspace to detect which
protocol the kernel is using?

> instead of 'time_t' and require to use CLOCK_MONOTONIC to
> avoid the overflow.

Well, avoid it for devices that don't go more than 136 years without
reboot... :)

> > > time_t may not be affected by these, but it shows that translation
> > > is fragile in general, i wonder if we can ensure correct behaviour
> > > in all cases. there is also the problem of linux headers which may
> > > use and redefine libc types and user code may need to use those.
> >
> > Redefining libc types is already broken, and the kernel headers that
> > do it can't be used from userspace when libc headers are included.
> > This issue is independent of type sizes/layouts matching.
> >
> > I don't think any kernel headers _use_ libc types either. They
> > generally use their own stuff.
> 
> 'struct timespec' is a notable exception here, but probably not
> the only one. At the moment, both libc and kernel define this
> structure (and timeval, itimerval, itimerspec, ...), and in my
> work on the kernel interfaces I assumed that the libc version
> is the one that will prevail, while the kernel version should get
> removed.

Yes, I think any type defined by userspace standards/interface
definitions inherently belongs to userspace implementation, and kernel
headers should not touch it.

Rich

next prev parent reply	other threads:[~2019-01-18 18:55 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-18 19:41 Arnd Bergmann
2018-12-20  0:30 ` Rich Felker
2018-12-20 10:33   ` Szabolcs Nagy
2018-12-20 18:08     ` Rich Felker
2019-01-18 16:50 ` Arnd Bergmann
2019-01-18 19:48   ` A. Wilcox
2019-01-18 21:09     ` Arnd Bergmann
2019-01-18 17:06 ` Arnd Bergmann
2019-01-18 18:55   ` Rich Felker [this message]
2019-01-18 21:07     ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190118185500.GP23599@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).