mailing list of musl libc
 help / color / mirror / code / Atom feed
* Discussion of partly-invasive changes needed for x32 port
@ 2014-01-20  7:41 Rich Felker
  2014-01-20  8:43 ` Luca Barbato
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Rich Felker @ 2014-01-20  7:41 UTC (permalink / raw)
  To: musl

The x32 port is close to being at a point where it can be integrated,
but unfortunately (due in large part to laziness and lack of foresight
on the part of the kernel folks) there are several points of major
ugliness which we need to work out solutions to. There are pretty much
all "bikeshed" topics, so in a way I hate spending time on them, but
the decisions we make are going to have an impact on how well we can
keep the code from getting hideously ugly.

The basic issues:

1. System calls take 64-bit arguments, whereas the existing syscall
framework in musl (and on the kernel side for all other archs/abis)
uses long arguments.

The current internal syscall macros cast all arguments to long; this
allows accepting either pointers or integer types smaller than long
without the caller having to make any casts. However this does not
work for x32, where some syscalls actually need 64-bit arguments (e.g.
lseek). The following macro, which replaces the long cast on x32,
solves the problem:

#define __scc(X) sizeof(1?(X):0ULL) < 8 ? (unsigned long) (X) : (long long) (X)

On other archs that don't define their own __scc, __scc(X) would just
be defined as (long)(X), preserving the current behavior.

For normal syscalls that go via the inline functions, these inline
functions are defined in syscall_arch.h (arch-specific) where it's
easy to change them to use long long instead of long for their
argument types. However, for cancellable syscalls, they go through
__syscall_cp in src/thread/cancel_impl.c. The proposal so far is to
expose these internals from the internal syscall.h and syscall_arch.h
to cancel_impl.c. I'm not opposed to this, but if we go this way, the
type name should be self-documenting (e.g. klong, klong_t,
syscall_long_t, or something similar) rather than the current proposal
(__sca) that's non-obvious unless you go digging for it. Alernatively,
src/internal/syscall.h could pull in register_t from alltypes.h
(having internal headers use alltypes.h directly is not something we
do yet, but it could be done) and we could simply always use
register_t for these arguments.

2. The kernel interprets the tv_nsec field of struct timespec as a
kernel long (64-bit) rather than a userspace long (32-bit). glibc
handles this completely wrong (bug #16437), making it easy for them.
Doing it correctly is much harder; for all syscalls that take timespec
arguments, musl will have to copy the timespec to a temp structure
with the padding correctly sign-extended before passing it to the
kernel. The affected functions include at least: sigtimedwait, ppoll,
nanosleep, clock_settime, mq_timedsend, mq_timedreceive, pselect,
futimesns, utimesnsat, and everything using timed futex waits
(thankfully these all go through __timedwait in musl).

Adding the workaround code at every syscall point is ugly, and
possibly error-prone. An alternate possible solution is hacking up the
syscall macros in syscall_arch.h to detect "const struct timespec *"
arguments and auto-wrap them with compound literals that would fix-up
the padding. However this probably requires either C11 or GNU C
extensions. On the positive side, it would only affect x32; no changes
at all would be made to the source files or other archs'
syscall_arch.h logic.

An alternate version of this approach is to make a dedicated KTS()
(kernel timespec) macro that takes a struct timespec pointer and
returns a pointer to either the original argument (on sane archs) or a
compound literal copy of it with the sign-extension performed (on
x32). Then all callers would have to use this macro. But I don't like
this from a standpoint that it obfuscates the general code for the
sake of a single broken arch.

3. A number of other structures in the public headers differ from
their userspace C type layout on other archs because they're setup to
match the 64-bit kernel structs. This requires either changing some
member types to match the 64-bit kernel ones (this is usually a bad
idea, and sometimes even non-conforming like with tv_nsec) or adding
adjacent padding members for the unused "high" part of the 64-bit
value. In many cases, this requires either moving structures from the
shared headers to bits/*, or coming up with clever ways to avoid doing
so.

Most of the issues are in the sysv ipc headers:

- ipc.h: There's actually no need for bits/ipc.h now; the structure is
  identical on all archs. But it would be different on x32 unless the
  type of the padding members is changed to be something that's 64-bit
  on x32 but retains its size on all other archs (right now it's
  long). I'd like to remove this bits header.

- sem.h: The time_t members are currently misdeclared as long; this
  needs to be fixed regardless of anything else. Changing all the
  unused/padding to be time_t-sized would work everywhere, including
  x32, but it's semantically ugly, especially if we ever add targets
  where time_t is 64-bit but the 64-bit-padded fields aren't needed
  elsewhere.

- shm.h: The structs shminfo and shm_info are made up of syscall-long
  sized members, at least from a kernel standpoint. These should
  probably be userspace-long plus adjacent padding on the userspace
  side. Either way it looks like we'd need to move them to bits, which
  would be mildly ugly. But it might be possible to write the generic
  header in a way that computes the padding sizes based on sizes of
  other types, or just gets the padding from a macro in the bits
  header; this might even make it possible to unify the definition of
  shmid_ds into the shared header and nearly eliminate the bits
  headers.

-  msg.h: Similar to shm.h, extra padding members are needed.

All of these issues raise the question of whether it would be
beneficial to have available "syscall_long_t" or similar in the public
headers. This would make it very possible to avoid moving anything to
bits (and perhaps even helo move some things back) with some mild to
moderately ugly tricks for computed padding. But the cons are that
exposing the type invites abuse (application code trying to use it)
and the actual struct layouts perhaps become less explicit...

OK, this email is already long enough, so I'll wrap it up for now.
There still are some other x32 issues to discuss I think, but we can
follow up on those later.

Rich


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discussion of partly-invasive changes needed for x32 port
  2014-01-20  7:41 Discussion of partly-invasive changes needed for x32 port Rich Felker
@ 2014-01-20  8:43 ` Luca Barbato
  2014-01-20  9:25   ` Jens Gustedt
  2014-01-21  1:47 ` John Spencer
  2014-02-02  6:50 ` Rich Felker
  2 siblings, 1 reply; 12+ messages in thread
From: Luca Barbato @ 2014-01-20  8:43 UTC (permalink / raw)
  To: musl

On 20/01/14 08:41, Rich Felker wrote:
> Adding the workaround code at every syscall point is ugly, and
> possibly error-prone. An alternate possible solution is hacking up the
> syscall macros in syscall_arch.h to detect "const struct timespec *"
> arguments and auto-wrap them with compound literals that would fix-up
> the padding. However this probably requires either C11 or GNU C
> extensions. On the positive side, it would only affect x32; no changes
> at all would be made to the source files or other archs'
> syscall_arch.h logic.

Can a compiler supporting x32 be safely expected to support C11 ? If yes
sounds the best route.

lu


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discussion of partly-invasive changes needed for x32 port
  2014-01-20  8:43 ` Luca Barbato
@ 2014-01-20  9:25   ` Jens Gustedt
  2014-01-20 10:12     ` John Spencer
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Gustedt @ 2014-01-20  9:25 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1903 bytes --]

Hi,

Am Montag, den 20.01.2014, 09:43 +0100 schrieb Luca Barbato:
> On 20/01/14 08:41, Rich Felker wrote:
> > Adding the workaround code at every syscall point is ugly, and
> > possibly error-prone. An alternate possible solution is hacking up the
> > syscall macros in syscall_arch.h to detect "const struct timespec *"
> > arguments and auto-wrap them with compound literals that would fix-up
> > the padding. However this probably requires either C11 or GNU C
> > extensions. On the positive side, it would only affect x32; no changes
> > at all would be made to the source files or other archs'
> > syscall_arch.h logic.
> 
> Can a compiler supporting x32 be safely expected to support C11 ? If yes
> sounds the best route.

I have difficulties in understanding why C11 pops up, here. Compound
literals are in C since C99. gcc and clang support them since
ages.

Generally they provide a nice "static cast" that doesn't convert
blindly as the "C cast" and usually provide with better diagnostics in
cases things go wrong. E.g the macro that Rich mentioned early to cast
to long could use "(long const){ (X) }" instead of "(long)(X)", such
that conversion from pointer expressions wouldn't pass unnoticed. In
my opion macros shouldn't hide casts. A cast is a marker for the
compiler "you must know what your doing" which might not be the case,
if it is hidden like that.

Padding tricks may perhaps be played a bit more nicely with C11, but
they can be done with a bit more pain in previous version of the
standard, I think.


Jens

-- 
:: INRIA Nancy Grand Est :: http://www.loria.fr/~gustedt/   ::
:: AlGorille ::::::::::::::: office Nancy : +33 383593090   ::
:: ICube :::::::::::::: office Strasbourg : +33 368854536   ::
:: ::::::::::::::::::::::::::: gsm France : +33 651400183   ::
:: :::::::::::::::::::: gsm international : +49 15737185122 ::




[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discussion of partly-invasive changes needed for x32 port
  2014-01-20  9:25   ` Jens Gustedt
@ 2014-01-20 10:12     ` John Spencer
  2014-01-20 10:55       ` Jens Gustedt
  0 siblings, 1 reply; 12+ messages in thread
From: John Spencer @ 2014-01-20 10:12 UTC (permalink / raw)
  To: musl

Jens Gustedt wrote:
> Hi,
> 
> Am Montag, den 20.01.2014, 09:43 +0100 schrieb Luca Barbato:
>> On 20/01/14 08:41, Rich Felker wrote:
>>> Adding the workaround code at every syscall point is ugly, and
>>> possibly error-prone. An alternate possible solution is hacking up the
>>> syscall macros in syscall_arch.h to detect "const struct timespec *"
>>> arguments and auto-wrap them with compound literals that would fix-up
>>> the padding. However this probably requires either C11 or GNU C
>>> extensions. On the positive side, it would only affect x32; no changes
>>> at all would be made to the source files or other archs'
>>> syscall_arch.h logic.
>> Can a compiler supporting x32 be safely expected to support C11 ? If yes
>> sounds the best route.
> 
> I have difficulties in understanding why C11 pops up, here. Compound
> literals are in C since C99. gcc and clang support them since
> ages.
> 

what rich had in mind was usage of _Generic in the x32 __scc 
(syscall-cast) macro:

instead of

#define __scc(X) sizeof(1?(X):0ULL) < 8 ? (unsigned long) (X) : (long 
long) (X)

something along the lines of
_Generic((X), struct timespec*: &(struct timespec){.sec = (X).sec, .nsec 
= (X).nsec}, default: sizeof(1?(X):0ULL) < 8 ? (unsigned long) (X) : 
(long long) (X))

however the only available compiler implementing _Generic is Clang.
not even GCC 4.8.2 supports it. however i've seen some pseudo _Generic 
implementation on your blog [1], imo we could use the bits described 
there, namely
"
     __typeof__(EXP) gives the type of the expression EXP
     __builtin_types_compatible_p(T1, T2) is true if the two types are 
compatible
     __builtin_choose_expr(CNTRL, EXP1, EXP2) choose between the two 
expressions at compile time.
"

to implement the above macro, as x32 requires a very recent GCC (>= 4.7) 
anyway, i see no big problem in using those gcc helpers in lieu of _Generic.


[1] 
http://gustedt.wordpress.com/2012/01/02/emulating-c11-compiler-features-with-gcc-_generic/


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discussion of partly-invasive changes needed for x32 port
  2014-01-20 10:12     ` John Spencer
@ 2014-01-20 10:55       ` Jens Gustedt
  2014-01-20 23:57         ` Rich Felker
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Gustedt @ 2014-01-20 10:55 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 3212 bytes --]

Am Montag, den 20.01.2014, 11:12 +0100 schrieb John Spencer:
> what rich had in mind was usage of _Generic in the x32 __scc 
> (syscall-cast) macro:
> 
> instead of
> 
> #define __scc(X) sizeof(1?(X):0ULL) < 8 ? (unsigned long) (X) : (long 
> long) (X)
> 
> something along the lines of
> _Generic((X), struct timespec*: &(struct timespec){.sec = (X).sec, .nsec 
> = (X).nsec}, default: sizeof(1?(X):0ULL) < 8 ? (unsigned long) (X) : 
> (long long) (X))

ah, I see how _Generic could pop in an make some wrappers for
architecture specific code nicer

In the concrete case I don't understand too well. The return type of
that generic expression would be of a fundamentally different type,
once a pointer to struct, otherwise an integer. Also the two struct
types should be different types, no? One the userland type and the
other the kernel type?

But then something like

&(struct kernel_timespec){.sec = (X)->sec, .nsec = (X)->nsec}

should suffice to capture the transition well.

This would certainly benefit of a wrapper function
kernel_timespec_copy, because it evaluates X twice :


kernel_timespec_copy(&(struct kernel_timespec){0}, (X))


If you want that for optimization reasons you could try to capture if
the two types agree and avoid the extra copy. Something along the
line of

_Generic((X),
        struct kernel_timespec*: (X),
         default: kernel_timespec_copy(&(struct kernel_timespec*){0}, (X))
        )

But I am not convinced that this can't be just achieved by doing some
#ifdef programming


> however the only available compiler implementing _Generic is Clang.
> not even GCC 4.8.2 supports it. however i've seen some pseudo _Generic 
> implementation on your blog [1], imo we could use the bits described 
> there, namely
> "
>      __typeof__(EXP) gives the type of the expression EXP
>      __builtin_types_compatible_p(T1, T2) is true if the two types are 
> compatible
>      __builtin_choose_expr(CNTRL, EXP1, EXP2) choose between the two 
> expressions at compile time.
> "
> 
> to implement the above macro, as x32 requires a very recent GCC (>= 4.7) 
> anyway, i see no big problem in using those gcc helpers in lieu of _Generic.

These features are supported since long in gcc and clang, I think, so
yes from that side there certainly would be not much of an obstacle.

But I wouldn't think that using such a feature just in such an
isolated spot (very restricted part of code and very specific
architecture) would be worth it.

Some cooked down version of _Generic could perhaps be beneficial for
the whole of musl, there are some places that implement type
genericity in some handcrafted way. But these things need macro magic
and should not be re-implemented at each point where they are
needed. So probably much too intrusive at this point in time.

Jens


-- 
:: INRIA Nancy Grand Est :: http://www.loria.fr/~gustedt/   ::
:: AlGorille ::::::::::::::: office Nancy : +33 383593090   ::
:: ICube :::::::::::::: office Strasbourg : +33 368854536   ::
:: ::::::::::::::::::::::::::: gsm France : +33 651400183   ::
:: :::::::::::::::::::: gsm international : +49 15737185122 ::




[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discussion of partly-invasive changes needed for x32 port
  2014-01-20 10:55       ` Jens Gustedt
@ 2014-01-20 23:57         ` Rich Felker
  2014-01-21  8:29           ` Jens Gustedt
  0 siblings, 1 reply; 12+ messages in thread
From: Rich Felker @ 2014-01-20 23:57 UTC (permalink / raw)
  To: musl

On Mon, Jan 20, 2014 at 11:55:39AM +0100, Jens Gustedt wrote:
> Am Montag, den 20.01.2014, 11:12 +0100 schrieb John Spencer:
> > what rich had in mind was usage of _Generic in the x32 __scc 
> > (syscall-cast) macro:
> > 
> > instead of
> > 
> > #define __scc(X) sizeof(1?(X):0ULL) < 8 ? (unsigned long) (X) : (long 
> > long) (X)
> > 
> > something along the lines of
> > _Generic((X), struct timespec*: &(struct timespec){.sec = (X).sec, .nsec 
> > = (X).nsec}, default: sizeof(1?(X):0ULL) < 8 ? (unsigned long) (X) : 
> > (long long) (X))
> 
> ah, I see how _Generic could pop in an make some wrappers for
> architecture specific code nicer
> 
> In the concrete case I don't understand too well. The return type of
> that generic expression would be of a fundamentally different type,
> once a pointer to struct, otherwise an integer. Also the two struct
> types should be different types, no? One the userland type and the
> other the kernel type?

Both the pointer to struct and the integer case would be converted to
integers as in the the above "current" version of the __scc macro.

> But then something like
> 
> &(struct kernel_timespec){.sec = (X)->sec, .nsec = (X)->nsec}
> 
> should suffice to capture the transition well.
> 
> This would certainly benefit of a wrapper function
> kernel_timespec_copy, because it evaluates X twice :
> 
> 
> kernel_timespec_copy(&(struct kernel_timespec){0}, (X))

Evaluating X twice is not really a big deal, because it's never going
to be an expression that would have side effects. But I agree having
the (inline) wrapper function there is a nicer approach, and the
compiler should generate equally efficient code.

> If you want that for optimization reasons you could try to capture if
> the two types agree and avoid the extra copy. Something along the
> line of
> 
> _Generic((X),
>         struct kernel_timespec*: (X),
>          default: kernel_timespec_copy(&(struct kernel_timespec*){0}, (X))
>         )
> 
> But I am not convinced that this can't be just achieved by doing some
> #ifdef programming

Where? In every file that passes timespecs to syscalls? The whole
point is to follow my basic principle that workarounds for buggy
platforms (which the kernel's x32 support is) belong in places where
nobody has to see them unless they're working with the buggy platform
in question. This is a principle I first proposed and promoted while
working on MPlayer, and later applied as a critique of frameworks like
NSPR, APR, etc. The classic example of this principle goes like this:
For a cross-platform application or library, instead of having #ifdef
WIN32 all over the place and windows-specific code interleaved all
over the general, non-platform-specific code, one should provide
replacement headers or library code that fix the behavior of the
windows headers or runtime to conform to the standards they're
ignoring, and then write the code shared between platforms to portable
C/POSIX. BTW in some ways gnulib also follows this principle, except
that it assumes by default the target system is broken and needs fixes
which are inherently non-portabke. If you instead assume by default
that the target is non-broken, and only apply fixes to a known finite
set of broken platforms, you have much less chance of breaking things
and also keep the ugly hacks out of the way of developers not
specifically dealing with porting to the broken system(s).

That's the motivation here.

> > however the only available compiler implementing _Generic is Clang.
> > not even GCC 4.8.2 supports it. however i've seen some pseudo _Generic 
> > implementation on your blog [1], imo we could use the bits described 
> > there, namely
> > "
> >      __typeof__(EXP) gives the type of the expression EXP
> >      __builtin_types_compatible_p(T1, T2) is true if the two types are 
> > compatible
> >      __builtin_choose_expr(CNTRL, EXP1, EXP2) choose between the two 
> > expressions at compile time.
> > "
> > 
> > to implement the above macro, as x32 requires a very recent GCC (>= 4.7) 
> > anyway, i see no big problem in using those gcc helpers in lieu of _Generic.
> 
> These features are supported since long in gcc and clang, I think, so
> yes from that side there certainly would be not much of an obstacle.
> 
> But I wouldn't think that using such a feature just in such an
> isolated spot (very restricted part of code and very specific
> architecture) would be worth it.

Yes, I think this is reasonable.

> Some cooked down version of _Generic could perhaps be beneficial for
> the whole of musl, there are some places that implement type
> genericity in some handcrafted way. But these things need macro magic
> and should not be re-implemented at each point where they are
> needed. So probably much too intrusive at this point in time.

Right now the only magic type-generic macros in musl are in tgmath.h
and internal/syscall.h. So I think we're okay without any more general
_Generic framework for now.

Rich


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discussion of partly-invasive changes needed for x32 port
  2014-01-20  7:41 Discussion of partly-invasive changes needed for x32 port Rich Felker
  2014-01-20  8:43 ` Luca Barbato
@ 2014-01-21  1:47 ` John Spencer
  2014-01-21  2:18   ` Rich Felker
  2014-02-02  6:50 ` Rich Felker
  2 siblings, 1 reply; 12+ messages in thread
From: John Spencer @ 2014-01-21  1:47 UTC (permalink / raw)
  To: musl

Rich Felker wrote:
> 1. System calls take 64-bit arguments, whereas the existing syscall
> framework in musl (and on the kernel side for all other archs/abis)
> uses long arguments.
> 
> The current internal syscall macros cast all arguments to long; this
> allows accepting either pointers or integer types smaller than long
> without the caller having to make any casts. However this does not
> work for x32, where some syscalls actually need 64-bit arguments (e.g.
> lseek). The following macro, which replaces the long cast on x32,
> solves the problem:
> 
> #define __scc(X) sizeof(1?(X):0ULL) < 8 ? (unsigned long) (X) : (long long) (X)

when i understood your last mail correctly, it's ok to go forward with a
GNUC enhanced version of this macro that can deal with timespec ?
that would solve 1 + 2.

 > For normal syscalls that go via the inline functions, these inline
 > functions are defined in syscall_arch.h (arch-specific) where it's
 > easy to change them to use long long instead of long for their
 > argument types. However, for cancellable syscalls, they go through
 > __syscall_cp in src/thread/cancel_impl.c. The proposal so far is to
 > expose these internals from the internal syscall.h and syscall_arch.h
 > to cancel_impl.c. I'm not opposed to this, but if we go this way, the
 > type name should be self-documenting (e.g. klong, klong_t,
 > syscall_long_t, or something similar) rather than the current proposal
 > (__sca) that's non-obvious unless you go digging for it. Alernatively,
 > src/internal/syscall.h could pull in register_t from alltypes.h
 > (having internal headers use alltypes.h directly is not something we
 > do yet, but it could be done) and we could simply always use
 > register_t for these arguments.

afaik mips n32 uses long arguments for syscalls, so using register_t 
would not work there.
so it seems our choices are either _Klong in alltypes.h or in a special 
bits header (bits/k_long.h), so we dont have to include syscall_arch.h 
in other bits headers and expose other unrelated internals (point 3).

> 3. A number of other structures in the public headers differ from
> their userspace C type layout on other archs because they're setup to
> match the 64-bit kernel structs. This requires either changing some
> member types to match the 64-bit kernel ones (this is usually a bad
> idea, and sometimes even non-conforming like with tv_nsec) or adding
> adjacent padding members for the unused "high" part of the 64-bit
> value. In many cases, this requires either moving structures from the
> shared headers to bits/*, or coming up with clever ways to avoid doing
> so.

i've seen this in linux/sysinfo.h today:
         __kernel_ulong_t totalhigh;     /* Total high memory size */
         __kernel_ulong_t freehigh;      /* Available high memory size */
         __u32 mem_unit;                 /* Memory unit size in bytes */
         char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)];   /* 
Padding: libc5 uses this.. */

so the padding evaluates to either char _f[0] where __kernel_ulong_t is 
64bit, or _f[8] for 32bit.

seems like a workable solution, which doesn't even require bitfield hacks.
but we need again the kernel_long to work with...
it seems there's no way around it.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discussion of partly-invasive changes needed for x32 port
  2014-01-21  1:47 ` John Spencer
@ 2014-01-21  2:18   ` Rich Felker
  0 siblings, 0 replies; 12+ messages in thread
From: Rich Felker @ 2014-01-21  2:18 UTC (permalink / raw)
  To: musl

On Tue, Jan 21, 2014 at 02:47:24AM +0100, John Spencer wrote:
> Rich Felker wrote:
> >1. System calls take 64-bit arguments, whereas the existing syscall
> >framework in musl (and on the kernel side for all other archs/abis)
> >uses long arguments.
> >
> >The current internal syscall macros cast all arguments to long; this
> >allows accepting either pointers or integer types smaller than long
> >without the caller having to make any casts. However this does not
> >work for x32, where some syscalls actually need 64-bit arguments (e.g.
> >lseek). The following macro, which replaces the long cast on x32,
> >solves the problem:
> >
> >#define __scc(X) sizeof(1?(X):0ULL) < 8 ? (unsigned long) (X) : (long long) (X)
> 
> when i understood your last mail correctly, it's ok to go forward with a
> GNUC enhanced version of this macro that can deal with timespec ?
> that would solve 1 + 2.

I think so. Keep in mind the timespec one needs to apply only to const
timespec pointers (non-const ones are results where the kernel needs
to write the result into the caller's structure). I think we should
also use Jens' idea to avoid multiple-evaluation of the argument. And
the result should be cast to unsigned long, so that the final result
of the __scc macro always has integer type.

> > to cancel_impl.c. I'm not opposed to this, but if we go this way, the
> > type name should be self-documenting (e.g. klong, klong_t,
> > syscall_long_t, or something similar) rather than the current proposal
> > (__sca) that's non-obvious unless you go digging for it. Alernatively,
> > src/internal/syscall.h could pull in register_t from alltypes.h
> > (having internal headers use alltypes.h directly is not something we
> > do yet, but it could be done) and we could simply always use
> > register_t for these arguments.
> 
> afaik mips n32 uses long arguments for syscalls, so using register_t
> would not work there.

There's no reason __syscall_cp has to use the same calling convention
as the kernel, as long as the argument type is sufficiently wide to
convey all possible arguments. So register_t would work even when it's
64-bit and the kernel uses 32-bit. However I tend to think using the
right type is probably cleaner since getting register_t requires at
least one hack anyway.

> so it seems our choices are either _Klong in alltypes.h or in a
> special bits header (bits/k_long.h), so we dont have to include
> syscall_arch.h in other bits headers and expose other unrelated
> internals (point 3).

Well this possibility is unrelated to items 1/2, so I think you're
getting ahead of yourself. In any case, bits/k_long.h would not be the
right place (bits/* are _never_ included directly and don't have
multiple-include guards; they're included only from the associated
public header they go with).

> >3. A number of other structures in the public headers differ from
> >their userspace C type layout on other archs because they're setup to
> >match the 64-bit kernel structs. This requires either changing some
> >member types to match the 64-bit kernel ones (this is usually a bad
> >idea, and sometimes even non-conforming like with tv_nsec) or adding
> >adjacent padding members for the unused "high" part of the 64-bit
> >value. In many cases, this requires either moving structures from the
> >shared headers to bits/*, or coming up with clever ways to avoid doing
> >so.
> 
> i've seen this in linux/sysinfo.h today:
>         __kernel_ulong_t totalhigh;     /* Total high memory size */
>         __kernel_ulong_t freehigh;      /* Available high memory size */
>         __u32 mem_unit;                 /* Memory unit size in bytes */
>         char _f[20-2*sizeof(__kernel_ulong_t)-sizeof(__u32)];   /*
> Padding: libc5 uses this.. */
> 
> so the padding evaluates to either char _f[0] where __kernel_ulong_t
> is 64bit, or _f[8] for 32bit.
> 
> seems like a workable solution, which doesn't even require bitfield hacks.
> but we need again the kernel_long to work with...
> it seems there's no way around it.

This is a rather ugly legacy interface that's really the least of our
concerns. And the x32 modification of it breaks the documented API
(where the members are longs); for example, correct programs may be
passing them to printf with %lu... As usual, the kernel/glibc folks
did not think before they did this stuff...

Rich


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discussion of partly-invasive changes needed for x32 port
  2014-01-20 23:57         ` Rich Felker
@ 2014-01-21  8:29           ` Jens Gustedt
  2014-01-21 11:33             ` Szabolcs Nagy
  0 siblings, 1 reply; 12+ messages in thread
From: Jens Gustedt @ 2014-01-21  8:29 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 3110 bytes --]

Hello,

Am Montag, den 20.01.2014, 18:57 -0500 schrieb Rich Felker:
> Where? In every file that passes timespecs to syscalls? The whole
> point is to follow my basic principle that workarounds for buggy
> platforms (which the kernel's x32 support is) belong in places where
> nobody has to see them unless they're working with the buggy platform
> in question. This is a principle I first proposed and promoted while
> working on MPlayer, and later applied as a critique of frameworks like
> NSPR, APR, etc. The classic example of this principle goes like this:
> For a cross-platform application or library, instead of having #ifdef
> WIN32 all over the place and windows-specific code interleaved all
> over the general, non-platform-specific code, one should provide
> replacement headers or library code that fix the behavior of the
> windows headers or runtime to conform to the standards they're
> ignoring, and then write the code shared between platforms to portable
> C/POSIX. BTW in some ways gnulib also follows this principle, except
> that it assumes by default the target system is broken and needs fixes
> which are inherently non-portabke. If you instead assume by default
> that the target is non-broken, and only apply fixes to a known finite
> set of broken platforms, you have much less chance of breaking things
> and also keep the ugly hacks out of the way of developers not
> specifically dealing with porting to the broken system(s).
> 
> That's the motivation here.

sure, system dependent #ifdef are a plague, you are absolutely right

but feature dependent ones are not as bad, I think, there is a big
difference in maintainability between

#if WIN32 > 47
typedef unsigned long total_t;
#endif

and

#if ULONG_MAX == ULLONG_MAX
typedef unsigned long total_t;
#else
typedef unsigned long long total_t;
#endif

_Generic macros are not so much different from that. At the point of
declaration you'd have to understand and maintain the different
branches that such a macro could take. They have the advantage of
being type oriented, instead of being value oriented. But the
disadvantage of spreading the decisions to everywhere in the code,
potentially leading to longer compilation times. (Some of the macros I
have bring gcc to its knees :)

> Right now the only magic type-generic macros in musl are in tgmath.h
> and internal/syscall.h.

I would have thought that there already are more, but you certainly
know the code much better than I do.

> So I think we're okay without any more general
> _Generic framework for now.

Once you are starting to implement C11 library features, you are going
to need more, I think. Implementing C11 atomics without it would be a
real challenge.

Jens

-- 
:: INRIA Nancy Grand Est :: http://www.loria.fr/~gustedt/   ::
:: AlGorille ::::::::::::::: office Nancy : +33 383593090   ::
:: ICube :::::::::::::: office Strasbourg : +33 368854536   ::
:: ::::::::::::::::::::::::::: gsm France : +33 651400183   ::
:: :::::::::::::::::::: gsm international : +49 15737185122 ::



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discussion of partly-invasive changes needed for x32 port
  2014-01-21  8:29           ` Jens Gustedt
@ 2014-01-21 11:33             ` Szabolcs Nagy
  2014-01-21 13:59               ` Jens Gustedt
  0 siblings, 1 reply; 12+ messages in thread
From: Szabolcs Nagy @ 2014-01-21 11:33 UTC (permalink / raw)
  To: musl

* Jens Gustedt <jens.gustedt@inria.fr> [2014-01-21 09:29:01 +0100]:
> Once you are starting to implement C11 library features, you are going
> to need more, I think. Implementing C11 atomics without it would be a
> real challenge.

what is the issue there?

i still don't get what the standard mean by "generic function" in the
c11 atomics specs

i thought the compiler will do the hard work (providing __builtin_foo
for every atomic foo) and libc just macro wraps those

it is not clear if the atomic functions are reserved for external
linkage (currently the text seems to require external linkage for them
but i don't know how could that work) or if you can take the address
of them

and the standard does not seem to permit the use of atomic functions
in signal handlers which would be useful


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discussion of partly-invasive changes needed for x32 port
  2014-01-21 11:33             ` Szabolcs Nagy
@ 2014-01-21 13:59               ` Jens Gustedt
  0 siblings, 0 replies; 12+ messages in thread
From: Jens Gustedt @ 2014-01-21 13:59 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 2717 bytes --]

Am Dienstag, den 21.01.2014, 12:33 +0100 schrieb Szabolcs Nagy:
> * Jens Gustedt <jens.gustedt@inria.fr> [2014-01-21 09:29:01 +0100]:
> > Once you are starting to implement C11 library features, you are going
> > to need more, I think. Implementing C11 atomics without it would be a
> > real challenge.
> 
> what is the issue there?

atomic operations are not only defined for the basic types, but any
type may be subject to an _Atomic qualifier or type-specifier.

> i still don't get what the standard mean by "generic function" in the
> c11 atomics specs

I read "generic function" as "function-like type generic macro".

> i thought the compiler will do the hard work (providing __builtin_foo
> for every atomic foo) and libc just macro wraps those

For some basic types, yes, but possibly not for all scalar types
(_Complex double or pointers e.g)

For the aggregate types you'd have to cook up something on the fly
which will usually not be lock-free but use an atomic_flag as a
spinlock.

For a full implementation you will need to distinguish types: a list of
types that have lock free builtins, pointers (most probably map them
to the functions for void*) and other types. Games with sizeof will
usually not suffice for that.

> it is not clear if the atomic functions are reserved for external
> linkage (currently the text seems to require external linkage for them
> but i don't know how could that work)

I think that for all functions mentioned in the standard, this is an
"as-if" rule.  The user of these functions should not make any
assumptions how a feature is implemented (external function, builtin,
macro), and in particular any of the standard functions *may* be
implemented as a macro.

> or if you can take the address of them

There I can't imagine how that would work for "type generic functions" :)

> and the standard does not seem to permit the use of atomic functions
> in signal handlers which would be useful

In addition to the functions, for the arithmetic types the usual
operators are defined for all atomic derived types. Compared to the
"generic functions" they have memory_order fixed to
memory_order_seq_cst, IRC.

The standard allows to use all of these operators in signal handlers,
provided that the atomic type is lock-free, for which in turn there
are feature test macros.


Jens

-- 
:: INRIA Nancy Grand Est :: http://www.loria.fr/~gustedt/   ::
:: AlGorille ::::::::::::::: office Nancy : +33 383593090   ::
:: ICube :::::::::::::: office Strasbourg : +33 368854536   ::
:: ::::::::::::::::::::::::::: gsm France : +33 651400183   ::
:: :::::::::::::::::::: gsm international : +49 15737185122 ::




[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Discussion of partly-invasive changes needed for x32 port
  2014-01-20  7:41 Discussion of partly-invasive changes needed for x32 port Rich Felker
  2014-01-20  8:43 ` Luca Barbato
  2014-01-21  1:47 ` John Spencer
@ 2014-02-02  6:50 ` Rich Felker
  2 siblings, 0 replies; 12+ messages in thread
From: Rich Felker @ 2014-02-02  6:50 UTC (permalink / raw)
  To: musl

Here's an update after further consideration and discussion:

On Mon, Jan 20, 2014 at 02:41:31AM -0500, Rich Felker wrote:
> 1. System calls take 64-bit arguments, whereas the existing syscall
> framework in musl (and on the kernel side for all other archs/abis)
> uses long arguments.

The only open question here is where the type used for syscall
arguments, especially in syscall_cp, comes from: whether it's defined
in syscall_arch.h with a default fallback in internal/syscall.h, or a
public type from the installed headers.

Despite a public type being convenient for use in other headers (see
the ipc structures, etc.) I'm leaning towards putting the syscall
argument type in the private internal syscall headers. This at least
avoids nasty cross-dependency of the resolution of two separate
issues, and makes it so we can choose to provide or not provide a
public type independent of how the internals are implemented.

> 2. The kernel interprets the tv_nsec field of struct timespec as a
> kernel long (64-bit) rather than a userspace long (32-bit). glibc
> handles this completely wrong (bug #16437), making it easy for them.
> Doing it correctly is much harder; for all syscalls that take timespec
> arguments, musl will have to copy the timespec to a temp structure
> with the padding correctly sign-extended before passing it to the
> kernel. The affected functions include at least: sigtimedwait, ppoll,
> nanosleep, clock_settime, mq_timedsend, mq_timedreceive, pselect,
> futimesns, utimesnsat, and everything using timed futex waits
> (thankfully these all go through __timedwait in musl).
> 
> Adding the workaround code at every syscall point is ugly, and
> possibly error-prone. An alternate possible solution is hacking up the
> syscall macros in syscall_arch.h to detect "const struct timespec *"
> arguments and auto-wrap them with compound literals that would fix-up
> the padding. However this probably requires either C11 or GNU C
> extensions. On the positive side, it would only affect x32; no changes
> at all would be made to the source files or other archs'
> syscall_arch.h logic.

I've since determined this to be inadequate, since at least some
syscalls (futimens and utimensat) take pointers to an array of two
timespecs or other such ugliness.

The new direction I'd like to consider it adding special-case code in
syscall_arch.h that catches all the syscalls which take timespec
arguments and converts them. This is much less fragile and it could
also be used for (hypothetical, at this point, I think) interfaces
where a single timespec is used as both input and output to a syscall.

> 3. A number of other structures in the public headers differ from
> their userspace C type layout on other archs because they're setup to
> match the 64-bit kernel structs. This requires either changing some
> member types to match the 64-bit kernel ones (this is usually a bad
> idea, and sometimes even non-conforming like with tv_nsec) or adding
> adjacent padding members for the unused "high" part of the 64-bit
> value. In many cases, this requires either moving structures from the
> shared headers to bits/*, or coming up with clever ways to avoid doing
> so.
> 
> Most of the issues are in the sysv ipc headers:
> 
> [...]
> 
> All of these issues raise the question of whether it would be
> beneficial to have available "syscall_long_t" or similar in the public
> headers. This would make it very possible to avoid moving anything to
> bits (and perhaps even helo move some things back) with some mild to
> moderately ugly tricks for computed padding. But the cons are that
> exposing the type invites abuse (application code trying to use it)
> and the actual struct layouts perhaps become less explicit...

My leaning at this point is to just move whatever needs to be in bits
to bits and be done with it. While these few changes are rather ugly,
they're isolated to sysvipc headers, which are independent of anything
else and already pretty ugly anyway. I think spreading more of them
into bits is a lesser offense than exposing an internal type and
putting weird (nonstandard, internal) types, macros, #ifdefs, or
similar into the public non-bits headers where it would become
non-explicit what the actual structure layout is on the arch that's in
use. (How frustrating is it in glibc's headers when you see an odd
__-prefixed type used somewhere and you have to go hunting for the
actual definition elsewhere?) Of course there were some places where
just using time_t for a reserved/padding fields would get the job done
without moving anything to bits; I'm moderately ok with the approach
for now, with the understanding that it might have to change again
later.

So, if the above directions are non-controversial, I think the
remaining path for integrating x32 is:

1. Make changes to internal/syscall.h to use __scc macro, possibly
defined by syscall_arch.h, instead of (long) casts, and to expose
syscall_arg_t.

2. Making cancel_*.c use syscall_arg_t instead of long.

3. Moving to bits any sysvipc structures/types that will have to
differ on x32 but that haven't differed on archs supported so far.

4. Designing and implementing the timespec rewriting code in
syscall_arch.h for x32 (see comments on quoted item 2 above); this is
basically a switch(n) in the inline __syscallN functions to copy the
appropriate timespec arg to a local kernel timespec structure, and it
will get completely optimized out as long as n is constant.

5. Merge all of the x32 files, which should at this point be possible
without touching any non-x32 files.

6. Activate x32 support (e.g. adding it to configure).

Rich


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-02-02  6:50 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-20  7:41 Discussion of partly-invasive changes needed for x32 port Rich Felker
2014-01-20  8:43 ` Luca Barbato
2014-01-20  9:25   ` Jens Gustedt
2014-01-20 10:12     ` John Spencer
2014-01-20 10:55       ` Jens Gustedt
2014-01-20 23:57         ` Rich Felker
2014-01-21  8:29           ` Jens Gustedt
2014-01-21 11:33             ` Szabolcs Nagy
2014-01-21 13:59               ` Jens Gustedt
2014-01-21  1:47 ` John Spencer
2014-01-21  2:18   ` Rich Felker
2014-02-02  6:50 ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).