From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/6997 Path: news.gmane.org!not-for-mail From: "arnd@arndb.de" Newsgroups: gmane.linux.lib.musl.general,gmane.comp.lib.glibc.alpha,gmane.linux.kernel,gmane.linux.ports.arm.kernel Subject: Re: Re: [PATCHv3 00/24] ILP32 support in ARM64 Date: Wed, 11 Feb 2015 22:02:55 +0100 (CET) Message-ID: <1383502854.512344.1423688575473.JavaMail.open-xchange@oxbaltgw09.schlund.de> References: <20141002155217.GH32147@e104818-lin.cambridge.arm.com> <20150210181302.GA23886@brightrain.aerifal.cx> <20150211173919.GF9058@e104818-lin.cambridge.arm.com> <20150211190537.GK32724@port70.net> <359577916.509062.1423684206521.JavaMail.open-xchange@oxbaltgw09.schlund.de> <20150211201251.GK23507@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_512343_1289594152.1423688575382" X-Trace: ger.gmane.org 1423688656 12480 80.91.229.3 (11 Feb 2015 21:04:16 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 11 Feb 2015 21:04:16 +0000 (UTC) Cc: "libc-alpha@sourceware.org" , "pinskia@gmail.com" , Marcus Shawcroft , "linux-kernel@vger.kernel.org" , Szabolcs Nagy , Andrew Pinski , "linux-arm-kernel@lists.infradead.org" , musl@lists.openwall.com To: Rich Felker Original-X-From: musl-return-7010-gllmg-musl=m.gmane.org@lists.openwall.com Wed Feb 11 22:04:09 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1YLeS8-0003qv-2T for gllmg-musl@m.gmane.org; Wed, 11 Feb 2015 22:04:08 +0100 Original-Received: (qmail 11380 invoked by uid 550); 11 Feb 2015 21:04:06 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 10148 invoked from network); 11 Feb 2015 21:03:40 -0000 In-Reply-To: <20150211201251.GK23507@brightrain.aerifal.cx> X-Priority: 3 Importance: Medium X-Mailer: Open-Xchange Mailer v7.6.0-Rev36 X-Originating-Client: com.openexchange.ox.gui.dhtml X-Provags-ID: V03:K0:5r9Igww/C1HgIV5PPB8mOUKMetlCFJep4c9oHRB41KGdkH3Lzag Qh17uqlZkUjG03XEpcKUzq/cHGf25CGQg97QBpoYTvY+eyfQFrECueW7ln7MXTMq2icm+Ob F6/FKD9wc3oAfCfd71/MONmd9OJtc3anQ7CKaZFJzn6cGvY68uGNFmNB4zd5QMxlY69nsYz DuokHXB/UoStuLrCeZbPQ== X-UI-Out-Filterresults: notjunk:1; Xref: news.gmane.org gmane.linux.lib.musl.general:6997 gmane.comp.lib.glibc.alpha:49169 gmane.linux.kernel:1886762 gmane.linux.ports.arm.kernel:392823 Archived-At: ------=_Part_512343_1289594152.1423688575382 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Rich Felker hat am 11. Februar 2015 um 21:12 geschrieben: > On Wed, Feb 11, 2015 at 08:50:06PM +0100, arnd@arndb.de wrote: > > > > At least for AArch64 ILP32 we are still free to change the user/kernel > > > > ABI, so we could add wrappers for the affected syscalls to fix this up. > > > yes, afaik on x32 the 64bit kernel expects 64bit layout, > > > arm64 can fix this > > > > We have to fix it on all 32-bit architectures when we move to 64-bit time_t. > > > > I think ideally you'd want a user space definition like > > > > typedef long long time_t; > > struct timespec { > > time_t tv_sec; > > long long tv_nsec; > > }; > > > > which is the only way to avoid passing uninitialized tv_nsec into the kernel > > from arbitrary user space doing ioctl. This is of course against POSIX and > > C99. Changing POSIX to allow it is probably easier than the C standard, > > but we have a couple of years before we need to make this the default. > > I don't see why you want it to be long long. There is no harm in > passing uninitialized padding to the kernel; the kernel just needs to > do the right thing and ignore it (or avoid reading it to begin with). This would however mean having three different implementations in the kernel rather than just two: Every driver that can pass a timespec with this model needs to handle the native 64-bit case (64/64), the legacy 32-bit case (32/32) and the y2038-safe case (64/32). Most code can already handle the first two, and none today handles the third. If you want to make the handling explicitly incompatible with native 64-bit mode, you get a lot of untested code in obscure places that are never tested properly, while using the normal behavior in the kernel at least gives us the same bugs that we already have on native 64-bit systems. In some cases, there may also be a measurable performance penalty in interpreting a user space data structure manually over copying it (including the timespec values) in one chunk. An alternative would be to change the native 64-bit case to ignore the upper half of tv_nsec and always just copy the low bits. This should work fine almost all of the time, but I fear that there might be corner cases where existing 64-bit user space depends on passing large or negative tv_nsec values into the kernel. > The other direction, passing uninitialized data from the kernel to > userspace, would be dangerous. But it doesn't happen as long as the > userspace padding is positioned (in an endian-dependent manner) where > the high bits of the kernel type would lie. It could happen if you > used a separate conversion wrapper that ony wrote 32 bits, but if you > wanted to take that approach you'd just need the wrapper to also write > the padding field manually. Going from kernel to user space should not be an issue as long as we always just write two 64-bit words, and this will zero-fill the upper half. > > In the kernel headers, the current plan is to provide interfaces taking > > structures > > > > typedef long long __kernel_time64_t; > > struct __kernel_timespec64_t { > > __kernel_time64_t tv_sec; > > long long tv_nsec; > > }; > > > > at least for ioctls, to avoid the ambiguity with libc headers specifying > > something else. > > This seems hideous from an application standpoint. Application > programmers don't want to know, and shouldn't need to know, these > silly implementation details that make no sense except as historical > baggage. They should just be able to use "struct timespec" everywhere > and have it work. The kernel does not even know how timespec is defined by libc, and we have to at least be able to handle the common cases of timespec being 32/32 and 64/64 (or 64/32 plus explicit padding). For system calls, we can rely on libc calling the syscalls that match the definition (or convert the structure as necessary), while for ioctl the command number is chosen by the application and has to match the structure definition provided in the same header. In a lot of cases, the ioctl command number is defined (correctly) using the _IOR/_IOW macros that take the size of the structure into account, but then you also have cases where you get indirect pointers and the size of data structure passed by the ioctl command is independent of the size of timespec or time_t. This is not just limited to time_t, we have a lot of data types for which we define __kernel_*_t types for this purpose, to deal with ioctls that need a specific layout independent of what libc uses. Arnd ------=_Part_512343_1289594152.1423688575382 MIME-Version: 1.0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit
Rich Felker <dalias@libc.org> hat am 11. Februar 2015 um 21:12 geschrieben:
> On Wed, Feb 11, 2015 at 08:50:06PM +0100, arnd@arndb.de wrote:
> > > > At least for AArch64 ILP32 we are still free to change the user/kernel
> > > > ABI, so we could add wrappers for the affected syscalls to fix this up.
> > > yes, afaik on x32 the 64bit kernel expects 64bit layout,
> > > arm64 can fix this
> >
> > We have to fix it on all 32-bit architectures when we move to 64-bit time_t.
> >
> > I think ideally you'd want a user space definition like
> >
> > typedef long long time_t;
> > struct timespec {
> > time_t tv_sec;
> > long long tv_nsec;
> > };
> >
> > which is the only way to avoid passing uninitialized tv_nsec into the kernel
> > from arbitrary user space doing ioctl. This is of course against POSIX and
> > C99. Changing POSIX to allow it is probably easier than the C standard,
> > but we have a couple of years before we need to make this the default.
>
> I don't see why you want it to be long long. There is no harm in
> passing uninitialized padding to the kernel; the kernel just needs to
> do the right thing and ignore it (or avoid reading it to begin with).
 
This would however mean having three different implementations
in the kernel rather than just two: Every driver that can pass a timespec
with this model needs to handle the native 64-bit case (64/64), the legacy
32-bit case (32/32) and the y2038-safe case (64/32). Most code can
already handle the first two, and none today handles the third. If you
want to make the handling explicitly incompatible with native 64-bit
mode, you get a lot of untested code in obscure places that are never
tested properly, while using the normal behavior in the kernel at least
gives us the same bugs that we already have on native 64-bit systems.
 
In some cases, there may also be a measurable performance penalty
in interpreting a user space data structure manually over copying
it (including the timespec values) in one chunk.
 
An alternative would be to change the native 64-bit case to ignore the upper
half of tv_nsec and always just copy the low bits. This should work
fine almost all of the time, but I fear that there might be corner cases
where existing 64-bit user space depends on passing large or negative
tv_nsec values into the kernel.
 
> The other direction, passing uninitialized data from the kernel to
> userspace, would be dangerous. But it doesn't happen as long as the
> userspace padding is positioned (in an endian-dependent manner) where
> the high bits of the kernel type would lie. It could happen if you
> used a separate conversion wrapper that ony wrote 32 bits, but if you
> wanted to take that approach you'd just need the wrapper to also write
> the padding field manually.
 
Going from kernel to user space should not be an issue as long as we
always just write two 64-bit words, and this will zero-fill the upper half.

> > In the kernel headers, the current plan is to provide interfaces taking
> > structures
> >
> > typedef long long __kernel_time64_t;
> > struct __kernel_timespec64_t {
> > __kernel_time64_t tv_sec;
> > long long tv_nsec;
> > };
> >
> > at least for ioctls, to avoid the ambiguity with libc headers specifying
> > something else.
>
> This seems hideous from an application standpoint. Application
> programmers don't want to know, and shouldn't need to know, these
> silly implementation details that make no sense except as historical
> baggage. They should just be able to use "struct timespec" everywhere
> and have it work.

The kernel does not even know how timespec is defined by libc, and we have
to at least be able to handle the common cases of timespec being 32/32
and 64/64 (or 64/32 plus explicit padding). For system calls, we can rely
on libc calling the syscalls that match the definition (or convert the
structure as necessary), while for ioctl the command number is chosen
by the application and has to match the structure definition provided in
the same header.
 
In a lot of cases, the ioctl command number is defined (correctly) using the
_IOR/_IOW macros that take the size of the structure into account, but then
you also have cases where you get indirect pointers and the size of data structure
passed by the ioctl command is independent of the size of timespec or time_t.
 
This is not just limited to time_t, we have a lot of data types for which we define
__kernel_*_t types for this purpose, to deal with ioctls that need a specific
layout independent of what libc uses.
 
      Arnd
------=_Part_512343_1289594152.1423688575382--