From: Catalin Marinas <catalin.marinas@arm.com>
To: Rich Felker <dalias@libc.org>
Cc: "libc-alpha@sourceware.org" <libc-alpha@sourceware.org>,
"arnd@arndb.de" <arnd@arndb.de>,
"pinskia@gmail.com" <pinskia@gmail.com>,
"musl@lists.openwall.com" <musl@lists.openwall.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Andrew Pinski <apinski@cavium.com>,
Marcus Shawcroft <Marcus.Shawcroft@arm.com>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCHv3 00/24] ILP32 support in ARM64
Date: Fri, 13 Feb 2015 17:33:46 +0000 [thread overview]
Message-ID: <20150213173345.GA26217@e104818-lin.cambridge.arm.com> (raw)
In-Reply-To: <20150213163013.GE23507@brightrain.aerifal.cx>
On Fri, Feb 13, 2015 at 11:30:13AM -0500, Rich Felker wrote:
> On Fri, Feb 13, 2015 at 01:33:56PM +0000, Catalin Marinas wrote:
> > On Thu, Feb 12, 2015 at 07:59:24PM +0100, Arnd Bergmann wrote:
> > > Catalin Marinas <catalin.marinas@arm.com> hat am 12. Februar 2015 um 19:17
> > > geschrieben:
> > > > The solution (for new ports) could be similar to the other such
> > > > solutions in the compat layer. A kernel internal structure which is
> > > > binary-compatible with the ILP32 user one (as exported by the kernel):
> > > >
> > > > struct ilp32_timespec_kernel_internal_only {
> > > > __kernel_time_t tv_sec; /* seconds */
> > > > int tv_nsec; /* nanoseconds */
> > > > };
> > > >
> > > > and a syscall wrapper which converts between ilp32_timespec and timespec
> > > > (take compat_sys_clock_settime as an example).
> > >
> > > We then have to to this on all architectures, and not call it ilp32_timespec,
> > > but call it something else.
> > >
> > > I would much prefer to only have two versions of each syscall that takes a
> > > timespec rather than three versions, or having a version that behaves
> > > differently based on the type of program calling it. On native 32-bit
> > > systems, we should have the native syscall taking the 16-byte structure
> > > (using long long __kernel_time64_t)
> >
> > Can this also be 12 bytes in general if tv_nsec stays as 32-bit? The
> > size of such structure would be 16 bytes on ARM but I guess this depends
> > on long long the alignment requirements on specific architectures.
>
> The only archs with modern relevance I'm aware of where 64-bit types
> are not aligned are i386 and, by a regretable but hard-to-fix mistake,
> or1k. I don't have much opinion on whether the 64-bit-time_t timespec
> should be 12 bytes or 16 bytes on such archs. From my perspective it's
> a new ABI anyway so I'd like to be able to fix the 64-bit alignment
> issue at the same time, in which case the question would go away, but
> I'm sure others (glibc) will prefer a more transitional approach with
> symbol versioning or feature test macros or something.
The good thing about 16-byte timespec64 with appropriate (endianness
aware) struct padding is that the kernel can write tv_nsec to user as a
64-bit value (long on a 64-bit kernel). It's only the reading from user
that the 32-bit needs to be sign-extended into the kernel structure.
> > > In the kernel, it comes down to a function like
> > >
> > > int get_user_timespec64(struct timespec64 *ts, struct __kernel_timespec64 __user
> > > *uts, bool task_32bit)
> > > {
> > > struct __kernel_timespec64 input;
> > >
> > > if (copy_from_user(&input, uts, sizeof(input))
> > > return -EFAULT;
> > >
> > > ts->tv_sec = input.tv_sec;
> > > if (task_32bit)
> > > ts->tv_nsec = (int)input.tv_nsec;
> > > else
> > > ts->tv_nsec = input.tv_nsec;
> > >
> > > return 0;
> > > }
> >
> > The only drawback is that native 64-bit and new 32-bit have the same
> > handling path, potentially slowing down the former (it may not be
> > noticeable).
>
> Offhand, I would not consider a single predictable branch on syscall
> entry or return to be noticable relative to general syscall overhead.
It's not just a check+branch but accessing some TIF flag which requires
reading the current_thread_info()->flags and testing it. It is probably
lost in the noise, unless you do such calls in loop where you may notice
a slight variation (it depends on the branch predictor as well; on some
architecture we may be able to make use of unlikely(task_32bit)).
> > > The data structure definition is a little bit fragile, as it depends on
> > > user space not using the __BIT_ENDIAN symbol in a conflicting way. So
> > > far we have managed to keep that outside of general purpose headers, but
> > > it should at least blow up in an obvious way if it does, rather than
> > > breaking silently.
> > >
> > > I still think it's more practical to keep the zeroing in user space though.
> > > In that case, we keep defining __kernel_timespec64 with a 'typedef long
> > > long __kernel_snseconds_t', and it's up to the libc to either use
> > > __kernel_timespec64 as its timespec, or to define a C11-compliant
> > > timespec itself and zero out the bits before passing the data to the kernel.
> >
> > The problem with doing this in user space is syscall(2). If we don't
> > allow it, then it's fine to do the padding in libc.
>
> It's already the case that callers have to tiptoe around syscall(2)
> usage on a per-arch basis for silly things like the convention for
> passing 64-bit arguments on 32-bit archs, different arg orders to work
> around 64-bit alignment and issues with too many args, and various
> legacy issues. So I think manual use of syscall(2) is a less-critical
> issue, though of course from a libc perspective I would very much like
> for the kernel to handle it right.
I think there is another problem with sign-extending tv_nsec in libc.
The prototype for functions like clock_settime(2) take a const struct
timespec *. There isn't anything to prevent such structure being in a
read-only section, even though it is unlikely. So libc would have to
duplicate the structure rather than just sign-extending tv_nsec in
place.
BTW, I'll be offline for a week (holiday) and I won't be able to follow
up on this thread.
--
Catalin
next prev parent reply other threads:[~2015-02-13 17:33 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20141002155217.GH32147@e104818-lin.cambridge.arm.com>
2015-02-10 18:13 ` Rich Felker
2015-02-11 17:39 ` Catalin Marinas
2015-02-11 19:05 ` Szabolcs Nagy
2015-02-11 19:22 ` [musl] " H.J. Lu
2015-02-11 19:50 ` arnd
2015-02-11 20:12 ` Rich Felker
2015-02-11 20:47 ` Jens Gustedt
2015-02-11 21:02 ` arnd
2015-02-11 21:09 ` arnd
2015-02-11 21:37 ` [musl] " Rich Felker
2015-02-16 17:20 ` Arnd Bergmann
2015-02-16 17:51 ` [musl] " Rich Felker
2015-02-16 19:38 ` Arnd Bergmann
2015-02-12 8:12 ` Szabolcs Nagy
2015-02-12 17:07 ` Catalin Marinas
2015-02-11 19:21 ` Rich Felker
2015-02-12 18:17 ` Catalin Marinas
2015-02-12 18:59 ` arnd
2015-02-13 13:33 ` Catalin Marinas
2015-02-13 16:30 ` Rich Felker
2015-02-13 17:33 ` Catalin Marinas [this message]
2015-02-13 18:37 ` Rich Felker
2015-02-16 14:40 ` Arnd Bergmann
2015-02-16 15:38 ` Rich Felker
2015-02-16 16:54 ` Arnd Bergmann
2015-02-11 18:33 ` H.J. Lu
2015-02-11 19:02 ` Rich Felker
2015-02-11 19:16 ` H.J. Lu
2015-02-11 19:25 ` Rich Felker
2015-02-11 19:34 ` H.J. Lu
2015-02-11 19:47 ` Rich Felker
2015-02-11 19:57 ` H.J. Lu
2015-02-11 20:15 ` Andy Lutomirski
2015-02-12 15:50 ` Catalin Marinas
2015-02-12 16:13 ` Rich Felker
2015-02-12 16:30 ` H.J. Lu
2015-02-12 17:00 ` Rich Felker
2015-02-11 21:41 ` Joseph Myers
2015-02-11 19:04 ` Josiah Worcester
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150213173345.GA26217@e104818-lin.cambridge.arm.com \
--to=catalin.marinas@arm.com \
--cc=Marcus.Shawcroft@arm.com \
--cc=apinski@cavium.com \
--cc=arnd@arndb.de \
--cc=dalias@libc.org \
--cc=libc-alpha@sourceware.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=musl@lists.openwall.com \
--cc=pinskia@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).