From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: [PATCH] inline llsc atomics when compiling for sh4a
Date: Sun, 17 May 2015 22:34:02 -0400 [thread overview]
Message-ID: <20150518023402.GS17573@brightrain.aerifal.cx> (raw)
In-Reply-To: <20150517185516.GA32020@duality.lan>
On Sun, May 17, 2015 at 01:55:16PM -0500, Bobby Bingham wrote:
> If we're building for sh4a, the compiler is already free to use
> instructions only available on sh4a, so we can do the same and inline the
> llsc atomics. If we're building for an older processor, we still do the
> same runtime atomics selection as before.
Thanks! I think it's ok for commit as-is, but based on re-reading this
code I have some ideas for improving it that are orthogonal to this
change. See comments inline:
> ---
> arch/sh/atomic.h | 83 +++++++++++++++++++++++++++++++
> arch/sh/src/atomic.c | 135 +++++++++++++++++----------------------------------
> 2 files changed, 128 insertions(+), 90 deletions(-)
>
> diff --git a/arch/sh/atomic.h b/arch/sh/atomic.h
> index a1d22e4..f2e6dac 100644
> --- a/arch/sh/atomic.h
> +++ b/arch/sh/atomic.h
> @@ -22,6 +22,88 @@ static inline int a_ctz_64(uint64_t x)
> return a_ctz_l(y);
> }
>
> +#define LLSC_CLOBBERS "r0", "t", "memory"
> +#define LLSC_START(mem) "synco\n" \
> + "0: movli.l @" mem ", r0\n"
> +#define LLSC_END(mem) \
> + "1: movco.l r0, @" mem "\n" \
> + " bf 0b\n" \
> + " synco\n"
> +
> +static inline int __sh_cas_llsc(volatile int *p, int t, int s)
> +{
> + int old;
> + __asm__ __volatile__(
> + LLSC_START("%1")
> + " mov r0, %0\n"
> + " cmp/eq %0, %2\n"
> + " bf 1f\n"
> + " mov %3, r0\n"
> + LLSC_END("%1")
> + : "=&r"(old) : "r"(p), "r"(t), "r"(s) : LLSC_CLOBBERS);
> + return old;
> +}
The mov from r0 to %0 seems unnecessary here. Presumably it's because
you didn't have a constraint to force old to come out via r0. Could
you do the following?
static inline int __sh_cas_llsc(volatile int *p, int t, int s)
{
register int old __asm__("r0");
__asm__ __volatile__(
LLSC_START("%1")
" cmp/eq r0, %2\n"
" bf 1f\n"
" mov %3, r0\n"
LLSC_END("%1")
: "=&r"(old) : "r"(p), "r"(t), "r"(s) : "t", "memory");
return old;
}
Note that I had to drop LLSC_CLOBBERS because "r0" was in it and you
can't clobber an output register.
I've actually always wondered about the value of having the LLSC_*
macros. I usually prefer for the whole asm to be written out
explicitly and readable unless there's a compelling reason to wrap it
up in macros. Then it would look like:
static inline int __sh_cas_llsc(volatile int *p, int t, int s)
{
register int old __asm__("r0");
__asm__ __volatile__(
" synco\n"
"0: movli.l @%1, r0\n"
" cmp/eq r0, %2\n"
" bf 1f\n"
" mov %3, r0\n"
"1: movco.l r0, @%1\n"
" bf 0b\n"
" synco\n"
: "=&r"(old) : "r"(p), "r"(t), "r"(s) : "t", "memory");
return old;
}
and similar for other functions. Part of the motivation of not hiding
the outer logic in macros is that it might make it possible to
fold/simplify some special cases like I did above for CAS.
Another idea is letting the compiler simplify, with something like the
following, which could actually be used cross-platform for all
llsc-type archs:
static inline int __sh_cas_llsc(volatile int *p, int t, int s)
{
do old = llsc_start(p);
while (*p == t && !llsc_end(p, s));
return old;
}
Here llsc_start and llsc_end would be inline functions using asm with
appropriate constraints. Unfortunately I don't see a way to model
using the value of the truth flag "t" as the output of the asm for
llsc_end, though. I suspect this would be a problem on a number of
other archs too; the asm would have to waste an instruction (or
several) converting the flag to an integer. Unless there's a solution
to that problem, it makes an approach like this less appealing.
> int __sh_cas(volatile int *p, int t, int s)
> {
> + if (__hwcap & CPU_HAS_LLSC) return __sh_cas_llsc(p, t, s);
> +
> int old;
> - if (__hwcap & CPU_HAS_LLSC) {
> - __asm__ __volatile__(
> - LLSC_START("%1")
> - " mov r0, %0\n"
> - " cmp/eq %0, %2\n"
> - " bf 1f\n"
> - " mov %3, r0\n"
> - LLSC_END("%1")
> - : "=&r"(old) : "r"(p), "r"(t), "r"(s) : LLSC_CLOBBERS);
> - } else {
> - __asm__ __volatile__(
> - GUSA_START_EVEN("%1", "%0")
> - " cmp/eq %0, %2\n"
> - " bf 1f\n"
> - GUSA_END("%1", "%3")
> - : "=&r"(old) : "r"(p), "r"(t), "r"(s) : GUSA_CLOBBERS, "t");
> - }
> + __asm__ __volatile__(
> + GUSA_START_EVEN("%1", "%0")
> + " cmp/eq %0, %2\n"
> + " bf 1f\n"
> + GUSA_END("%1", "%3")
> + : "=&r"(old) : "r"(p), "r"(t), "r"(s) : GUSA_CLOBBERS, "t");
> return old;
> }
For the GUSA stuff, do you really nexed the ODD/EVEN macros? I think
you could just add appropriate .align inside that would cause the
assembler to insert a nop if necessary.
Rich
next prev parent reply other threads:[~2015-05-18 2:34 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-17 18:55 Bobby Bingham
2015-05-18 2:34 ` Rich Felker [this message]
2015-05-18 4:28 ` Rich Felker
2015-05-18 22:56 ` Bobby Bingham
2015-05-19 0:30 ` Rich Felker
2015-05-19 2:12 ` Rich Felker
2015-05-19 4:52 ` Rich Felker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150518023402.GS17573@brightrain.aerifal.cx \
--to=dalias@libc.org \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).