mailing list of musl libc
 help / color / mirror / code / Atom feed
* musl sh2 support
@ 2015-04-27 21:36 Rich Felker
  2015-04-28  0:59 ` Isaac Dunham
  0 siblings, 1 reply; 3+ messages in thread
From: Rich Felker @ 2015-04-27 21:36 UTC (permalink / raw)
  To: musl; +Cc: yuri.nunami, sumpei.kawasaki

Recently nsz and I have been looking at the state of the sh port and
noticed that the gusa soft atomics, which Bobby Bingham (original port
author) and I assumed would be sufficient for anything pre-sh4a,
actually don't work on pre-sh3 targets. This is explained on the GCC
bug-tracker threads here:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50457

but the TL;DR is that gusa works by setting an invalid stack pointer
as a sentinel to the kernel whereas sh1/sh2 exception-handling
requires a valid stack pointer. This issue may also affect __unmapself
which runs momentarily (roughly 1-2 cycles in userspace) without a
valid stack pointer. For non-SMP configurations I suspect it should
suffice for __unmapself to just set the stack pointer to point at some
global data for the kernel to use momentarily during exceptions.
Alternatively the first thread to call __unmapself could transform
into a reaper that never exits but unmaps future detached exiting
threads; this could even be a decent default C-only implementation of
__unmapself for archs/ABIs that can't handle threads unmapping their
own stacks.

Anyway, back to atomics. GCC introduced a new soft-tcb atomic model
that works like the old gusa but stores a flag (for the kernel to
inspect) indicating that an atomic sequence is in progress at a fixed
offset from the thread-pointer register, GBR. This offset has to be
aligned to 4 and in the range 0 to 1020. I can't find any
documentation on a default/ABI-accepted location for this flag,
though. The offsets that would be possible for musl to use immediately
are 0 and 4. These offsets are used by glibc to store the DTV pointer
and a pointer to the full thread structure; on musl they're unused but
kept to maintain the same TLS ABI used by the toolchain. So we could
use either of these, but the ABI would not be compatible with glibc,
which might be irrelevant since glibc will probably never support
sh1/sh2.

The other option is to use offset 8 by putting a TLS (.tdata section)
object in crt1.o to reserve the very first slot of application-owned
TLS for soft-tcb atomic use. Actual application TLS would then begin
at offset 12.

Offset -8 or -12 would be even better (sticking the flag in the end of
struct __pthread) but the GBR-relative addressing modes used don't
seem to support negative offsets.

In addition to the question of what to do with atomics, there's a
question of whether we need full runtime selection for the atomic
method at all. I've been told (but I'm not clear whether it's right)
that sh1/sh2(/sh2a?) have a different kernel syscall ABI, and since
they're nommu, it wouldn't be possible (or at least not efficiently)
to run normal dynamic-linked ELF binaries (where syscall ABI wouldn't
matter as long as you have the right libc.so installed on the system
you're running on) for sh3+ on sh1/2. So it might make sense to treat
sh1/sh2 as a separate arch for musl's purposes. But if this arch will
possibly have SMP implementations (e.g. running on sh4a or new tech)
then soft-tcb atomics will not suffice and it might need its own
method of runtime-atomic-selection to get a working atomic cas.

Ideas?

Rich


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: musl sh2 support
  2015-04-27 21:36 musl sh2 support Rich Felker
@ 2015-04-28  0:59 ` Isaac Dunham
  2015-04-28  1:08   ` Rich Felker
  0 siblings, 1 reply; 3+ messages in thread
From: Isaac Dunham @ 2015-04-28  0:59 UTC (permalink / raw)
  To: musl; +Cc: yuri.nunami, sumpei.kawasaki

On Mon, Apr 27, 2015 at 05:36:03PM -0400, Rich Felker wrote:
> Recently nsz and I have been looking at the state of the sh port and
> noticed that the gusa soft atomics, which Bobby Bingham (original port
> author) and I assumed would be sufficient for anything pre-sh4a,
> actually don't work on pre-sh3 targets. This is explained on the GCC
> bug-tracker threads here:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50457
> 
> but the TL;DR is that gusa works by setting an invalid stack pointer
> as a sentinel to the kernel whereas sh1/sh2 exception-handling
> requires a valid stack pointer. This issue may also affect __unmapself
> which runs momentarily (roughly 1-2 cycles in userspace) without a
> valid stack pointer. For non-SMP configurations I suspect it should
> suffice for __unmapself to just set the stack pointer to point at some
> global data for the kernel to use momentarily during exceptions.
> Alternatively the first thread to call __unmapself could transform
> into a reaper that never exits but unmaps future detached exiting
> threads; this could even be a decent default C-only implementation of
> __unmapself for archs/ABIs that can't handle threads unmapping their
> own stacks.
 
Heads up: Rob Landley's current work project involves bringing up the
software for a new sh2-compatible chip, the J2 (with BSD-licensed VHDL).
The latest post on his blog refers to SMP support being "nearly" ready
(could be done by now, or might not).

> In addition to the question of what to do with atomics, there's a
> question of whether we need full runtime selection for the atomic
> method at all. I've been told (but I'm not clear whether it's right)
> that sh1/sh2(/sh2a?) have a different kernel syscall ABI, and since
> they're nommu, it wouldn't be possible (or at least not efficiently)
> to run normal dynamic-linked ELF binaries (where syscall ABI wouldn't
> matter as long as you have the right libc.so installed on the system
> you're running on) for sh3+ on sh1/2. So it might make sense to treat
> sh1/sh2 as a separate arch for musl's purposes. But if this arch will
> possibly have SMP implementations (e.g. running on sh4a or new tech)
> then soft-tcb atomics will not suffice and it might need its own
> method of runtime-atomic-selection to get a working atomic cas.

If the J2 gets done, you will have smp and sh2.

HTH,
Isaac Dunham


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: musl sh2 support
  2015-04-28  0:59 ` Isaac Dunham
@ 2015-04-28  1:08   ` Rich Felker
  0 siblings, 0 replies; 3+ messages in thread
From: Rich Felker @ 2015-04-28  1:08 UTC (permalink / raw)
  To: Isaac Dunham; +Cc: musl, yuri.nunami, shumpei.kawasaki

On Mon, Apr 27, 2015 at 05:59:42PM -0700, Isaac Dunham wrote:
> On Mon, Apr 27, 2015 at 05:36:03PM -0400, Rich Felker wrote:
> > Recently nsz and I have been looking at the state of the sh port and
> > noticed that the gusa soft atomics, which Bobby Bingham (original port
> > author) and I assumed would be sufficient for anything pre-sh4a,
> > actually don't work on pre-sh3 targets. This is explained on the GCC
> > bug-tracker threads here:
> > 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=50457
> > 
> > but the TL;DR is that gusa works by setting an invalid stack pointer
> > as a sentinel to the kernel whereas sh1/sh2 exception-handling
> > requires a valid stack pointer. This issue may also affect __unmapself
> > which runs momentarily (roughly 1-2 cycles in userspace) without a
> > valid stack pointer. For non-SMP configurations I suspect it should
> > suffice for __unmapself to just set the stack pointer to point at some
> > global data for the kernel to use momentarily during exceptions.
> > Alternatively the first thread to call __unmapself could transform
> > into a reaper that never exits but unmaps future detached exiting
> > threads; this could even be a decent default C-only implementation of
> > __unmapself for archs/ABIs that can't handle threads unmapping their
> > own stacks.
>  
> Heads up: Rob Landley's current work project involves bringing up the
> software for a new sh2-compatible chip, the J2 (with BSD-licensed VHDL).
> The latest post on his blog refers to SMP support being "nearly" ready
> (could be done by now, or might not).

Yes. I actually CC'd a couple of the people working on this. I don't
know all the details of their project though.

> > In addition to the question of what to do with atomics, there's a
> > question of whether we need full runtime selection for the atomic
> > method at all. I've been told (but I'm not clear whether it's right)
> > that sh1/sh2(/sh2a?) have a different kernel syscall ABI, and since
> > they're nommu, it wouldn't be possible (or at least not efficiently)
> > to run normal dynamic-linked ELF binaries (where syscall ABI wouldn't
> > matter as long as you have the right libc.so installed on the system
> > you're running on) for sh3+ on sh1/2. So it might make sense to treat
> > sh1/sh2 as a separate arch for musl's purposes. But if this arch will
> > possibly have SMP implementations (e.g. running on sh4a or new tech)
> > then soft-tcb atomics will not suffice and it might need its own
> > method of runtime-atomic-selection to get a working atomic cas.
> 
> If the J2 gets done, you will have smp and sh2.

In that case we need to figure out how to make atomics work in a way
that's SMP-compatible. I'm a bit concerned from some things Rob said
that it might not be possible without a machine-wide lock with the
current design...

Rich


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-04-28  1:08 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-27 21:36 musl sh2 support Rich Felker
2015-04-28  0:59 ` Isaac Dunham
2015-04-28  1:08   ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).