Eliminating preference for avoiding thread pointer? Cost on MIPS?

mailing list of musl libc
 help / color / mirror / code / Atom feed

* Eliminating preference for avoiding thread pointer? Cost on MIPS?
@ 2015-05-16  3:55 Rich Felker
  2015-05-16  6:19 ` Rich Felker
  2015-05-16 16:33 ` Isaac Dunham
  0 siblings, 2 replies; 7+ messages in thread
From: Rich Felker @ 2015-05-16  3:55 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 3203 bytes --]

Traditionally, musl has gone to pretty great lengths to avoid
depending on the thread pointer. The original reason was that it was
not always initialized, and when it was, the init was lazy. This
resulted in a lot of cruft, where we would have lots of constructs of
the form:

	bar = some_predicate ? __pthread_self()->foo : global_foo

or similar. Being that these predicates depend(ed) on globals, they
were/are rather expensive in position-independent code on most archs.
Now that the thread pointer is always initialized at startup (since
1.1.0) and assumed to have succeeded (since 1.1.9; musl now performs
HCF if it fails), this seems to be an unnecessary cost. Not only does
it cost cycles; it also has a complexity cost in terms of code to
maintain the state of the predicates (e.g. the atomics for locale
state) and in terms of libc-internal assumptions. So I'd like to just
use the thread pointer directly wherever it makes sense, and take
advantage of the fact that we have it.

Unfortunately, there's one arch where thread-pointer access may be
prohibitively costly: old MIPS. On the MIPS o32 ABI, the thread
pointer is accessed via the "rdhwr $3,$29" instruction, which was only
introduced in MIPS32rev2. MIPS-I, MIPS-II, and possibly the original
MIPS32 lack it, and while Linux has a "fast path" trap to emulate it,
I'm not clear on how "fast" it is.

First, I'd like to find out how slow this trap is. If it's something
like 150 cycles, that's ugly but probably acceptable. If it's more
like 1000 cycles, that's a big problem. If anyone can run the attached
test program on real MIPS-I or MIPS-II hardware and give me the
results, please do! Compile it once with -O3 -DDO_RDHWR and once with
just -O3 and send the (one-line) output of both to the list. It
doesn't matter what libc your MIPS system is using -- any should be
fine, but you might need to link with -lrt on glibc or uclibc.

Now, depending on the results, we have 2 options:

1. If rdhwr emulation on old MIPS is not horribly slow, just do the
   unconditional thread-pointer usage with no MIPS-specific changes.

2. If introducing rdhwr all over the place on old MIPS would be a
   serious performance regression, we take advantage of the fact that
   we're not using compiler-generate TLS access (which would emit
   rdhwr instructions) in musl. We control the definition of
   __pthread_self(), which musl uses internally to get the thread
   pointer (adjusted to point to the pthread structure), so when
   compiling code that might run on old MIPS (according to -march
   settings and the resulting predefined macros), we can define
   __pthread_self() to an expression or function that first checks a
   global to see if process is multi-threaded, and if not, just reads
   the thread pointer from a global instead of using rdhwr. Basically,
   this would be keeping the same way we're doing things now, but
   tucking it away as an old-MIPS-specific hack and encapsulating it
   in __pthread_self() rather than having it in every caller.

So I think, whatever the performance results end up being, we have an
acceptable path forward to use the (possibly virtual) thread pointer
unconditionally throughout musl.

Rich

[-- Attachment #2: mips_rdhwr.c --]
[-- Type: text/plain, Size: 559 bytes --]

#include <time.h>
#include <stdio.h>

int main()
{
	struct timespec t0, t;
	unsigned i, x=0;
	clock_gettime(CLOCK_REALTIME, &t0);
	for (i=0; i<1000000; i++) {
		register void *tp __asm__("$3");
#ifdef DO_RDHWR
		__asm__ __volatile__(".word 0x7c03e83b" : "=r"(tp));
#else
		__asm__ __volatile__("move %0,$0" : "=r"(tp));
#endif
		x += (unsigned)tp;
	}
	clock_gettime(CLOCK_REALTIME, &t);
	t.tv_sec -= t0.tv_sec;
	if ((t.tv_nsec -= t0.tv_nsec) < 0) {
		t.tv_nsec += 1000000000;
		t.tv_sec--;
	}
	printf("%u %lld.%.9ld\n", x, (long long)t.tv_sec, t.tv_nsec);
}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Eliminating preference for avoiding thread pointer? Cost on MIPS?
  2015-05-16  3:55 Eliminating preference for avoiding thread pointer? Cost on MIPS? Rich Felker
@ 2015-05-16  6:19 ` Rich Felker
  2015-05-16 16:33 ` Isaac Dunham
  1 sibling, 0 replies; 7+ messages in thread
From: Rich Felker @ 2015-05-16  6:19 UTC (permalink / raw)
  To: musl

On Fri, May 15, 2015 at 11:55:44PM -0400, Rich Felker wrote:
> Traditionally, musl has gone to pretty great lengths to avoid
> depending on the thread pointer. The original reason was that it was
> not always initialized, and when it was, the init was lazy. This
> resulted in a lot of cruft, where we would have lots of constructs of
> the form:
> 
> 	bar = some_predicate ? __pthread_self()->foo : global_foo
> 
> or similar. Being that these predicates depend(ed) on globals, they
> were/are rather expensive in position-independent code on most archs.
> Now that the thread pointer is always initialized at startup (since
> 1.1.0) and assumed to have succeeded (since 1.1.9; musl now performs
> HCF if it fails), this seems to be an unnecessary cost. Not only does
> it cost cycles; it also has a complexity cost in terms of code to
> maintain the state of the predicates (e.g. the atomics for locale
> state) and in terms of libc-internal assumptions. So I'd like to just
> use the thread pointer directly wherever it makes sense, and take
> advantage of the fact that we have it.
> 
> [...]
> 
> So I think, whatever the performance results end up being, we have an
> acceptable path forward to use the (possibly virtual) thread pointer
> unconditionally throughout musl.

Just committed the first stage of this work:

http://git.musl-libc.org/cgit/musl/commit/?id=68630b55c0c7219fe9df70dc28ffbf9efc8021d8

If there are old-MIPS performance regressions we'll fix them at the
arch level rather than keeping this cruft all over the source.

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Eliminating preference for avoiding thread pointer? Cost on MIPS?
  2015-05-16  3:55 Eliminating preference for avoiding thread pointer? Cost on MIPS? Rich Felker
  2015-05-16  6:19 ` Rich Felker
@ 2015-05-16 16:33 ` Isaac Dunham
  2015-05-16 16:48   ` Rich Felker
  1 sibling, 1 reply; 7+ messages in thread
From: Isaac Dunham @ 2015-05-16 16:33 UTC (permalink / raw)
  To: musl

On Fri, May 15, 2015 at 11:55:44PM -0400, Rich Felker wrote:
> Traditionally, musl has gone to pretty great lengths to avoid
> depending on the thread pointer. The original reason was that it was
> not always initialized, and when it was, the init was lazy. This
> resulted in a lot of cruft, where we would have lots of constructs of
> the form:
> 
> 	bar = some_predicate ? __pthread_self()->foo : global_foo
> 
> or similar. Being that these predicates depend(ed) on globals, they
> were/are rather expensive in position-independent code on most archs.
> Now that the thread pointer is always initialized at startup (since
> 1.1.0) and assumed to have succeeded (since 1.1.9; musl now performs
> HCF if it fails), this seems to be an unnecessary cost. Not only does
> it cost cycles; it also has a complexity cost in terms of code to
> maintain the state of the predicates (e.g. the atomics for locale
> state) and in terms of libc-internal assumptions. So I'd like to just
> use the thread pointer directly wherever it makes sense, and take
> advantage of the fact that we have it.
> 
> Unfortunately, there's one arch where thread-pointer access may be
> prohibitively costly: old MIPS. On the MIPS o32 ABI, the thread
> pointer is accessed via the "rdhwr $3,$29" instruction, which was only
> introduced in MIPS32rev2. MIPS-I, MIPS-II, and possibly the original
> MIPS32 lack it, and while Linux has a "fast path" trap to emulate it,
> I'm not clear on how "fast" it is.
> 
> First, I'd like to find out how slow this trap is. If it's something
> like 150 cycles, that's ugly but probably acceptable. If it's more
> like 1000 cycles, that's a big problem. If anyone can run the attached
> test program on real MIPS-I or MIPS-II hardware and give me the
> results, please do! Compile it once with -O3 -DDO_RDHWR and once with
> just -O3 and send the (one-line) output of both to the list. It
> doesn't matter what libc your MIPS system is using -- any should be
> fine, but you might need to link with -lrt on glibc or uclibc.

dd-wrt micro on a WRT54Gv8.0:
\u@\h:\w\$ cat /proc/version
Linux version 2.4.37 (root@dd-wrt) (gcc version 3.4.6 (OpenWrt-2.0)) #13303 Thu Aug 12 04:47:54 CEST 2010
\u@\h:\w\$ wget http://192.168.2.114:8080/def-bin
Connecting to 192.168.2.114:8080 (192.168.2.114:8080)
\u@\h:\w\$ echo *
def-bin
\u@\h:\w\$ chmod +x def-bin
\u@\h:\w\$ ./def-bin
0 0.016751000
\u@\h:\w\$ wget http://192.168.2.114:8080/rd-bin
Connecting to 192.168.2.114:8080 (192.168.2.114:8080)
\u@\h:\w\$ chmod +x rd-bin
\u@\h:\w\$ ./rd-bin
Illegal instruction

def-bin is withou -DDO_RDHWR, rd-bin is with.
Both compiled static with musl 1.1.6 (because that's the latest musl-cross
toolchain) and stripped.

free reports 448 kb of 5736 kb free. (In other words, there's a reason it's
that stripped down.)

Thanks,
Isaac Dunham


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Eliminating preference for avoiding thread pointer? Cost on MIPS?
  2015-05-16 16:33 ` Isaac Dunham
@ 2015-05-16 16:48   ` Rich Felker
  2015-05-18 19:35     ` Andre McCurdy
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2015-05-16 16:48 UTC (permalink / raw)
  To: musl

On Sat, May 16, 2015 at 09:33:20AM -0700, Isaac Dunham wrote:
> On Fri, May 15, 2015 at 11:55:44PM -0400, Rich Felker wrote:
> > Traditionally, musl has gone to pretty great lengths to avoid
> > depending on the thread pointer. The original reason was that it was
> > not always initialized, and when it was, the init was lazy. This
> > resulted in a lot of cruft, where we would have lots of constructs of
> > the form:
> > 
> > 	bar = some_predicate ? __pthread_self()->foo : global_foo
> > 
> > or similar. Being that these predicates depend(ed) on globals, they
> > were/are rather expensive in position-independent code on most archs.
> > Now that the thread pointer is always initialized at startup (since
> > 1.1.0) and assumed to have succeeded (since 1.1.9; musl now performs
> > HCF if it fails), this seems to be an unnecessary cost. Not only does
> > it cost cycles; it also has a complexity cost in terms of code to
> > maintain the state of the predicates (e.g. the atomics for locale
> > state) and in terms of libc-internal assumptions. So I'd like to just
> > use the thread pointer directly wherever it makes sense, and take
> > advantage of the fact that we have it.
> > 
> > Unfortunately, there's one arch where thread-pointer access may be
> > prohibitively costly: old MIPS. On the MIPS o32 ABI, the thread
> > pointer is accessed via the "rdhwr $3,$29" instruction, which was only
> > introduced in MIPS32rev2. MIPS-I, MIPS-II, and possibly the original
> > MIPS32 lack it, and while Linux has a "fast path" trap to emulate it,
> > I'm not clear on how "fast" it is.
> > 
> > First, I'd like to find out how slow this trap is. If it's something
> > like 150 cycles, that's ugly but probably acceptable. If it's more
> > like 1000 cycles, that's a big problem. If anyone can run the attached
> > test program on real MIPS-I or MIPS-II hardware and give me the
> > results, please do! Compile it once with -O3 -DDO_RDHWR and once with
> > just -O3 and send the (one-line) output of both to the list. It
> > doesn't matter what libc your MIPS system is using -- any should be
> > fine, but you might need to link with -lrt on glibc or uclibc.
> 
> dd-wrt micro on a WRT54Gv8.0:
> \u@\h:\w\$ cat /proc/version
> Linux version 2.4.37 (root@dd-wrt) (gcc version 3.4.6 (OpenWrt-2.0)) #13303 Thu Aug 12 04:47:54 CEST 2010
> \u@\h:\w\$ wget http://192.168.2.114:8080/def-bin
> Connecting to 192.168.2.114:8080 (192.168.2.114:8080)
> \u@\h:\w\$ echo *
> def-bin
> \u@\h:\w\$ chmod +x def-bin
> \u@\h:\w\$ ./def-bin
> 0 0.016751000
> \u@\h:\w\$ wget http://192.168.2.114:8080/rd-bin
> Connecting to 192.168.2.114:8080 (192.168.2.114:8080)
> \u@\h:\w\$ chmod +x rd-bin
> \u@\h:\w\$ ./rd-bin
> Illegal instruction
> 
> def-bin is withou -DDO_RDHWR, rd-bin is with.
> Both compiled static with musl 1.1.6 (because that's the latest musl-cross
> toolchain) and stripped.
> 
> free reports 448 kb of 5736 kb free. (In other words, there's a reason it's
> that stripped down.)

Bleh, it looks like they intentionally broke their kernel to save a
few bytes... I don't think it's possible to support such
configurations, at least not reasonably.

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Eliminating preference for avoiding thread pointer? Cost on MIPS?
  2015-05-16 16:48   ` Rich Felker
@ 2015-05-18 19:35     ` Andre McCurdy
  2015-05-18 20:16       ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Andre McCurdy @ 2015-05-18 19:35 UTC (permalink / raw)
  To: musl

On Sat, May 16, 2015 at 9:48 AM, Rich Felker <dalias@libc.org> wrote:
> On Sat, May 16, 2015 at 09:33:20AM -0700, Isaac Dunham wrote:
>> On Fri, May 15, 2015 at 11:55:44PM -0400, Rich Felker wrote:
>> > Traditionally, musl has gone to pretty great lengths to avoid
>> > depending on the thread pointer. The original reason was that it was
>> > not always initialized, and when it was, the init was lazy. This
>> > resulted in a lot of cruft, where we would have lots of constructs of
>> > the form:
>> >
>> >     bar = some_predicate ? __pthread_self()->foo : global_foo
>> >
>> > or similar. Being that these predicates depend(ed) on globals, they
>> > were/are rather expensive in position-independent code on most archs.
>> > Now that the thread pointer is always initialized at startup (since
>> > 1.1.0) and assumed to have succeeded (since 1.1.9; musl now performs
>> > HCF if it fails), this seems to be an unnecessary cost. Not only does
>> > it cost cycles; it also has a complexity cost in terms of code to
>> > maintain the state of the predicates (e.g. the atomics for locale
>> > state) and in terms of libc-internal assumptions. So I'd like to just
>> > use the thread pointer directly wherever it makes sense, and take
>> > advantage of the fact that we have it.
>> >
>> > Unfortunately, there's one arch where thread-pointer access may be
>> > prohibitively costly: old MIPS. On the MIPS o32 ABI, the thread
>> > pointer is accessed via the "rdhwr $3,$29" instruction, which was only
>> > introduced in MIPS32rev2. MIPS-I, MIPS-II, and possibly the original
>> > MIPS32 lack it, and while Linux has a "fast path" trap to emulate it,
>> > I'm not clear on how "fast" it is.
>> >
>> > First, I'd like to find out how slow this trap is. If it's something
>> > like 150 cycles, that's ugly but probably acceptable. If it's more
>> > like 1000 cycles, that's a big problem. If anyone can run the attached
>> > test program on real MIPS-I or MIPS-II hardware and give me the
>> > results, please do! Compile it once with -O3 -DDO_RDHWR and once with
>> > just -O3 and send the (one-line) output of both to the list. It
>> > doesn't matter what libc your MIPS system is using -- any should be
>> > fine, but you might need to link with -lrt on glibc or uclibc.
>>
>> dd-wrt micro on a WRT54Gv8.0:
>> \u@\h:\w\$ cat /proc/version
>> Linux version 2.4.37 (root@dd-wrt) (gcc version 3.4.6 (OpenWrt-2.0)) #13303 Thu Aug 12 04:47:54 CEST 2010

It looks like rdhwr emulation was first added in linux 2.6.15, so
2.4.37 is likely too old to run this test?

>> \u@\h:\w\$ wget http://192.168.2.114:8080/def-bin
>> Connecting to 192.168.2.114:8080 (192.168.2.114:8080)
>> \u@\h:\w\$ echo *
>> def-bin
>> \u@\h:\w\$ chmod +x def-bin
>> \u@\h:\w\$ ./def-bin
>> 0 0.016751000
>> \u@\h:\w\$ wget http://192.168.2.114:8080/rd-bin
>> Connecting to 192.168.2.114:8080 (192.168.2.114:8080)
>> \u@\h:\w\$ chmod +x rd-bin
>> \u@\h:\w\$ ./rd-bin
>> Illegal instruction
>>
>> def-bin is withou -DDO_RDHWR, rd-bin is with.
>> Both compiled static with musl 1.1.6 (because that's the latest musl-cross
>> toolchain) and stripped.
>>
>> free reports 448 kb of 5736 kb free. (In other words, there's a reason it's
>> that stripped down.)
>
> Bleh, it looks like they intentionally broke their kernel to save a
> few bytes... I don't think it's possible to support such
> configurations, at least not reasonably.
> Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Eliminating preference for avoiding thread pointer? Cost on MIPS?
  2015-05-18 19:35     ` Andre McCurdy
@ 2015-05-18 20:16       ` Rich Felker
  2015-05-18 20:20         ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2015-05-18 20:16 UTC (permalink / raw)
  To: musl

On Mon, May 18, 2015 at 12:35:55PM -0700, Andre McCurdy wrote:
> On Sat, May 16, 2015 at 9:48 AM, Rich Felker <dalias@libc.org> wrote:
> > On Sat, May 16, 2015 at 09:33:20AM -0700, Isaac Dunham wrote:
> >> On Fri, May 15, 2015 at 11:55:44PM -0400, Rich Felker wrote:
> >> > Traditionally, musl has gone to pretty great lengths to avoid
> >> > depending on the thread pointer. The original reason was that it was
> >> > not always initialized, and when it was, the init was lazy. This
> >> > resulted in a lot of cruft, where we would have lots of constructs of
> >> > the form:
> >> >
> >> >     bar = some_predicate ? __pthread_self()->foo : global_foo
> >> >
> >> > or similar. Being that these predicates depend(ed) on globals, they
> >> > were/are rather expensive in position-independent code on most archs.
> >> > Now that the thread pointer is always initialized at startup (since
> >> > 1.1.0) and assumed to have succeeded (since 1.1.9; musl now performs
> >> > HCF if it fails), this seems to be an unnecessary cost. Not only does
> >> > it cost cycles; it also has a complexity cost in terms of code to
> >> > maintain the state of the predicates (e.g. the atomics for locale
> >> > state) and in terms of libc-internal assumptions. So I'd like to just
> >> > use the thread pointer directly wherever it makes sense, and take
> >> > advantage of the fact that we have it.
> >> >
> >> > Unfortunately, there's one arch where thread-pointer access may be
> >> > prohibitively costly: old MIPS. On the MIPS o32 ABI, the thread
> >> > pointer is accessed via the "rdhwr $3,$29" instruction, which was only
> >> > introduced in MIPS32rev2. MIPS-I, MIPS-II, and possibly the original
> >> > MIPS32 lack it, and while Linux has a "fast path" trap to emulate it,
> >> > I'm not clear on how "fast" it is.
> >> >
> >> > First, I'd like to find out how slow this trap is. If it's something
> >> > like 150 cycles, that's ugly but probably acceptable. If it's more
> >> > like 1000 cycles, that's a big problem. If anyone can run the attached
> >> > test program on real MIPS-I or MIPS-II hardware and give me the
> >> > results, please do! Compile it once with -O3 -DDO_RDHWR and once with
> >> > just -O3 and send the (one-line) output of both to the list. It
> >> > doesn't matter what libc your MIPS system is using -- any should be
> >> > fine, but you might need to link with -lrt on glibc or uclibc.
> >>
> >> dd-wrt micro on a WRT54Gv8.0:
> >> \u@\h:\w\$ cat /proc/version
> >> Linux version 2.4.37 (root@dd-wrt) (gcc version 3.4.6 (OpenWrt-2.0)) #13303 Thu Aug 12 04:47:54 CEST 2010
> 
> It looks like rdhwr emulation was first added in linux 2.6.15, so
> 2.4.37 is likely too old to run this test?

Ah yes, that would explain it. Linux 2.4 is pre-NPTL and really
doesn't have any of the stuff needed to support threads. I could look
and see if LinuxThreads might have had any practical way to do TLS for
2.4 though; this may give us a fallback for accessing TLS quickly on
MIPS-I and MIPS-II.

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Eliminating preference for avoiding thread pointer? Cost on MIPS?
  2015-05-18 20:16       ` Rich Felker
@ 2015-05-18 20:20         ` Rich Felker
  0 siblings, 0 replies; 7+ messages in thread
From: Rich Felker @ 2015-05-18 20:20 UTC (permalink / raw)
  To: musl

On Mon, May 18, 2015 at 04:16:20PM -0400, Rich Felker wrote:
> On Mon, May 18, 2015 at 12:35:55PM -0700, Andre McCurdy wrote:
> > On Sat, May 16, 2015 at 9:48 AM, Rich Felker <dalias@libc.org> wrote:
> > > On Sat, May 16, 2015 at 09:33:20AM -0700, Isaac Dunham wrote:
> > >> On Fri, May 15, 2015 at 11:55:44PM -0400, Rich Felker wrote:
> > >> > Traditionally, musl has gone to pretty great lengths to avoid
> > >> > depending on the thread pointer. The original reason was that it was
> > >> > not always initialized, and when it was, the init was lazy. This
> > >> > resulted in a lot of cruft, where we would have lots of constructs of
> > >> > the form:
> > >> >
> > >> >     bar = some_predicate ? __pthread_self()->foo : global_foo
> > >> >
> > >> > or similar. Being that these predicates depend(ed) on globals, they
> > >> > were/are rather expensive in position-independent code on most archs.
> > >> > Now that the thread pointer is always initialized at startup (since
> > >> > 1.1.0) and assumed to have succeeded (since 1.1.9; musl now performs
> > >> > HCF if it fails), this seems to be an unnecessary cost. Not only does
> > >> > it cost cycles; it also has a complexity cost in terms of code to
> > >> > maintain the state of the predicates (e.g. the atomics for locale
> > >> > state) and in terms of libc-internal assumptions. So I'd like to just
> > >> > use the thread pointer directly wherever it makes sense, and take
> > >> > advantage of the fact that we have it.
> > >> >
> > >> > Unfortunately, there's one arch where thread-pointer access may be
> > >> > prohibitively costly: old MIPS. On the MIPS o32 ABI, the thread
> > >> > pointer is accessed via the "rdhwr $3,$29" instruction, which was only
> > >> > introduced in MIPS32rev2. MIPS-I, MIPS-II, and possibly the original
> > >> > MIPS32 lack it, and while Linux has a "fast path" trap to emulate it,
> > >> > I'm not clear on how "fast" it is.
> > >> >
> > >> > First, I'd like to find out how slow this trap is. If it's something
> > >> > like 150 cycles, that's ugly but probably acceptable. If it's more
> > >> > like 1000 cycles, that's a big problem. If anyone can run the attached
> > >> > test program on real MIPS-I or MIPS-II hardware and give me the
> > >> > results, please do! Compile it once with -O3 -DDO_RDHWR and once with
> > >> > just -O3 and send the (one-line) output of both to the list. It
> > >> > doesn't matter what libc your MIPS system is using -- any should be
> > >> > fine, but you might need to link with -lrt on glibc or uclibc.
> > >>
> > >> dd-wrt micro on a WRT54Gv8.0:
> > >> \u@\h:\w\$ cat /proc/version
> > >> Linux version 2.4.37 (root@dd-wrt) (gcc version 3.4.6 (OpenWrt-2.0)) #13303 Thu Aug 12 04:47:54 CEST 2010
> > 
> > It looks like rdhwr emulation was first added in linux 2.6.15, so
> > 2.4.37 is likely too old to run this test?
> 
> Ah yes, that would explain it. Linux 2.4 is pre-NPTL and really
> doesn't have any of the stuff needed to support threads. I could look
> and see if LinuxThreads might have had any practical way to do TLS for
> 2.4 though; this may give us a fallback for accessing TLS quickly on
> MIPS-I and MIPS-II.

And nope -- later LinuxThreads used rdhwr; earlier did their hideous
hack of using the high bits of the stack pointer as a thread id and
means of locating the thread structure.

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-05-18 20:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-16  3:55 Eliminating preference for avoiding thread pointer? Cost on MIPS? Rich Felker
2015-05-16  6:19 ` Rich Felker
2015-05-16 16:33 ` Isaac Dunham
2015-05-16 16:48   ` Rich Felker
2015-05-18 19:35     ` Andre McCurdy
2015-05-18 20:16       ` Rich Felker
2015-05-18 20:20         ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).