From mboxrd@z Thu Jan  1 00:00:00 1970
From: Linus Torvalds <torvalds@osdl.org>
To: Lucio De Re <lucio@proxima.alt.za>
Cc: 9fans@cse.psu.edu, dbailey27@ameritech.net
Subject: Re: [9fans] Re: Threads: Sewing badges of honor onto a Kernel
In-Reply-To: <20040227103130.E22848@cackle.proxima.alt.za>
Message-ID: <Pine.LNX.4.58.0402270132180.2563@ppc970.osdl.org>
References: <e0bca78fbd4995d1a912c40426757360@yourdomain.dom>
 <Pine.LNX.4.58.0402270002560.2563@ppc970.osdl.org> <20040227101110.E24932@cackle.proxima.alt.za>
 <64FBCAEA-68FD-11D8-B851-000A95B984D8@mightycheese.com>
 <20040227103130.E22848@cackle.proxima.alt.za>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Date: Fri, 27 Feb 2004 01:46:27 -0800
Topicbox-Message-UUID: fedfe8fe-eacc-11e9-9e20-41e7f4b1d025



On Fri, 27 Feb 2004, Lucio De Re wrote:
>
> On Fri, Feb 27, 2004 at 12:17:33AM -0800, Rob Pike wrote:
> >
> > On Feb 27, 2004, at 12:11 AM, Lucio De Re wrote:
> >
> > > Of course, I may be talking out of turn, but I really don't see
> > > how threads can have private space if the stack isn't private.
> >
> > well, perhaps the stack isn't the only place to do it, but it's
> > certainly an easy one, and one that makes the syscall interface
> > to fork easy to implement in a threaded environment: longjmp
> > to the private stack, fork, adjust, longjmp back.
> >
> But I can't think of even one possible alternative.  After all, the
> stack is the only storage being duplicated (ignoring registers) so
> where does one keep pointers to the private space?

Think it through. You should _not_ duplicate the stack (because that
wreaks havoc with your TLB and normal usage), so what do you have left?

Once you've eliminated the impossible, what you have left, however
improbable, is the truth.

Registers.

Why are you ignoring registers? That's what you _should_ use.

For example, inside the Linux kernel, we tend to use the stack POINTER as
the thread-local state. When we allocate a new context of execution, we
allocate (depending on architecture) 8kB of memory, and it's aligned so
that if the architecture doesn't have any other registers free, we can get
at the "thread_info" structure by just doing bit masking on the stack
pointer.

That ends up being quite powerful - and it's cheap too, exactly because it
is a register, and thus fast to access. The stack itself is by no means
private - other threads can access the stack.

In fact, we used to put the whole "struct task_struct" (which is the thing
that defines a context of execution in Linux) that way, but it ends up
doing nasty things to caches when important global data structures are all
aligned on powers-of-two boundaries, so we ended up getting rid of that.

In user space, that doesn't tend to work too well, because the stack isn't
as well bounded as in the kernel, but most architectures either have lots
of registers (and then one is just used for the thread-local pointer) or
even an architected register that user space can read but not write.

One of the most problematic architectures is the x86, which doesn't have
lots of general-purpose registers (so using one of them to point to TLS
would be bad), and doesn't have any nice architected register either.
There we ended up using a segment register, however much I hate them.

We could have just made a trivial system call ("get the thread-local
pointer" from the kernel stack), but obviously there are performance
issues here.

In short: there is absolutely no reason to make the stack be private. The
only thing you need for thread-local-storage is literally just one
register, to indirect through. And it can be a fairly strange one at that,
ie it doesn't need to be able to hold a full 32-bit value.

			Linus