From mboxrd@z Thu Jan 1 00:00:00 1970 From: Linus Torvalds To: Lucio De Re Cc: 9fans@cse.psu.edu, dbailey27@ameritech.net Subject: Re: [9fans] Re: Threads: Sewing badges of honor onto a Kernel In-Reply-To: <20040227103130.E22848@cackle.proxima.alt.za> Message-ID: References: <20040227101110.E24932@cackle.proxima.alt.za> <64FBCAEA-68FD-11D8-B851-000A95B984D8@mightycheese.com> <20040227103130.E22848@cackle.proxima.alt.za> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Date: Fri, 27 Feb 2004 01:46:27 -0800 Topicbox-Message-UUID: fedfe8fe-eacc-11e9-9e20-41e7f4b1d025 On Fri, 27 Feb 2004, Lucio De Re wrote: > > On Fri, Feb 27, 2004 at 12:17:33AM -0800, Rob Pike wrote: > > > > On Feb 27, 2004, at 12:11 AM, Lucio De Re wrote: > > > > > Of course, I may be talking out of turn, but I really don't see > > > how threads can have private space if the stack isn't private. > > > > well, perhaps the stack isn't the only place to do it, but it's > > certainly an easy one, and one that makes the syscall interface > > to fork easy to implement in a threaded environment: longjmp > > to the private stack, fork, adjust, longjmp back. > > > But I can't think of even one possible alternative. After all, the > stack is the only storage being duplicated (ignoring registers) so > where does one keep pointers to the private space? Think it through. You should _not_ duplicate the stack (because that wreaks havoc with your TLB and normal usage), so what do you have left? Once you've eliminated the impossible, what you have left, however improbable, is the truth. Registers. Why are you ignoring registers? That's what you _should_ use. For example, inside the Linux kernel, we tend to use the stack POINTER as the thread-local state. When we allocate a new context of execution, we allocate (depending on architecture) 8kB of memory, and it's aligned so that if the architecture doesn't have any other registers free, we can get at the "thread_info" structure by just doing bit masking on the stack pointer. That ends up being quite powerful - and it's cheap too, exactly because it is a register, and thus fast to access. The stack itself is by no means private - other threads can access the stack. In fact, we used to put the whole "struct task_struct" (which is the thing that defines a context of execution in Linux) that way, but it ends up doing nasty things to caches when important global data structures are all aligned on powers-of-two boundaries, so we ended up getting rid of that. In user space, that doesn't tend to work too well, because the stack isn't as well bounded as in the kernel, but most architectures either have lots of registers (and then one is just used for the thread-local pointer) or even an architected register that user space can read but not write. One of the most problematic architectures is the x86, which doesn't have lots of general-purpose registers (so using one of them to point to TLS would be bad), and doesn't have any nice architected register either. There we ended up using a segment register, however much I hate them. We could have just made a trivial system call ("get the thread-local pointer" from the kernel stack), but obviously there are performance issues here. In short: there is absolutely no reason to make the stack be private. The only thing you need for thread-local-storage is literally just one register, to indirect through. And it can be a fairly strange one at that, ie it doesn't need to be able to hold a full 32-bit value. Linus