From mboxrd@z Thu Jan 1 00:00:00 1970 From: brantley@coraid.com (Brantley Coile) Date: Tue, 15 Nov 2005 08:30:43 -0500 Subject: [TUHS] Redoing "V6on286" or porting V7...? In-Reply-To: <20051115034024.GJ6574@gsmx07.alcatel.com.au> Message-ID: > On 2005-Nov-14 19:08:52 -0800, Greg Haerr wrote: >>> One >>> crucial difference is that Unix has the implicit assumption that the >>> stack is in the data space - which is not true on the 286. This >>> difference is fairly critical to Unix and makes it impossible to >>> accurately reproduce the traditional Unix memory protection. >> >>I don't understand this. If SS is set to DS, in any 16 bit mode, >>then doesn't this accomplish the accurate reproduction? I realize >>that a 32-bit mode would be required for limit checking. > > You can make SS and DS the same but this means that there's nothing > stopping the stack growing down into the heap or vice versa. This > makes the stack accessible from the data space but gives no protection > (note that I was referring to reproducing Unix protection). sorry this message is so long. i wish it was better written, but at least it's early in the day, and i've had my diet co-cola (as we say here in the south). this mess is why i made the suggestion just to use a full 64k for the data segments. but there is a way to get protection of a sort. Peter's right. setting ss==ds will work but it leaves the data segment unprotected. in protect mode, data segments can be configured to be valid above a limit or below. for stacks you can make them valid above a limit and move the limit down as the stack grows. (by above i mean address with larger values and by down i mean addresses with smaller values.) but one can't easily use this when implementing C. when the processor does a stack operation, a push, pop, call, return and so on, it uses the stack segment. when the processor feteches or stores an operand or a result, it uses the data segment. even if the data segment and stack segments had the same base register, you couldn't use the grow-down feature to protect the data from the stack growing into it. the problem is local varibles and call by reference. local varibles live on the stack, but if you tried to access them with using the data segment selector, you would get a protection violation because you're doing a data fetch above the data limit. since it's a data fetch it's using the data segment. you might think that the compiler could know where the variable is and use an instruction prefix to override the data segment and use the stack segment instead. that would work for the simple case, but what do you do when you take the address of the variable? you loose the information that it's a local variable on the stack and don't know which segment to use when dereferencing the pointer. when the protect mode was designed in intel, Pascal was all the rage in schools of higher learning. C had yet to become the ubiquitous notation. since Pascal didn't allow addresses to be taken of arbitrary varibles, for years i just assumed the intel design was an arifact of designing hardware to run Pascal. (intel indeed includes a feature that is only useful for Algol like languages that have nested procedures. look at the definition of the `enter' instruction.) but i was wrong. Pascal, (and Modula, and Oberon) allow procedure parameters to be either call by value or call by reference. a call by reference parameter is a kind of pointer. there is no way to know if you've been called with a pointer to a global variable, a variable allocated on the heap or a local variable, without including more information than just the address. so, what were they thinking? i've no idea. all other segmented architectures include the segment selector in the high bits of the virtual address, as does the pdp-11, as did Multics. in fact, what intel calls a page directory is called a segment table in many other systems. but of course the word segment was already taken. in the mid 1980's we used intel compilers with different memory models. the small model did much as i'm suggesting. just give'm 64k and have at it. there was also a middle model with 64k data and more than one code segment. large model allowed more than 64k of data, but a single array was limited to 64k. there was also a huge model. as you went from small to huge, the code generated by the compiler would include more and more load segment selector instructions. while this looks bad, it's really quite worse than it looks. when one loads a selector, a 16-bit value, into the segment register it causes the processor to load a 64 bit segment descriptor. and it does this for every varible access in large and huge models. yuck! all that is why i suggest just giving the process 64k and be done with it. it's 6th edition after all. but last night i thought of a couple of reasons to turn on protect mode. first, there is a way to use it to limit the stack growth and stop it from growing into the data. you must decide how large a stack you want to allow. then put the base of your static data just above the stack and have the stack grow down to the bottom of your data segment. you use protect mode to allow 64k-N, so you can't just wrap around the data segment. if the stack grow down and wraps around to the top of the data segment, it will touch the area that isn't allowed in the segment. there some issues with the value of N, but i won't go into that. like so: +----------+ | heap | +----------+ | data/bss | +----------+ | stack | +----------+ this has the disadvantage of having to set the stack limit when you link the load module, but it will keep one from crashing into the heap. i don't think this is worth the effort. the second reason to turn on the segment protection is to make use of more of the memory in the system. you can have more than the six to eight processes. again, i don't think it's worth it, even thought is hard to `waste' that other 512M. anyway, again i'm sorry for the long message. if anyone has a better answer as to WHY intel designed the segments the way they did in protect mode, if you know what language they were thinking of, please let me know. bc 1011 1100