The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] Introduction
@ 2008-06-04 11:57 Jose R. Valverde
  2008-06-04 15:11 ` Oliver Lehmann
                   ` (3 more replies)
  0 siblings, 4 replies; 25+ messages in thread
From: Jose R. Valverde @ 2008-06-04 11:57 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2594 bytes --]

Dear Oliver

	Astounding work!

What reference source are you using for the reconstruction process?
I bet you are having a look at the source code for Plexis sys3 in the 
TUHS archives, and comparing with the stock sysIII from SCO, right?

FWIW from the source trees from SCO and Plexis, the code layout was
arranged by CPU. I'd bet the WEGA authors had access so SysIII sources,
and if they'd gone through the pains to get it, then they might as
well got both -or perhaps Plexis, which should have been more easy
to get- to use as their codebase. So a comparison of both source 
trees should yield useful insights from the differences between PDP11,
VAX and Z8000.

BTW, as I remember practice in the '80s it was not uncommon to write 
source in C and then tweak the assembler produced by hand to gain some
extra efficiency or fixes. It is also possible that the authors resorted 
to tricks (like casting an int parameter to char) to force the compiler 
generate the code they wanted. You should also watch out for external or
global symbols. It is also possible that the system was compiled with a 
different (may  be earlier) version of the compiler that was later shipped. 
If you can't get stock code to render the same asm then I'd bet for the 
latest explanation (different compiler versions).

Other than that, you are doing an astounding work!

BTW, there are other Z8000 UNIX floating around. Maybe one of them will
shed some extra light.

> So my goal is now to get the kernel sources right now to make the
> neccessary changes to get TCP/IP running in the kernel. As you might
> think now this is not so easy as it sounds. The sources for some objects
> of the kernel survied over the time, but many are missing. I'm now
> sitting here since a month disassembling the original kernel object and
> writing the disassembled code back in C. I've started this by having lets
> say nearly-to-zero ASM knowldege and I'm making good progress. Not much
> is left, but from time to time the C files are not compiling to
> exactly the same object which is in the kernel. Some times other
> temporary registers are used for operations, or I can't get to the same C
> code doesn't matter of what I'm trying and so on. I'm trying to get 100%
> the same object to be 100% sure I have the same code the object was built
> with. The compiler on that system should be the same but of course I
> can't guarantee that for sure.

				

-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    José R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-04 11:57 [TUHS] Introduction Jose R. Valverde
@ 2008-06-04 15:11 ` Oliver Lehmann
  2008-06-04 15:16   ` Oliver Lehmann
  2008-06-05 15:07 ` Jose R. Valverde
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 25+ messages in thread
From: Oliver Lehmann @ 2008-06-04 15:11 UTC (permalink / raw)


Jose R. Valverde wrote:

> Dear Oliver
> 
> 	Astounding work!

Thanks :)

While I'm puzzeling the html document together with my open questions
(not so much questions but not so easy - at least for me - i guess) to
present them here (yeah, fear! ;) just some answers...

> 
> What reference source are you using for the reconstruction process?
> I bet you are having a look at the source code for Plexis sys3 in the 
> TUHS archives, and comparing with the stock sysIII from SCO, right?

I have Plexis sources and used plain SYSIII sources - yes. Sometimes
I also used V7 sources because it pointed out through the recovery
process that sometimes the V7 source matched more the implementation
in WEGA than the SYSIII implementation was.


> FWIW from the source trees from SCO and Plexis, the code layout was
> arranged by CPU. I'd bet the WEGA authors had access so SysIII sources,
> and if they'd gone through the pains to get it, then they might as
> well got both -or perhaps Plexis, which should have been more easy
> to get- to use as their codebase. So a comparison of both source 
> trees should yield useful insights from the differences between PDP11,
> VAX and Z8000.

The biggest problem here is the memory segmentation. Plexis has - as far
as I understand the code - no segmentation/non-segmentation support. It
only supports one memory segment like the other SYSIII implementation did
(please correct me if I'm wrong - I'm far away from being a professional
here).
ZEUS introduced a flag in the user structure specifying if the user runs
a programm in the segmented or non-segmented mode which was based on what
C-compiler was used (or what ASM, PLZ/ASM, PLZ/SYS directive was set.). In
the segmented mode all 128 memory segments where used, in the non
segmented mode only one of those 128 segments (as far as I can remember
it was always segment 63) was used.

In plexis I didn't saw such logic. This creates problems when it comes to
memory access in the Kernel or such things like file execution where the
special s.out header has differences for segmented or non-segmented
programs.

the WEGA-Developer themself (I had contact to their kernel developer) just
took ZEUS and wrote "some" machine-specific kernel parts new. They never
had access to the ZEUS sources. Later after WEGA was in place they got
access to original V7 sources and modified/added stuff in WEGA. All
sources the WEGA guys used is available to me because I got access to the
Development floppies (containing also firmware sources and so on... ;))
It can be clearly determined which objects are still original ZEUS objects
and which where rewritten by the WEGA guys:

a) the libraries LIB1 and LIB2 are storing the file modification times
of the included objects. The original ZEUS objects are all dated in '83
or '84, all WEGA implementations are dated later beginning with '86 for
some ASM-based objects and '88 and '89 for C-based objects.
b) the original ZEUS libraries are all containing the SCCS ident string
like
char sys4wstr[] = "@[$]sys4.c		Rev : 4.2 	09/26/83 22:15:02";
whereas the WEGA-sources are not containing such "whatstring".


> It is also possible that the system was compiled with a 
> different (may  be earlier) version of the compiler that was later shipped. 

This might be the case - but I just don't hope so..

> BTW, there are other Z8000 UNIX floating around. Maybe one of them will
> shed some extra light.

I'm much interested about hearing more about this. I only knew ZEUS, even
Plexis came quite late to my ears.

  Greetings, Oliver

-- 
 Oliver Lehmann
  http://www.pofo.de/
  http://wishlist.ans-netz.de/



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-04 15:11 ` Oliver Lehmann
@ 2008-06-04 15:16   ` Oliver Lehmann
  0 siblings, 0 replies; 25+ messages in thread
From: Oliver Lehmann @ 2008-06-04 15:16 UTC (permalink / raw)


Oliver Lehmann wrote:

> I have Plexis sources and used plain SYSIII sources - yes. Sometimes
> I also used V7 sources because it pointed out through the recovery
> process that sometimes the V7 source matched more the implementation
> in WEGA than the SYSIII implementation was.

One thing I forgott to add - ZEUS also had 2 Kernel objects where no
SYSIII or V7 equivalent existed for. one was called break.o, and one
was called lock.o. While lock.o implementes functions for a file locking
granulated on read/write/read+write base, break.o implements function I
can't see the meaning ... ;)
I got break.o completly rewritten into break.c because the logic itself
was not so hard to read. 
http://cvs.laladev.org/index.html/WEGA/src/uts/sys/break.c?rev=1.2

lock.o on the other hand I got not rewritten into C because I don't
understand even the ASM listing with the handling of the struct
locklist[] and so on... I'll skip that object for now.


-- 
 Oliver Lehmann
  http://www.pofo.de/
  http://wishlist.ans-netz.de/



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-04 11:57 [TUHS] Introduction Jose R. Valverde
  2008-06-04 15:11 ` Oliver Lehmann
@ 2008-06-05 15:07 ` Jose R. Valverde
  2008-06-05 17:59   ` Oliver Lehmann
  2008-06-05 15:17 ` Jose R. Valverde
  2008-06-06  9:58 ` Jose R. Valverde
  3 siblings, 1 reply; 25+ messages in thread
From: Jose R. Valverde @ 2008-06-05 15:07 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 813 bytes --]

> so i created the following C code out of it:
> 
> u.u_count = (-u.u_segmts[NUSEGS-1].sg_limit+0x100)<<8;
> ...
> 
> done by having 256² - 256*x. This was great. With that information I wrote
> in C:
> 
> u.u_count = (256-u.u_segmts[NUSEGS-1].sg_limit)<<8;


What happens if you use instead

	u.u_count = (~(-u.u_segmts[NUSEGS-1].sg_limit))<<8;

That should mean the same, would avoid using a hard coded value and the
compiler may optimize it to the same assembly.

I hope you understand that any advice will probably be faulty as we can
not check the code generated by our suggestions as you do. As long as you
don't mind that, it's OK.

				j
-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    José R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-04 11:57 [TUHS] Introduction Jose R. Valverde
  2008-06-04 15:11 ` Oliver Lehmann
  2008-06-05 15:07 ` Jose R. Valverde
@ 2008-06-05 15:17 ` Jose R. Valverde
  2008-06-05 17:45   ` Oliver Lehmann
  2008-06-23 14:18   ` Jose R. Valverde
  2008-06-06  9:58 ` Jose R. Valverde
  3 siblings, 2 replies; 25+ messages in thread
From: Jose R. Valverde @ 2008-06-05 15:17 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 636 bytes --]

break() migh be used to allocate memory. There was a break() routine used
for low level memory allocation. The ancient code or even the MINIX code
may help you understand it. Look for break or brk.

lock().. are you sure it is for file locking? If so, it may have been
mimic'ed from XENIX file locking mechanisms. Otherwise it might implement
a low level lock to avoid CPU contention as the machine you describe needs
to coordinate work among more than one CPU.

				j

-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    José R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-05 15:17 ` Jose R. Valverde
@ 2008-06-05 17:45   ` Oliver Lehmann
  2008-06-23 14:18   ` Jose R. Valverde
  1 sibling, 0 replies; 25+ messages in thread
From: Oliver Lehmann @ 2008-06-05 17:45 UTC (permalink / raw)


Jose R. Valverde wrote:

> lock().. are you sure it is for file locking? If so, it may have been
> mimic'ed from XENIX file locking mechanisms. Otherwise it might implement
> a low level lock to avoid CPU contention as the machine you describe needs
> to coordinate work among more than one CPU.

I'm sure. I've the man-page for lkdata() and unlk()

       #include <sys/lockblk.h>

       long lkdata (fildes, flag, lkblk);
       int fildes, flag;
       struct lockblk *lkblk;

       long unlk (fildes, flag, lkblk);
       int fildes, flag;
       struct lockblk *lkblk;

in the eastern germany english was not teached (or very rarely) so many
things - even in the world of the computers - where kept in german - so
did the man pages.

I can post the man-page link but your german isn't probably that good ;)
	http://pofo.de/cgi-bin/man.cgi?query=lkdata


-- 
 Oliver Lehmann
  http://www.pofo.de/
  http://wishlist.ans-netz.de/



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-05 15:07 ` Jose R. Valverde
@ 2008-06-05 17:59   ` Oliver Lehmann
  0 siblings, 0 replies; 25+ messages in thread
From: Oliver Lehmann @ 2008-06-05 17:59 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2302 bytes --]

Jose R. Valverde wrote:

> > so i created the following C code out of it:
> > 
> > u.u_count = (-u.u_segmts[NUSEGS-1].sg_limit+0x100)<<8;
> > ...
> > 
> > done by having 256² - 256*x. This was great. With that information I wrote
> > in C:
> > 
> > u.u_count = (256-u.u_segmts[NUSEGS-1].sg_limit)<<8;
> 
> 
> What happens if you use instead
> 
> 	u.u_count = (~(-u.u_segmts[NUSEGS-1].sg_limit))<<8;
> 
> That should mean the same, would avoid using a hard coded value and the
> compiler may optimize it to the same assembly.

This gets compiled+optimized to:

        ldb     rl2,_u+1060
        neg     r2
        com     r2
        ldb     rh2,rl2
        clrb    rl2
        ld      _u+48,r2

I think 256 is ok for me as it a) works, and b) I'm using a defined
variable (CPAS - clicks per address space) instead of the hardcoded 256
and instead of <<8 I'm using "ctob()" which is defined as 

/* clicks to bytes */
# define ctob(x)        ((x)<<8)

so I guess this is OK.



The other (unsolved) problem is quite more complicated for me. I tried
several different things:

u.u_dirp.l = (long)((saddr_t *)uap->linkname)->l & 0x7f00ffffL;
u.u_dirp.l = ((long)uap->linkname&0x7F00FFFFL);
u.u_dirp.l = (long)((int)uap->linkname&0x7F00);

(the types are all defined in param.h I linked to at the webpage) I've
tried to figure out what happens there.

Any value which is in the long-word-register will be ANDed with

    7F       00      FF       FF
01111111 00000000 11111111 11111111

This means, that the higher register will be taken from the long-word-
register unmodified. For the lower register the first bit will be
removed, and the highbyte will be removed as well. The colleague of
mine meant this could have to do something with memory adressing - maybe
to get an address from a memory segment. I didn't understood it that
much. But we didn't found a way how it could be written "differently" and
the optimizer creates the ANDing - again with a temporary register.
I also tried to put the 0x7f00ffff in front of the variable just to be
sure this is not what's triggering the copy. But without success. Maybe
I'm to focused on the ANDing with 7f00 of the first 16bit register-word
from the 32bit-longword - who knows?

-- 
 Oliver Lehmann
  http://www.pofo.de/
  http://wishlist.ans-netz.de/



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-04 11:57 [TUHS] Introduction Jose R. Valverde
                   ` (2 preceding siblings ...)
  2008-06-05 15:17 ` Jose R. Valverde
@ 2008-06-06  9:58 ` Jose R. Valverde
  3 siblings, 0 replies; 25+ messages in thread
From: Jose R. Valverde @ 2008-06-06  9:58 UTC (permalink / raw)


> Hi,
> 
> Does having a license for Solaris 8 source allow you to also have
> System V source?

It probably depends on what your Solaris 8 source license says. I can't
remember offhand the terms of the Solaris Foundation Source Program which
provided it under non-disclosure terms, but it wouldn't surprise me if
they stated that you only got access to Solaris 8, not to ancestor SV
code.

Your license should tell (mine is at home now) but I suspect it will be
highly restricted. 

				j




^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-05 15:17 ` Jose R. Valverde
  2008-06-05 17:45   ` Oliver Lehmann
@ 2008-06-23 14:18   ` Jose R. Valverde
  2008-06-23 16:11     ` Oliver Lehmann
  1 sibling, 1 reply; 25+ messages in thread
From: Jose R. Valverde @ 2008-06-23 14:18 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2597 bytes --]

> The (right now) remaining question is here:
> 
> 	http://pofo.de/P8000/problems.php


My guess on this:

> I've some functions where the asm code looks as follows:
> 
> 0530 3582  0004     584         ldl     rr2,rr8(#4)
> 0534 9424                       ldl     rr4,rr2
> 0536 0704  7f00     585         and     r4,#32512
> 04d2 5d04  8000*    586         ldl     _u+78,rr4
> 04d6 004e*
> 
> This means, an unsigned long value stored in rr8 at position 4 gets loaded into rr2, then into rr4 and then ANDed with 7F00FFFF (r4 are the first 2 bytes of rr4). After the operation is done, the result gets loaded into the address the external reference _u is stored + 78 bytes. The C code I tried to produce out of this information is:

May be there is an additional cast being done. On prf.c you have a
similar AND:

                s=(char *)(*(long *)adx & 0x7F00FFFF);

As you can see there is a double indirection. My guess is that the
AND is done to clear some segmentation information, say to ensure the
datsegment of the program, possibly as a
security measure against a user process providing a pointer crafted
to point to an invalid address. The raw -unsafe- code would have
looked like
		s = (char *) *adx;

So, the address pointed to by adx, which is a char * is first cast 
into long *, then ANDed to clear those bits, then assigned. That would
mean that char* would then be restricted in this system to fit within
that 0x7F00FFFF mask. If that is so, then the original code in sys2.c 
for link()

		u.u_dirp.l = (caddr_t) ((long) uap->linkname);

was recoded to ensure that the (void) int* it got from uap was cleaned
before actual use:

	u.u_dirp.l = (caddr_t) (*((long *)(uap->linkname &0x7F00FFFF)))

uap->linkname is a re-interpretation (as per the struct cast) of the
data stored in u.u_ap, but u.u_ap is an (int*), a generic pointer that
might point to anything (a char* as expected or anything else). Then,
this would explain why you see other register usage in other similar
situations like in rdwr() after assignment of uap->cbuf (another char*)

Could you try that or some such? It would be used then whenever a
char * is to be retrieved through a generic int (void) pointer.

					j
-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    José R. Valiigencia Artificial cuando falta la Natural
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20080623/eb15d5f3/attachment.sig>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-23 14:18   ` Jose R. Valverde
@ 2008-06-23 16:11     ` Oliver Lehmann
  2008-06-25  9:40       ` Jose R. Valverde
  2008-06-25 10:25       ` Jose R. Valverde
  0 siblings, 2 replies; 25+ messages in thread
From: Oliver Lehmann @ 2008-06-23 16:11 UTC (permalink / raw)


Hi Jose,

Jose R. Valverde wrote:

> 
> 	u.u_dirp.l = (caddr_t) (*((long *)(uap->linkname &0x7F00FFFF)))
> 

leads to:

"sys2.c":305: operands of "&" have incompatible types 
Error in file sys2.c: Error.  No assembly.


I've changed it to:
	u.u_dirp.l = (caddr_t) (*((long *)((long)uap->linkname &0x7F00FFFF)));

and this produces:

        ldl     rr2,rr8(#4)
        and     r2,#32512
        ldl     rr4, at rr2
        ldl     _u+78,rr4

not exactly the wanted code :(

Greetings, Oliver

-- 
 Oliver Lehmann
  http://www.pofo.de/
  http://wishlist.ans-netz.de/



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-23 16:11     ` Oliver Lehmann
@ 2008-06-25  9:40       ` Jose R. Valverde
  2008-06-25 10:25       ` Jose R. Valverde
  1 sibling, 0 replies; 25+ messages in thread
From: Jose R. Valverde @ 2008-06-25  9:40 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3247 bytes --]

Oliver:

Right, it seems that I mistransliterated the code in a hurry or confusion. 

I notice that on prf.c *adx is a pointer to be assigned to s, whereas in
sys2.c uap->linkname is the pointer itself thas is assigned to u.u_dirp.l
So my initial transliteration was wrong as I was assigning *uap->linkname
instead of uap->linkname.

Reviewing the assembler you submitted I notice it looks almost like what you
wanted but for @rr2 instead of rr2 so it might have been the extra * I
wrongly added in front of the parenthesized & expression and the erroneously
placed parenthesis (which I also got wrong) the reason for not getting what 
you wanted.

If on prf.c you have in printf
	register unsigned int *adx;
	char *s;

	adx = &x1;
...
	s = (char *) * adx;
and was recoded on WEGA in printfv
	register unsigned *adx;
	register char *s;

	adx = x1;
...
	s = (char *)(*(long *)adx & 0x7F00FFFF);

then maybe the right code on sys.c would be a change from
	caddr_t u.u_dirp;		/* char *, from param.h user.h */
        register struct a {
                char    *target;
                char    *linkname;
        } *uap;
...
	u.u_dirp.l = (caddr_t) uap->linkname;
to
	caddr_t u.u_dirp.l;		/* char *, from param.h user.h */ 
	register struct a {
                char    *target;
                char    *linkname;
        } *uap;
...
	u.u_dirp.l = (caddr_t) ) ((long *) uap->linkname & 0x7F00FFFF);

Note also the difference in parenthesis usage with what you said you had
tried on http://pofo.de/P8000/problems.php
	u.u_dirp.l = (caddr_t)(((long)uap->linkname) & 0x7F00FFFF);

I fear that I was too tired when I wrote my previous posting and made two
many mistakes.

Anyway, the first step should be to check what prf.c generates as assembler 
at these & lines when compiled. If it matches the sample code you mention you
have in other places then it means the same device was used to generate it
(which I would guess is the case) and then it should be a matter of thinking
clearly of what is being assigned. I do believe the surviving trace in prf.c
is the key to understanding the problem assembly code.

				j

On Mon, 23 Jun 2008 18:11:01 +0200 Oliver Lehmann <lehmann at ans-netz.de>
wrote:
> Hi Jose,
> 
> Jose R. Valverde wrote:
> 
> > 
> > 	u.u_dirp.l = (caddr_t) (*((long *)(uap->linkname &0x7F00FFFF)))
> > 
> 
> leads to:
> 
> "sys2.c":305: operands of "&" have incompatible types 
> Error in file sys2.c: Error.  No assembly.
> 
> 
> I've changed it to:
> 	u.u_dirp.l = (caddr_t) (*((long *)((long)uap->linkname &0x7F00FFFF)));
> 
> and this produces:
> 
>         ldl     rr2,rr8(#4)
>         and     r2,#32512
>         ldl     rr4, at rr2
>         ldl     _u+78,rr4
> 
> not exactly the wanted code :(
> 
> Greetings, Oliver
> 
> -- 
>  Oliver Lehmann
>   http://www.pofo.de/
>   http://wishlist.ans-netz.de/


-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    José R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20080625/f57ad71e/attachment.sig>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-23 16:11     ` Oliver Lehmann
  2008-06-25  9:40       ` Jose R. Valverde
@ 2008-06-25 10:25       ` Jose R. Valverde
  2008-06-26 14:52         ` Oliver Lehmann
  1 sibling, 1 reply; 25+ messages in thread
From: Jose R. Valverde @ 2008-06-25 10:25 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1591 bytes --]

Oliver,

	BTW, I am thiking more clearly now and realize I initially confused 
the uap struct in lock() with u_uap, although what is actually assigned is
uap->linkname to u.u_dirp.

	When seeing the type definitions in param.h I also notice that it
defines caddr_t (the type of the u.u_dirp.l side of the saddr_t union) as

typedef char 		*caddr_t;	/* pointer to kernel things */

that leads me to consider that the &0x7F00FFFF may be an additional 
security check to ensure that the pointer falls within valid memory 
space, in which case it would match the memory map.

I notice that nsseg in mch.s may return %7F00 on some cases and is used
in machdep.c as stseg = nsseg(u_state->s_sp); so it seems the stack uses
segment 0x7F00. Then may be the & is shorthand to make sure the address
pointed by the ANDed pointer falls within the stack. It would probably
imply user programs have a maximum stack size of 65536 bytes as well.

That may explain why some pointers are ANDed and others not. I haven't
had a thorough look, but if the &0x7F00FFFF usage is consistent, then
that's is an explanation that may guide source reconstruction.

Does this look sensible?

					j

-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    José R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20080625/8fe9f063/attachment.sig>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-25 10:25       ` Jose R. Valverde
@ 2008-06-26 14:52         ` Oliver Lehmann
  2008-06-27 12:24           ` Jose R. Valverde
  0 siblings, 1 reply; 25+ messages in thread
From: Oliver Lehmann @ 2008-06-26 14:52 UTC (permalink / raw)


Hi Jose,

first - thanks for taking the time helping me here on this issue.

		s=(char *)(*(long *)adx & 0x7F00FFFF);

in prf.c compiles to:

        ldl     rr2,|_stkseg+~L1+8|(fp)
        ldl     rr4, at rr2
        and     r4,#32512
        ldl     |_stkseg+~L1+12|(fp),rr4

I've some places in the WEGA kernel where this ANDing is done in a
way I need it, but I have no source which compiles to the way I need
it. I've searched all the kernel for it.
I guess the WEGA-developer which replaced some ZEUS objects by his
own implementation didn't found out the real syntax too and so created
basically the same syntax I have now in the sources for the original
ZEUS objects. So some sources (mostly theese with german comments ;)
are containing 7F00FFFF ANDings which are compatible because they are
not the original ZEUS objects where this copying is made. And for
ZEUS I've zero sources...

So the only thing in sys2.c's link() you wanted me to change in your
previous mail was:

	u.u_dirp.l = (caddr_t) ((long *) uap->linkname & 0x7F00FFFF);

right? Tried this, and got:

"sys2.c":305: operands of "&" have incompatible types 
Error in file sys2.c: Error.  No assembly.


> I notice that nsseg in mch.s may return %7F00 on some cases and is used
> in machdep.c as stseg = nsseg(u_state->s_sp); so it seems the stack uses
> segment 0x7F00. Then may be the & is shorthand to make sure the address
> pointed by the ANDed pointer falls within the stack. It would probably
> imply user programs have a maximum stack size of 65536 bytes as well.
> 
> That may explain why some pointers are ANDed and others not. I haven't
> had a thorough look, but if the &0x7F00FFFF usage is consistent, then
> that's is an explanation that may guide source reconstruction.

A memory segment is 64Kbyte of size. The hardware is a bit special here.
The CPU can access the memory in a segmented and a nonsegmented mode. For
this purpose 3(!) MMUs are existing. A special MMU control logic is
implemented which can handle 3 states:
1: segmented OS (CPU works in system mode)
	The segments Code, Data and Stack are managed by MMU1. MMU2 and
	MMU3 are not active
2: userprocess not segmented (CPU works in normal-mode, segmentnumber 63)
	The segments Code, Data and Stack are managed by MMU1. MMU2 and
	MMU3 are not active. This is done by a special break register
3: userprocess segmented (CPU works in normal-mode)
	MMU2 and MMU3 are used to process the 128 possible memory segments
	which can be Code-, Data- or Stack-Segments. MMU2 manages the
	segments 0-63 and MMU3 manages the segments 64-128. The switching
	between both MMUs works hardwarecontrolled in dependence of the
	segmentline. Both MMUs are programmed for segment 0..63.

A colleague of mine wrote about this:
>>>>>
I've looked at your problems site and think I can imagine why the AND 0x7f00ffff
is there. Remember, the Z8000 segmentation concept is flawed the way that a segment
address can wrap around without warning. Now, at a higher level, UNIX uses a flat 
address space and somewhere, this logic address needs to be translated into 
physical addresses. This is done by the MMU - however, if within a pointer arithmetic
an overflow beyond the 64k boundary happens, it can spill over into the segment number
which is - you might remember at bit [30:24]. So as soon as a pointer is created by the
compiler, it is ANDed with 0x7f00 for the upper 16bits to extract the segment number and 
with 0xffff for the lower 16bit address to obtain the real logic address PC.

It would be really interesting to look at the implementation of malloc for memory blocks
greater than 64K byte. My assumption is that the compiler inserts this AND on its own for
any pointer arithmetic.
<<<<<

Maybe this helps...

   Greetings, Oliver

-- 
 Oliver Lehmann
  http://www.pofo.de/
  http://wishlist.ans-netz.de/



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-26 14:52         ` Oliver Lehmann
@ 2008-06-27 12:24           ` Jose R. Valverde
  2008-06-29  8:25             ` Oliver Lehmann
  0 siblings, 1 reply; 25+ messages in thread
From: Jose R. Valverde @ 2008-06-27 12:24 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 6926 bytes --]

Dear Oliver,

	well, the fact that prf.c uses the anding explicitly means that
it was actually used as such directly in the code. But...

On Thu, 26 Jun 2008 16:52:46 +0200
Oliver Lehmann <lehmann at ans-netz.de> wrote:
> Hi Jose,
> 
> first - thanks for taking the time helping me here on this issue.
> 
> 		s=(char *)(*(long *)adx & 0x7F00FFFF);
> 
> in prf.c compiles to:
>
Let me test if I understand this:
 
>         ldl     rr2,|_stkseg+~L1+8|(fp)

	rr2 = adx

>         ldl     rr4, at rr2

	rr4 = *adx

>         and     r4,#32512	
>         ldl     |_stkseg+~L1+12|(fp),rr4
	s = rr4
> 


Which is the equivalent to the code you describe in your problems page
except for the @

it also would look like what's doing is

	s = (char *) ( (*(long *)adx) & 0x7F00FFFF;

reflecting how the compiler has read the line (giving higher precedence
to the * than to the &). Am I mistaken? Hence the & is done to the value
pointed by adx, not to adx itself before indirecting it.

That means that adx contains a pointer to a pointer instead of a pointer
to unsigned int as declared, and would explain why the need for the 
first cast (long *), so the value pointed by adx is stored in a long
not an unsigned int. I wonder why adx would not have been declared as
unsigned long * directly...

Then that long (which is actually a pointer) is ANDed to fall in the 
stack, and finally coerced to be interpreted as a char *.

>  
> So the only thing in sys2.c's link() you wanted me to change in your
> previous mail was:
> 
> 	u.u_dirp.l = (caddr_t) ((long *) uap->linkname & 0x7F00FFFF);
> 
> right? Tried this, and got:
> 
> "sys2.c":305: operands of "&" have incompatible types 
> Error in file sys2.c: Error.  No assembly.
> 
My take is that whatever the original source must have been very
close to my suggestion. If we assume the former interpretation
then, of course in prf.c it compiles (as it is ANDing a pointer
coerced to long with the constant) and here it doesn't (as it
would be ANDing a long *.

Why don't you try to split the assignment into various statements
to reproduce the assembly and the recombine them? Like, e.g.

1:	r2 = uap->linkname;		/* ldl rr2,rr8(#4) */
2:	r4 = (long) r2;			/* ldl rr4,rr2 */
3:	r4 &= 0x7F00FFFF;		/* and rr4,#32512 */
4:	u.u_dirp.l = (caddr_t) r4;	/* ldl _u+78, rr4 */

If you can get it by parts, then you can work your way back
recombining with parenthesis. I suspect line (2) above will give the
lead.

Other possibility is some other conversion was used. I notice similar code 
on rdwr(), but here it is of the king
	ldl rrX, something
	ldl rr4,rrX(#some offset)
	and r4,#32512 (or some other value, like 61440) 

So, what if it was called somehow so that the compiler decided to assign 
the value of rr2 to an auxililary register believing there was an offset 
but the offset was zero?

	u.u_dirp.l => ((saddr_t) (uap->linkname)).l

may be they first cast uap->linkname into a segmented address (as it points to
user data) leading to

		(saddr_t) uap->linkname).l

to get the segmented stack pointer that was to be fixed by the AND and then you
cast it to long for the AND

		(long) ((saddr_t) uap->linkname).l
giving
	u.u_dirp.l = (caddr_t) ((long) (((saddr_t) uap->linkname).l) & 0x7F00FFFF);
	// this might force use of an aux. variable for the 0 offset and then anding it

or may be the simpler implicitly forces the code

	u.u_dirp = (saddr_t) (((long) uap->linkname) & 0x7F00FFFF);

Another possibility is that it were coded by hand in assembler working over
assembly listings generated by the compiler: on development, probably prf.c
was coded early on, and then maybe they hand coded that code using prf.c as
a template (reproducing the verbose now unneeded ldl rr4,rr2 line).

> 
> > I notice that nsseg in mch.s may return %7F00 on some cases and is used
> > in machdep.c as stseg = nsseg(u_state->s_sp); so it seems the stack uses
> > segment 0x7F00. Then may be the & is shorthand to make sure the address
> > pointed by the ANDed pointer falls within the stack. It would probably
> > imply user programs have a maximum stack size of 65536 bytes as well.
> > 
> > That may explain why some pointers are ANDed and others not. I haven't
> > had a thorough look, but if the &0x7F00FFFF usage is consistent, then
> > that's is an explanation that may guide source reconstruction.
> 
> A memory segment is 64Kbyte of size. The hardware is a bit special here.
> The CPU can access the memory in a segmented and a nonsegmented mode. For
> this purpose 3(!) MMUs are existing. A special MMU control logic is
> implemented which can handle 3 states:
> 1: segmented OS (CPU works in system mode)
> 	The segments Code, Data and Stack are managed by MMU1. MMU2 and
> 	MMU3 are not active
> 2: userprocess not segmented (CPU works in normal-mode, segmentnumber 63)
> 	The segments Code, Data and Stack are managed by MMU1. MMU2 and
> 	MMU3 are not active. This is done by a special break register
> 3: userprocess segmented (CPU works in normal-mode)
> 	MMU2 and MMU3 are used to process the 128 possible memory segments
> 	which can be Code-, Data- or Stack-Segments. MMU2 manages the
> 	segments 0-63 and MMU3 manages the segments 64-128. The switching
> 	between both MMUs works hardwarecontrolled in dependence of the
> 	segmentline. Both MMUs are programmed for segment 0..63.
> 
> A colleague of mine wrote about this:
> >>>>>
> I've looked at your problems site and think I can imagine why the AND 0x7f00ffff
> is there. Remember, the Z8000 segmentation concept is flawed the way that a segment
> address can wrap around without warning. Now, at a higher level, UNIX uses a flat 
> address space and somewhere, this logic address needs to be translated into 
> physical addresses. This is done by the MMU - however, if within a pointer arithmetic
> an overflow beyond the 64k boundary happens, it can spill over into the segment number
> which is - you might remember at bit [30:24]. So as soon as a pointer is created by the
> compiler, it is ANDed with 0x7f00 for the upper 16bits to extract the segment number and 
> with 0xffff for the lower 16bit address to obtain the real logic address PC.
> 
> It would be really interesting to look at the implementation of malloc for memory blocks
> greater than 64K byte. My assumption is that the compiler inserts this AND on its own for
> any pointer arithmetic.
> <<<<<
> 
> Maybe this helps...
> 
>    Greetings, Oliver
> 
> -- 
>  Oliver Lehmann
>   http://www.pofo.de/
>   http://wishlist.ans-netz.de/


-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    José R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20080627/3f93ff3d/attachment.sig>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-27 12:24           ` Jose R. Valverde
@ 2008-06-29  8:25             ` Oliver Lehmann
  2008-06-30  9:30               ` Jose R. Valverde
  0 siblings, 1 reply; 25+ messages in thread
From: Oliver Lehmann @ 2008-06-29  8:25 UTC (permalink / raw)


Jose R. Valverde wrote:

> Why don't you try to split the assignment into various statements
> to reproduce the assembly and the recombine them? Like, e.g.
> 
> 1:	r2 = uap->linkname;		/* ldl rr2,rr8(#4) */
> 2:	r4 = (long) r2;			/* ldl rr4,rr2 */
> 3:	r4 &= 0x7F00FFFF;		/* and rr4,#32512 */
> 4:	u.u_dirp.l = (caddr_t) r4;	/* ldl _u+78, rr4 */

hm.. this won't work because the compiler starts handing out registers
the register-declared variables with the highest register possible so
would start with rr10 or so.

> 	u.u_dirp.l = (caddr_t) ((long) (((saddr_t) uap->linkname).l) & 0x7F00FFFF);

I've changed it to:

	u.u_dirp.l = (caddr_t) ((long) (((saddr_t *) uap->linkname)->l) & 0x7F00FFFF);

otherwise it won't compile. It compiles to:

        ldl     rr2,rr8(#4)
        ldl     rr4, at rr2
        and     r4,#32512
        ldl     _u+78,rr4

is it because I added a * and changed . to ->?

> 	u.u_dirp = (saddr_t) (((long) uap->linkname) & 0x7F00FFFF);

this generates:
"sys2.c":305: operands of CAST have incompatible types 
"sys2.c":305: operands of "=" have incompatible types 

:(

-- 
 Oliver Lehmann
  http://www.pofo.de/
  http://wishlist.ans-netz.de/



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-29  8:25             ` Oliver Lehmann
@ 2008-06-30  9:30               ` Jose R. Valverde
  2008-06-30 17:34                 ` Oliver Lehmann
  0 siblings, 1 reply; 25+ messages in thread
From: Jose R. Valverde @ 2008-06-30  9:30 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3562 bytes --]

On Sun, 29 Jun 2008 10:25:23 +0200
Oliver Lehmann <lehmann at ans-netz.de> wrote:
> Jose R. Valverde wrote:
> 
> > Why don't you try to split the assignment into various statements
> > to reproduce the assembly and the recombine them? Like, e.g.
> > 
> > 1:	r2 = uap->linkname;		/* ldl rr2,rr8(#4) */
> > 2:	r4 = (long) r2;			/* ldl rr4,rr2 */
> > 3:	r4 &= 0x7F00FFFF;		/* and rr4,#32512 */
> > 4:	u.u_dirp.l = (caddr_t) r4;	/* ldl _u+78, rr4 */
> 
> hm.. this won't work because the compiler starts handing out registers
> the register-declared variables with the highest register possible so
> would start with rr10 or so.

But you would still be able to see what did generate the code (barring
register number).
> 
> > 	u.u_dirp.l = (caddr_t) ((long) (((saddr_t) uap->linkname).l) & 0x7F00FFFF);
> 
> I've changed it to:
> 
> 	u.u_dirp.l = (caddr_t) ((long) (((saddr_t *) uap->linkname)->l) & 0x7F00FFFF);
> 
> otherwise it won't compile. It compiles to:
> 
>         ldl     rr2,rr8(#4)
>         ldl     rr4, at rr2
>         and     r4,#32512
>         ldl     _u+78,rr4
> 
> is it because I added a * and changed . to ->?

Yes, but it also does not reflect the correct usage. linkname is not an saddr_t* but
an saddrt_t. And you want to assign directly the value of uap->linkname not what it
points to.

typedef	union	
{
    caddr_t		l;
    struct
    {
	unsigned	left;
	unsigned	right;
    }			half;
}			saddr_t;	/* segmented address with parts */

> 
> > 	u.u_dirp = (saddr_t) (((long) uap->linkname) & 0x7F00FFFF);
> 
> this generates:
> "sys2.c":305: operands of CAST have incompatible types 
> "sys2.c":305: operands of "=" have incompatible types 
> 
> :(

My fault. That's a typical beginner's mistake I made there. I'm starting
to feel embarrassed of so many mistakes I'm making lately. BTW, I'm on a
deadline so most probably my mind is not 100% in place so do not take me
too seriously specially when dealing with complex abstract data types.

That is because an saddr_t is a union. You cannot assign directly to a
union (u.u_dirp), you must assign to a union member (u.u_dirp.l), but
the union member is not an saddr_t, it is a caddr_t: the correct text
would be 
 	u.u_dirp.l = (caddr_t) (((long) uap->linkname) & 0x7F00FFFF);

which you know does not work. That is why I suggested the extra cast to
see if the compiler would be misled into using an unneeded zero-offset
assignment instruction to an auxiliary register.

	u.u_dirp.l = (caddr_t) ((long) (((saddr_t) uap->linkname).l) & 0x7F00FFFF);

that should be tantamount to

	u.u_dirp.l = (caddrt_t) ((long) ((caddr_t) uap->linkname) & 0x7F00FFFF);

where due to the long cast the initial caddr_t cast would be redundant
reducing to

	u.u_dirp.l = (caddr_t) ((long) uap->linkname & 0x7F00FFFF);

but introducing a saddr_t cast that might fool the compiler into a
temporary assignment with a zero offset (the .l) into ldl rr4,rr2

And I still think that dividing the assignment into intermediate
instructions and looking at the assembly might shed some light into
what is going on.

> 
> -- 
>  Oliver Lehmann
>   http://www.pofo.de/
>   http://wishlist.ans-netz.de/


-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    José R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20080630/e747ed01/attachment.sig>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-30  9:30               ` Jose R. Valverde
@ 2008-06-30 17:34                 ` Oliver Lehmann
  2008-07-01 14:21                   ` Jose R. Valverde
  0 siblings, 1 reply; 25+ messages in thread
From: Oliver Lehmann @ 2008-06-30 17:34 UTC (permalink / raw)


Jose R. Valverde wrote:

> But you would still be able to see what did generate the code (barring
> register number).

my C code:

        register char *r2;
        register long r4;

	r2 = uap->linkname;
	r4 = (long) r2;
	r4 &= 0x7F00FFFF;
	u.u_dirp.l = (caddr_t) r4;

leads to:

        ldl     rr2,rr8(#4)			/* r2 = uap->linkname; */
        ldl     |_stkseg+~L1|(fp),rr2		/* r2 = uap->linkname; */
        ldl     |_stkseg+~L1+4|(fp),rr2		/* r4 = (long) r2;     */
        ldl     rr4,rr2				/* r4 &= 0x7F00FFFF;   */
        and     r4,#32512			/* r4 &= 0x7F00FFFF;   */
        ldl     |_stkseg+~L1+4|(fp),rr4		/* r4 &= 0x7F00FFFF;   */
        ldl     _u+78,rr4			/* u.u_dirp.l = (caddr_t) r4; */

looks not sooo bad - just the assigning into the stacked variables (no
idea why no register bound is used here even if I told the compiler to
make them register bound - but ,,register'' isn't that strong anyway)



> That is why I suggested the extra cast to
> see if the compiler would be misled into using an unneeded zero-offset
> assignment instruction to an auxiliary register.
> 
> 	u.u_dirp.l = (caddr_t) ((long) (((saddr_t) uap->linkname).l) & 0x7F00FFFF);
> [...]
> but introducing a saddr_t cast that might fool the compiler into a
> temporary assignment with a zero offset (the .l) into ldl rr4,rr2

but not with that code :/

u.u_dirp.l = (caddr_t) ((long) (((saddr_t) uap->linkname).l) & 0x7F00FFFF);
"sys2_.c":50: operands of CAST have incompatible types 
"sys2_.c":50: warning: struct/union or struct/union pointer required

Thats why I changed it the last time... to * and ->.


-- 
 Oliver Lehmann
  http://www.pofo.de/
  http://wishlist.ans-netz.de/



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-06-30 17:34                 ` Oliver Lehmann
@ 2008-07-01 14:21                   ` Jose R. Valverde
  2008-07-01 18:35                     ` Oliver Lehmann
  0 siblings, 1 reply; 25+ messages in thread
From: Jose R. Valverde @ 2008-07-01 14:21 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 3780 bytes --]

On Mon, 30 Jun 2008 19:34:50 +0200
Oliver Lehmann <lehmann at ans-netz.de> wrote:
> Jose R. Valverde wrote:
> 
> > But you would still be able to see what did generate the code (barring
> > register number).
> 
> my C code:
> 
>         register char *r2;
>         register long r4;
> 
> 	r2 = uap->linkname;
> 	r4 = (long) r2;
> 	r4 &= 0x7F00FFFF;
> 	u.u_dirp.l = (caddr_t) r4;
> 
> leads to:
> 
>         ldl     rr2,rr8(#4)			/* r2 = uap->linkname; */
>         ldl     |_stkseg+~L1|(fp),rr2		/* r2 = uap->linkname; */
>         ldl     |_stkseg+~L1+4|(fp),rr2		/* r4 = (long) r2;     */
>         ldl     rr4,rr2				/* r4 &= 0x7F00FFFF;   */
>         and     r4,#32512			/* r4 &= 0x7F00FFFF;   */
>         ldl     |_stkseg+~L1+4|(fp),rr4		/* r4 &= 0x7F00FFFF;   */
>         ldl     _u+78,rr4			/* u.u_dirp.l = (caddr_t) r4; */
> 
> looks not sooo bad - just the assigning into the stacked variables (no
> idea why no register bound is used here even if I told the compiler to
> make them register bound - but ,,register'' isn't that strong anyway)
>

So it means that you can reproduce (barring the stack assignments) the
behavior that you describe is puzzling you by using an auxiliary variable.
That is you exactly get

>         ldl     rr2,rr8(#4)			/* r2 = uap->linkname; */
>         ldl     rr4,rr2			/* r4 = (long) r2   */
>         and     r4,#32512			/* r4 &= 0x7F00FFFF;   */
>         ldl     _u+78,rr4			/* u.u_dirp.l = (caddr_t) r4; */

So the problem now is to figure out how the compiler came to use an
additional internal variable (maybe by playing with parenthesis) or
figure out if the original coder could have sensibly used an additional
variable actually.
> 
> but not with that code :/
> 
> u.u_dirp.l = (caddr_t) ((long) (((saddr_t) uap->linkname).l) & 0x7F00FFFF);
> "sys2_.c":50: operands of CAST have incompatible types 
> "sys2_.c":50: warning: struct/union or struct/union pointer required
> 
> Thats why I changed it the last time... to * and ->.
> 
This is starting to look nasty. My bet now is the compiler is getting confused
to parse the line. One thing: try with some additional parenthesis to 
disambiguate

	u.u_dirp.l = (caddr_t) ((long) (((saddr_t) (uap->linkname).l) & 0x7F00FFFF);

and see if that works or the same error repeats. Then, either one of two:
the original code did look that ugly, the author faced a difficult to parse
expression and broke it with an auxiliary variable,

	saddr_t aux;

	aux.l = (caddr_t) uap->linkname;
	u.u_dirp.l = (caddr_t) ((long) aux.l & 0x7F00FFFF);

or, now that I come to think of that, seeing the split example I just gave,
maybe it was all the way implicitly defined _the right way_: and instead of

	register struct a {
		char	*target;
		char	*linkname;
	} *uap;
	...
	u.u_dirp.l = (caddr_t)(((long)uap->linkname) & 0x7F00FFFF);	/* FIXME: this is not 100% compatible */

they actually had the original code

	register struct a {
		char	*target;
		saddr_t	*linkname;
	} *uap;
	...
	u.u_dirp.l = (caddr_t)(((long)uap->linkname.l & 0x7F00FFFF);

which would be absolutely clean, coherent with the way u_dirp is declared and
introduce a zero-offset union reference in the expression leading to the compiler
assigning an additional auxiliary variable to produce the expression.


					j
> 
> -- 
>  Oliver Lehmann
>   http://www.pofo.de/
>   http://wishlist.ans-netz.de/


-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    José R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20080701/f001c388/attachment.sig>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-07-01 14:21                   ` Jose R. Valverde
@ 2008-07-01 18:35                     ` Oliver Lehmann
  2008-07-03 10:12                       ` Jose R. Valverde
  0 siblings, 1 reply; 25+ messages in thread
From: Oliver Lehmann @ 2008-07-01 18:35 UTC (permalink / raw)


Jose R. Valverde wrote:

> 	u.u_dirp.l = (caddr_t) ((long) (((saddr_t) (uap->linkname).l) & 0x7F00FFFF);

I've added a missing ) behind .l:

 	u.u_dirp.l = (caddr_t) ((long) (((saddr_t) (uap->linkname).l)) &
0x7F00FFFF);

And I've got:

"sys2.c":305: warning: struct/union or struct/union pointer required
"sys2.c":305: operands of CAST have incompatible types

> 	saddr_t aux;
> 
> 	aux.l = (caddr_t) uap->linkname;
> 	u.u_dirp.l = (caddr_t) ((long) aux.l & 0x7F00FFFF);

ldl	rr2,rr8(#4)
ldl	|_stkseg+~L1|(fp),rr2
and	r2,#32512
ldl	_u+78,rr2

> they actually had the original code
> 
> 	register struct a {
> 		char	*target;
> 		saddr_t	*linkname;
> 	} *uap;
> 	...
> 	u.u_dirp.l = (caddr_t)(((long)uap->linkname.l & 0x7F00FFFF);

Hm - my man page states, that link() needs a char * as 2nd parameter, but
I've tested it:

"sys2.c":305: operands of "&" have incompatible types
"sys2.c":305: illegal combination of pointer and integer
"sys2.c":305: syntax error

I also tried 

u.u_dirp.l = (caddr_t)(((long)((uap->linkname).l) & 0x7F00FFFF);

-- 
 Oliver Lehmann
  http://www.pofo.de/
  http://wishlist.ans-netz.de/



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-07-01 18:35                     ` Oliver Lehmann
@ 2008-07-03 10:12                       ` Jose R. Valverde
  2008-07-06 16:14                         ` Oliver Lehmann
  0 siblings, 1 reply; 25+ messages in thread
From: Jose R. Valverde @ 2008-07-03 10:12 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1416 bytes --]

On Tue, 1 Jul 2008 20:35:41 +0200
Oliver Lehmann <lehmann at ans-netz.de> wrote:
> Jose R. Valverde wrote:
> 
> > 	u.u_dirp.l = (caddr_t) ((long) (((saddr_t) (uap->linkname).l) & 0x7F00FFFF);
> 
> I've added a missing ) behind .l:
> 
>  	u.u_dirp.l = (caddr_t) ((long) (((saddr_t) (uap->linkname).l)) &
> 0x7F00FFFF);
> 
> And I've got:
> 
> "sys2.c":305: warning: struct/union or struct/union pointer required
> "sys2.c":305: operands of CAST have incompatible types
> 
>
That looks like the compiler is ignoring the parenthesis... So,
maybe what happened was that the original author also tried several 
of these combinations and failed as well, and may be -as I mentioned-

On Tue, 1 Jul 2008 16:21:02 +0200
"Jose R. Valverde" <jrvalverde at cnb.csic.es> wrote:
> the author faced a difficult to parse
> expression and broke it with an auxiliary variable,
> 
> 	saddr_t aux;
> 
> 	aux.l = (caddr_t) uap->linkname;
> 	u.u_dirp.l = (caddr_t) ((long) aux.l & 0x7F00FFFF);

or some such.

					j
-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    José R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20080703/ca63f7f6/attachment.sig>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-07-03 10:12                       ` Jose R. Valverde
@ 2008-07-06 16:14                         ` Oliver Lehmann
  2008-07-07  9:25                           ` [TUHS] SysIII/PDP-11 on SIMH (was Re: Introduction) Jose R. Valverde
  2008-07-07  9:32                           ` [TUHS] Introduction Jose R. Valverde
  0 siblings, 2 replies; 25+ messages in thread
From: Oliver Lehmann @ 2008-07-06 16:14 UTC (permalink / raw)


Hmmmm

but still:

Jose R. Valverde wrote:

> > 	saddr_t aux;
> > 
> > 	aux.l = (caddr_t) uap->linkname;
> > 	u.u_dirp.l = (caddr_t) ((long) aux.l & 0x7F00FFFF);

ldl	rr2,rr8(#4)
ldl	|_stkseg+~L1+8|(fp),rr2
and	r2,#32512
ldl	_u+78,rr2

-- 
 Oliver Lehmann
  http://www.pofo.de/
  http://wishlist.ans-netz.de/



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] SysIII/PDP-11 on SIMH (was Re:  Introduction)
  2008-07-06 16:14                         ` Oliver Lehmann
@ 2008-07-07  9:25                           ` Jose R. Valverde
  2008-07-07  9:32                           ` [TUHS] Introduction Jose R. Valverde
  1 sibling, 0 replies; 25+ messages in thread
From: Jose R. Valverde @ 2008-07-07  9:25 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2167 bytes --]

This is puzzling me... so I decided to check whether I could somehow reproduce
the problem here assuming your compiler derives from the one in System-III.

So, I have installed System III on SIMH from the tapes in TUHS. Silly me, I
didn't realize that I was using the ones for PDP-11 and so they are probably
farther from the WEGA than the tapes for VAX.

Still, while remaining within 16 bits, it seems like I can sort of reproduce
some similar behavior on the PDP-11 System-III. If I find some spare time I
will try to install SysIII on the VAX simulator as well and see if that works
better.

For the curious, I started off Hellwig Geisse's distro of V7 on PDP-11

	http://homepages.fh-giessen.de/~hg53/pdp11-unix/

and succeeded to install mini-root. Then I had to use an intermediate V7
to restore from tape the tar archive of the full file system. After a few
extra changes, I got System III apparently up and running (at least reaches
single user and compiles itself).

I don't know if there is any additional interest on this, but should it be
I can make the UNIX System III distribution with instructions available for
anybody interested.

This distro comes with instructions AND a working system image, hence it
is big, much bigger than Geisse's V7, but for today's standards it is a
paltry 35M.

				j

On Sun, 6 Jul 2008 18:14:19 +0200
Oliver Lehmann <lehmann at ans-netz.de> wrote:
> Hmmmm
> 
> but still:
> 
> Jose R. Valverde wrote:
> 
> > > 	saddr_t aux;
> > > 
> > > 	aux.l = (caddr_t) uap->linkname;
> > > 	u.u_dirp.l = (caddr_t) ((long) aux.l & 0x7F00FFFF);
> 
> ldl	rr2,rr8(#4)
> ldl	|_stkseg+~L1+8|(fp),rr2
> and	r2,#32512
> ldl	_u+78,rr2
> 
> -- 
>  Oliver Lehmann
>   http://www.pofo.de/
>   http://wishlist.ans-netz.de/


-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    José R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20080707/ed340675/attachment.sig>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-07-06 16:14                         ` Oliver Lehmann
  2008-07-07  9:25                           ` [TUHS] SysIII/PDP-11 on SIMH (was Re: Introduction) Jose R. Valverde
@ 2008-07-07  9:32                           ` Jose R. Valverde
  2008-07-07 14:45                             ` Oliver Lehmann
  1 sibling, 1 reply; 25+ messages in thread
From: Jose R. Valverde @ 2008-07-07  9:32 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1702 bytes --]

Then, all I can think of is the other approach I mentioned: that the 
original authors first wrote prf.c, had a look at the assembly generated
and then tweaked the assembly code generated from sys*.c by hand and
reproduced some similar patterns.

If the original authors used sys*.c as "templates" to then further fine
tune the derived assembly listings (and in doing so left behind these
puzzling traces) then that might explain why your sources contained the
asm listings instead of the C ones.

It might have made sense for them not to tune prf.c which is rarely used
but try to tune sys*.c which are heavily used to account for their "new"
segmented architecture.

Well, if I can find some time to install SystemIII for VAX on SIMH, may
be I will be able to reproduce this, but I seriously doubt it as they
are different architectures.

					j

On Sun, 6 Jul 2008 18:14:19 +0200
Oliver Lehmann <lehmann at ans-netz.de> wrote:
> Hmmmm
> 
> but still:
> 
> Jose R. Valverde wrote:
> 
> > > 	saddr_t aux;
> > > 
> > > 	aux.l = (caddr_t) uap->linkname;
> > > 	u.u_dirp.l = (caddr_t) ((long) aux.l & 0x7F00FFFF);
> 
> ldl	rr2,rr8(#4)
> ldl	|_stkseg+~L1+8|(fp),rr2
> and	r2,#32512
> ldl	_u+78,rr2
> 
> -- 
>  Oliver Lehmann
>   http://www.pofo.de/
>   http://wishlist.ans-netz.de/


-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    José R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://minnie.tuhs.org/pipermail/tuhs/attachments/20080707/b63813db/attachment.sig>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
  2008-07-07  9:32                           ` [TUHS] Introduction Jose R. Valverde
@ 2008-07-07 14:45                             ` Oliver Lehmann
  0 siblings, 0 replies; 25+ messages in thread
From: Oliver Lehmann @ 2008-07-07 14:45 UTC (permalink / raw)


Jose R. Valverde wrote:

> Then, all I can think of is the other approach I mentioned: that the 
> original authors first wrote prf.c, had a look at the assembly generated
> and then tweaked the assembly code generated from sys*.c by hand and
> reproduced some similar patterns.

prf.c has another author than the source files I've got the problem with.
prf.c was "developed" by the guys at EAW who created WEGA. Developed means
here, they disassembled the ZEUS objects and created C files out of the
ASM listing like I do. Probably they didn't figured out the "ldl rr4,rr2"
part of the original ZEUS object (which was probably there) too and they
decided that how it is done now has the same effect.
So - you can say all sources which contain german text are created based
on ZEUS disassembled objects so they are not "original" ZEUS. all the
source files containing a "whatstring" from the version control system
are files created by me by disassembling the original ZEUS objects.

One interesting thing for example can be found in the WEGA-debug.c:

	callr	_gethex
	and	r2,#16128
	ldl	rr4,rr2
	and	r4,#32512
	ldl	|_stkseg+~L1+8|(fp),rr4

is the ASM code from the original object

What they've tryed at EAW when writing the debug.c file was:

#define UMASK	0x3f00ffffL
#define KMASK	0x7f00ffffL
[...]
kadr = (unsigned int *)((long)(gethex() & UMASK) & KMASK);

But this gets compiled into:

	callr	_gethex
	and	r2,#16128
	ldl	|_stkseg+~L1+8|(fp),rr2

which is the same because - when the AND with 16128 happend, an AND with
any higher value won't change the register - so the optimizer probably
decided to remove the and with 32512 - which is not removed in the
original object....

> Well, if I can find some time to install SystemIII for VAX on SIMH, may
> be I will be able to reproduce this, but I seriously doubt it as they
> are different architectures.

A friend of mine is working on a P8000 emulator - 8Bit is kinda done,
16bit part is "in progress" - no idea how long it will take him to
complete it since I have no circuit plans of the 16Bit part - only the
original (good documented) Firmware Sources.

When it is done and it can be released officially you can all enjoy the
feeling on working with a P8000! ;)

-- 
 Oliver Lehmann
  http://www.pofo.de/
  http://wishlist.ans-netz.de/



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [TUHS] Introduction
@ 2008-06-03  4:18 Oliver Lehmann
  0 siblings, 0 replies; 25+ messages in thread
From: Oliver Lehmann @ 2008-06-03  4:18 UTC (permalink / raw)


Hi everybody,

I don't know if it's usual or not to write an introduction but I'll just
do so by keeping more an eye on the computer system I own.
If you don't care just skip this mail ;)

As my From header states my name is Oliver, I live in germany and right
now I'm 27 years old. That should be enough to my person - now let me
tell you a bit more about the computer system I own ;)

EAW P8000

This system was built between 1987 and the breakdown of the former GDR -
the eastern part of germany - 1990. The system itself is split up into
two "towers" connected together. The first tower called "P8000 Computer"
contains a 8Bit system (Z80) and a 16Bit system (Z8001).The 2nd case -
the "P8000 Winchester" - contains a Winchester Disc Controller which runs
with a Z80 CPU and is connected to the 16Bit part of the "P8000
Computer". Up to three MFM drives (all with the same geometry while the
geometry itself can be configured) can be connected to the WDC.

The 8Bit part is built on a single board, has 64KB SRAM, 2 SIOs to
connect up to 4 terminals to it, one PIO to connect a EPROM programmer,
and one PIO to establish a connection to the 16Bit part. It has 2 5.25"
floppy drives with an external connector to connect two further 5.25" or
8" floppy drives. The systemmonitor is loaded from two 2732 EPROMs.
The system originally supported three operating systems while two
survived the time being. I own UDOS which is a Z80-RIO clone and OS/M
which is a CP/M clone. There also was an OS called IS/M which was an ISIS
clone.

The more interesting (at least for me) part is the 16Bit part. The 16Bit
part is built on a single board too (6layer) while the DRAM are single
board which can be hooked up onto the mainboard.
The system runs a Z8001 with 3 MMUs and Z80-peripherial ICs (PIO, SIO...)
It also has 2 SIOs for 4 terminal connections, and one PIO to connect the
WDC. The system also has two furhter PIO chips to establish a connection
to the 8Bit system. The system runs with up to 4MB of DRAM but it might
run with more RAM with self-made RAM modules. There exists also a RTC for
the system and an extension to connect an 80286CPU + 1MBRAM to the 16BIT
port to run a x86 OS on it while stearing it from the OS running on the
16Bit system.
The Operating-System running on the 16Bit part is WEGA - a ZiLOG ZEUS
clone.
To boot WEGA at first the 8Bit system has to be booted up with UDOS (the
Z80-RIO clone) to load a communication software which handles the
communication over the 8Bit-PIO. After this is done the system switches
over to the 16Bit system and the system monitor there gets loaded. The
WEGA-Kernel (most parts are still original ZEUS objects) itself has the
corresponding part for the 8<->16Bit communication interface in it.
This was done to get access to the floppy drives, the EPROM programmer
and the 4 8Bit-terminal connections which are all connected to the Z80
on the 8Bit-system.
To access for example a floppy, the WEGA-kernel has to send the request
using the PIO connection to the 8Bit system which handles it and sends
the results back to the WEGA-kernel on the 16 Bit system. Same goes with
the WDC which is connected through another PIO directly to the 16Bit
system - command codes are sent to the Z80 on the WDC which handles the
codes and sends the results back to the 16Bit system. Not that fast but
it works good.

Pictures and so one are all collected on my homepage
http://pofo.de/P8000/ while most (if not to say all) of original
documents are written in german...

So - what do I do with the system? I use it for learn more about hardware
processes itself, assembler and to get a deeper UNIX knowledge which is
easier to start with there then with todays UNIX systems.

Las project was to get TCP/IP working and I successed by usingg K5JB to
get FTP and ping to work via SLIP. Because the speed was damn slow (and
not just because of the baud rate), I came to the conclusion that a
better performance could be achieved by implementing TCP/IP in the
kernelspace instead of having it run in the userspace.

So my goal is now to get the kernel sources right now to make the
neccessary changes to get TCP/IP running in the kernel. As you might
think now this is not so easy as it sounds. The sources for some objects
of the kernel survied over the time, but many are missing. I'm now
sitting here since a month disassembling the original kernel object and
writing the disassembled code back in C. I've started this by having lets
say nearly-to-zero ASM knowldege and I'm making good progress. Not much
is left, but from time to time the C files are not compiling to
exactly the same object which is in the kernel. Some times other
temporary registers are used for operations, or I can't get to the same C
code doesn't matter of what I'm trying and so on. I'm trying to get 100%
the same object to be 100% sure I have the same code the object was built
with. The compiler on that system should be the same but of course I
can't guarantee that for sure.

I'll put a web page together with my open C<->ASM questions because I
think I can format things better there so asking and reading would be
easier (probably because it is a lot of text)

My progess can be seen here: http://pofo.de/P8000/kernel.php
And the sources I got so far are here: 
  http://cvs.laladev.org/index.html/WEGA/src/uts/

I hope you can help me a bit with answering the things I can't find an
answer myself ;)

-- 
 Oliver Lehmann
  http://www.pofo.de/
  http://wishlist.ans-netz.de/



^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2008-07-07 14:45 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-06-04 11:57 [TUHS] Introduction Jose R. Valverde
2008-06-04 15:11 ` Oliver Lehmann
2008-06-04 15:16   ` Oliver Lehmann
2008-06-05 15:07 ` Jose R. Valverde
2008-06-05 17:59   ` Oliver Lehmann
2008-06-05 15:17 ` Jose R. Valverde
2008-06-05 17:45   ` Oliver Lehmann
2008-06-23 14:18   ` Jose R. Valverde
2008-06-23 16:11     ` Oliver Lehmann
2008-06-25  9:40       ` Jose R. Valverde
2008-06-25 10:25       ` Jose R. Valverde
2008-06-26 14:52         ` Oliver Lehmann
2008-06-27 12:24           ` Jose R. Valverde
2008-06-29  8:25             ` Oliver Lehmann
2008-06-30  9:30               ` Jose R. Valverde
2008-06-30 17:34                 ` Oliver Lehmann
2008-07-01 14:21                   ` Jose R. Valverde
2008-07-01 18:35                     ` Oliver Lehmann
2008-07-03 10:12                       ` Jose R. Valverde
2008-07-06 16:14                         ` Oliver Lehmann
2008-07-07  9:25                           ` [TUHS] SysIII/PDP-11 on SIMH (was Re: Introduction) Jose R. Valverde
2008-07-07  9:32                           ` [TUHS] Introduction Jose R. Valverde
2008-07-07 14:45                             ` Oliver Lehmann
2008-06-06  9:58 ` Jose R. Valverde
  -- strict thread matches above, loose matches on Subject: below --
2008-06-03  4:18 Oliver Lehmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).