The Unix Heritage Society mailing list
 help / color / Atom feed
* [TUHS] Design of the AT&T assembly syntax
@ 2019-10-28 20:07 Robert Clausecker
  2019-10-28 22:08 ` Warner Losh
  0 siblings, 1 reply; 10+ messages in thread
From: Robert Clausecker @ 2019-10-28 20:07 UTC (permalink / raw)
  To: tuhs

Some time ago, I wrote a piece [1] about the design of the AT&T
assembler syntax.  While I'm still not quite sure if everything in there
is correct, this explanation seemed plausible to me; the PDP-11
assembler being adapted for the 8086, then the 80386 and then ELF
targets, giving us today's convoluted syntax.

The one thing in this chain I have never found is an AT&T style
assembler for x86 before ELF was introduced.  Supposedly, it would get
away without % as a register prefix, thus being much less obnoxious to
use.  Any idea if such an assembler ever existed and if yes where?
I suppose Xenix might have shipped something like that.

The only AT&T syntax assemblers I know today are those from Solaris,
the GNU project, the LLVM project, and possibly whatever macOS ships.
Are there (or where there) any other x86 AT&T assemblers?  Who was
the first party to introduce this?

Yours,
Robert Clausecker

[1]: https://stackoverflow.com/a/42250270/417501

-- 
()  ascii ribbon campaign - for an 8-bit clean world 
/\  - against html email  - against proprietary attachments

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [TUHS] Design of the AT&T assembly syntax
  2019-10-28 20:07 [TUHS] Design of the AT&T assembly syntax Robert Clausecker
@ 2019-10-28 22:08 ` Warner Losh
  2019-10-28 22:24   ` Robert Clausecker
  0 siblings, 1 reply; 10+ messages in thread
From: Warner Losh @ 2019-10-28 22:08 UTC (permalink / raw)
  To: Robert Clausecker; +Cc: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 2871 bytes --]

On Mon, Oct 28, 2019 at 2:16 PM Robert Clausecker <fuz@fuz.su> wrote:

> Some time ago, I wrote a piece [1] about the design of the AT&T
> assembler syntax.  While I'm still not quite sure if everything in there
> is correct, this explanation seemed plausible to me; the PDP-11
> assembler being adapted for the 8086, then the 80386 and then ELF
> targets, giving us today's convoluted syntax.
>
> The one thing in this chain I have never found is an AT&T style
> assembler for x86 before ELF was introduced.  Supposedly, it would get
> away without % as a register prefix, thus being much less obnoxious to
> use.  Any idea if such an assembler ever existed and if yes where?
> I suppose Xenix might have shipped something like that.
>
> The only AT&T syntax assemblers I know today are those from Solaris,
> the GNU project, the LLVM project, and possibly whatever macOS ships.
> Are there (or where there) any other x86 AT&T assemblers?  Who was
> the first party to introduce this?
>

VENIX 2.0 had this. It was a Pure AT&T syntax w/o % signs:

eg
|
| VENIX/86 start off (bootstrap starts execution at location 0 `start').
|
| Relocate complete kernel down to low memory.
        .text
start:  cli
        mov     dx,#LOWMEM      | base of relocated kernel
        mov     cx,cs
        cmp     cx,dx           | are we there (put there by bootstrap) ?
        beq     L0002           | Yes.
        mov     ds,cx

which is clearly op dst, src.

VENIX's compiler was from the MIT compiler collection which was a port of
the portable C compiler to x86 that everybody used (it seems, I don't have
a reference for that, just speculation).  You can find a version of
this code in the TUHS archive in Applications/Portable_CC which has the
8086.zip.

There's follow on work from a university in Queens in 286.zip that adds
near/far stuff (the original one didn't, and the VENIX code assumes none of
the segment registers change in userland code for its context switching
code). I've not looked at this code.

All this code is dyed in the wool K&R code from a V7-level C compiler, so
it won't compile on newer systems. And it's a right-royal pain in the
backside to convert on the fly because it wasn't written to be portable to
ANSI compilers and modern C compilers no longer have a K&R mode...

Thanks again to Al Kossow for this being in the archive. It's possible to
find this on FTP sites if you look hard enough. I found them in the past,
but I can't find it now that I went looking, so I'm quite happy that it's
in the archive. VENIX 2.1 released a newer version of the compiler than was
in VENIX 2.0. I don't know if those pre-date or post-date this stuff.

Sadly, the modern PCC project no longer works with 16-bit code, but I
suppose that's par for the course these days.

Warner

Yours,
> Robert Clausecker
>
> [1]: https://stackoverflow.com/a/42250270/417501
>

[-- Attachment #2: Type: text/html, Size: 3881 bytes --]

<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Oct 28, 2019 at 2:16 PM Robert Clausecker &lt;<a href="mailto:fuz@fuz.su">fuz@fuz.su</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Some time ago, I wrote a piece [1] about the design of the AT&amp;T<br>
assembler syntax.  While I&#39;m still not quite sure if everything in there<br>
is correct, this explanation seemed plausible to me; the PDP-11<br>
assembler being adapted for the 8086, then the 80386 and then ELF<br>
targets, giving us today&#39;s convoluted syntax.<br>
<br>
The one thing in this chain I have never found is an AT&amp;T style<br>
assembler for x86 before ELF was introduced.  Supposedly, it would get<br>
away without % as a register prefix, thus being much less obnoxious to<br>
use.  Any idea if such an assembler ever existed and if yes where?<br>
I suppose Xenix might have shipped something like that.<br>
<br>
The only AT&amp;T syntax assemblers I know today are those from Solaris,<br>
the GNU project, the LLVM project, and possibly whatever macOS ships.<br>
Are there (or where there) any other x86 AT&amp;T assemblers?  Who was<br>
the first party to introduce this?<br></blockquote><div><br></div><div>VENIX 2.0 had this. It was a Pure AT&amp;T syntax w/o % signs:</div><div><br></div><div>eg</div><div>|<br>| VENIX/86 start off (bootstrap starts execution at location 0 `start&#39;).<br>|<br>| Relocate complete kernel down to low memory.<br>        .text<br>start:  cli<br>        mov     dx,#LOWMEM      | base of relocated kernel<br>        mov     cx,cs<br>        cmp     cx,dx           | are we there (put there by bootstrap) ?<br>        beq     L0002           | Yes.<br>        mov     ds,cx<br></div><div><br></div><div>which is clearly op dst, src.</div><div><br></div><div>VENIX&#39;s compiler was from the MIT compiler collection which was a port of the portable C compiler to x86 that everybody used (it seems, I don&#39;t have a reference for that, just speculation).  You can find a version of this code in the TUHS archive in Applications/Portable_CC which has the 8086.zip.</div><div><br></div><div>There&#39;s follow on work from a university in Queens in 286.zip that adds near/far stuff (the original one didn&#39;t, and the VENIX code assumes none of the segment registers change in userland code for its context switching code). I&#39;ve not looked at this code.</div><div><br></div><div>All this code is dyed in the wool K&amp;R code from a V7-level C compiler, so it won&#39;t compile on newer systems. And it&#39;s a right-royal pain in the backside to convert on the fly because it wasn&#39;t written to be portable to ANSI compilers and modern C compilers no longer have a K&amp;R mode...</div><div><br></div><div>Thanks again to Al Kossow for this being in the archive. It&#39;s possible to find this on FTP sites if you look hard enough. I found them in the past, but I can&#39;t find it now that I went looking, so I&#39;m quite happy that it&#39;s in the archive. VENIX 2.1 released a newer version of the compiler than was in VENIX 2.0. I don&#39;t know if those pre-date or post-date this stuff.</div><div><br></div><div>Sadly, the modern PCC project no longer works with 16-bit code, but I suppose that&#39;s par for the course these days.</div><div><br></div><div>Warner</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Yours,<br>
Robert Clausecker<br>
<br>
[1]: <a href="https://stackoverflow.com/a/42250270/417501" rel="noreferrer" target="_blank">https://stackoverflow.com/a/42250270/417501</a><br>
</blockquote></div></div>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [TUHS] Design of the AT&T assembly syntax
  2019-10-28 22:08 ` Warner Losh
@ 2019-10-28 22:24   ` Robert Clausecker
  2019-10-28 22:29     ` Warner Losh
  0 siblings, 1 reply; 10+ messages in thread
From: Robert Clausecker @ 2019-10-28 22:24 UTC (permalink / raw)
  To: tuhs

Hi Warner,

On Mon, Oct 28, 2019 at 04:08:53PM -0600, Warner Losh wrote:
> VENIX 2.0 had this. It was a Pure AT&T syntax w/o % signs:
> 
> eg
> |
> | VENIX/86 start off (bootstrap starts execution at location 0 `start').
> |
> | Relocate complete kernel down to low memory.
>         .text
> start:  cli
>         mov     dx,#LOWMEM      | base of relocated kernel
>         mov     cx,cs
>         cmp     cx,dx           | are we there (put there by bootstrap) ?
>         beq     L0002           | Yes.
>         mov     ds,cx
> 
> which is clearly op dst, src.

op dst, src is Intel syntax.  AT&T syntax has op src, dst like MACRO-11.
There are a number of other differences: (a) | instead of / or # as a comment
character (b) different mnemonics (beq instead of je) and (c) # instead of $
as the comment character.

Without seeing some more code, I'd say it's not AT&T syntax.

> VENIX's compiler was from the MIT compiler collection which was a port of
> the portable C compiler to x86 that everybody used (it seems, I don't have
> a reference for that, just speculation).  You can find a version of
> this code in the TUHS archive in Applications/Portable_CC which has the
> 8086.zip.
>
> There's follow on work from a university in Queens in 286.zip that adds
> near/far stuff (the original one didn't, and the VENIX code assumes none of
> the segment registers change in userland code for its context switching
> code). I've not looked at this code.

Will have a look!

> All this code is dyed in the wool K&R code from a V7-level C compiler, so
> it won't compile on newer systems. And it's a right-royal pain in the
> backside to convert on the fly because it wasn't written to be portable to
> ANSI compilers and modern C compilers no longer have a K&R mode...
> 
> Thanks again to Al Kossow for this being in the archive. It's possible to
> find this on FTP sites if you look hard enough. I found them in the past,
> but I can't find it now that I went looking, so I'm quite happy that it's
> in the archive. VENIX 2.1 released a newer version of the compiler than was
> in VENIX 2.0. I don't know if those pre-date or post-date this stuff.

Thank you for trying to dig up the source.

> Sadly, the modern PCC project no longer works with 16-bit code, but I
> suppose that's par for the course these days.

OpenWatcom still works, but it's not too compatible.

> Warner
> 
> Yours,
> > Robert Clausecker
> >
> > [1]: https://stackoverflow.com/a/42250270/417501

Yours,
Robert Clausecker

-- 
()  ascii ribbon campaign - for an 8-bit clean world 
/\  - against html email  - against proprietary attachments

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [TUHS] Design of the AT&T assembly syntax
  2019-10-28 22:24   ` Robert Clausecker
@ 2019-10-28 22:29     ` Warner Losh
  0 siblings, 0 replies; 10+ messages in thread
From: Warner Losh @ 2019-10-28 22:29 UTC (permalink / raw)
  To: Robert Clausecker; +Cc: The Eunuchs Hysterical Society

[-- Attachment #1: Type: text/plain, Size: 1195 bytes --]

On Mon, Oct 28, 2019 at 4:24 PM Robert Clausecker <fuz@fuz.su> wrote:

> Hi Warner,
>
> On Mon, Oct 28, 2019 at 04:08:53PM -0600, Warner Losh wrote:
> > VENIX 2.0 had this. It was a Pure AT&T syntax w/o % signs:
> >
> > eg
> > |
> > | VENIX/86 start off (bootstrap starts execution at location 0 `start').
> > |
> > | Relocate complete kernel down to low memory.
> >         .text
> > start:  cli
> >         mov     dx,#LOWMEM      | base of relocated kernel
> >         mov     cx,cs
> >         cmp     cx,dx           | are we there (put there by bootstrap) ?
> >         beq     L0002           | Yes.
> >         mov     ds,cx
> >
> > which is clearly op dst, src.
>
> op dst, src is Intel syntax.  AT&T syntax has op src, dst like MACRO-11.
> There are a number of other differences: (a) | instead of / or # as a
> comment
> character (b) different mnemonics (beq instead of je) and (c) # instead of
> $
> as the comment character.
>
> Without seeing some more code, I'd say it's not AT&T syntax.
>

Doh! I've been mixing the two up since the 90s :(. Yea, this stuff isn't
AT&T syntax...  It's from a compiler from MIT... I should have taken the
hint that it used MIT sequence :)

Warner

[-- Attachment #2: Type: text/html, Size: 1766 bytes --]

<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Oct 28, 2019 at 4:24 PM Robert Clausecker &lt;<a href="mailto:fuz@fuz.su">fuz@fuz.su</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Warner,<br>
<br>
On Mon, Oct 28, 2019 at 04:08:53PM -0600, Warner Losh wrote:<br>
&gt; VENIX 2.0 had this. It was a Pure AT&amp;T syntax w/o % signs:<br>
&gt; <br>
&gt; eg<br>
&gt; |<br>
&gt; | VENIX/86 start off (bootstrap starts execution at location 0 `start&#39;).<br>
&gt; |<br>
&gt; | Relocate complete kernel down to low memory.<br>
&gt;         .text<br>
&gt; start:  cli<br>
&gt;         mov     dx,#LOWMEM      | base of relocated kernel<br>
&gt;         mov     cx,cs<br>
&gt;         cmp     cx,dx           | are we there (put there by bootstrap) ?<br>
&gt;         beq     L0002           | Yes.<br>
&gt;         mov     ds,cx<br>
&gt; <br>
&gt; which is clearly op dst, src.<br>
<br>
op dst, src is Intel syntax.  AT&amp;T syntax has op src, dst like MACRO-11.<br>
There are a number of other differences: (a) | instead of / or # as a comment<br>
character (b) different mnemonics (beq instead of je) and (c) # instead of $<br>
as the comment character.<br>
<br>
Without seeing some more code, I&#39;d say it&#39;s not AT&amp;T syntax.<br></blockquote><div><br></div><div>Doh! I&#39;ve been mixing the two up since the 90s :(. Yea, this stuff isn&#39;t AT&amp;T syntax...  It&#39;s from a compiler from MIT... I should have taken the hint that it used MIT sequence :)</div><div><br></div><div>Warner</div></div></div>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Design of the AT&T assembly syntax
@ 2019-10-29 14:38 Alexander Voropay
  0 siblings, 0 replies; 10+ messages in thread
From: Alexander Voropay @ 2019-10-29 14:38 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

Robert Clausecker <fuz@fuz.su> wrote:

> > I've tried Microport SystemV /386 (SysV R3.2). It uses COFF
> Nice find!  It seems to use lcall to selector 7 for system calls.  A
> similar choice was made in 386BSD all the way through FreeBSD 2.2.8
> where it was replaced with int $0x80 as in Linux.

Technically speaking
lcall $0x07,$0
uses selector 0 with RPL=3 (bit0 and bit1==1) and LDT (bit2==1)

It seems it's oldest way to call kernel from userspace on x86 architecture.
AT&T's programmers used this sycall convention for SysVR3 and
SysVR4 on i386 (not sure about SysVR2 on i286).
There are very few examples with lcall-type syscall  i.e.
http://www.sco.com/developers/devspecs/abi386-4.pdf
(figure 3-26)
(and leaked SysVR4 i386 sources)

William Jolitz used this convention in his amazing articles about
porting BSD4.3 to the i386 (c)1991
http://www.informatica.co.cr/unix-source-code/research/1991/0101.html
(p."System Call Inteface"). See also 386BSD 0.0:
https://github.com/386bsd/386bsd/blob/0.0/arch/i386/i386/locore.s#L361
(Did he run AT&T userspace on his kernel ???)
As you mentioned, most of early *BSD systems on i386 also used lcall.

Linus selected to use "DOS-style" call with INT 0x80.
More recent BSD on i386 also use INT.
https://john-millikin.com/unix-syscalls
http://asm.sourceforge.net/intro/hello.html

Solaris on x86 (ex. SysVR4) also uses lcall. See a
https://www.cs.dartmouth.edu/sergey/cs258/solaris-on-x86.pdf
p.4.2.3
and Solaris (later OpenSolaris and later Illumos) sourcecode.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [TUHS] Design of the AT&T assembly syntax
  2019-10-28 21:06 ` Seth Morabito
@ 2019-10-29  0:31   ` Nemo Nusquam
  0 siblings, 0 replies; 10+ messages in thread
From: Nemo Nusquam @ 2019-10-29  0:31 UTC (permalink / raw)
  To: tuhs

On 10/28/19 17:06, Seth Morabito wrote:

>>
>> Thanks for pointing this out, I think it's a really interesting read.
>>
>> [...] because I actually like and prefer AT&T syntax.
+1


>   I think most of the other developers I know find my attitude to be abhorrent :)
>
> -Seth


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [TUHS] Design of the AT&T assembly syntax
  2019-10-28 21:48 Alexander Voropay
@ 2019-10-28 21:59 ` Robert Clausecker
  0 siblings, 0 replies; 10+ messages in thread
From: Robert Clausecker @ 2019-10-28 21:59 UTC (permalink / raw)
  To: Alexander Voropay; +Cc: tuhs

Hi Alexander,

On Tue, Oct 29, 2019 at 12:48:29AM +0300, Alexander Voropay wrote:
> Robert Clausecker <fuz@fuz.su>wrote:
> 
> > The one thing in this chain I have never found is an AT&T style
> > assembler for x86 before ELF was introduced.
> 
> There were alot of AT&T codebase ports to x86 architecture except Xenix:
> Microport, INTERACTIVE, Everex, Wyse e.t.c. using AT&T x86 syntax.
> 
> I've tried Microport SystemV /386 (SysV R3.2). It uses COFF
> as format for executables:
> See:
> http://www.vcfed.org/forum/showthread.php?67736-History-behind-the-disk-images-of-AT-amp-T-UNIX-System-V-Release-4-Version-2-1-for-386&p=560039#post560039
> (Rather interesting kernel ABI/Call convention)

Nice find!  It seems to use lcall to selector 7 for system calls.  A
similar choice was made in 386BSD all the way through FreeBSD 2.2.8
where it was replaced with int $0x80 as in Linux.

> and
> https://gunkies.org/wiki/Unix_SYSVr3
> 
> There were also SystemV R2 to i286 ports i.e.:
> https://gunkies.org/wiki/Microport_System_V
> with a.out binary format.

I'll have a look at that.

Thank you for the help!

Yours,
Robert Clausecker

-- 
()  ascii ribbon campaign - for an 8-bit clean world 
/\  - against html email  - against proprietary attachments

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Design of the AT&T assembly syntax
@ 2019-10-28 21:48 Alexander Voropay
  2019-10-28 21:59 ` Robert Clausecker
  0 siblings, 1 reply; 10+ messages in thread
From: Alexander Voropay @ 2019-10-28 21:48 UTC (permalink / raw)
  To: The Eunuchs Hysterical Society

Robert Clausecker <fuz@fuz.su>wrote:

> The one thing in this chain I have never found is an AT&T style
> assembler for x86 before ELF was introduced.

There were alot of AT&T codebase ports to x86 architecture except Xenix:
Microport, INTERACTIVE, Everex, Wyse e.t.c. using AT&T x86 syntax.

I've tried Microport SystemV /386 (SysV R3.2). It uses COFF
as format for executables:
See:
http://www.vcfed.org/forum/showthread.php?67736-History-behind-the-disk-images-of-AT-amp-T-UNIX-System-V-Release-4-Version-2-1-for-386&p=560039#post560039
(Rather interesting kernel ABI/Call convention)

and
https://gunkies.org/wiki/Unix_SYSVr3

There were also SystemV R2 to i286 ports i.e.:
https://gunkies.org/wiki/Microport_System_V
with a.out binary format.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [TUHS] Design of the AT&T assembly syntax
  2019-10-28 20:14 Robert Clausecker
@ 2019-10-28 21:06 ` Seth Morabito
  2019-10-29  0:31   ` Nemo Nusquam
  0 siblings, 1 reply; 10+ messages in thread
From: Seth Morabito @ 2019-10-28 21:06 UTC (permalink / raw)
  To: tuhs

On Mon, Oct 28, 2019, at 1:14 PM, Robert Clausecker wrote:
> Some time ago, I wrote a piece [1] about the design of the AT&T
> assembler syntax.  


Thanks for pointing this out, I think it's a really interesting read.

I'm a bit of an oddball among my friends, because I actually like and prefer AT&T syntax. I think most of the other developers I know find my attitude to be abhorrent :)

-Seth
-- 
  Seth Morabito
  Poulsbo, WA
  web@loomcom.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Design of the AT&T assembly syntax
@ 2019-10-28 20:14 Robert Clausecker
  2019-10-28 21:06 ` Seth Morabito
  0 siblings, 1 reply; 10+ messages in thread
From: Robert Clausecker @ 2019-10-28 20:14 UTC (permalink / raw)
  To: tuhs

Some time ago, I wrote a piece [1] about the design of the AT&T
assembler syntax.  While I'm still not quite sure if everything in there
is correct, this explanation seemed plausible to me; the PDP-11
assembler being adapted for the 8086, then the 80386 and then ELF
targets, giving us today's convoluted syntax.

The one thing in this chain I have never found is an AT&T style
assembler for x86 before ELF was introduced.  Supposedly, it would get
away without % as a register prefix, thus being much less obnoxious to
use.  Any idea if such an assembler ever existed and if yes where?
I suppose Xenix might have shipped something like that.

The only AT&T syntax assemblers I know today are those from Solaris,
the GNU project, the LLVM project, and possibly whatever macOS ships.
Are there (or where there) any other x86 AT&T assemblers?  Who was
the first party to introduce this?

Yours,
Robert Clausecker

[1]: https://stackoverflow.com/a/42250270/417501

-- 
()  ascii ribbon campaign - for an 8-bit clean world 
/\  - against html email  - against proprietary attachments

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, back to index

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-28 20:07 [TUHS] Design of the AT&T assembly syntax Robert Clausecker
2019-10-28 22:08 ` Warner Losh
2019-10-28 22:24   ` Robert Clausecker
2019-10-28 22:29     ` Warner Losh
2019-10-28 20:14 Robert Clausecker
2019-10-28 21:06 ` Seth Morabito
2019-10-29  0:31   ` Nemo Nusquam
2019-10-28 21:48 Alexander Voropay
2019-10-28 21:59 ` Robert Clausecker
2019-10-29 14:38 Alexander Voropay

The Unix Heritage Society mailing list

Archives are clonable: git clone --mirror http://inbox.vuxu.org/tuhs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://inbox.vuxu.org/vuxu.archive.tuhs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git