The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] Query on PDP-11 assembly
@ 2008-04-30 11:56 Warren Toomey
  2008-04-30 13:55 ` Brantley Coile
  2008-04-30 15:08 ` Carl Lowenstein
  0 siblings, 2 replies; 10+ messages in thread
From: Warren Toomey @ 2008-04-30 11:56 UTC (permalink / raw)


All, I'm trying to write a PDP-11 disassembler for a.out files. I'm having
trouble dealing with jsrs. Take, for example, the code here:
http://minnie.tuhs.org/UnixTree/1972_stuff/s1/frag19.html

I can happily deal with the   jsr pc,do   type of jsr, but the ones
involving r5 have me stumped, e.g.:

	jsr	r5,questf; < nonexistent\n\0>; .even

It appears that data is being inserted into the executable directly
after the jsr instruction. How does the rts which returns from the jsr
know how much data to skip, and what is the involvement of r5 here?

Thanks,
	Warren



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Query on PDP-11 assembly
  2008-04-30 11:56 [TUHS] Query on PDP-11 assembly Warren Toomey
@ 2008-04-30 13:55 ` Brantley Coile
  2008-04-30 14:41   ` Naoki Hamada
  2008-04-30 16:53   ` Milo Velimirovic
  2008-04-30 15:08 ` Carl Lowenstein
  1 sibling, 2 replies; 10+ messages in thread
From: Brantley Coile @ 2008-04-30 13:55 UTC (permalink / raw)


In your example, -(sp) = r5; r5 = pc; pc = guestf.
Guestf will have to bump r5 as in consumes the parameters.
Rts r5 means pc = r5; r5 = (sp)+.

Hope this helps.

Warren Toomey wrote:
> All, I'm trying to write a PDP-11 disassembler for a.out files. I'm having
> trouble dealing with jsrs. Take, for example, the code here:
> http://minnie.tuhs.org/UnixTree/1972_stuff/s1/frag19.html
> 
> I can happily deal with the   jsr pc,do   type of jsr, but the ones
> involving r5 have me stumped, e.g.:
> 
> 	jsr	r5,questf; < nonexistent\n\0>; .even
> 
> It appears that data is being inserted into the executable directly
> after the jsr instruction. How does the rts which returns from the jsr
> know how much data to skip, and what is the involvement of r5 here?
> 
> Thanks,
> 	Warren
> _______________________________________________
> TUHS mailing list
> TUHS at minnie.tuhs.org
> https://minnie.tuhs.org/mailman/listinfo/tuhs



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Query on PDP-11 assembly
  2008-04-30 13:55 ` Brantley Coile
@ 2008-04-30 14:41   ` Naoki Hamada
  2008-05-01 23:47     ` Warren Toomey
  2008-04-30 16:53   ` Milo Velimirovic
  1 sibling, 1 reply; 10+ messages in thread
From: Naoki Hamada @ 2008-04-30 14:41 UTC (permalink / raw)


Hi,

This technique was a very common one in assembler sources for PDP-11
versions of UNIX. For example, m40.s of Version 6 UNIX shows a line
"jsr     r0,call1; _trap" in its trap routine. I feel very funny to
tell this to Warren, the author of apout!

Anyway, it could be a source of the null-terminated expression of
character strings of the C language, I guess.

Naoki Hamada
nao at tom-yam.or.jp



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Query on PDP-11 assembly
  2008-04-30 11:56 [TUHS] Query on PDP-11 assembly Warren Toomey
  2008-04-30 13:55 ` Brantley Coile
@ 2008-04-30 15:08 ` Carl Lowenstein
  1 sibling, 0 replies; 10+ messages in thread
From: Carl Lowenstein @ 2008-04-30 15:08 UTC (permalink / raw)


On Wed, Apr 30, 2008 at 4:56 AM, Warren Toomey <wkt at tuhs.org> wrote:
> All, I'm trying to write a PDP-11 disassembler for a.out files. I'm having
>  trouble dealing with jsrs. Take, for example, the code here:
>  http://minnie.tuhs.org/UnixTree/1972_stuff/s1/frag19.html
>
>  I can happily deal with the   jsr pc,do   type of jsr, but the ones
>  involving r5 have me stumped, e.g.:
>
>         jsr     r5,questf; < nonexistent\n\0>; .even
>
>  It appears that data is being inserted into the executable directly
>  after the jsr instruction. How does the rts which returns from the jsr
>  know how much data to skip, and what is the involvement of r5 here?

Standard subroutine calling sequence.

The called routine must know how many parameters it is called with.
It retrieves them by MOV (R5)+, <somewhere>.
This advances R5 so that eventually it points to the return address,
and the return is done as RTS R5.

A more advanced calling sequence could insert the number of parameters
as the first value after the JSR, and the called routine would then
retrieve that number and use it to tell when it had fetched the right
amount of data.

    carl
-- 
    carl lowenstein         marine physical lab     u.c. san diego
                                                 clowenstein at ucsd.edu



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Query on PDP-11 assembly
  2008-04-30 13:55 ` Brantley Coile
  2008-04-30 14:41   ` Naoki Hamada
@ 2008-04-30 16:53   ` Milo Velimirovic
  2008-04-30 17:00     ` Larry McVoy
  1 sibling, 1 reply; 10+ messages in thread
From: Milo Velimirovic @ 2008-04-30 16:53 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 2475 bytes --]

A subprogram using this calling convention would look something like  
this:

questf:	mov (r5)+,r0
	/ play with string pointed to by r0.
	rts r5

On Apr 30, 2008, at 8:55 AM, Brantley Coile wrote:

> In your example, -(sp) = r5; r5 = pc; pc = guestf.
> Guestf will have to bump r5 as in consumes the parameters.
> Rts r5 means pc = r5; r5 = (sp)+.
>
> Hope this helps.
>
> Warren Toomey wrote:
>> All, I'm trying to write a PDP-11 disassembler for a.out files. I'm  
>> having
>> trouble dealing with jsrs. Take, for example, the code here:
>> http://minnie.tuhs.org/UnixTree/1972_stuff/s1/frag19.html
>>
>> I can happily deal with the   jsr pc,do   type of jsr, but the ones
>> involving r5 have me stumped, e.g.:
>>
>> 	jsr	r5,questf; < nonexistent\n\0>; .even
>>
>> It appears that data is being inserted into the executable directly
>> after the jsr instruction. How does the rts which returns from the  
>> jsr
>> know how much data to skip, and what is the involvement of r5 here?

The rts doesn't know anything about how much data to skip. In this  
snippet r5 is a linkage register that's doing double duty: it's both  
an argument pointer to the location immediately following the jsr and  
once r5 has been adjusted to point to the location after the argument  
list it becomes a return address. This programming technique is  
dependent on the subprogram and its callers agreeing on the number of  
arguments though it's possible to do a vararg style as well. It's  
necessary for the subprogram to pick up the arguments and adjust the  
linkage register accordingly AND to return to the caller with the same  
register named in the rts instruction as was used in the calling jsr.

/ variable argument list with linkage register
	jsr r5,some_fn; $argc; arg1; arg2; ... argn

/ in the function it's necessary to pick up all the args

some_fn: mov	(r5)+,r0	/ pick up argc
	beq	2f		/ if no arguments, process simplest case.
1:
/ pick up an argument:
	mov	(r5)+, somewhere
/ or
	mov	*(r5)+, somewhere
	...
	dec	r0
	bne	1b	/ process remaining arguments.
2:	...
	rts r5


I've seen other ways of processing arguments passed with this method  
that involved using indexing and adding the argument count to the  
linkage register. Tastes and mileage may vary.

Regards,
Milo

--
Milo Velimirović,  Unix Computer Network Administrator
608.785.6618 Office -  608.386.2817 Cell
University of Wisconsin - La Crosse
La Crosse, Wisconsin 54601 USA   43 48 48 N 91 13 53 W






^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Query on PDP-11 assembly
  2008-04-30 16:53   ` Milo Velimirovic
@ 2008-04-30 17:00     ` Larry McVoy
  2008-04-30 17:47       ` John Cowan
  0 siblings, 1 reply; 10+ messages in thread
From: Larry McVoy @ 2008-04-30 17:00 UTC (permalink / raw)


On Wed, Apr 30, 2008 at 11:53:52AM -0500, Milo Velimirovic wrote:
> A subprogram using this calling convention would look something like  
> this:
> 
> questf:	mov (r5)+,r0
> 	/ play with string pointed to by r0.
> 	rts r5

It warms my heart to see pdp-11 assembly again.  What a pleasant instruction
set.  m68k was close but already starting to get weird and x86 is the pits.
-- 
---
Larry McVoy                lm at bitmover.com           http://www.bitkeeper.com



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Query on PDP-11 assembly
  2008-04-30 17:00     ` Larry McVoy
@ 2008-04-30 17:47       ` John Cowan
  2008-04-30 17:59         ` Larry McVoy
  0 siblings, 1 reply; 10+ messages in thread
From: John Cowan @ 2008-04-30 17:47 UTC (permalink / raw)


Larry McVoy scripsit:

> It warms my heart to see pdp-11 assembly again.  What a pleasant instruction
> set.

Indeed.  PDP-8 and PDP-11 assembly are the only assembly languages I've
ever used, and I liked both of them a lot.

-- 
John Cowan   <cowan at ccil.org>   http://www.ccil.org/~cowan
One time I called in to the central system and started working on a big
thick 'sed' and 'awk' heavy duty data bashing script.  One of the geologists
came by, looked over my shoulder and said 'Oh, that happens to me too.
Try hanging up and phoning in again.'  --Beverly Erlebacher



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Query on PDP-11 assembly
  2008-04-30 17:47       ` John Cowan
@ 2008-04-30 17:59         ` Larry McVoy
  0 siblings, 0 replies; 10+ messages in thread
From: Larry McVoy @ 2008-04-30 17:59 UTC (permalink / raw)


On Wed, Apr 30, 2008 at 01:47:39PM -0400, John Cowan wrote:
> Larry McVoy scripsit:
> 
> > It warms my heart to see pdp-11 assembly again.  What a pleasant instruction
> > set.
> 
> Indeed.  PDP-8 and PDP-11 assembly are the only assembly languages I've
> ever used, and I liked both of them a lot.

I had a TA that could read PDP-11 octal the way other people read C.
I used to go over to his apartment with a listing and a 6-pack and
let him debug (lazy bastard that I was :)  Ken Witte, wonder where 
he is now.  Well there he is: http://www.linkedin.com/pub/4/B08/637
-- 
---
Larry McVoy                lm at bitmover.com           http://www.bitkeeper.com



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Query on PDP-11 assembly
  2008-04-30 14:41   ` Naoki Hamada
@ 2008-05-01 23:47     ` Warren Toomey
  0 siblings, 0 replies; 10+ messages in thread
From: Warren Toomey @ 2008-05-01 23:47 UTC (permalink / raw)


On Wed, Apr 30, 2008 at 11:41:17PM +0900, Naoki Hamada wrote:
> This technique was a very common one in assembler sources for PDP-11
> versions of UNIX. For example, m40.s of Version 6 UNIX shows a line
> "jsr     r0,call1; _trap" in its trap routine. I feel very funny to
> tell this to Warren, the author of apout!

Well, I borrowed the PDP-11 instruction simulaton code from Eric Edwards,
and spent most of my time on emulating the syscalls, so there are still
things I don't know!

I've got the disassembler to a point where it works OK on the instructions,
I haven't worked on the data side yet. I have put it aside for a bit, due
to increasing frustration levels and other things to do. If someone wants
a copy, e-mail me.

Thanks for all your responses, they were very useful.

Cheers,
	Warren



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [TUHS] Query on PDP-11 assembly
@ 2008-04-30 16:20 James A. Markevitch
  0 siblings, 0 replies; 10+ messages in thread
From: James A. Markevitch @ 2008-04-30 16:20 UTC (permalink / raw)


> I can happily deal with the   jsr pc,do   type of jsr, but the ones
> involving r5 have me stumped, e.g.:
> 
>         jsr     r5,questf; < nonexistent\n\0>; .even

I have encountered this type of construct a lot when doing disassemblers
over the years.  My usual strategy for dealing with this is:

1. If it's quick and dirty and I am not running huge amounts of code,
then the disassembler allows the user to provide a list of "hints" to
it.  The hints for this would describe the arguments to each subroutine.
For illustrative purposes, you might have a side file that contains
the following:

	subr 002004 questf string

meaning that location 002004 is a subroutine names questf that expects
a null-terminated string as the argument.  As an additional benefit,
you get a nice name for the subroutine that the disassembler can put
into the output.

And if a subroutine takes two 16-bit arguments, you might have:

	subr 003436 mysub arg16 arg16

If the disassembler identifies each of the targets of the jsr
instructions, then you can usually do a quick look at the code to
see what it expects, then add to the side file, then re-run the
disassembler.

2. If you want to be less quick and dirty, you can have the disassembler
do a partial flow analysis of the code to figure out what is expected
for arguments.  This is usually much more involved and you still often
need to add hints for cases where the '60s or '70s programmer did some
kind of "neat trick" when coding.

My philosophy on these is to use tools to get to the 95%+ level of
automation and provide hints to pick up the rest.  Using strategy
number 1 above will probably get you a lot of success with a small
amount of coding in your disassembler.

James Markevitch



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2008-05-01 23:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-04-30 11:56 [TUHS] Query on PDP-11 assembly Warren Toomey
2008-04-30 13:55 ` Brantley Coile
2008-04-30 14:41   ` Naoki Hamada
2008-05-01 23:47     ` Warren Toomey
2008-04-30 16:53   ` Milo Velimirovic
2008-04-30 17:00     ` Larry McVoy
2008-04-30 17:47       ` John Cowan
2008-04-30 17:59         ` Larry McVoy
2008-04-30 15:08 ` Carl Lowenstein
2008-04-30 16:20 James A. Markevitch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).