The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [Unix-jun72] Disassembler in progress
@ 2008-04-30 10:21 Warren Toomey
  2008-04-30 13:01 ` [Unix-jun72] ocr'd e03 Brad Parker
  0 siblings, 1 reply; 11+ messages in thread
From: Warren Toomey @ 2008-04-30 10:21 UTC (permalink / raw)


Guys, I'm writing a PDP-11 a.out disassember. I think it will be useful for
a couple of reasons:

 - we will be able to convert the extant 1972 binaries back into some form
   of source code. It won't be as good as the real thing, but it will be
   better than the binary.
 - we have some source code in fragmentary form on the s1 tape, see
   http://minnie.tuhs.org/UnixTree/1972_stuff/. Some of the fragments
   are identifiable, some are not. We might be able to use the
   diassembled binaries to identify some of the fragments, and even
   reconstruct a hybrid original/diassembled version of the source
   for some of the 1972 applications.

Right now, here's what I've got: disassembly of the top of 1972 ls:

    sys        break: 00
    mov        $01,044260
    mov        sp,r5
    mov        (r5)+,043732
    tst        (r5)+
    dec        043732
    mov        043732,043734
    bgt        040056
    mov        $042542,r5
    mov        (r5)+,r4
    cmpb        (r4)+,$055
    bne        040174
    dec        043734

and the top of the frag19 file:

        sys     break; end+512.
        mov     $1,obuf
        mov     sp,r5
        mov     (r5)+,count
        tst     (r5)+
        dec     count
        mov     count,ocount
        bgt     loop
        mov     $dotp,r5
loop:
        mov     (r5)+,r4
        cmpb    (r4)+,$'-
        bne     1f
        dec     ocount

At the moment it's a 1-pass disassembler. I want to make it 2-pass: on the
first pass I will try to identify labels for branches, functions, strings and
variable locations (and given them arbitrary names); on the second pass
I'll print out the instructions with reference to the labels.
None of the binaries have symbol tables, unfortunately.

It's a start, anyway.
	Warren



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Unix-jun72] ocr'd e03
  2008-04-30 10:21 [Unix-jun72] Disassembler in progress Warren Toomey
@ 2008-04-30 13:01 ` Brad Parker
  2008-04-30 17:38   ` Tim Newsham
  0 siblings, 1 reply; 11+ messages in thread
From: Brad Parker @ 2008-04-30 13:01 UTC (permalink / raw)



Hi,

I'm new to this (just discovered it - way cool!), so as an experiment I
opened the scanned pdf and cut and pasted e03-01,02,03,04 into gimp,
shrunk them to 3000x3000 and sent them to the tesseract web site.  It
does an amazing job.  A little emacs work and the source looks good.

Anyway, I know e03 is assigned to someone else, but they where not in
the svn.  should I check them in?  (I just did it as an experiment, and
I don't want to step on anyone;)

I'm also curious how we boot strap this.  In the end I assume we need a
binary image which one of the sims can read.  I have 0.5 a mind to write
a quick and dirty assembler which outputs a binary file... 

But I suppose it would be better to use the original as/as2.  Can this
be run with apout? (I'd be curious to hear how people are doing it).

I'm happy to keep plugging through the remaining un-ocr'd pages if no
one screams, sending email first of course.

-brad



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Unix-jun72] ocr'd e03
  2008-04-30 13:01 ` [Unix-jun72] ocr'd e03 Brad Parker
@ 2008-04-30 17:38   ` Tim Newsham
  2008-04-30 18:49     ` Brad Parker
                       ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Tim Newsham @ 2008-04-30 17:38 UTC (permalink / raw)


> I'm new to this (just discovered it - way cool!), so as an experiment I
> opened the scanned pdf and cut and pasted e03-01,02,03,04 into gimp,
> shrunk them to 3000x3000 and sent them to the tesseract web site.  It
> does an amazing job.  A little emacs work and the source looks good.
>
> Anyway, I know e03 is assigned to someone else, but they where not in
> the svn.  should I check them in?  (I just did it as an experiment, and
> I don't want to step on anyone;)

yes, please.  I'll drop a note to the person this was assigned to
and start talkign to current assignees who havent yet had a chance
to do work to see if any of it should be reclaimed.

> I'm also curious how we boot strap this.  In the end I assume we need a
> binary image which one of the sims can read.  I have 0.5 a mind to write
> a quick and dirty assembler which outputs a binary file...

It seems like we can build this using the V7 assembler, or an earlier
one such as the "as" in the 1972 bits that are around, using the
apout emulator.  I'm probably going to stick with the V7 assembler
for now due to the "mount" issues I ran across in the 1972 assembler.
For quick and dirty bootstrapping I was thinking of writing out
a simh script with a bunch of "deposit" commands to put the image
directly in memory.  At some point later we can figure out if it
would be possible to reconstruct the original boot process (documented
in the 1ed manuals).

> But I suppose it would be better to use the original as/as2.  Can this
> be run with apout? (I'd be curious to hear how people are doing it).

Yup.  I've already tried it on some of the completed sections.  The
one problem I ran across is that the "ux" section defines "mount"
as a label where the 1972 "as" predefines the "mount" system call.
This problem doesnt exist in the 7ed assembler because it doesnt
predefine the system calls.  So I wrote up a "sys.s" (in svn) with
the system call definitions and I left out the definition for "mount"
(which isnt referenced in the kernel code).

> I'm happy to keep plugging through the remaining un-ocr'd pages if no
> one screams, sending email first of course.

Great.  I'll poke at current assignees to see which section would be
most appropriate.

> -brad

Tim Newsham
http://www.thenewsh.com/~newsham/



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Unix-jun72] ocr'd e03
  2008-04-30 17:38   ` Tim Newsham
@ 2008-04-30 18:49     ` Brad Parker
  2008-04-30 19:19       ` Hellwig Geisse
       [not found]     ` <10901.1209581283@mini>
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Brad Parker @ 2008-04-30 18:49 UTC (permalink / raw)



I checked in the missing pages from e3, e4 and e8.  I have not tried
to assemble them yet.

Looks like e1 has a few missing pages at the end but my "free time" is
gone for the day.

I'll try and fire up the v7 as later on.

-brad




^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Unix-jun72] ocr'd e03
  2008-04-30 18:49     ` Brad Parker
@ 2008-04-30 19:19       ` Hellwig Geisse
  2008-04-30 19:52         ` Tim Newsham
  0 siblings, 1 reply; 11+ messages in thread
From: Hellwig Geisse @ 2008-04-30 19:19 UTC (permalink / raw)


On Wed, 2008-04-30 at 14:49 -0400, Brad Parker wrote:
> I checked in the missing pages from e3, e4 and e8.  I have not tried
> to assemble them yet.

Please forgive me, but I think it is not a brilliant
idea to check-in pages which obviously haven't been
proof-read. Especially typoes in the comments will
not even been noticed by the assembler.

> Looks like e1 has a few missing pages at the end but my "free time" is
> gone for the day.

Yes, these are part of my batch and will be submitted
when I had time to check them.

Hellwig




^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Unix-jun72] ocr'd e03
       [not found]     ` <10901.1209581283@mini>
@ 2008-04-30 19:24       ` Tim Newsham
  2008-04-30 20:43         ` Brad Parker
  0 siblings, 1 reply; 11+ messages in thread
From: Tim Newsham @ 2008-04-30 19:24 UTC (permalink / raw)


> Can you show me how you are running it?  (and feel free to cc the list)

(I think its mentioned in an earlier post already).  I copy the
files to my 7ed system (make a tar, put it on a tape image, and
attach it in simh, then tar x to get contents).  Probably easier
if you're using apout and local filesystem...  I'm using the following
script (in my tools but not checked in because I'm using nonstandard
conv2):

    tools/rebuild
    (cd rebuilt; gtar -O -cf ../u.tar u?.s)
    ./conv2 -o tape.tm u.tar
    cp tape.tm ~/work/simh/unix-v7-4/run/

Anyway to assemble I run:

     as - sys.s u0.s u1.s ux.s

btw, I noticed some unicode characters in the files you committed.
I havent' had a chance to spend time editing it yet..  The ocr
often uses unicode for things like "-".

> I think there is a binary format.  I think I figured it out once and
> wrote something to turn an a.out into it.  hmmm.  I'll go digging.

a.out is so simple, it wouldnt be hard to reproduce if we had to.

> I checked in the missing pages from e3, e4 and e8.  I have not tried
> to assemble them yet, however.

I noticed that.  Thank you.

> -brad

Tim Newsham
http://www.thenewsh.com/~newsham/



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Unix-jun72] ocr'd e03
  2008-04-30 19:19       ` Hellwig Geisse
@ 2008-04-30 19:52         ` Tim Newsham
  0 siblings, 0 replies; 11+ messages in thread
From: Tim Newsham @ 2008-04-30 19:52 UTC (permalink / raw)


> Please forgive me, but I think it is not a brilliant
> idea to check-in pages which obviously haven't been
> proof-read. Especially typoes in the comments will
> not even been noticed by the assembler.

I think it is fine as long as they are not marked as having been
reviewed yet.  Other people can assist in the review process if
the imperfect files are in the SVN.  Hopefully by the end of this
process every file will have been reviewed by the original author
and at least one other person on a line-by-line basis.

> Hellwig

Tim Newsham
http://www.thenewsh.com/~newsham/



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Unix-jun72] Mar 72 kernel subroutine description
  2008-04-30 17:38   ` Tim Newsham
  2008-04-30 18:49     ` Brad Parker
       [not found]     ` <10901.1209581283@mini>
@ 2008-04-30 20:26     ` Al Kossow
  2008-04-30 20:48       ` Brad Parker
  2008-05-01 16:23     ` [Unix-jun72] ocr'd e03 Tim Bradshaw
  3 siblings, 1 reply; 11+ messages in thread
From: Al Kossow @ 2008-04-30 20:26 UTC (permalink / raw)


The hand-written subroutine description document is now
up as
http://bitsavers.org/pdf/bellLabs/unix/Kernel_Subroutine_Descriptions_Mar72.pdf





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Unix-jun72] ocr'd e03
  2008-04-30 19:24       ` Tim Newsham
@ 2008-04-30 20:43         ` Brad Parker
  0 siblings, 0 replies; 11+ messages in thread
From: Brad Parker @ 2008-04-30 20:43 UTC (permalink / raw)



Tim Newsham wrote:
>> Can you show me how you are running it?  (and feel free to cc the list)
>
>(I think its mentioned in an earlier post already).  I copy the
>files to my 7ed system (make a tar, put it on a tape image, and
>attach it in simh, then tar x to get contents).  Probably easier

Interesting.  and authentic! :-)  I am too lazy and used apout with
a v7 tree.  I used this script:

#!/bin/sh
export APOUT_ROOT=/backup/raid2/pdp11/v7
H=`pwd`
R=./rebuilt
W="$R/u0.s $R/u1.s $R/u2.s $R/u3.s $R/u4.s $R/u5.s $R/u6.s $R/u7.s $R/u8.s $R/u9.s"
apout /backup/raid2/pdp11/v7/bin/as ./sys1.s $W $R/ux.s


Which seems to work pretty well.

>btw, I noticed some unicode characters in the files you committed.

yes, sorry.  I think just committed fixed for all of them.

>The ocr often uses unicode for things like "-".

Yes, that's exactly what it did.  Took me a bit to figure that out since
they look very similar.

I also did a bunch of clean up of existing pages (probably should wait for
someone to review them but what the heck).  Mostly simple ocr errors.

I'm not sure we need to define the system traps; some (like mkdir) collide.
I turned off the def for now.

Also, some of the registers for math (mq, ac) don't seem to be defined
by the v7 as.  Not sure how that works to be honest (fp & mult are
beyond my pdp-11 knowledge).

It's closer now.  Still some editing to be done and some missing pages,
but closer.

-brad

Brad Parker
Heeltoe Consulting
+1-781-483-3101
http://www.heeltoe.com





^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Unix-jun72] Mar 72 kernel subroutine description
  2008-04-30 20:26     ` [Unix-jun72] Mar 72 kernel subroutine description Al Kossow
@ 2008-04-30 20:48       ` Brad Parker
  0 siblings, 0 replies; 11+ messages in thread
From: Brad Parker @ 2008-04-30 20:48 UTC (permalink / raw)



Al Kossow wrote:
>The hand-written subroutine description document is now
>up as
>http://bitsavers.org/pdf/bellLabs/unix/Kernel_Subroutine_Descriptions_Mar72.pd
>f

Wow.  That's an amazing document. 

Someone one from Bell Labs?

-brad



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [Unix-jun72] ocr'd e03
  2008-04-30 17:38   ` Tim Newsham
                       ` (2 preceding siblings ...)
  2008-04-30 20:26     ` [Unix-jun72] Mar 72 kernel subroutine description Al Kossow
@ 2008-05-01 16:23     ` Tim Bradshaw
  3 siblings, 0 replies; 11+ messages in thread
From: Tim Bradshaw @ 2008-05-01 16:23 UTC (permalink / raw)


On 30 Apr 2008, at 18:38, Tim Newsham wrote:
>
> yes, please.  I'll drop a note to the person this was assigned to
> and start talkign to current assignees who havent yet had a chance
> to do work to see if any of it should be reclaimed.

FWIW that was me, and clearly was rash to agree given I have no time  
and/or am lazy.  Sorry!

--tim



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-05-01 16:23 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-04-30 10:21 [Unix-jun72] Disassembler in progress Warren Toomey
2008-04-30 13:01 ` [Unix-jun72] ocr'd e03 Brad Parker
2008-04-30 17:38   ` Tim Newsham
2008-04-30 18:49     ` Brad Parker
2008-04-30 19:19       ` Hellwig Geisse
2008-04-30 19:52         ` Tim Newsham
     [not found]     ` <10901.1209581283@mini>
2008-04-30 19:24       ` Tim Newsham
2008-04-30 20:43         ` Brad Parker
2008-04-30 20:26     ` [Unix-jun72] Mar 72 kernel subroutine description Al Kossow
2008-04-30 20:48       ` Brad Parker
2008-05-01 16:23     ` [Unix-jun72] ocr'd e03 Tim Bradshaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).