The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [pups] extract old archive format?
       [not found] <337550.74945.qm@web82407.mail.mud.yahoo.com>
@ 2010-04-09  5:35 ` Jeremy C. Reed
  0 siblings, 0 replies; 14+ messages in thread
From: Jeremy C. Reed @ 2010-04-09  5:35 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1033 bytes --]

On Thu, 8 Apr 2010, Michael Davidson wrote:

> Your modern compiler is almost certainly inserting two bytes of padding
> after ar_name[]
> so that the int32_t ar_date is aligned on a 4 byte boundary - that shifts
> everything else
> down by 2 bytes and means that ar_size lines up with the last 2 bytes of the
> size in
> the header which, as luck would have it, is the low order 16 bits of the
> size as it would
> have been stored in a 32 bit long.
> 
> Something like this should work on a modern little endian processor"
> 
> struct {
>         char    ar_name[14];
>         int16_t ar_date_16_31;
>         int16_t ar_date_00_15;
>         char    ar_uid;
>         char    ar_gid;
>         uint16_t        ar_mode;
>         uint16_t        ar_size_16_31;
>         uint16_t        ar_size_00_15;
> } ar_buf;

Thank you! That works for me! I can now get correct sizes, names, and 
data. I will clean up my little ar extractor over the next few days and 
share it.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pups] extract old archive format?
  2010-04-09  4:39   ` Carl Lowenstein
@ 2010-04-09 10:23     ` Johnny Billquist
  0 siblings, 0 replies; 14+ messages in thread
From: Johnny Billquist @ 2010-04-09 10:23 UTC (permalink / raw)


Carl Lowenstein wrote:
> On Thu, Apr 8, 2010 at 7:40 PM, John  Holden <johnh at psych.usyd.edu.au> wrote:
>>> Well I found the ar specification (in ar.5 not ar.1).
>>>
>>>              struct ar_hdr {
>>>                      char      ar_name[14];
>>>                      long      ar_date;
>>>                      char      ar_uid;
>>>                      char      ar_gid;
>>>                      int       ar_mode;
>>>                      long      ar_size;
>>>              };
>> Endian should not be a problem on a Intel/AMD processor. More likely your C
>> compiler is padding out the array for alignment. Try a '-fpack-struct' or
>> more safely, read the elements individually rather than a structure.
>>
> 
> In the PDP-11 long is 32 bits, int 16 bits.   And the PDP-11 is
> determinedly little-endian if you stick to integers.

Um? Longs are integers, and they are middle-endian...

> They got floating-point software right in 1971, but somebody screwed
> up the word order when building FP hardware, which led to the
> middle-endian mess.

...but yes. The middle-endian long issue appeared with the FPP, since 
that is the only hardware that deals natively with long integers.

	Johnny




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pups] extract old archive format?
  2010-04-09  2:40 John Holden
       [not found] ` <n2m5904d5731004082137u5b054823wd4a9ce55113b1dee@mail.gmail.com>
@ 2010-04-09 10:21 ` Johnny Billquist
  1 sibling, 0 replies; 14+ messages in thread
From: Johnny Billquist @ 2010-04-09 10:21 UTC (permalink / raw)


John Holden wrote:
>> Well I found the ar specification (in ar.5 not ar.1).
>>
>>              struct ar_hdr {
>>                      char      ar_name[14];
>>                      long      ar_date;
>>                      char      ar_uid;
>>                      char      ar_gid;
>>                      int       ar_mode;
>>                      long      ar_size;
>>              };
> 
> Endian should not be a problem on a Intel/AMD processor. More likely your C
> compiler is padding out the array for alignment. Try a '-fpack-struct' or
> more safely, read the elements individually rather than a structure.
> 
> PS
> 
> To check, see what 'sizeof (struct ar_hdr_)' returns.

Well, you are correct in that aligment is a part of the problem. 
However, endianess is also a problem with longs, since they are not 
little-endian on a PDP11. :-)

	Johnny



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pups] extract old archive format?
  2010-04-09  2:11 ` Jeremy C. Reed
@ 2010-04-09 10:15   ` Johnny Billquist
  0 siblings, 0 replies; 14+ messages in thread
From: Johnny Billquist @ 2010-04-09 10:15 UTC (permalink / raw)


Jeremy C. Reed wrote:
> Well I found the ar specification (in ar.5 not ar.1).
> 
>              struct ar_hdr {
>                      char      ar_name[14];
>                      long      ar_date;
>                      char      ar_uid;
>                      char      ar_gid;
>                      int       ar_mode;
>                      long      ar_size;
>              };

Simple enough... :-)

> This is same as the old ar.c source.
> 
> (plus more in the manual page.)
> 
> Now my problem is I don't know what "long" or "int" is on the old PDP-11 
> / system 5 this was made on.

An int on the pdp11 is 16 bits, and a long is 32.
Remember? int is whatever size is most convenient for the architecture? :-)

> And I read about PDP-11 "middle endianess" (first time I heard of 
> "middle").

A mess, but we have to live with it.
In short, the bytes of a long on a PDP11 is likely laid out like this:
3412

So, the 16-bit values are each little-endian, but the 16 bit values as 
such, in the 32-bit view, is laid out as big-endian.
Thus middle-endian... :-)

> So I had (wrong but gets ar_name and ar_size correct for my few tests 
> for the first header but chops two characters into the data section).
> 
> struct {
>         char    ar_name[14];
>         int32_t ar_date;
>         char    ar_uid;
>         char    ar_gid;
>         uint16_t        ar_mode;
>         uint16_t        ar_size;
> } ar_buf;
> 
> Well I know above is wrong because ar_size and ar_date should be the 
> same. But I get ar_size correct each time. But it also loses the next 
> two bytes from the data. So I am guessing I have some endian issue where 
> I am getting some things reversed.
> 
> Any ideas?

As others already said, it's your compiler trying to optimize the 
alignments.
One solution (already presented) is to just play with 16-bit values.
You could also explain to the compiler that it shouldn't try to optimize 
the alignments, but since you have to deal with the middle-endianess 
anyway, you are probably better off just looking at 16-bit values and 
combine them into 32-bit values as needed yourself.

	Johnny




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pups] extract old archive format?
  2010-04-09  1:13 Norman Wilson
  2010-04-09  2:11 ` Jeremy C. Reed
@ 2010-04-09  5:49 ` Bob Eager
  1 sibling, 0 replies; 14+ messages in thread
From: Bob Eager @ 2010-04-09  5:49 UTC (permalink / raw)


On Thu, 08 Apr 2010 21:13:55 -0400 (EDT)
Norman Wilson <norman at oclsc.org> wrote:

>   The 'ar' format of that vintage is trivial, and documentation easily
>   found. I wrote programs to read it back in 1976!
> 
> ======
> 
> That's nothing.  Either Ken or Dennis wrote such a program
> years before that!

I know...I used it, in July 1976!

Mine was in PDP-11 assembler, for a different operating system.

> Seriously, it's a binary format, so I don't know that
> it would be easy to process in awk.

I forgot that the first version was partially binary, but still easy in
most modern languages.

If I had more time, I'd have a go...again!

-- 
Bob




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pups] extract old archive format?
  2010-04-08 13:25 Jeremy C. Reed
                   ` (2 preceding siblings ...)
  2010-04-08 17:09 ` Bob Eager
@ 2010-04-09  5:11 ` Warren Toomey
  3 siblings, 0 replies; 14+ messages in thread
From: Warren Toomey @ 2010-04-09  5:11 UTC (permalink / raw)


On Thu, Apr 08, 2010 at 08:25:29AM -0500, Jeremy C. Reed wrote:
> I want to look in some .a files identified by file(1) as "old PDP-11 
> archive".
> 
> Anyone know what tool I can use on a modern *BSD or Linux system to 
> extract the files from an "old PDP-11" ar archive?

As an alternative to doing it by hand, you can compile and install my
Apout user-mode simulator, then run a PDP-11 ar inside Apout. This gives
you direct access to your normal filesystem, which makes extraction easier.

The latest Apout can be found in the svn snapshot here:
http://code.google.com/p/unix-jun72/downloads/list

You will also need to grab some V7 binaries, at least /bin/sh and /bin/ar,
from http://minnie.tuhs.org/Archive/PDP-11/Distributions/research/Henry_Spencer_v7/v7.tar.gz

Cheers,
	Warren



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pups]  extract old archive format?
       [not found] ` <n2m5904d5731004082137u5b054823wd4a9ce55113b1dee@mail.gmail.com>
@ 2010-04-09  4:39   ` Carl Lowenstein
  2010-04-09 10:23     ` Johnny Billquist
  0 siblings, 1 reply; 14+ messages in thread
From: Carl Lowenstein @ 2010-04-09  4:39 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1239 bytes --]

On Thu, Apr 8, 2010 at 7:40 PM, John  Holden <johnh at psych.usyd.edu.au> wrote:
>
>> Well I found the ar specification (in ar.5 not ar.1).
>>
>>              struct ar_hdr {
>>                      char      ar_name[14];
>>                      long      ar_date;
>>                      char      ar_uid;
>>                      char      ar_gid;
>>                      int       ar_mode;
>>                      long      ar_size;
>>              };
>
> Endian should not be a problem on a Intel/AMD processor. More likely your C
> compiler is padding out the array for alignment. Try a '-fpack-struct' or
> more safely, read the elements individually rather than a structure.
>

In the PDP-11 long is 32 bits, int 16 bits.   And the PDP-11 is
determinedly little-endian if you stick to integers.

They got floating-point software right in 1971, but somebody screwed
up the word order when building FP hardware, which led to the
middle-endian mess.

   carl
--
   carl lowenstein         marine physical lab     u.c. san diego
                                                clowenstein at ucsd.edu



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pups] extract old archive format?
@ 2010-04-09  2:40 John Holden
       [not found] ` <n2m5904d5731004082137u5b054823wd4a9ce55113b1dee@mail.gmail.com>
  2010-04-09 10:21 ` Johnny Billquist
  0 siblings, 2 replies; 14+ messages in thread
From: John Holden @ 2010-04-09  2:40 UTC (permalink / raw)



> Well I found the ar specification (in ar.5 not ar.1).
>
>              struct ar_hdr {
>                      char      ar_name[14];
>                      long      ar_date;
>                      char      ar_uid;
>                      char      ar_gid;
>                      int       ar_mode;
>                      long      ar_size;
>              };

Endian should not be a problem on a Intel/AMD processor. More likely your C
compiler is padding out the array for alignment. Try a '-fpack-struct' or
more safely, read the elements individually rather than a structure.

PS

To check, see what 'sizeof (struct ar_hdr_)' returns.

John



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pups] extract old archive format?
  2010-04-09  1:13 Norman Wilson
@ 2010-04-09  2:11 ` Jeremy C. Reed
  2010-04-09 10:15   ` Johnny Billquist
  2010-04-09  5:49 ` Bob Eager
  1 sibling, 1 reply; 14+ messages in thread
From: Jeremy C. Reed @ 2010-04-09  2:11 UTC (permalink / raw)


On Thu, 8 Apr 2010, Norman Wilson wrote:

> Bob Eager:
> 
>   The 'ar' format of that vintage is trivial, and documentation easily
>   found. I wrote programs to read it back in 1976!
> 
> ======
> 
> That's nothing.  Either Ken or Dennis wrote such a program
> years before that!
> 
> Warren even has a binary somewhere to prove it!
> 
> Seriously, it's a binary format, so I don't know that
> it would be easy to process in awk.  (At least not in
> awk-classic; stuff that works only in ghootandwaveawk
> is not all that interesting to me.)  But the format is
> simple, and any language new or old that can handle
> binary data without tears should do.

I can't see how to do it in awk either.

> If I didn't have an overfull plate already (and a
> visit to the Auto-Electrocution Consultant tomorrow,
> and one to the Canal Rooting Clinic Monday--proving
> that one should follow Father's advice and Stay Away
> >>From The Canal, Neddie) it would be interesting to
> collect the different specifications for ar headers
> over the years, and write a small suite of programs
> to read them.  Perhaps in Python, just to be difficult.
> (Why isn't there a language called Goon, Warren?)

Well I found the ar specification (in ar.5 not ar.1).

             struct ar_hdr {
                     char      ar_name[14];
                     long      ar_date;
                     char      ar_uid;
                     char      ar_gid;
                     int       ar_mode;
                     long      ar_size;
             };

This is same as the old ar.c source.

(plus more in the manual page.)

Now my problem is I don't know what "long" or "int" is on the old PDP-11 
/ system 5 this was made on.

And I read about PDP-11 "middle endianess" (first time I heard of 
"middle").

So I had (wrong but gets ar_name and ar_size correct for my few tests 
for the first header but chops two characters into the data section).

struct {
        char    ar_name[14];
        int32_t ar_date;
        char    ar_uid;
        char    ar_gid;
        uint16_t        ar_mode;
        uint16_t        ar_size;
} ar_buf;

Well I know above is wrong because ar_size and ar_date should be the 
same. But I get ar_size correct each time. But it also loses the next 
two bytes from the data. So I am guessing I have some endian issue where 
I am getting some things reversed.

Any ideas?

Note I am not using any system 5 or PDP-11 system. I am using a modern 
little endian (amd64) system to extract the files that were created in 
1970s.

Once I figure out the structure and endianness (if applicable) I will 
share back my code so others can extract ...



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pups] extract old archive format?
@ 2010-04-09  1:13 Norman Wilson
  2010-04-09  2:11 ` Jeremy C. Reed
  2010-04-09  5:49 ` Bob Eager
  0 siblings, 2 replies; 14+ messages in thread
From: Norman Wilson @ 2010-04-09  1:13 UTC (permalink / raw)


Bob Eager:

  The 'ar' format of that vintage is trivial, and documentation easily
  found. I wrote programs to read it back in 1976!

======

That's nothing.  Either Ken or Dennis wrote such a program
years before that!

Warren even has a binary somewhere to prove it!

Seriously, it's a binary format, so I don't know that
it would be easy to process in awk.  (At least not in
awk-classic; stuff that works only in ghootandwaveawk
is not all that interesting to me.)  But the format is
simple, and any language new or old that can handle
binary data without tears should do.

If I didn't have an overfull plate already (and a
visit to the Auto-Electrocution Consultant tomorrow,
and one to the Canal Rooting Clinic Monday--proving
that one should follow Father's advice and Stay Away
From The Canal, Neddie) it would be interesting to
collect the different specifications for ar headers
over the years, and write a small suite of programs
to read them.  Perhaps in Python, just to be difficult.
(Why isn't there a language called Goon, Warren?)

Norman Wilson
Toronto ON
(owwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww)



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pups] extract old archive format?
  2010-04-08 13:25 Jeremy C. Reed
  2010-04-08 14:49 ` Tim Bradshaw
  2010-04-08 15:16 ` Brantley Coile
@ 2010-04-08 17:09 ` Bob Eager
  2010-04-09  5:11 ` Warren Toomey
  3 siblings, 0 replies; 14+ messages in thread
From: Bob Eager @ 2010-04-08 17:09 UTC (permalink / raw)


On Thu, 8 Apr 2010 08:25:29 -0500 (CDT)
"Jeremy C. Reed" <reed at reedmedia.net> wrote:

> Anyone know what tool I can use on a modern *BSD or Linux system to 
> extract the files from an "old PDP-11" ar archive?
> 
> GNU ar complains "File format not recognized". ar tells me:
> 
>  ar: supported targets: elf64-x86-64 elf32-i386 a.out-i386-netbsd 
>  coff-i386 efi-app-ia32 elf64-little elf64-big elf32-little elf32-big 
>  srec symbolsrec tekhex binary ihex netbsd-core

The 'ar' format of that vintage is trivial, and documentation easily
found. I wrote programs to read it back in 1976!

A simple C program, or even an awk script, should do it.
-- 
Bob




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pups] extract old archive format?
  2010-04-08 13:25 Jeremy C. Reed
  2010-04-08 14:49 ` Tim Bradshaw
@ 2010-04-08 15:16 ` Brantley Coile
  2010-04-08 17:09 ` Bob Eager
  2010-04-09  5:11 ` Warren Toomey
  3 siblings, 0 replies; 14+ messages in thread
From: Brantley Coile @ 2010-04-08 15:16 UTC (permalink / raw)


read the ar(1) entry in the 7th edition manual, located at ...

http://plan9.bell-labs.com/7thEdMan/index.html

and write the tiny bit of C it takes to read the archive.

Brantley

On Apr 8, 2010, at 9:25 AM, Jeremy C. Reed wrote:

> I want to look in some .a files identified by file(1) as "old PDP-11 
> archive".
> 
> Anyone know what tool I can use on a modern *BSD or Linux system to 
> extract the files from an "old PDP-11" ar archive?
> 
> GNU ar complains "File format not recognized". ar tells me:
> 
> ar: supported targets: elf64-x86-64 elf32-i386 a.out-i386-netbsd 
> coff-i386 efi-app-ia32 elf64-little elf64-big elf32-little elf32-big 
> srec symbolsrec tekhex binary ihex netbsd-core
> 
> But I have no idea how to try different targets. The GNU ar manual page 
> doesn't tell me much.
> 
> Or how can I use modern pcc or gcc to compile old pre-ansi ar.c?
> 
> Any suggestions?
> 
> Now I found a simtools.zip via http://simh.trailing-edge.com/ which is 
> "a collection of tools for manipulating simulator file formats and for 
> cross-assembling code for the PDP-1, PDP-7, PDP-8, and PDP-11." But I am 
> not sure if this is related. On that note, any ideas how to extract 
> files from a ".tap" file used by simh? (For now I use view or strings to 
> look at it.)
> 
> Thanks,
> 
>  Jeremy C. Reed
> 
> echo 'EhZ[h ^jjf0%%h[[Zc[Z_W$d[j%Xeeai%ZW[ced#]dk#f[d]k_d%' | \
>  tr            '#-~'            '\-.-{'
> 
> _______________________________________________
> PUPS mailing list
> PUPS at minnie.tuhs.org
> https://minnie.tuhs.org/mailman/listinfo/pups
> 
> 
> 




^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pups] extract old archive format?
  2010-04-08 13:25 Jeremy C. Reed
@ 2010-04-08 14:49 ` Tim Bradshaw
  2010-04-08 15:16 ` Brantley Coile
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Tim Bradshaw @ 2010-04-08 14:49 UTC (permalink / raw)


On 8 Apr 2010, at 14:25, Jeremy C. Reed wrote:

> Anyone know what tool I can use on a modern *BSD or Linux system to
> extract the files from an "old PDP-11" ar archive?

I think the politically correct approach to this would clearly be:  
install a PDP-11 simulator on the host, and a suitable Unix on it, and  
unpack the archives with ar.

Actually, that's wrong: the politically correct thing to do would be  
to *buy a PDP-11*.  But that seems to be fairly hard nowadays.

The former approach might be fairly practical though.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [pups] extract old archive format?
@ 2010-04-08 13:25 Jeremy C. Reed
  2010-04-08 14:49 ` Tim Bradshaw
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Jeremy C. Reed @ 2010-04-08 13:25 UTC (permalink / raw)


I want to look in some .a files identified by file(1) as "old PDP-11 
archive".

Anyone know what tool I can use on a modern *BSD or Linux system to 
extract the files from an "old PDP-11" ar archive?

GNU ar complains "File format not recognized". ar tells me:

 ar: supported targets: elf64-x86-64 elf32-i386 a.out-i386-netbsd 
 coff-i386 efi-app-ia32 elf64-little elf64-big elf32-little elf32-big 
 srec symbolsrec tekhex binary ihex netbsd-core

But I have no idea how to try different targets. The GNU ar manual page 
doesn't tell me much.

Or how can I use modern pcc or gcc to compile old pre-ansi ar.c?

Any suggestions?

Now I found a simtools.zip via http://simh.trailing-edge.com/ which is 
"a collection of tools for manipulating simulator file formats and for 
cross-assembling code for the PDP-1, PDP-7, PDP-8, and PDP-11." But I am 
not sure if this is related. On that note, any ideas how to extract 
files from a ".tap" file used by simh? (For now I use view or strings to 
look at it.)

Thanks,

  Jeremy C. Reed

echo 'EhZ[h ^jjf0%%h[[Zc[Z_W$d[j%Xeeai%ZW[ced#]dk#f[d]k_d%' | \
  tr            '#-~'            '\-.-{'




^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-04-09 10:23 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <337550.74945.qm@web82407.mail.mud.yahoo.com>
2010-04-09  5:35 ` [pups] extract old archive format? Jeremy C. Reed
2010-04-09  2:40 John Holden
     [not found] ` <n2m5904d5731004082137u5b054823wd4a9ce55113b1dee@mail.gmail.com>
2010-04-09  4:39   ` Carl Lowenstein
2010-04-09 10:23     ` Johnny Billquist
2010-04-09 10:21 ` Johnny Billquist
  -- strict thread matches above, loose matches on Subject: below --
2010-04-09  1:13 Norman Wilson
2010-04-09  2:11 ` Jeremy C. Reed
2010-04-09 10:15   ` Johnny Billquist
2010-04-09  5:49 ` Bob Eager
2010-04-08 13:25 Jeremy C. Reed
2010-04-08 14:49 ` Tim Bradshaw
2010-04-08 15:16 ` Brantley Coile
2010-04-08 17:09 ` Bob Eager
2010-04-09  5:11 ` Warren Toomey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).