The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: reed@reedmedia.net (Jeremy C. Reed)
Subject: [pups] extract old archive format?
Date: Thu, 8 Apr 2010 21:11:33 -0500 (CDT)	[thread overview]
Message-ID: <alpine.NEB.2.01.1004082057300.13963@t1.m.reedmedia.net> (raw)
In-Reply-To: <1270775431.29470.for-standards-violators@oclsc.org>

On Thu, 8 Apr 2010, Norman Wilson wrote:

> Bob Eager:
> 
>   The 'ar' format of that vintage is trivial, and documentation easily
>   found. I wrote programs to read it back in 1976!
> 
> ======
> 
> That's nothing.  Either Ken or Dennis wrote such a program
> years before that!
> 
> Warren even has a binary somewhere to prove it!
> 
> Seriously, it's a binary format, so I don't know that
> it would be easy to process in awk.  (At least not in
> awk-classic; stuff that works only in ghootandwaveawk
> is not all that interesting to me.)  But the format is
> simple, and any language new or old that can handle
> binary data without tears should do.

I can't see how to do it in awk either.

> If I didn't have an overfull plate already (and a
> visit to the Auto-Electrocution Consultant tomorrow,
> and one to the Canal Rooting Clinic Monday--proving
> that one should follow Father's advice and Stay Away
> >>From The Canal, Neddie) it would be interesting to
> collect the different specifications for ar headers
> over the years, and write a small suite of programs
> to read them.  Perhaps in Python, just to be difficult.
> (Why isn't there a language called Goon, Warren?)

Well I found the ar specification (in ar.5 not ar.1).

             struct ar_hdr {
                     char      ar_name[14];
                     long      ar_date;
                     char      ar_uid;
                     char      ar_gid;
                     int       ar_mode;
                     long      ar_size;
             };

This is same as the old ar.c source.

(plus more in the manual page.)

Now my problem is I don't know what "long" or "int" is on the old PDP-11 
/ system 5 this was made on.

And I read about PDP-11 "middle endianess" (first time I heard of 
"middle").

So I had (wrong but gets ar_name and ar_size correct for my few tests 
for the first header but chops two characters into the data section).

struct {
        char    ar_name[14];
        int32_t ar_date;
        char    ar_uid;
        char    ar_gid;
        uint16_t        ar_mode;
        uint16_t        ar_size;
} ar_buf;

Well I know above is wrong because ar_size and ar_date should be the 
same. But I get ar_size correct each time. But it also loses the next 
two bytes from the data. So I am guessing I have some endian issue where 
I am getting some things reversed.

Any ideas?

Note I am not using any system 5 or PDP-11 system. I am using a modern 
little endian (amd64) system to extract the files that were created in 
1970s.

Once I figure out the structure and endianness (if applicable) I will 
share back my code so others can extract ...



  reply	other threads:[~2010-04-09  2:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-09  1:13 Norman Wilson
2010-04-09  2:11 ` Jeremy C. Reed [this message]
2010-04-09 10:15   ` Johnny Billquist
2010-04-09  5:49 ` Bob Eager
     [not found] <337550.74945.qm@web82407.mail.mud.yahoo.com>
2010-04-09  5:35 ` Jeremy C. Reed
  -- strict thread matches above, loose matches on Subject: below --
2010-04-09  2:40 John Holden
     [not found] ` <n2m5904d5731004082137u5b054823wd4a9ce55113b1dee@mail.gmail.com>
2010-04-09  4:39   ` Carl Lowenstein
2010-04-09 10:23     ` Johnny Billquist
2010-04-09 10:21 ` Johnny Billquist
2010-04-08 13:25 Jeremy C. Reed
2010-04-08 14:49 ` Tim Bradshaw
2010-04-08 15:16 ` Brantley Coile
2010-04-08 17:09 ` Bob Eager
2010-04-09  5:11 ` Warren Toomey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.NEB.2.01.1004082057300.13963@t1.m.reedmedia.net \
    --to=reed@reedmedia.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).