The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: Johnny Billquist <bqt@softjar.se>
To: simh@groups.io, Will Senn <will.senn@gmail.com>
Cc: TUHS main list <tuhs@minnie.tuhs.org>
Subject: Re: [TUHS] [simh]  2bsd tarball
Date: Wed, 29 Jul 2020 11:50:25 +0200	[thread overview]
Message-ID: <2cb3ad2a-f8c5-a003-661c-e257f7cbe38c@softjar.se> (raw)
In-Reply-To: <CAC20D2NRF2CHESt_Virro2Op4mVDH2JCRBN7g5a2CvU1X=kUAw@mail.gmail.com>

Just a small comment. Whoever it was that thought DECtape was a tape was 
making a serious mistake. DECtapes are very different from magtapes.

   Johnny

On 2020-07-29 02:21, Clement T Cole wrote:
> 
> Cross posting to simh - since much of this has been discussed in the 
> last few days there also....
> 
> in for penny, in for pound ... here is the history ...  man ... I lived 
> this and I'll need a strong drink later tonight after I write it all up.
> 
> 
> On Tue, Jul 28, 2020 at 7:04 PM Will Senn <will.senn@gmail.com 
> <mailto:will.senn@gmail.com>> wrote:
> 
>     I recall having to do something with cont.a files, which are not
>     present on these images. So, my questions is, does anyone know of or
>     have an actual 2bsd tape/tape image?
> 
> cont.a is a tp-v6 and earlier ism.
> 
> DECtape had a directory at the front of the tape (think 
> superblock/ilist), but could do cool things and be treated more like a disk.
> When tp was created for a very early version of Unix (I'm not sure 
> which, could be V2), Ken/Dennis et al had DECtape units and so the 
> original scheme followed the media.   This also meant that the program 
> could write files and go back and update the directory, which is a no-no 
> with many tape systems.  Then research got a 9-track unit.   So tp was 
> changed to calculate how much space was going to be needed, write the 
> directory, then the datablocks.  All good ... except...
> 
> 9-track could write more files than the directory could take.   So for 
> many years, people would use the ar(1) program to take a number of files 
> in a directory, create a file called cont.a and then delete the files.  
> Then the tree would be written with tp, when you read it, you reversed 
> the ar(1) process.  If you look at the USENIX/Harvard tape on the TUHS 
> you'll see this scheme in use.
> 
> BTW: tp was written in assembler and all the data structures were using 
> PDP-11 binary formats.  Eventually, Harvard wrote stp (super-tp) in C  
> (which is on the USENIX tape Warren has in the archives) that worked 
> like the original assembler tp but also put a redundant directory at the 
> end of the tape.  Another issue with tp was if the you had a bad block 
> in the first few blocks you could not decode the rest of the tape.  
> [There were some other issues with the UNIX tree structure as disks got 
> bigger but I'm going to ignore all that - other than to say, tp had 
> lived it life].
> 
> Enter Mashey and the PWB 1.0 folks (which is based on V6).  Someone in 
> USG created cpio (and volcpy) as part of the PWB 1.0.   Like tp it was a 
> PDP-11 binary format, but unlike tp, the tape directory is threaded. 
> /i.e./ block one describes the first file only and includes the size of 
> the following file, then file itself, then a new directory block for the 
> next file and again that file (rinse and repeat).  So it improved on tp 
> in the directory threading, but was still binary and for a reasons I'll 
> leave out had a different user interface.
> 
> As part of V7, Ken wrote a new program, tar [you can ask him why].  But 
> like cpio he used a threaded tape directory, but unlike cpio it was 
> always ASCII and not PDP-11 specific.  Furthermore, the user interface 
> was made to parrot tp.  So, certainly, it had the advantage that 
> changing tp scripts to use tar was pretty easy i.e. s/tp/tar/     not so 
> for coil.  And it was muscle memory compliant.
> 
> For PWB 2.0, cpio was updated to allow a -c option to write the header 
> in ascii and -s to byte swap the binary.   But the damage had been done ...
> 
> Thus began 'tar wars' which was a battle that raged officially over tape 
> archive formats, but really was an argument about user interfaces.  
> Since tar was part of Research and the Universities and commercial 
> people used it, only USG and the folks inside the Bell System were using 
> cpio, as officially none of us had it since PWB was not released to us 
> (although thanks to many AT&T employees doing an OYOC year, many schools 
> like UCB, MIT and CMU all had the sources to cpio anyway -- for instance 
> you'll see it hidden away on Kirk's CD).
> 
> I personally had used both, preferred tar for easy of use and ASCII 
> directories.  But, note I had written car at Masscomp, but never tpio.  
> This was our trick to use the file scripting list that cpio could do, 
> but create tar format tapes - which was handy.  I never wrote tpio which 
> would have been cpio format using tp/tar user interface as I did not 
> need it.
> 
> Roll forward to the /usr/group UNIX standard that Heinz chaired.  We 
> ended up not being able to agree on a distribution format, but the ISVs 
> were PO because now they could create UNIX programs that might actually 
> work across systems, but they had not standard way to distribution.
> Roll forward again to IEEE.  Heinz's committee was officially disbanded 
> (story discussed elsewhere) and we were created as IEEE P1003 with Jim 
> Issack as Chair. This time the ISV's said we had to have a distribution 
> format.  Since *.1 was only an API we were allowed to avoid the user 
> interface issue but only examine the on tape format.
> 
> It turns out while it seems to have been unintended, Ken's original V7 
> implementation has an interesting coding feature/bug which turns out to 
> be what clinched the deal.   When Ken creates the directory block for 
> each file, he did bcopy of 0's to the buffer before he wrote that data 
> that fills it in.  Then when he calculated the checksum for the 
> directory header block, he summed the entire block (which because of the 
> bcopy was zeros).  This means if you write beyond the end of Ken's 
> original header and include that extra data in the chksum, the original 
> program will ignore the new information but accept the directory block 
> as valid.  i.e. he had built an extension mechanism into the tar on-tape 
> format.
> 
> cpio's ASCII on tape format was not able to do that as the checksum used 
> a sizeof(header struct) in the checksum routine.
> 
> USTAR was born ... Ken had written things like the UID/GID as ASCII 
> representations of the integer value in the original header.  USTAR 
> added the ASCII representation of the username and the group name since 
> that was more often portable between systems than the numbers.   There 
> were other additions like more room for the pathname new file types 
> /etc/.  But the key is that a USTAR tape can be read by the original V7 
> (and follow on) tape formats, although may not recognize all the 
> filetype or use all of the new information.
> 
> A few years later during *.2 discussions, we finally got into the user 
> interface stuff and pax(1) was born.  Knowing my hack with car, Keith 
> Bostic, Jim McGuiness and I wrote up a description of a program that 
> could with both users interfaces scheme.  USENIX provided funding for a 
> student to implement it and put the sources out on comp.unix.sources at 
> some point.  That proposal was originally accepted at the first tape 
> user interface program in *.2 [a few years later after I stopped being 
> part of the committee, the USG folks did get an alternate CPIO format 
> accepted and cpio as an allowed program.   USENIX paid to have the 
> program updated to operate like cpio if it was called that, pure V7 tar 
> if called that and if pax, user USTAR].
> 
> 'nuf said ... I hope.
> 
> Clem
> 
> _._,_._,_
> ------------------------------------------------------------------------
> Groups.io Links:
> 
> You receive all messages sent to this group.
> 
> View/Reply Online (#62) <https://groups.io/g/simh/message/62> | Reply To 
> Group 
> <mailto:simh@groups.io?subject=Re:%20Re%3A%20%5Bsimh%5D%20%5BTUHS%5D%202bsd%20tarball> 
> | Reply To Sender 
> <mailto:clemc@ccc.com?subject=Private:%20Re:%20Re%3A%20%5Bsimh%5D%20%5BTUHS%5D%202bsd%20tarball> 
> | Mute This Topic <https://groups.io/mt/75856261/4814011> | New Topic 
> <https://groups.io/g/simh/post>
> 
> Your Subscription <https://groups.io/g/simh/editsub/4814011> | Contact 
> Group Owner <mailto:simh+owner@groups.io> | Unsubscribe 
> <https://groups.io/g/simh/leave/8625569/104597204/xyzzy> [bqt@softjar.se]
> 
> _._,_._,_

-- 
Johnny Billquist                  || "I'm on a bus
                                   ||  on a psychedelic trip
email: bqt@softjar.se             ||  Reading murder books
pdp is alive!                     ||  tryin' to stay hip" - B. Idol

  reply	other threads:[~2020-07-29  9:58 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-28 23:03 [TUHS] " Will Senn
2020-07-29  0:09 ` Warner Losh
2020-07-29  0:19   ` Clem Cole
2020-07-29  0:45   ` Will Senn
2020-07-29  0:46   ` Will Senn
2020-07-29  0:21 ` Clem Cole
2020-07-29  9:50   ` Johnny Billquist [this message]
2020-07-29 13:52     ` [TUHS] [simh] " John Cowan
2020-07-29 14:30       ` Johnny Billquist
2020-08-11 23:41       ` Dave Horsfall
     [not found]     ` <5A12E0BB-4FFF-4C3E-B486-D4E852FAA97F@comcast.net>
2020-07-29 14:29       ` Johnny Billquist
2020-08-11 23:55         ` Dave Horsfall
2020-07-29 13:42   ` [TUHS] 2bsd tarball -> pdtar, with a side of uuslave John Gilmore
2020-07-29 15:40     ` Clem Cole
2020-07-29 19:34       ` Richard Salz
2020-07-29 19:42         ` Warner Losh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2cb3ad2a-f8c5-a003-661c-e257f7cbe38c@softjar.se \
    --to=bqt@softjar.se \
    --cc=simh@groups.io \
    --cc=tuhs@minnie.tuhs.org \
    --cc=will.senn@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).