The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: John Cowan <cowan@ccil.org>
To: Tyler Adams <coppero1237@gmail.com>
Cc: The Eunuchs Hysterical Society <tuhs@tuhs.org>
Subject: Re: [TUHS] Origins of globbing
Date: Tue, 6 Oct 2020 11:17:58 -0400	[thread overview]
Message-ID: <CAD2gp_QmPMYiWrRN+RvaF+4VyXfTZLn-oWZ_gg3Rs3LAVswzWA@mail.gmail.com> (raw)
In-Reply-To: <CAEuQd1ArwELtQH=+KAoQ4CAjTjFg2Dvu5ca1p8mttsPZwO3XFw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2563 bytes --]

On Tue, Oct 6, 2020 at 5:54 AM Tyler Adams <coppero1237@gmail.com> wrote:

How did globbing come about in unix?
>

It's been present at least since the PDP-11 migration. The Thompson shell,
used in the 1st through 6th Editions, used a separate program called
/etc/glob to do the dirty work, presumably in order to keep /bin/sh as
small as possible. Unfortunately, glob never got its own man page, so its
protocol for communicating with the shell is lost, unless someone remembers
it and writes it down (hint, hint).

Related, as regexes were already well known because of qed/ed, why wasn't a
> subset of regular expressions used instead?
>

The use of * and ?  along with file extensions preceded by dot (as in ".c"
and ".o") are, or so it seems to me, an inheritance from the DEC operating
systems, starting with Monitor (later called TOPS-10) in 1964 and going
right through OpenVMS.  In the file systems used by those OSes, the
"filename" (typically up to 6 characters) and the "extension" (typically up
to 3 characters) were stored separately both on disk and in memory, and the
separating dot was parsed by user programs before invoking the appropriate
kernel routine.  (That is why it is still true in WIndows that "foo" and
"foo." refer to the same file.)

Because dot was not in any way magic to the Unix file system, and because
file names were limited to 14 characters, extensions were kept short.
However, the path that leads from DEC OSes to CP/M to MS-DOS to Windows has
kept the 3-letter extension alive, and we now see plenty of it in
Unix-style OSes.  Thus using dot to mean "any character" would seriously
collide with this well-established usage as the extension separator.

Globbing was uninterpreted by the shell-equivalent in the DEC OSes, and was
understood only by a few programs, those responsible for listing
directories and copying, renaming, and deleting files.  Universal globbing
in the shell was AFAIK original with Unix, though Prime Computer's PRIMOS
also had it and may have been earlier by a year or two.  "It steam-engines
when it comes steam-engine time."  Both were direct descendants of Multics;
I have not been able to find out anything about

TIL that GNU find(1) supplements the standard -name option (which globs
against the filename) with -regex (which matches the regex against the
whole path).



John Cowan          http://vrici.lojban.org/~cowan        cowan@ccil.org
The Imperials are decadent, 300 pound free-range chickens (except they have
teeth, arms instead of wings, and dinosaurlike tails).  --Elyse Grasso

[-- Attachment #2: Type: text/html, Size: 3584 bytes --]

  reply	other threads:[~2020-10-06 15:19 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-06  9:53 Tyler Adams
2020-10-06 15:17 ` John Cowan [this message]
2020-10-07  2:25   ` Random832
2020-10-07  2:58     ` George Michaelson
2020-10-07  9:22       ` arnold
2020-10-07  9:45         ` Michael Kjörling
2020-10-08  3:45           ` John Cowan
2020-10-09 18:21             ` Random832
2020-10-08  0:18       ` Dave Horsfall
2020-10-08  0:33         ` Larry McVoy
2020-10-08  2:35         ` Dave Horsfall
2020-10-06 15:44 Noel Chiappa
2020-10-06 23:11 ` George Michaelson
2020-10-06 23:21   ` Jon Steinhart
2020-10-07  0:23     ` Warner Losh
2020-10-07  0:32       ` George Michaelson
2020-10-07  0:33         ` Jon Steinhart
2020-10-07  3:14   ` John Cowan
2021-02-04 21:29     ` Greg A. Woods

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAD2gp_QmPMYiWrRN+RvaF+4VyXfTZLn-oWZ_gg3Rs3LAVswzWA@mail.gmail.com \
    --to=cowan@ccil.org \
    --cc=coppero1237@gmail.com \
    --cc=tuhs@tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).