The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: crossd@gmail.com (Dan Cross)
Subject: [TUHS] redirection wildness in v7
Date: Thu, 9 Nov 2017 10:30:33 -0500	[thread overview]
Message-ID: <CAEoi9W50vhvY6zHXS__eTCwSuXv5Z1RW4N4Ug2vLpdKf=kLTkg@mail.gmail.com> (raw)
In-Reply-To: <3c417dcc-d3b4-0128-737a-7e510c6d0ffc@gmail.com>

On Thu, Nov 9, 2017 at 9:34 AM, Will Senn <will.senn at gmail.com> wrote:
> Why does the first of these incantations not present text, but the second
> does (word is a file)? Neither errors out.

When dealing with obscure pipeline surprises, I find it best to try
and reason through what's happening, step-by-step.

> $ <word | sed 20q

well, here you have (at least) two processes: sed, and another (more
on that in a second). The standard input of `sed` is connected to the
read-end of a pipe; the write-end of that pipe is connected to that
unnamed process. The unnamed process's input is connected to the file
`word` in the current directory.

So...what's that unnamed process? To figure out what's going on, we'll
have to look at the source for the shell itself. In this case, the
answer to the mystery lies in /usr/src/cmd/sh/xec.c, in the (extremely
long) function `execute`. This is the heart of the shell's
interpreter. Basically, the string entered by the user has been parsed
into an in-memory data structure (essentially an abstract syntax
tree), and this function is responsible for interpreting that data
structure and invoking the entered commands. It is called recursively;
you'll note that the bulk of it is a switch statement on the token
type for the current node in the syntax tree.

The magic here is in the case for TFIL, which sets up a pipe between
the left-hand side of tree and the right-hand side.

In this case, the left-hand-side is an empty command; basically,
equivalent to typing return at the shell prompt. Note that that just
happens to be perfectly syntactically valid, so the parser doesn't
generate an error. We may reasonably assume that this results in a
left-hand child of the pipe node in the AST that is set with TFORK and
an empty command vector: indeed, we see this in the `term` function in
cmd.c.

In the context of a pipeline, a new copy of the shell *is* forked for
this, and if we follow the TFORK logic, we can see that, in the
handling of the child, we do things like setting up pipes and I/O
redirection and then either execute a builtin command, or *if the
command is not empty* we execute the command via `execa`. This is not
a builtin command, but since this command is empty, we don't do
anything in the child other than exit.

Thus, the empty command produces no output, despite having its input
redirected from a file and therefore nothing is put into the write-end
of the pipe and sed sees nothing on the read-end (other than EOF).

> $ <word sed 20q

This one is easy: you're redirecting the input to `sed 20q` from the
file `word` in the current directory. This is semantically the same
as,

$ sed 20q < word

Except that the shell allows you to play some syntactic shenanigans
with putting the redirection before the command. This is simply a
consequence of how the command parser was written.

        - Dan C.


  parent reply	other threads:[~2017-11-09 15:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-09 14:34 Will Senn
2017-11-09 15:04 ` Ralph Corderoy
2017-11-09 15:38   ` Will Senn
2017-11-20 23:38     ` Ralph Corderoy
2017-11-09 15:30 ` Chet Ramey
2017-11-09 21:36   ` Dave Horsfall
2017-11-09 21:39     ` Chet Ramey
2017-11-09 15:30 ` Dan Cross [this message]
2017-11-09 15:42   ` Will Senn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEoi9W50vhvY6zHXS__eTCwSuXv5Z1RW4N4Ug2vLpdKf=kLTkg@mail.gmail.com' \
    --to=crossd@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).