The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
* [TUHS] redirection wildness in v7
@ 2017-11-09 14:34 Will Senn
  2017-11-09 15:04 ` Ralph Corderoy
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Will Senn @ 2017-11-09 14:34 UTC (permalink / raw)


Why does the first of these incantations not present text, but the 
second does (word is a file)? Neither errors out.

$ <word | sed 20q
$ <word sed 20q

Thanks,

Will

-- 
GPG Fingerprint: 68F4 B3BD 1730 555A 4462  7D45 3EAA 5B6D A982 BAAF



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [TUHS] redirection wildness in v7
  2017-11-09 14:34 [TUHS] redirection wildness in v7 Will Senn
@ 2017-11-09 15:04 ` Ralph Corderoy
  2017-11-09 15:38   ` Will Senn
  2017-11-09 15:30 ` Chet Ramey
  2017-11-09 15:30 ` Dan Cross
  2 siblings, 1 reply; 9+ messages in thread
From: Ralph Corderoy @ 2017-11-09 15:04 UTC (permalink / raw)


Hi Will,

> Why does the first of these incantations not present text, but the
> second does (word is a file)? Neither errors out.
>
> $ <word | sed 20q
> $ <word sed 20q

That's still the case with modern-day sh(1).

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01
explains that a simple command doesn't need to result in a command name
to execute.  In your first pipeline, there's nothing to copy the data
from the first subshell's stdin redirected from ./word to the subshell's
stdout that's pipes into sed's stdin.  Adding a command to do the copy
works.

    <word cat | sed 20q

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [TUHS] redirection wildness in v7
  2017-11-09 14:34 [TUHS] redirection wildness in v7 Will Senn
  2017-11-09 15:04 ` Ralph Corderoy
@ 2017-11-09 15:30 ` Chet Ramey
  2017-11-09 21:36   ` Dave Horsfall
  2017-11-09 15:30 ` Dan Cross
  2 siblings, 1 reply; 9+ messages in thread
From: Chet Ramey @ 2017-11-09 15:30 UTC (permalink / raw)


On 11/9/17 9:34 AM, Will Senn wrote:
> Why does the first of these incantations not present text, but the second
> does (word is a file)? Neither errors out.
> 
> $ <word | sed 20q

A null command consisting of only a redirection (effectively a no-op).

> $ <word sed 20q

Equivalent to 'sed 20q < word', which has the obvious meaning.


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet at case.edu    http://cnswww.cns.cwru.edu/~chet/


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [TUHS] redirection wildness in v7
  2017-11-09 14:34 [TUHS] redirection wildness in v7 Will Senn
  2017-11-09 15:04 ` Ralph Corderoy
  2017-11-09 15:30 ` Chet Ramey
@ 2017-11-09 15:30 ` Dan Cross
  2017-11-09 15:42   ` Will Senn
  2 siblings, 1 reply; 9+ messages in thread
From: Dan Cross @ 2017-11-09 15:30 UTC (permalink / raw)


On Thu, Nov 9, 2017 at 9:34 AM, Will Senn <will.senn at gmail.com> wrote:
> Why does the first of these incantations not present text, but the second
> does (word is a file)? Neither errors out.

When dealing with obscure pipeline surprises, I find it best to try
and reason through what's happening, step-by-step.

> $ <word | sed 20q

well, here you have (at least) two processes: sed, and another (more
on that in a second). The standard input of `sed` is connected to the
read-end of a pipe; the write-end of that pipe is connected to that
unnamed process. The unnamed process's input is connected to the file
`word` in the current directory.

So...what's that unnamed process? To figure out what's going on, we'll
have to look at the source for the shell itself. In this case, the
answer to the mystery lies in /usr/src/cmd/sh/xec.c, in the (extremely
long) function `execute`. This is the heart of the shell's
interpreter. Basically, the string entered by the user has been parsed
into an in-memory data structure (essentially an abstract syntax
tree), and this function is responsible for interpreting that data
structure and invoking the entered commands. It is called recursively;
you'll note that the bulk of it is a switch statement on the token
type for the current node in the syntax tree.

The magic here is in the case for TFIL, which sets up a pipe between
the left-hand side of tree and the right-hand side.

In this case, the left-hand-side is an empty command; basically,
equivalent to typing return at the shell prompt. Note that that just
happens to be perfectly syntactically valid, so the parser doesn't
generate an error. We may reasonably assume that this results in a
left-hand child of the pipe node in the AST that is set with TFORK and
an empty command vector: indeed, we see this in the `term` function in
cmd.c.

In the context of a pipeline, a new copy of the shell *is* forked for
this, and if we follow the TFORK logic, we can see that, in the
handling of the child, we do things like setting up pipes and I/O
redirection and then either execute a builtin command, or *if the
command is not empty* we execute the command via `execa`. This is not
a builtin command, but since this command is empty, we don't do
anything in the child other than exit.

Thus, the empty command produces no output, despite having its input
redirected from a file and therefore nothing is put into the write-end
of the pipe and sed sees nothing on the read-end (other than EOF).

> $ <word sed 20q

This one is easy: you're redirecting the input to `sed 20q` from the
file `word` in the current directory. This is semantically the same
as,

$ sed 20q < word

Except that the shell allows you to play some syntactic shenanigans
with putting the redirection before the command. This is simply a
consequence of how the command parser was written.

        - Dan C.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [TUHS] redirection wildness in v7
  2017-11-09 15:04 ` Ralph Corderoy
@ 2017-11-09 15:38   ` Will Senn
  2017-11-20 23:38     ` Ralph Corderoy
  0 siblings, 1 reply; 9+ messages in thread
From: Will Senn @ 2017-11-09 15:38 UTC (permalink / raw)


Hi Ralph,

This is a good answer. I thought it was great until I saw Dan's :).

I didn't realize that the open group standard was online and accessible. 
Thanks for the link.

Will

On 11/09/2017 09:04 AM, Ralph Corderoy wrote:
> Hi Will,
>
>> Why does the first of these incantations not present text, but the
>> second does (word is a file)? Neither errors out.
>>
>> $ <word | sed 20q
>> $ <word sed 20q
> That's still the case with modern-day sh(1).
>
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01
> explains that a simple command doesn't need to result in a command name
> to execute.  In your first pipeline, there's nothing to copy the data
> from the first subshell's stdin redirected from ./word to the subshell's
> stdout that's pipes into sed's stdin.  Adding a command to do the copy
> works.
>
>      <word cat | sed 20q
>



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [TUHS] redirection wildness in v7
  2017-11-09 15:30 ` Dan Cross
@ 2017-11-09 15:42   ` Will Senn
  0 siblings, 0 replies; 9+ messages in thread
From: Will Senn @ 2017-11-09 15:42 UTC (permalink / raw)




On 11/09/2017 09:30 AM, Dan Cross wrote:
> So...what's that unnamed process? To figure out what's going on, we'll
> have to look at the source for the shell itself. In this case, the
> answer to the mystery lies in /usr/src/cmd/sh/xec.c, in the (extremely
> long) function `execute`. This is the heart of the shell's
> interpreter. Basically, the string entered by the user has been parsed
> into an in-memory data structure (essentially an abstract syntax
> tree), and this function is responsible for interpreting that data
> structure and invoking the entered commands. It is called recursively;
> you'll note that the bulk of it is a switch statement on the token
> type for the current node in the syntax tree.
>
> The magic here is in the case for TFIL, which sets up a pipe between
> the left-hand side of tree and the right-hand side.
>
> In this case, the left-hand-side is an empty command; basically,
> equivalent to typing return at the shell prompt. Note that that just
> happens to be perfectly syntactically valid, so the parser doesn't
> generate an error. We may reasonably assume that this results in a
> left-hand child of the pipe node in the AST that is set with TFORK and
> an empty command vector: indeed, we see this in the `term` function in
> cmd.c.
>
> In the context of a pipeline, a new copy of the shell *is* forked for
> this, and if we follow the TFORK logic, we can see that, in the
> handling of the child, we do things like setting up pipes and I/O
> redirection and then either execute a builtin command, or *if the
> command is not empty* we execute the command via `execa`. This is not
> a builtin command, but since this command is empty, we don't do
> anything in the child other than exit.
>
> Thus, the empty command produces no output, despite having its input
> redirected from a file and therefore nothing is put into the write-end
> of the pipe and sed sees nothing on the read-end (other than EOF).
>

Dan,
This is a great answer. I appreciate the detail and the source pointers 
(although that code is pretty complicated). My mental model is 
significantly enhanced, if not totally accurate, by your description. 
I've seen some other constructs along these same lines and they make 
more sense now - such as:

 > myfile

to create a file.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* [TUHS] redirection wildness in v7
  2017-11-09 15:30 ` Chet Ramey
@ 2017-11-09 21:36   ` Dave Horsfall
  2017-11-09 21:39     ` Chet Ramey
  0 siblings, 1 reply; 9+ messages in thread
From: Dave Horsfall @ 2017-11-09 21:36 UTC (permalink / raw)


On Thu, 9 Nov 2017, Chet Ramey wrote:

>> $ <word | sed 20q
>
> A null command consisting of only a redirection (effectively a no-op).

With the side-effect of opening "word" (which could be a device).

-- 
Dave Horsfall DTM (VK2KFU)  "Those who don't understand security will suffer."


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [TUHS] redirection wildness in v7
  2017-11-09 21:36   ` Dave Horsfall
@ 2017-11-09 21:39     ` Chet Ramey
  0 siblings, 0 replies; 9+ messages in thread
From: Chet Ramey @ 2017-11-09 21:39 UTC (permalink / raw)


On 11/9/17 4:36 PM, Dave Horsfall wrote:
> On Thu, 9 Nov 2017, Chet Ramey wrote:
> 
>>> $ <word | sed 20q
>>
>> A null command consisting of only a redirection (effectively a no-op).
> 
> With the side-effect of opening "word" (which could be a device).

Quite true. The redirection is the only thing that happens.

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    chet at case.edu    http://cnswww.cns.cwru.edu/~chet/


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [TUHS] redirection wildness in v7
  2017-11-09 15:38   ` Will Senn
@ 2017-11-20 23:38     ` Ralph Corderoy
  0 siblings, 0 replies; 9+ messages in thread
From: Ralph Corderoy @ 2017-11-20 23:38 UTC (permalink / raw)


> This is a good answer. I thought it was great until I saw Dan's :).

I'm here to relive the 70s.  Not last week.
Admin, a purge of the list moderators is required.  They're tripping.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-11-20 23:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-09 14:34 [TUHS] redirection wildness in v7 Will Senn
2017-11-09 15:04 ` Ralph Corderoy
2017-11-09 15:38   ` Will Senn
2017-11-20 23:38     ` Ralph Corderoy
2017-11-09 15:30 ` Chet Ramey
2017-11-09 21:36   ` Dave Horsfall
2017-11-09 21:39     ` Chet Ramey
2017-11-09 15:30 ` Dan Cross
2017-11-09 15:42   ` Will Senn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).