rc-list - mailing list for the rc(1) shell
 help / color / mirror / Atom feed
* Re: username expansion, gnu readline, and code bloat
@ 1991-07-02 17:25 Paul Haahr
  1991-07-04  8:48 ` David Hogan
  0 siblings, 1 reply; 5+ messages in thread
From: Paul Haahr @ 1991-07-02 17:25 UTC (permalink / raw)
  To: rc

David Hogan writes (excerpted ruthlessly)

> Well, I'll put my vote in.  I think username expansion _should_ be in rc.
> Maybe it should be #ifdef'd, but it should be available.  As for the
> issue of changing the language, well, username expansion need not change
> the language, merely add to it.

adding to it does change it.  #ifdef'ing means there would be different
dialects of the language, which is problematic.  if username expansion
does go into rc (which, as i've said, i strongly argue against) it should
go into all versions.  

> > 	- another special character is a bad idea.  [...]

> You'd get used to it.  After all, ~ is already special in rc.  I mean,
> should we remove ^ from rc because it's a special character?

no, but we shouldn't gratuitously add new lexical tokens.  and what should
be done about the old meaning for ~?  yet another context dependent meaning?
i hope not?

> > 	- what makes this functionality special enough that it
> > 	  should be in the shell itself?  why is username expansion
> > 	  more important than, say, naming specific paths in my
> > 	  home directory, or important system directories?

> What makes shell functions special enough that they should be in the
> shell?  The answer is that they are convenient, and make it possible to
> do things that you couldn't otherwise do.

we agree.  on the other hand, it bothers me that i have to have a shell
script the corresponds to most of my shell functions for when the are
invoked by system().

>					     Just like username expansion.
> Username expansion saves me a lot of time every day, from having to type
> in long pathnames when I want to get to some program that is in someone
> else's bin.  Typing ~user is a way of life!  After 3 years of using shells
> which support it, I would not even consider using one which didn't.

job control is a way of life.  emacs is a way of life.  why is that the
ultimate defense?

>									As
> for the important system directories, well, look at the following lines
> from our passwd file: 

> s:*:7:7:System Source:/usr/src:
> man:*:8:1:0000-Admin(0000):/usr/local/man:
> i:*:11:11:Include Files:/usr/include:

why not
	s=/usr/src
	man=/usr/local/man
	i=/usr/include
and be done with it?  why is a ``user'' the appropriate concept for
this group of files.

> > 	- in general, i don't like ~ expansion because it is not
> > 	  recognized by the kernel; [...]

> Do you then hate backquote expansion because it's not in the kernel?
> What about globbing??  Come on, everybody, lets add globbing to
> the kernel! ;-)  No, symbolic expansion belongs to the shell, whether
> it be metacharacters in filenames or username expansion.

valid point.  but $ expansion and globbing are more general that just
a shorthand for a few specific path names.

> > [i suggest several alternatives:]

> > `{u user}
> Much too awkward to type.  Might as well be typing full pathnames.

agreed.

> > $u/user	# where $u is not a system wide directory
> This would waste more disk space than adding username expansion to
> rc, and waste more time searching the directory than would be used up
> loading the extra bytes in when you run rc.

actually, i doubt it would add much more disk space on any modern unix system
than the 130k cited earlier for yellow pages support.  (i know that you find
yp useless baggage, but sites running large numbers of diskless or dataless
workstations have very few reasonable alternatives.)  besides, the reason to
fear code bloat has little to do with disk space.

> > $user	# where $user is defined in the environment
> On our system, if you did that, you wouldn't have any environment space left.

(is that rc's fault? :-)  it's probably not a good idea to add that much
to your environment, regardless if your machine support it.

> > (1) cartesian products

> Come, come, you can't be serious!  How often are you going to need such
> a feature?  Once in 5 years perhaps?

needed it:  so far, twice since i switched to rc as my full-time shell.  (six
months or so.)  wanted it: a dozen or so times more.  often i do an `{ls|egrep}
or somesuch where ^^ would have been far more convenient.

>					Lets restrict ourselves to putting
> _useful_ features into the shell (such as username expansion -- I use
> this every day) and not some feature that noone will ever need.

> > (2) ranges in variable subscripts
> > (3) backslash escapes

> Both (2) and (3) here are gratuitous changes to the syntax of rc.

you say gratuitous, i say necessary.  you say useful, i say code bloat.
you say no one, i say everyone.  the truth lies somewhere in between.

paul


^ permalink raw reply	[flat|nested] 5+ messages in thread
* username expansion, gnu readline, and code bloat
@ 1991-06-30  3:23 Paul Haahr
  1991-07-01  9:37 ` Boyd Roberts
  1991-07-02  7:03 ` David Hogan
  0 siblings, 2 replies; 5+ messages in thread
From: Paul Haahr @ 1991-06-30  3:23 UTC (permalink / raw)
  To: boyd, john, noel, rc

John Mackin argues strongly for username expansion in the shell,
but disparages the ``foul stench'' of the gnu readline library.
other people argued vociferously enough for gnu readline that Byron
added conditional support for it.  the 4.4bsd people would like to
see rc include job control.  John prefers not to have an echo
builtin, but i use echo enough that i don't mind losing a little
flexibility for performance in that case.

i would argue that support for gnu readline() and whether echo is
a builtin are fundamentally different issues from username expansion,
because the former two do not alter the language rc accepts, whereas
syntax.  that distinction makes a strong case to me that either
username expansion should be in all versions of rc or in none: we
would like this shell to be the same language everywhere.  [it is
my real hope that the plan9 folks adopt Byron's improvements to
the language---his version of rc feels much cleaner than Tom Duff's
original.]

we're never going to have complete agreement on what features rc
should include, and what's too much bloat, or innapropriate for a
shell, or absolutely essential.  one of the most appealing things
for me about rc is its small size;  there's almost no fat, and i
probably use 90-95% percent of the language every week.  in order
to keep its size down, that probably means erring on the side of
leaving things out when there is doubt about their usefulness.

that said, i will put forth my own opinion that rc would suffer
from having username expansion added.  here's my reasoning:

	- code size costs.  John mentions that a local user hacked
	  a version of rc that did username expansion by what i
	  presume was a call to getpwnam and it more than doubled
	  the size of rc.  he then hacked a smaller version that
	  did the lookup itself.  my question:  does that smaller
	  version look in /etc/passwd? what does it do in systems
	  that use yellow pages?  how about next's netinfo?  getpwnam
	  is the only portable way i know of getting the information.

	- another special character is a bad idea.  one of the nice
	  things about rc is that i don't have to remember as many
	  special purpose characters as i did when using /bin/sh
	  not to mention all the things pre-empted by csh.  (remember
	  the bad old days of ! mail paths plus csh history---typing
	  \! enough was bad enough that i hacked a version of csh
	  together which expanded "!foo" to "!foo" if there was no
	  history event matching "foo")

	- what makes this functionality special enough that it
	  should be in the shell itself?  why is username expansion
	  more important than, say, naming specific paths in my
	  home directory, or important system directories?

	- in general, i don't like ~ expansion because it is not
	  recognized by the kernel; i can type that kind of filename
	  to some shells, some editors, maybe my debugger, not my
	  mailer, and certainly not a typical program that expects
	  a file name.  (this same argument applies to $ expansion,
	  but those are more clearly shell variables.)  if ~user
	  functionality is really desirable, i think the most
	  appropriate place for it would be somewhere it could be
	  universal, i.e., the kernel or the shared library that
	  contains open().  (this has real disadvantages, i know,
	  i just don't like that i can give one file name to some
	  tools but not to others.)

now, for people like John who really do miss the absence of ~user
in a shell, and can't create some sort of centralized directory of
symbolic links to each user's home directories, what are the
alternatives?  i see 3 that i think are reasonable:

	- write a program named u that does a getpwnam() of its
	  arguments and prints out the home directories.  disadvantages:
	  costs more cycles than shell support does, `{u haahr} is
	  more to type than ~haahr, hard to integrate with command
	  completion support.

	- make a directory somewhere in your own directory tree
	  analagous to the /u Byron talked about, and have the
	  variable $u point to it.  $u/haahr is not bad to type.
	  disadvantages:  wastes inodes, can be out of date with
	  respect to the password file.  (though a cron script can
	  do a bit to resolve that issue.)

	- have a .rcrc that makes $user point to the appropriate
	  home directory.  disadvantages:  can be out of date,
	  wastes environment space.

as far as readline knowing about ~ but not $home, it sounds to me
like someone should fix readline to look at environment variables
and send the change back to the fsf.  is there any reason $ expansion
shouldn't be in every version of readline, and an issue for the
authors of that routine, not for their clients?  [admittedly, this
would work particularly well with the $u scheme i suggested.]

--paul

!
mail -s 'feeping creaturism' \
	rc john@syd.dit.csiro.au boyd@prl.dec.com noel@cs.su.oz.au << '!'
[in my previous note, i argued vociferously against adding someone's pet
feature to rc.  proving myself as hypocritical as other human animals, i
now put forth my own wish list of extra features.]

(1) cartesian products

rc provides two forms of concatenation:  scalar and pairwise, by which i mean
	x^(1 2 3) -> x1 x2 x3
and
	(x y z)^(a b c) -> xa yb zc
respectively.  i miss the third obvious form, which would be a cross product:
	(x y z)^^(a b c) -> xa xb xc ya yb yc za zb zc
(this is how the csh {} operate.)

this functionality is by no means necessary, as doubly nested for loops can
be used to get the desired effect.  but no general ``cross'' function can be
written in rc because it impossible to pass two separate lists to a function
and preserve the information of what came from each list, without some gross
hack of counting the number of elements in at least one of the lists.  what
i have done in the past is a function that takes two variable names as arguments
and it does the $ dereferencing itself.

thus, i propose allowing ^^ everywhere that ^ can appear.  (the necessary
productions are
	first	: first '^' '^' word	{ $$ = newnode(CROSS, $1, $4); }
	word	: word '^' '^' word	{ $$ = newnode(CROSS, $1, $4); }
and they do not introduce any syntactic conflicts.)  it meaning would be
that every element of the first list would be concatenated with every
element of the second list.

this can not break any existing rc code, since two '^' tokens in a row
currently raise a syntax error.


(2) ranges in variable subscripts

there's no way currently entirely within rc to refer to a sequence
of elements in a list.  i would like ``...'' as a list subscript to
mean all the elements between the number before the ... and the number
after it.  if ... appears with no numbers before it, the range starts
at 1; with no following numbers, the range ends at $#.  a range from
a large number to a smaller one would be empty.  thus:

	; x=(a b c d e)
	; echo $x(2 ... 4)
	b c d
	; echo $x(4 ... 2)
	
	; echo $x(4 ...)
	d e
	; echo $x(6 ...)
	
	; echo $x(... 3)
	a b c
	; echo $x(...)
	a b c d e
	; 

... should not be recognized by the lexical analyzer.  it should be
a separate word from any of the numbers around it to make parsing
easier.  thus $x(1...3) would be a syntax error.  also, $x(1 ... 3 ... 5)
would be illegal in my book, but $x(1 ... 3 5 ... 9) would be ok.

... would not break existing code, as ... is currently illegal in
a subscript.  right now, to get this functionality i rely on an awk
script ``seq'' and do things like
	echo $x(`{seq 2 $#x})

with this feature, we could use
	*=$*(2 ...)
and get rid of the need for the shift builtin. (i guess proposal (2b)
would be to actually remove shift from the language.)


(3) backslash escapes

i propose changing the way unquoted backlash is treated in rc.  currently,
\ (except when followed by a newline) is treated as any other character, and
i would like to change that to allow c-style \ escape sequences.  thus,
to get a newline in a string, i could just type
	echo foo\nbar
or, more clearly
	echo foo^\n^bar
i also propose that \ escape all special and meta characters in rc, it
being more convenient to type \* than '*'.

for newlines, this is far from necessary, because of rc's quoting rules.
\' is easier to read (imho) than '''', and it would be nice to have octal
constants, especially since my editor does not deal well with non-printing
characters.  (yes, i know this is not a good argument, and i should just
fix my editor)  but seeing \n or \r or \033 is probably clearer than
looking for some binary sequence, and typing it from the command line
is far easier.

anyway, this change would break some existing rc uses.  in particular,
	tex \\plain
and
	grep stdio\\.h *.c
are nicer if you don't have to quote \.  this also removes the nice claim
that rc has only one mechanism for quoting.


please poke holes these suggestions.  i'm especially wary about #3.
better syntax anywhere?  am i proposing inappropriate features?

paul


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~1991-07-04 10:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1991-07-02 17:25 username expansion, gnu readline, and code bloat Paul Haahr
1991-07-04  8:48 ` David Hogan
  -- strict thread matches above, loose matches on Subject: below --
1991-06-30  3:23 Paul Haahr
1991-07-01  9:37 ` Boyd Roberts
1991-07-02  7:03 ` David Hogan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).