rc-list - mailing list for the rc(1) shell
 help / color / mirror / Atom feed
From: Carlo Strozzi <carlos@linux.it>
To: fosterd@hartwick.edu
Cc: rc@hawkwind.utcs.toronto.edu
Subject: Re: environment again
Date: Thu, 8 Jun 2000 03:19:16 -0400	[thread overview]
Message-ID: <E12zway-0000Ko-00@localhost> (raw)

Decklin Foster wrote:

| > ; a=`{cat bigfile}
| > ; echo $a |wc -c
| Can you give a non-trivial example? This should be 'wc -c < bigfile',
| obviously; I'm wondering what the real problem you're working on is
| where you can't iron out the variable.

Oh, yes, of course that was just an example, but it is meant to
show that bloating the environment is bad. On the other hand, given
the fact that rc does not provide a 'read' builtin, I cannot really
devise how to do to process a file line-by-line other than swallowing
it into memory all in one go and then using $*(n) to reference each
line in turn.

The point is that I use the shell to build the outer layer of
relatively large and complex Web applications. You may argue that I
should resort to a different language, but I would disagree. When it
comes to delivering applications in virtually no time, nothing beats
the shell. I have usually used bourne-type shells for this, but now
that I know of rc I really like it and I would like to switch. Now,
this leads to the next, much more practical example of real-world
problem that may lead to a bloated environment with rc:

Suppose you want to build a Web search engine like Altavista (not
that I built that particular one :-), that will show the results on
the output page in chunks of ten at a time. To render the results I
am using a web page template, that contains a special tag that marks
the point in the page where I want the final result to appare. This
is usually in the body of an html table, where each table row is one
search hit (like Altavista). Unlike Altavista though, suppose that
the output contains the complete hits, not just absracts of them. As
you know, you can put such html structure with all of its formatting
tags all on one single line, as the final rendering will not depend on
physical newlines but rather on the html tags themselves. Furthermore
each output hit can contain other hyperlinks, embedded images,
formatting tags and so on, so the shell variable that is going to
hold it may become pretty large. Then say I want to use sed(1) to
substitute the special tag in the page template with such a result
string, I need to do something like:

  sed 's/__SPECIAL_TAG__/'^$my_big_var'/' page_template.html

which will send the final output to stdout, i.e. to the Web client
Of course I must also provide for escaping any sed(1) special
characters in $my_big_var, so I need to make that variable known to
the shell for all this back-and-forth between different utilities
(that's what the shell is supposed to be: the "glue" between utilities).

Although not easily, I could use an external file rather than
$my_big_var, and use 'sed -f', but that would require a few more i/o
operations in the CGI program. If the site handles a couple of million
hits a day I would really like to avoid that.

Back to the possibility that you suggest I should resort to a different
language for such things, apart from fast go-to-market considerations
a can demonstrate that a lightweight shell + well-choosen utilities
can provide a faster application than other self-contained approaches
(there are books on that).  Unless you want to re-code the whole system
into your application, the very few times you will run an external
utility (like sendmail(8) for instance) you will need a system(3),
which will run a shell (possibly bash(1), which is ten time slower
than rc), so you had better just to use the shell in the first place.

Sorry for the length, but I think that explaining things works
better than posting a few lines of obscure code :-)

As I said I'm awful at C, but is it really that difficult to provide a
way for not exporting everything to the environment by default ?
I think it would simply make rc a better interpreter.

Take care	--carlo

             reply	other threads:[~2000-06-09  8:35 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2000-06-08  7:19 Carlo Strozzi [this message]
2000-06-08 16:48 ` Decklin Foster
  -- strict thread matches above, loose matches on Subject: below --
2000-06-08 19:02 Carlo Strozzi
2000-06-12 15:23 ` Tim Goodwin
2000-06-12 16:35   ` Carlo Strozzi
2000-06-07  6:53 Carlo Strozzi
2000-06-08  5:55 ` Decklin Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E12zway-0000Ko-00@localhost \
    --to=carlos@linux.it \
    --cc=carlos@texne.com \
    --cc=fosterd@hartwick.edu \
    --cc=rc@hawkwind.utcs.toronto.edu \
    --subject='Re: environment again' \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).