rc-list - mailing list for the rc(1) shell
 help / color / mirror / Atom feed
* Re: rc futures
@ 1999-12-15 16:32 Russ Cox
  2000-01-05 11:59 ` Dynamically loading readline on demand (was Re: rc futures) Tim Goodwin
  0 siblings, 1 reply; 8+ messages in thread
From: Russ Cox @ 1999-12-15 16:32 UTC (permalink / raw)
  To: rc

> 23. Dynamically load readline only when rc is about to read from a
> terminal device.  This would mean that a single rc binary would be
> lean and fast for scripts, but still do readline for interactive use.
> However, I suspect that the effort involved in making this happen
> portably would be considerable.

What operating system doesn't demand-load the binaries anyway?
If you're not using the readline code, (the large majority of) it won't
be resident in memory.

And furthermore surely your OS is sharing text pages, so as long
as there is one rc binary running on a terminal, you've already got
it loaded and there's no penalty for running more or for running
scripts.

As Byron said, the mind boggles.

Russ



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Dynamically loading readline on demand (was Re: rc futures)
  2000-01-05 11:59 ` Dynamically loading readline on demand (was Re: rc futures) Tim Goodwin
@ 2000-01-01  0:22   ` Paul Haahr
  2000-01-14 16:26     ` Tim Goodwin
  2000-01-13  0:19   ` Jeremy Fitzhardinge
  1 sibling, 1 reply; 8+ messages in thread
From: Paul Haahr @ 2000-01-01  0:22 UTC (permalink / raw)
  To: Tim Goodwin; +Cc: rc

Tim Goodwin wrote
> I took an rc script that does nothing (makes no system calls) except
> fork() and wait() 10000 times.

Are you sure it doesn't exec at all?  Most of the costs associated with
dynamic loading would correspond with execs.  My comments below work on
the assumption that an exec of rc is included in your timing tests.  If
it wasn't, I'd be curious to see your script.

> Here are the results I got on my 200MHz Linux PC (us = microseconds).
> 
>     rc.static      744us
>     rc.rl.static   865us
>     rc.dynamic    1071us
>     rc.rl.dynamic 1442us

I guess it depends on what your test measures.  A ~30% performance
penaly doesn't seem like that much if it's saving enough by sharing the
library with other clients.

> ([...]  Why does dynamic linking increase the user time?  I have no
> idea, but it reliably does on both these platforms.)

(If you really care, see John Levine's excellent _Linkers and Loaders_
book, especially chapters 9 and 10.  Available at

  http://www.amazon.com/exec/obidos/ASIN/1558604960/paulhaahrA

and finer tech bookstores.  The short answer is ``because dynamic
linking is done with user-space code.'')

> My conclusion from this is that using readline is an insignificant
> (although measurable) performance hit.  But dynamic linking is always
> a bad idea.

No.  Dynamic linking slows startup time down with the gain of using
less memory overall.  It also makes systems more flexible, allowing the
upgrade of library code without explicit relinking of applications,
which is a big win.

> I therefore intend to drop my suggestion that rc should itself load
> readline on demand, and instead to make the following minor changes
> for rc-1.6b3.
> 
> 1. Modify the INSTALL documentation; we should always recommend linking
> statically.
> 
> 2. Modify configure.in, so that, if we are using gcc, `-static' is
> specified.

I disagree.  Overall system resource consumption still goes down with
dynamic linking and loading.  If a system is set up to use dynamic
linking by default, why should rc override that?

A number of us argued squarely against dynamic linking in the early '90s,
at least partially because dynamic linking encouraged software bloat.
The bloatists won because techniques like dynamic linking (along with
Moore's law) made the costs of bloat minimal.  (And, when it comes down
to it, bloat in libraries is probably a good thing.)

--p


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Dynamically loading readline on demand (was Re: rc futures)
  1999-12-15 16:32 rc futures Russ Cox
@ 2000-01-05 11:59 ` Tim Goodwin
  2000-01-01  0:22   ` Paul Haahr
  2000-01-13  0:19   ` Jeremy Fitzhardinge
  0 siblings, 2 replies; 8+ messages in thread
From: Tim Goodwin @ 2000-01-05 11:59 UTC (permalink / raw)
  To: rc

> > 23. Dynamically load readline only when rc is about to read from a
> > terminal device.

> What operating system doesn't demand-load the binaries anyway?
> If you're not using the readline code, (the large majority of) it won't
> be resident in memory.

Sure, but it still slows down fork().  My guess is that this is because
of the extra page table entries to be managed.  I swear I'm not making
this up, and I have data to prove it.  But first, let's examine the
status quo.

Specifying `--with-readline' to the rc configure script ends up simply
appending `-lreadline' to the link command line (and maybe `-ltermcap'
or similar).  This will link rc against readline either dynamically, if
the linker finds a libreadline.so, or statically, if the linker only
finds libreadline.a or you said `-static' or similar.

The INSTALL documentation has this advice to offer.

                              It is a good idea to specify static linking
    (unless you are linking against a dynamic readline library)---if you are
    using gcc, you can do this by setting `CFLAGS=-static'.

My assumption when I wrote this was that readline is big enough that
it's always worth linking against it dynamically if you can.  That was a
clear violation of the rule: "profile, don't speculate!", so now let me
make amends...

I took an rc script that does nothing (makes no system calls) except
fork() and wait() 10000 times.  Dividing the total CPU time to run the
script by 10000 gives an average loop time.

There are two variables of interest: static versus dynamic linking,
and no-readline versus readline; giving four variants of rc to try.
(Remember that rc always needs the C library.)

Here are the results I got on my 200MHz Linux PC (us = microseconds).

    rc.static      744us
    rc.rl.static   865us
    rc.dynamic    1071us
    rc.rl.dynamic 1442us

(The raw data, together with file sizes, are appended.  Lest you think
that this is a quirk of Linux, I have even more dramatic results from
SunOS 5.  Why does dynamic linking increase the user time?  I have no
idea, but it reliably does on both these platforms.)

My conclusion from this is that using readline is an insignificant
(although measurable) performance hit.  But dynamic linking is always a
bad idea.

I therefore intend to drop my suggestion that rc should itself load
readline on demand, and instead to make the following minor changes
for rc-1.6b3.

1. Modify the INSTALL documentation; we should always recommend linking
statically.

2. Modify configure.in, so that, if we are using gcc, `-static' is
specified.

Any comments?

Tim.


   text	   data	    bss	    dec	    hex	filename
 347773	  13204	  16008	 376985	  5c099	rc.static
12.58user 61.78system 1:14.52elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (100040major+2150702minor)pagefaults 0swaps

   text	   data	    bss	    dec	    hex	filename
 427989	  25748	  19500	 473237	  73895	rc.rl.static
13.07user 73.42system 1:26.66elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (200044major+2281366minor)pagefaults 0swaps

   text	   data	    bss	    dec	    hex	filename
  65580	   8492	  11360	  85432	  14db8	rc.dynamic
21.16user 85.95system 1:47.22elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (200100major+2449872minor)pagefaults 0swaps

   text	   data	    bss	    dec	    hex	filename
  67213	   8532	  11552	  87297	  15501	rc.rl.dynamic
30.07user 114.15system 2:24.65elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (200117major+2827443minor)pagefaults 0swaps


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Dynamically loading readline on demand (was Re: rc futures)
  2000-01-05 11:59 ` Dynamically loading readline on demand (was Re: rc futures) Tim Goodwin
  2000-01-01  0:22   ` Paul Haahr
@ 2000-01-13  0:19   ` Jeremy Fitzhardinge
  1 sibling, 0 replies; 8+ messages in thread
From: Jeremy Fitzhardinge @ 2000-01-13  0:19 UTC (permalink / raw)
  To: Tim Goodwin; +Cc: rc


On 05-Jan-00 Tim Goodwin wrote:
> Here are the results I got on my 200MHz Linux PC (us = microseconds).
> 
>     rc.static      744us
>     rc.rl.static   865us
>     rc.dynamic    1071us
>     rc.rl.dynamic 1442us
> 
> (The raw data, together with file sizes, are appended.  Lest you think
> that this is a quirk of Linux, I have even more dramatic results from
> SunOS 5.  Why does dynamic linking increase the user time?  I have no
> idea, but it reliably does on both these platforms.)

I had noticed the same thing.  I tend build two versions of rc: "rci" for
interactive use, which contains readline, and plain "rc" which is minimal and
statically linked, for scripts.  I've never noticed a real-world performance
difference between them.

        J


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Dynamically loading readline on demand (was Re: rc futures)
  2000-01-01  0:22   ` Paul Haahr
@ 2000-01-14 16:26     ` Tim Goodwin
  2000-01-14 19:18       ` Paul Haahr
  0 siblings, 1 reply; 8+ messages in thread
From: Tim Goodwin @ 2000-01-14 16:26 UTC (permalink / raw)
  To: Paul Haahr; +Cc: rc

> > I took an rc script that does nothing (makes no system calls) except
> > fork() and wait() 10000 times.
> 
> Are you sure it doesn't exec at all?

You tell me.

    for (i in 0 1 2 3 4 5 6 7 8 9)
    ...
    for (m in 0 1 2 3 4 5 6 7 8 9)
    @{ ~ 0 0 }

According to strace on my Linux box, each loop calls fork(), close(),
rt_sigaction() twice, _exit(), and wait().

> > ([...]  Why does dynamic linking increase the user time?

>                             The short answer is ``because dynamic
> linking is done with user-space code.'')

Yeah: in crt0.  But (I'm sure) that isn't involved here.  I'd expect
fork() to take more *system* time (since there are more MAP_SHARED page
table entries to fiddle with), but I don't understand the increase in
user time.

> I disagree.  Overall system resource consumption still goes down with
> dynamic linking and loading.  If a system is set up to use dynamic
> linking by default, why should rc override that?

You're right.  I wish you'd stop doing that :-).

Forcing static linking down people's throats is Not Nice.  Particularly
because (at least with the present setup) it's dead easy to turn static
linking on, but it would be quite hard to turn it off.  (For comparison,
see how easy it isn't to turn off `-Wall' if you're using gcc.)

I still intend to advocate it in the INSTALL document, though.

Tim.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Dynamically loading readline on demand (was Re: rc futures)
  2000-01-14 16:26     ` Tim Goodwin
@ 2000-01-14 19:18       ` Paul Haahr
  0 siblings, 0 replies; 8+ messages in thread
From: Paul Haahr @ 2000-01-14 19:18 UTC (permalink / raw)
  To: Tim Goodwin; +Cc: rc

Tim Goodwin wrote, replying to me, replying to him:
> > > I took an rc script that does nothing (makes no system calls) except
> > > fork() and wait() 10000 times.
> > 
> > Are you sure it doesn't exec at all?
> 
> You tell me.
> 
>     for (i in 0 1 2 3 4 5 6 7 8 9)
>     ...
>     for (m in 0 1 2 3 4 5 6 7 8 9)
>     @{ ~ 0 0 }
> 
> According to strace on my Linux box, each loop calls fork(), close(),
> rt_sigaction() twice, _exit(), and wait().

Fascinating.  My guess was definitely wrong.

> > > ([...]  Why does dynamic linking increase the user time?
> > The short answer is ``because dynamic
> > linking is done with user-space code.'')
> 
> Yeah: in crt0.  But (I'm sure) that isn't involved here.  I'd expect
> fork() to take more *system* time (since there are more MAP_SHARED page
> table entries to fiddle with), but I don't understand the increase in
> user time.

The only thing I can guess is that using the PIC version of the code in
the shared libraries and the shared library calling sequences is hurting
much more than I would have expected.  (Chapter 8 of Levine's book goes
into the issues here.)  My guess would have been at most a 10% hit, not
the 30% you're reporting.  Detailed profiling might shed more light.

--p


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Dynamically loading readline on demand (was Re: rc futures)
@ 2000-01-14 11:11 Bengt Kleberg
  0 siblings, 0 replies; 8+ messages in thread
From: Bengt Kleberg @ 2000-01-14 11:11 UTC (permalink / raw)
  To: haahr, tjg; +Cc: rc

> From: Paul Haahr <haahr@jivetech.com>

> > Tim Goodwin wrote

> > Here are the results I got on my 200MHz Linux PC (us = microseconds).
 
> >     rc.static      744us
> >     rc.rl.static   865us
> >     rc.dynamic    1071us
> >     rc.rl.dynamic 1442us

> A ~30% performance
> penaly doesn't seem like that much if it's saving enough by sharing the
> library with other clients.

I would disagree here. I start rc scripts from wily all the time. They are really short
and my SS2 needs all the help it can get to make things faster.

Ofcourse, I do not use readline and run the scripts in sequence, so the memory savings would 
not be great (or?, what sizes accompany the rc.static, rc.dynamic, xxx)

> bloat in libraries is probably a good thing

Yes, I agree here. But a shell is special, (my usage of rc is special :-) it is started lots of times, finnishes
quickly and is not run many times in parallell.


Best Wishes, Bengt
===============================================================
Everything aforementioned should be regarded as totally private
opinions, and nothing else. bengt@softwell.se
``His great strength is that he is uncompromising. It would make
him physically ill to think of programming in C++.''


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Dynamically loading readline on demand (was Re: rc futures)
@ 2000-01-07 10:38 Bengt Kleberg
  0 siblings, 0 replies; 8+ messages in thread
From: Bengt Kleberg @ 2000-01-07 10:38 UTC (permalink / raw)
  To: rc, tjg

> 1. Modify the INSTALL documentation; we should always recommend linking statically.

> 2. Modify configure.in, so that, if we are using gcc, `-static' is specified.

Yes. To both.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2000-01-14 20:32 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-12-15 16:32 rc futures Russ Cox
2000-01-05 11:59 ` Dynamically loading readline on demand (was Re: rc futures) Tim Goodwin
2000-01-01  0:22   ` Paul Haahr
2000-01-14 16:26     ` Tim Goodwin
2000-01-14 19:18       ` Paul Haahr
2000-01-13  0:19   ` Jeremy Fitzhardinge
2000-01-07 10:38 Bengt Kleberg
2000-01-14 11:11 Bengt Kleberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).