9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* Re: [9fans] plan or side effect
@ 2002-03-06 10:24 geoff
  2002-03-07  9:56 ` Douglas A. Gwyn
  0 siblings, 1 reply; 31+ messages in thread
From: geoff @ 2002-03-06 10:24 UTC (permalink / raw)
  To: 9fans

Unfortunately
	#define StrEq(a,b) (*(a)==*(b) && strcmp((a)+1,(b)+1)==0)
is in general incorrect: *a can equal *b and they can both be NUL
bytes, unless you know for certain that one argument will always be a
(preferably constant) non-empty string.  A correct version is
	#define STREQ(a,b) (*(a)==*(b) && strcmp((a),(b))==0)



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-06 10:24 [9fans] plan or side effect geoff
@ 2002-03-07  9:56 ` Douglas A. Gwyn
  0 siblings, 0 replies; 31+ messages in thread
From: Douglas A. Gwyn @ 2002-03-07  9:56 UTC (permalink / raw)
  To: 9fans

geoff@collyer.net wrote:
>         #define StrEq(a,b) (*(a)==*(b) && strcmp((a)+1,(b)+1)==0)
> is in general incorrect: *a can equal *b and they can both be NUL

Well, yeah, it was taken from a context where non-null identifiers
were being matched.  For more general use one would indeed use your
version..

Note that under reasonable assumptions the macro saves at least a
dozen function calls for each redundant test of byte equality, so
the more conservative version is still a big performance win.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-08 18:00           ` Dan Cross
  2002-03-11 10:04             ` Ralph Corderoy
@ 2002-03-11 10:04             ` Thomas Bushnell, BSG
  1 sibling, 0 replies; 31+ messages in thread
From: Thomas Bushnell, BSG @ 2002-03-11 10:04 UTC (permalink / raw)
  To: 9fans

cross@math.psu.edu (Dan Cross) writes:

> ...Tom...

If you make correctness such an ideal that you are willing to
sacrifice the speed of a correct program to make it "more likely to be
correct", then why do you insist on getting my name wrong?

Are you really *that* stupid?

Thomas


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-08 18:00           ` Dan Cross
@ 2002-03-11 10:04             ` Ralph Corderoy
  2002-03-11 10:04             ` Thomas Bushnell, BSG
  1 sibling, 0 replies; 31+ messages in thread
From: Ralph Corderoy @ 2002-03-11 10:04 UTC (permalink / raw)
  To: 9fans

Hi,

Dan Cross <9fans@cse.psu.edu> wrote:
> No Tom, the point is that it's easier to verify that something is
> ...
> the Plan 9 compilers are, Tom.

Even I didn't bother to read this seeing it started and ended with a
deliberate wind-up that *Thomas* has publically asked you to stop
doing.  Please keep c.o.plan9/9fans civil.  Even if you vehemently
disagree.

Cheers,


Ralph.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
@ 2002-03-08 19:22 forsyth
  0 siblings, 0 replies; 31+ messages in thread
From: forsyth @ 2002-03-08 19:22 UTC (permalink / raw)
  To: 9fans

>>The Plan 9 guys at the labs took these things to heart when they built
>>the compiler suite.  They saw themselves spending a lot of time
>>compiling, and not worrying too much about performance of the compiled
>>code.  The end result is what we see in 8[acl] et al; you get very
>>speedy compilation, and medium quality output with acceptable
>>performance.  You don't have the maintenance overhead of something
>>... The compiler is well suited to the unique Plan 9 environment. 

they are also structurally different from gcc, because they can be.
i liked the approach, and the distribution of effort,
and it also gives it some of the speed:
there is no separate assembler for instance, and although the papers
say it links slowly, it's by no means as slow as some conventional linkers.
it further allows literal pool generation, span-dependent instructions,
instruction scheduling, and several other things on peculiar architectures
to be handled in a good place, without one component second-guessing
another or duplicating the effort.

gcc has external constraints that pretty much force a certain
approach.  even the generation of assembly language is arguably
sensible for it because it allows porting to systems where the
binary formats are a complete mystery or a precious secret.

actually, one could do a reasonably good fancy optimiser for
the Plan 9 compilers too, if that were desired, but for what it usually compiles
that hasn't been a big demand.   it's a fairly well-studied problem,
and there are plenty of techniques that aren't ridiculously hard
but are compact and give good results.  even so,
if i were doing it i'd plump for the Digital (i think it was) `best-simple' approach,
which historically produced reasonable results with
reasonable effort, and with fewer surprises to programmers.
of course, i'd hire a friend to do it, because i've got a life to lead.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-08 17:30         ` Thomas Bushnell, BSG
@ 2002-03-08 18:00           ` Dan Cross
  2002-03-11 10:04             ` Ralph Corderoy
  2002-03-11 10:04             ` Thomas Bushnell, BSG
  0 siblings, 2 replies; 31+ messages in thread
From: Dan Cross @ 2002-03-08 18:00 UTC (permalink / raw)
  To: 9fans

No Tom, the point is that it's easier to verify that something is
correct if it's simpler.  Performance is, to a lot of folks (myself
included) less important than correctness.

These are all pretty straight forward engineering tradeoffs; how much
time am I willing to invest in making something go faster?  Is it worth
while for me to do so?  What is the cost of maintaining that thing?  If
you're a resource constrained group like 1127, and performance isn't
such a big deal to you, you're not going to worry about making a
super-fast whiz bang optimizer, or about hand coding the universe's
most efficient implementation of strcpy.  It's just not important.

Another good engineering principle is ``optimize for the most common
case.'' Imagine that you build a system, and you find that you spend
90% of your time in one, rather small, part of that system.  If the
performance of the system is still acceptable, you probably aren't
going to worry about it.  But, if you do need to make it faster, where
do you get the biggest return on your investment in optimizing: in that
one part, or on the system overall?  Usually, it's in that one part.

The Plan 9 guys at the labs took these things to heart when they built
the compiler suite.  They saw themselves spending a lot of time
compiling, and not worrying too much about performance of the compiled
code.  The end result is what we see in 8[acl] et al; you get very
speedy compilation, and medium quality output with acceptable
performance.  You don't have the maintenance overhead of something
really big like gcc.  Overall, the system is very well balanced in
terms of compiler output quality, compiler speed, and maintenance costs
associated with the compiler suite.  The compiler is well suited to the
unique Plan 9 environment.  Really, I don't see what your problem with
the Plan 9 compilers are, Tom.

	- Dan C.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-05  9:43       ` Boyd Roberts
@ 2002-03-08 17:30         ` Thomas Bushnell, BSG
  2002-03-08 18:00           ` Dan Cross
  0 siblings, 1 reply; 31+ messages in thread
From: Thomas Bushnell, BSG @ 2002-03-08 17:30 UTC (permalink / raw)
  To: 9fans

boyd@strakt.com (Boyd Roberts) writes:

> "Thomas Bushnell, BSG" wrote:
> > 
> > boyd@strakt.com (Boyd Roberts) writes:
> > 
> > > s/magic/stupidity/
> > 
> > The amazing thing is that you think performance is hugely important, ...
> 
> s/amazing //
> s/performance/correctness/

Um, so, is there a bug in the glibc implementation of strcpy?

I mean, correctness is a binary property--you've got it, or you don't.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-06  9:52 ` Douglas A. Gwyn
@ 2002-03-08  9:59   ` ozan s. yigit
  0 siblings, 0 replies; 31+ messages in thread
From: ozan s. yigit @ 2002-03-08  9:59 UTC (permalink / raw)
  To: 9fans

"Fco.J.Ballesteros" wrote:

> If there are more efficient (but still correct) ways, just
> replace the implementation of strcpy. And let the programs using
> strcpy call strcpy without a preprocessor mess. cf. The Practice
> of Programming (A "must" read).

it is worth noting that the macro mess (from bits/string2.h) is not the
default; it takes quite a deliberate effort to activate, and presumably only
used by those who desperately need the "optimal" inlined code and know what
they are doing. there are similar macros that inject assembler as well. :]

oz
---
red hat is the VHS of the linux world -- peter laws
Followup-To: 
Distribution: 
Organization: University of Bath Computing Services, UK
Keywords: 
Cc: 


-- 
Dennis Davis, BUCS, University of Bath, Bath, BA2 7AY, UK
D.H.Davis@bath.ac.uk


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-07 13:45 rob pike
@ 2002-03-07 15:47 ` AMSRL-CI-C
  0 siblings, 0 replies; 31+ messages in thread
From: AMSRL-CI-C @ 2002-03-07 15:47 UTC (permalink / raw)
  To: 9fans

> Function calls are cheap.  They used to be expensive, but they're
> really not any more.

Of course that depends on the architecture.  I'm really pleased
with Motorola's M*CORE, where quite often no registers at all
need to be saved across the function call (except the return PC
which is automatically pushed on the stack).

> you'd have to have a pretty special program before this change could
> make a worthwhile improvement.

It is also possible, sometimes, to change the algorithm, e.g.
hashing has more up-front cost but searching is then cheaper.

> Surely inlining is safer than this sort of hack, if you can inline.

Yes, that particular hack was to avoid function linkage, which
is also what inlining does.  The C compilers shipped with the
5620 DMD/630 MTG SGS were very good about automatically
inlining small functions.




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
@ 2002-03-07 13:45 rob pike
  2002-03-07 15:47 ` AMSRL-CI-C
  0 siblings, 1 reply; 31+ messages in thread
From: rob pike @ 2002-03-07 13:45 UTC (permalink / raw)
  To: 9fans

> Note that under reasonable assumptions the macro saves at least a
> dozen function calls for each redundant test of byte equality, so
> the more conservative version is still a big performance win.

Function calls are cheap.  They used to be expensive, but they're
really not any more.  I tried the attached program and got a factor of
about 2.5 between the two programs (mips or x86).  Given that our
strcmp is not clever, one might have expected a bigger difference.

No denying the macro makes a difference, but given their dangers,
you'd have to have a pretty special program before this change could
make a worthwhile improvement.  I tried inlining (by hand) strcmp -
something gcc does just fine - and got exactly the same speedup.
Surely inlining is safer than this sort of hack, if you can inline.

-rob


#include <u.h>
#include <libc.h>

#define StrEq(a,b) (*(a)==*(b) && strcmp((a),(b))==0)

void
main(int argc, char *argv[])
{
	char *a, *b;
	int i, j;

	a = "asdfadfdsf";
	b = "bsdfzxcvvx";
	for(i=0; i<1000*1000*100; i++)
//		if(StrEq(a,b))
		if(strcmp(a,b)==0)
			j++;
}


our strcmp:

#include <u.h>
#include <libc.h>

int
strcmp(char *s1, char *s2)
{
	unsigned c1, c2;

	for(;;) {
		c1 = *s1++;
		c2 = *s2++;
		if(c1 != c2) {
			if(c1 > c2)
				return 1;
			return -1;
		}
		if(c1 == 0)
			return 0;
	}
}



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-05  9:54 Fco.J.Ballesteros
  2002-03-06  9:51 ` Thomas Bushnell, BSG
@ 2002-03-06  9:52 ` Douglas A. Gwyn
  2002-03-08  9:59   ` ozan s. yigit
  1 sibling, 1 reply; 31+ messages in thread
From: Douglas A. Gwyn @ 2002-03-06  9:52 UTC (permalink / raw)
  To: 9fans

"Fco.J.Ballesteros" wrote:
> If there are more efficient (but still correct) ways, just
> replace the implementation of strcpy. And let the programs using
> strcpy call strcpy without a preprocessor mess. cf. The Practice
> of Programming (A "must" read).

The usual reason for the "preprocessor mess" is that generation
of the function linkage itself needs to be avoided.  E.g.
	extern char *strcpy(char *dst, const char *src);
	#define strcpy(a,b) __builtin_strcpy(a,b)
(The first line is in case the user invokes the actual function;
there are a couple of ways he can do that.)
The "inline" facility may be able to replace some of this, but it's
still no substitute for intrinsics.

Here is another example that has really sped up some searching
applications, therefore was worth doing despite being "unclean":
	#define StrEq(a,b) (*(a)==*(b) && strcmp((a)+1,(b)+1)==0)
This has the drawback that the arguments mustn't have side effects.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-05  9:54 Fco.J.Ballesteros
@ 2002-03-06  9:51 ` Thomas Bushnell, BSG
  2002-03-06  9:52 ` Douglas A. Gwyn
  1 sibling, 0 replies; 31+ messages in thread
From: Thomas Bushnell, BSG @ 2002-03-06  9:51 UTC (permalink / raw)
  To: 9fans

nemo@plan9.escet.urjc.es (Fco.J.Ballesteros) writes:

> If there are more efficient (but still correct) ways, just
> replace the implementation of strcpy. And let the programs using
> strcpy call strcpy without a preprocessor mess. cf. The Practice
> of Programming (A "must" read).

Um, the implementation is correct.  I'm not sure why a "preprocessor
mess" is such a mess.  The beginning comment was somebody who thought
that the compiler was doing the magic; that shows that the
"preprocessor mess" is sufficiently well done that nobody ever notices
it unless they go looking for it.  (As, indeed, I did.)


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-05  9:41         ` Thomas Bushnell, BSG
@ 2002-03-05  9:56           ` Boyd Roberts
  0 siblings, 0 replies; 31+ messages in thread
From: Boyd Roberts @ 2002-03-05  9:56 UTC (permalink / raw)
  To: 9fans

"Thomas Bushnell, BSG" wrote:
> The referenced thread was about more efficient ways of implementing
> strcpy than the trivial one.  I'm sure there are more than that one.

If you had a PDP-11 or slower ...


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
@ 2002-03-05  9:54 Fco.J.Ballesteros
  2002-03-06  9:51 ` Thomas Bushnell, BSG
  2002-03-06  9:52 ` Douglas A. Gwyn
  0 siblings, 2 replies; 31+ messages in thread
From: Fco.J.Ballesteros @ 2002-03-05  9:54 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 234 bytes --]

If there are more efficient (but still correct) ways, just
replace the implementation of strcpy. And let the programs using
strcpy call strcpy without a preprocessor mess. cf. The Practice
of Programming (A "must" read).

hth


[-- Attachment #2: Type: text/plain, Size: 1903 bytes --]

Received: from mail.cse.psu.edu ([130.203.4.6]) by aquamar; Tue Mar  5 10:51:26 MET 2002
Received: from psuvax1.cse.psu.edu (psuvax1.cse.psu.edu [130.203.4.6])
	by mail.cse.psu.edu (CSE Mail Server) with ESMTP
	id DE5A319AB0; Tue,  5 Mar 2002 04:51:12 -0500 (EST)
Delivered-To: 9fans@cse.psu.edu
Received: from mercury.bath.ac.uk (mercury.bath.ac.uk [138.38.32.81])
	by mail.cse.psu.edu (CSE Mail Server) with ESMTP id 678BA19AA9
	for <9fans@cse.psu.edu>; Tue,  5 Mar 2002 04:49:52 -0500 (EST)
Received: from news by mercury.bath.ac.uk with local (Exim 3.12 #1)
	id 16iBUJ-0005fJ-00
	for 9fans@cse.psu.edu; Tue, 05 Mar 2002 09:44:03 +0000
Received: from GATEWAY by bath.ac.uk with netnews
	for 9fans@cse.psu.edu (9fans@cse.psu.edu)
To: 9fans@cse.psu.edu
From: "Thomas Bushnell, BSG" <tb+usenet@becket.net>
Message-ID: <878z97wr1h.fsf@becket.becket.net>
Organization: University of California, Irvine
Content-Type: text/plain; charset=us-ascii
References: <3C7F6F11.2A1CD8DC@strakt.com>, <87664dv3gj.fsf@becket.becket.net>, <vi4bse4rypw.fsf@blue.cs.yorku.ca>
Subject: Re: [9fans] plan or side effect
Sender: 9fans-admin@cse.psu.edu
Errors-To: 9fans-admin@cse.psu.edu
X-BeenThere: 9fans@cse.psu.edu
X-Mailman-Version: 2.0.8
Precedence: bulk
Reply-To: 9fans@cse.psu.edu
List-Help: <mailto:9fans-request@cse.psu.edu?subject=help>
List-Id: Fans of the OS Plan 9 from Bell Labs <9fans.cse.psu.edu>
List-Archive: <https://lists.cse.psu.edu/archives/9fans/>
Date: Tue, 5 Mar 2002 09:41:37 GMT

ozan s yigit <oz@blue.cs.yorku.ca> writes:

> so far as we can tell, the only thing you wish we would learn has to do
> with license paperwork. i've been using it since b2 release in 87, so
> i'm curious what else you have in mind.

The referenced thread was about more efficient ways of implementing
strcpy than the trivial one.  I'm sure there are more than that one.

Thomas

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-04 10:04     ` Thomas Bushnell, BSG
  2002-03-04 17:11       ` Sean Quinlan
  2002-03-04 18:23       ` ozan s yigit
@ 2002-03-05  9:43       ` Boyd Roberts
  2002-03-08 17:30         ` Thomas Bushnell, BSG
  2 siblings, 1 reply; 31+ messages in thread
From: Boyd Roberts @ 2002-03-05  9:43 UTC (permalink / raw)
  To: 9fans

"Thomas Bushnell, BSG" wrote:
> 
> boyd@strakt.com (Boyd Roberts) writes:
> 
> > s/magic/stupidity/
> 
> The amazing thing is that you think performance is hugely important, ...

s/amazing //
s/performance/correctness/


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-04 18:23       ` ozan s yigit
@ 2002-03-05  9:41         ` Thomas Bushnell, BSG
  2002-03-05  9:56           ` Boyd Roberts
  0 siblings, 1 reply; 31+ messages in thread
From: Thomas Bushnell, BSG @ 2002-03-05  9:41 UTC (permalink / raw)
  To: 9fans

ozan s yigit <oz@blue.cs.yorku.ca> writes:

> so far as we can tell, the only thing you wish we would learn has to do
> with license paperwork. i've been using it since b2 release in 87, so
> i'm curious what else you have in mind.

The referenced thread was about more efficient ways of implementing
strcpy than the trivial one.  I'm sure there are more than that one.

Thomas


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-04 10:04     ` Thomas Bushnell, BSG
  2002-03-04 17:11       ` Sean Quinlan
@ 2002-03-04 18:23       ` ozan s yigit
  2002-03-05  9:41         ` Thomas Bushnell, BSG
  2002-03-05  9:43       ` Boyd Roberts
  2 siblings, 1 reply; 31+ messages in thread
From: ozan s yigit @ 2002-03-04 18:23 UTC (permalink / raw)
  To: 9fans

"Thomas Bushnell, BSG" <tb+usenet@becket.net> writes:

>					      ... but it's a real
> shame that many people here have a kind of allergic reaction to
> learning from the successes of anything else.

so far as we can tell, the only thing you wish we would learn has to do
with license paperwork. i've been using it since b2 release in 87, so
i'm curious what else you have in mind.

oz


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-04 10:04     ` Thomas Bushnell, BSG
@ 2002-03-04 17:11       ` Sean Quinlan
  2002-03-04 18:23       ` ozan s yigit
  2002-03-05  9:43       ` Boyd Roberts
  2 siblings, 0 replies; 31+ messages in thread
From: Sean Quinlan @ 2002-03-04 17:11 UTC (permalink / raw)
  To: 9fans

You are confusing

	We kept it simple and as a result it goes fast

with

	We made it massively more complex to squeeze out a little more performance.


"Thomas Bushnell, BSG" wrote:
> 
> boyd@strakt.com (Boyd Roberts) writes:
> 
> > s/magic/stupidity/
> 
> The amazing thing is that you think performance is hugely important,
> so much so, that you claim run time of compilation is the most
> important thing.  Then when another system does something faster, you
> call it "stupid".  Puhleez.  If your real opinion is just "Whatever
> Plan 9 does is brilliant, and anything different is stupid", then say
> it, instead of pretending to have reasoned opinions.
> 
> For my part, Plan 9 does some things very well, and other things less
> well.  I enjoy learning from the successes of Plan 9, but it's a real
> shame that many people here have a kind of allergic reaction to
> learning from the successes of anything else.
> 
> Thomas


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-01 12:07   ` Boyd Roberts
@ 2002-03-04 10:04     ` Thomas Bushnell, BSG
  2002-03-04 17:11       ` Sean Quinlan
                         ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Thomas Bushnell, BSG @ 2002-03-04 10:04 UTC (permalink / raw)
  To: 9fans

boyd@strakt.com (Boyd Roberts) writes:

> s/magic/stupidity/

The amazing thing is that you think performance is hugely important,
so much so, that you claim run time of compilation is the most
important thing.  Then when another system does something faster, you
call it "stupid".  Puhleez.  If your real opinion is just "Whatever
Plan 9 does is brilliant, and anything different is stupid", then say
it, instead of pretending to have reasoned opinions.

For my part, Plan 9 does some things very well, and other things less
well.  I enjoy learning from the successes of Plan 9, but it's a real
shame that many people here have a kind of allergic reaction to
learning from the successes of anything else.

Thomas


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-03-01 10:02 ` Thomas Bushnell, BSG
@ 2002-03-01 12:07   ` Boyd Roberts
  2002-03-04 10:04     ` Thomas Bushnell, BSG
  0 siblings, 1 reply; 31+ messages in thread
From: Boyd Roberts @ 2002-03-01 12:07 UTC (permalink / raw)
  To: 9fans

"Thomas Bushnell, BSG" wrote:
> If you look at <bits/string2.h> you can see the glibc magic for
> strcpy.  It boils down to the following:
> 
>   define strcpy(dest, src) \
>   (__extension__ (__builtin_constant_p (src)                                  \
>                   ? (__string2_1bptr_p (src) && strlen (src) + 1 <= 8         \
>                      ? __strcpy_small (dest, __strcpy_args (src),             \
>                                        strlen (src) + 1)                      \
>                      : (char *) memcpy (dest, src, strlen (src) + 1))         \
>                   : strcpy (dest, src)))
> 

s/magic/stupidity/


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-02-28 17:41 David Gordon Hogan
  2002-03-01 10:02 ` Thomas Bushnell, BSG
@ 2002-03-01 11:57 ` Boyd Roberts
  1 sibling, 0 replies; 31+ messages in thread
From: Boyd Roberts @ 2002-03-01 11:57 UTC (permalink / raw)
  To: 9fans

David Gordon Hogan wrote:
> I'm just reporting, I don't think it's a particularly good thing.
> Like, do we really need that extra .1% speed improvement,
> at the expense of code size, compile speed, and transparent
> behaviour?

I'm with dhog [I can already smell the napalm burning ...]


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
@ 2002-03-01 11:35 forsyth
  0 siblings, 0 replies; 31+ messages in thread
From: forsyth @ 2002-03-01 11:35 UTC (permalink / raw)
  To: 9fans

Given the results of (say) that integer benchmark run that
was shown here earlier, my inclination would have been
to compare the object code to find out where and how
the difference arose, rather than speculating that -O999 had
a lot to do with it.  Certainly I've found in the past, especially
with such microbenchmarks, that small things often count
for a lot.  As a simple example, one code generator I wrote
didn't do as well on a particular sequence as another compiler,
even though in some ways the code I generated was better;
it turned out that the difference was whether parts of the
loop were aligned on cache boundaries.   Sometimes particular
instructions or instruction sequences that are logically identical
differ markedly in speed.

As to the repeated assertion that 8c (actually there's a larger family than that)
isn't used because of the licence or the code quality, I can observe
that I do not use 8c to compile hosted Inferno on anything other
than Plan 9, and to my certain knowledge it is for neither reason.
It's simply that [8qk]c does not produce code that is easily compatible
with the hosting system.   Unlike GCC, the compiler does not
produce assembly language that can be assembled by the host assembler.
The compiler is part of a larger suite that includes a linker that is designed
to work with the compiler.  The compiler produces a binary file
but that is not object code, let alone object code in the obscure
Windows and Unix formats.  The linker converts only that binary representation
to an executable format.  You can't link the compiler's output
binary with a host linker.   Conversely, the linker
does not know how to read host object files.   It would be very hard
work to make it do so (in general), especially with ghastly, subtle
instruction encodings as on x86 and 680x0.   It operates with
a binary representation from the compiler that's set at a higher level
than raw machine code with relocation bits.   Indeed, there is
no relocation information in it.   Consequently, the linker cannot
link with the system's existing libraries.

Of course, the linker could take its current input and produce
executable images for the hosting operating system.  This is limiting
unless you've got the source for all the libraries (and they are written
in reasonable C), since you will probably not be able to do graphics
(eg, under Windows).   The linker does this is in a small way
on every platform, where it produces an executable image that
is sufficient disguise for this or that bootstrap program to accept it
(and even that's not easy).

I took a stab at using the suite years ago on AIX and RS6000, partly because
I wanted a platform to test the powerpc compiler suite.   At the time,
there was no documentation  I could find that adequately described
the AIX COFF format and dynamic linking conventions
so that I could generate it.  This is, by the way, often true
on other platforms.  IBM (at the time)
was not particularly helpful; not obstructive, just unhelpful.
I wasted time guessing and finally gave up, wrote a powerpc interpreter,
and used that until my BeBox turned up.
I was able to use AIX as a cross-compiling environment, though,
for another target system,
because the compilers require no special configuration for that:
just compile them on the new host.

Thus, the best reason not to use the Plan9 suite has nothing to do
with licences or code quality: it's that it's typically quite impractical.
If I could use them more generally, I certainly would.



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-02-28 17:41 David Gordon Hogan
@ 2002-03-01 10:02 ` Thomas Bushnell, BSG
  2002-03-01 12:07   ` Boyd Roberts
  2002-03-01 11:57 ` Boyd Roberts
  1 sibling, 1 reply; 31+ messages in thread
From: Thomas Bushnell, BSG @ 2002-03-01 10:02 UTC (permalink / raw)
  To: 9fans

dhog@plan9.bell-labs.com (David Gordon Hogan) writes:

> It certainly knows about strcpy() and memmove() (or
> whatever they're #defined to in the headers).  So for
> instance,
> 
> 	strcpy(s, "xyzzy");
> 
> will get replaced with a bunch of instructions to store the
> appropriate constant values in s.

Actually, that's glibc that's doing the trick, not gcc.

As I said, since GCC and glibc are maintained separately, gcc is very
careful to stay away from magic related to things like "strcpy".
(There is an exception for "main"; some magic happens there, and GCC
and glibc worked out carefully how it should happen.)

If you look at <bits/string2.h> you can see the glibc magic for
strcpy.  It boils down to the following:

  define strcpy(dest, src) \
  (__extension__ (__builtin_constant_p (src)				      \
		  ? (__string2_1bptr_p (src) && strlen (src) + 1 <= 8	      \
		     ? __strcpy_small (dest, __strcpy_args (src),	      \
				       strlen (src) + 1)		      \
		     : (char *) memcpy (dest, src, strlen (src) + 1))	      \
		  : strcpy (dest, src)))

The only compiler support for this is the __builtin_constant_p
function, which is a GCC builtin.  (There is a GCC __builtin_memcpy,
as well, which does get used for larger strings inside the guts of
memcpy, and expands to an inline block memory copy instruction.)

The function __strcpy_small (which is invoked in the case where src is
the constant "xyzzy") is an inline function that moves the bytes one
word at a time, and then the compiler simply optimizes those
assignments in the usual way to produce:

foo:
	pushl %ebp
	movl %esp,%ebp
	movl s,%eax
	movl $2054846840,(%eax)
	movw $121,4(%eax)
	leave
	ret


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
@ 2002-02-28 17:41 David Gordon Hogan
  2002-03-01 10:02 ` Thomas Bushnell, BSG
  2002-03-01 11:57 ` Boyd Roberts
  0 siblings, 2 replies; 31+ messages in thread
From: David Gordon Hogan @ 2002-02-28 17:41 UTC (permalink / raw)
  To: 9fans

> No, in general GCC does not have such knowledge.  
> 
> If it is able to inline the function, then of course it can do the
> optimization, but an inlined function isn't a function call at all, so
> that's really a different case.
> 
> Also, GCC has some builtin functions; it knows the behavior of those.
> But not (in general) library functions.

It certainly knows about strcpy() and memmove() (or
whatever they're #defined to in the headers).  So for
instance,

	strcpy(s, "xyzzy");

will get replaced with a bunch of instructions to store the
appropriate constant values in s.

I'm just reporting, I don't think it's a particularly good thing.
Like, do we really need that extra .1% speed improvement,
at the expense of code size, compile speed, and transparent
behaviour?

When I say .1%, I'm just pulling a number out of the air.
Clearly, if your program is composed entirely out of
strcpy's of constants, the improvement could be much
larger(!).  But, I claim that this is a pathological case,
and the time wasted on such `improvements' is generally
better spent elsewhere (like, maybe, some day, someone
will simplify the morass of #ifdefs that GCC and Binutils
are afflicted with...).



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-02-28 12:51     ` Ralph Corderoy
@ 2002-02-28 16:57       ` Thomas Bushnell, BSG
  0 siblings, 0 replies; 31+ messages in thread
From: Thomas Bushnell, BSG @ 2002-02-28 16:57 UTC (permalink / raw)
  To: 9fans

Ralph Corderoy <ralph@inputplus.demon.co.uk> writes:

> It doesn't have special knowledge of some functions behaviour then,
> like AIX's xlc compiler?  For example, <string.h> #defines strlen(s) to
> be __strlen(s) and the compiler knows that calls to __strlen can be
> optimised in various ways because it knows more about strlen's
> behaviour than can be expressed in <string.h>.

No, in general GCC does not have such knowledge.  

If it is able to inline the function, then of course it can do the
optimization, but an inlined function isn't a function call at all, so
that's really a different case.

Also, GCC has some builtin functions; it knows the behavior of those.
But not (in general) library functions.

This is the kind of inter-function optimization that the MIPS compiler
does quite generally, and GCC doesn't really attempt at all.  

(One reason, certainly, is that it's a little fragile, especially when
the C library and GCC are developed separately.)

Thomas


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-02-28 16:01     ` AMSRL-CI-CN
@ 2002-02-28 16:52       ` Thomas Bushnell, BSG
  0 siblings, 0 replies; 31+ messages in thread
From: Thomas Bushnell, BSG @ 2002-02-28 16:52 UTC (permalink / raw)
  To: 9fans

"AMSRL-CI-CN" <gwyn@arl.army.mil> writes:

> "Thomas Bushnell, BSG" <tb+usenet@becket.net> wrote in message
> news:87u1s2y1uu.fsf@becket.becket.net...
> > In the presence of concurrence, even this is not sufficient, because a
> > different thread could clobber the value.  However, C does not
> > guarantee synchronization in this case unless the variable is marked
> > "volatile".
> 
> Actually the C standard does not address threads at all.
> It is nice that "volatile" helps, but I'm sure it doesn't totally
> solve the concurrent data access problem for threads.

No, certainly not!  I misspoke.  I shouldn't have said
"synchronization", which does imply more, and certainly you still need
mutexes or semaphores or something.

"volatile" is a declaration to the compiler that the value could be
changing at any time, unbeknownst to the compiler, and so values can't
be cached in registers.  

If "acquire mutex" is a function call, then you don't even need to
declare the variable volatile, since the compiler knows that the
function call could clobber all of memory.  So it isn't even necessary
for threads, if you *are* using mutexes, and for other reasons you
certainly do need to.  So the reference to threads was a needless
confusion; sorry.

Thomas


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-02-28  9:58   ` Thomas Bushnell, BSG
  2002-02-28 12:51     ` Ralph Corderoy
@ 2002-02-28 16:01     ` AMSRL-CI-CN
  2002-02-28 16:52       ` Thomas Bushnell, BSG
  1 sibling, 1 reply; 31+ messages in thread
From: AMSRL-CI-CN @ 2002-02-28 16:01 UTC (permalink / raw)
  To: 9fans

"Thomas Bushnell, BSG" <tb+usenet@becket.net> wrote in message
news:87u1s2y1uu.fsf@becket.becket.net...
> In the presence of concurrence, even this is not sufficient, because a
> different thread could clobber the value.  However, C does not
> guarantee synchronization in this case unless the variable is marked
> "volatile".

Actually the C standard does not address threads at all.
It is nice that "volatile" helps, but I'm sure it doesn't totally
solve the concurrent data access problem for threads.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-02-28  9:58   ` Thomas Bushnell, BSG
@ 2002-02-28 12:51     ` Ralph Corderoy
  2002-02-28 16:57       ` Thomas Bushnell, BSG
  2002-02-28 16:01     ` AMSRL-CI-CN
  1 sibling, 1 reply; 31+ messages in thread
From: Ralph Corderoy @ 2002-02-28 12:51 UTC (permalink / raw)
  To: 9fans

Hi Thomas,

> Well, that's a bug, certainly.  GCC does not make such assumptions.
> 
> More specifically, the canonical copy of a global is stored in
> memory, and all function calls are assumed to dirty all of memory.

It doesn't have special knowledge of some functions behaviour then,
like AIX's xlc compiler?  For example, <string.h> #defines strlen(s) to
be __strlen(s) and the compiler knows that calls to __strlen can be
optimised in various ways because it knows more about strlen's
behaviour than can be expressed in <string.h>.


Ralph.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-02-27 15:27 ` Sean Quinlan
@ 2002-02-28  9:58   ` Thomas Bushnell, BSG
  2002-02-28 12:51     ` Ralph Corderoy
  2002-02-28 16:01     ` AMSRL-CI-CN
  0 siblings, 2 replies; 31+ messages in thread
From: Thomas Bushnell, BSG @ 2002-02-28  9:58 UTC (permalink / raw)
  To: 9fans

seanq@research.bell-labs.com (Sean Quinlan) writes:

> The compiler will registerise
> such variables within a function even though it is possible
> that the variable is aliased via a pointer.  

Well, that's a bug, certainly.  GCC does not make such assumptions.

More specifically, the canonical copy of a global is stored in memory,
and all function calls are assumed to dirty all of memory.  As a
result, a global variable can be registerized, but after any function
call, it must be assumed that the value has changed, and the register
copy must therefore by synced before and after the function call.

In the presence of concurrence, even this is not sufficient, because a
different thread could clobber the value.  However, C does not
guarantee synchronization in this case unless the variable is marked
"volatile".  (And if it is so marked, GCC doesn't do any
registerization at all.)

Thomas


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [9fans] plan or side effect
  2002-02-27 15:08 presotto
@ 2002-02-27 15:27 ` Sean Quinlan
  2002-02-28  9:58   ` Thomas Bushnell, BSG
  0 siblings, 1 reply; 31+ messages in thread
From: Sean Quinlan @ 2002-02-27 15:27 UTC (permalink / raw)
  To: 9fans

My experience is that most, if not all, optimizing C
compilers will make optimistic assumptions about aliasing.
Given that C allows arbitrary cast and application of the &
operator, it is pretty much impossible to excatly determine
what can be aliased.  The compiler takes a shot at figuring this
out, but in general, it has to be optimistic if it is going
to find many variables that can be registerised.
For example, I believe the plan 9 compilers is a little
optimistic with globals.  The compiler will registerise
such variables within a function even though it is possible
that the variable is aliased via a pointer.  I recall
Ken saying he never gets caught, but obviously it is not
hard to contrive examples when he will get the "wrong"
answer.


presotto@plan9.bell-labs.com wrote:
> 
> Bushnell et al,
> 
> By the way, I'm not ignoring Thomas' question.
> I'm querying compiler designers I know about my
> statement that they know that they might be breaking
> programs, i.e. generating incorrect code,
> with some optimizations and find that acceptable.
> It was definitely a true statement when I was at
> DG in the compiler group, but that was a long
> time ago.  It could be that I'm assuming a plan when
> all I'm seeing is the inevitable bugs.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [9fans] plan or side effect
@ 2002-02-27 15:08 presotto
  2002-02-27 15:27 ` Sean Quinlan
  0 siblings, 1 reply; 31+ messages in thread
From: presotto @ 2002-02-27 15:08 UTC (permalink / raw)
  To: 9fans

Bushnell et al,

By the way, I'm not ignoring Thomas' question.
I'm querying compiler designers I know about my
statement that they know that they might be breaking
programs, i.e. generating incorrect code,
with some optimizations and find that acceptable.
It was definitely a true statement when I was at
DG in the compiler group, but that was a long
time ago.  It could be that I'm assuming a plan when
all I'm seeing is the inevitable bugs.


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2002-03-11 10:04 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-03-06 10:24 [9fans] plan or side effect geoff
2002-03-07  9:56 ` Douglas A. Gwyn
  -- strict thread matches above, loose matches on Subject: below --
2002-03-08 19:22 forsyth
2002-03-07 13:45 rob pike
2002-03-07 15:47 ` AMSRL-CI-C
2002-03-05  9:54 Fco.J.Ballesteros
2002-03-06  9:51 ` Thomas Bushnell, BSG
2002-03-06  9:52 ` Douglas A. Gwyn
2002-03-08  9:59   ` ozan s. yigit
2002-03-01 11:35 forsyth
2002-02-28 17:41 David Gordon Hogan
2002-03-01 10:02 ` Thomas Bushnell, BSG
2002-03-01 12:07   ` Boyd Roberts
2002-03-04 10:04     ` Thomas Bushnell, BSG
2002-03-04 17:11       ` Sean Quinlan
2002-03-04 18:23       ` ozan s yigit
2002-03-05  9:41         ` Thomas Bushnell, BSG
2002-03-05  9:56           ` Boyd Roberts
2002-03-05  9:43       ` Boyd Roberts
2002-03-08 17:30         ` Thomas Bushnell, BSG
2002-03-08 18:00           ` Dan Cross
2002-03-11 10:04             ` Ralph Corderoy
2002-03-11 10:04             ` Thomas Bushnell, BSG
2002-03-01 11:57 ` Boyd Roberts
2002-02-27 15:08 presotto
2002-02-27 15:27 ` Sean Quinlan
2002-02-28  9:58   ` Thomas Bushnell, BSG
2002-02-28 12:51     ` Ralph Corderoy
2002-02-28 16:57       ` Thomas Bushnell, BSG
2002-02-28 16:01     ` AMSRL-CI-CN
2002-02-28 16:52       ` Thomas Bushnell, BSG

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).