From mboxrd@z Thu Jan 1 00:00:00 1970 To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> In-reply-to: Your message of "Thu, 22 Apr 2010 17:29:53 +0200." <20100422152953.GA616@polynum.com> References: <20100422152953.GA616@polynum.com> From: Bakul Shah Date: Thu, 22 Apr 2010 10:03:58 -0700 Message-Id: <20100422170358.80B9D5B73@mail.bitblocks.com> Subject: Re: [9fans] BUG!!! in Plan9 compiler! Topicbox-Message-UUID: 0b9f49dc-ead6-11e9-9d60-3106f5b1d025 On Thu, 22 Apr 2010 17:29:53 +0200 tlaronde@polynum.com wrote: > Data: > Under NetBSD/gcc, I have the following values: > > before: x1:=5440, x2:=-5843, x3:=78909 > after: x1:=5440, x2:=-201, x3:=18166, r:=6827 t:=30232 > > Under Plan9/gcc, I have the following values: > > before: x1:=5440, x2:=-5843, x3:=78909 > after: x1:=5440, x2:=2147483447, x3:=1073759990, r:=6827 t:=-1073711592 > > Uhm... seems to have a `slight' divergence... > > In fact, all wrong values depend upon x2, that has the "correct" > value... with 2^31 complement. A positive when it should be negative, > since the offending code is the following: > > x2 = half ( x1 + x2 + xicorr ) ; > > that is : > x2 = (5440 - 5843 + 1) / 2; > > Not exactly pushing things to the limit! And yes, the expected result is > indeed -201. You would get 2147483447 if x1 and x2 were treated as unsigned numbers but -201 if treated as signed. Try this: cat > x.c < NUM f(NUM x, NUM y) { return (x + y + 1) / 2; } int main(int c, char**v) { printf("%d\n", f(atoi(v[1]), atoi(v[2]))); } EOF cc -DNUM=signed x.c && a.out 5440 -5843 cc -DNUM=unsigned x.c && a.out 5440 -5843 What is the type of x1 and x2? Can you show an actual C code fragment? Don't worry about it being complete. Just the half() function (or macro), header of the function where it is called, declarations for x1 and x2 and a couple of lines of around call to half. I am still wondering if this is due to a different interpretation of language semantics by the two compilers. > Since the problem arises in this context, but not if you just add > this isolated in a test program, and call it with these very 3 > values (5440, -5843, 1), it is clear that's the way the computation > is handled with huge number of parameters and auto variables > that wreaks havoc. You *suspect* this but you need to prove it. An isolated test case that doesn't trigger this problem simply means you have not created the right condition for the bug. Creating a simple test can be tricky and may be more work than debugging your program. > If I declare all the auto volatile, this does nothing: same result. > > If I do the addition, and afterwards take the half, that works: > > x2 += x1 + xicorr; > x2 = half(x2); /* works! */ I wouldn't bother changing anything. You already have a smoking gun (at least you know in which neighbourhood it has gone off). You can try a binary search to narrow down the area but in the end you will have to look at the assembly output of the relevant code fragment.