9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: adr <adr@SDF.ORG>
To: 9fans@9fans.net
Subject: [9fans] Conversion of constants in C compiler
Date: Wed, 20 Apr 2022 10:19:58 +0000 (UTC)	[thread overview]
Message-ID: <16a447c9-e11-c379-598-7430d2ed39d4@SDF.ORG> (raw)

Hi.

I've been tinkering again with the dist I shared before in
http://adr.freeshell.org/plan9, basically 9legacy + Miller's 9pi
+ 9front's libseq. Importing ori's git9 I noticed that the compiler
was truncating a constant without the correspondent suffix.

According to C99, the type of a constant should be the first in
which its value can be represented following this table:

Suffix     Decimal                 0 or 0x
----------------------------------------------------------
none       int                     int
            long int                unsigned int
            long long int           long int
                                    unsigned long int
                                    unsigned long long int
----------------------------------------------------------
u|U        unsigned int            unsigned int
            unsigned long int       unsigned long int
            unsigned long long int  unsigned long long int
----------------------------------------------------------
l|L        long int                long int
            long long int           unsigned long int
                                    long long int
                                    unsigned long long int
----------------------------------------------------------
u|U & l|L  unsigned long int       unsigned long int
            unsigned long long int  unsigned long long int
----------------------------------------------------------
ll|LL      long long int           long long int
                                    unsigned long long int
----------------------------------------------------------
u|U & ll|LL        unsigned long long int
----------------------------------------------------------

Which follows the K&R description at "A.2.5.1 Integer Constants", just adding LL.

Now, in plan9 constants are of type int if there is no suffix, or
of the one specified by the suffix. The only change made is from
a signed type to an unsigned one if it is necessary to fit the
constant's value. The rest is truncated:

/sys/src/cmd/cc/lex.c:
[...]
        vv = yylval.vval;
        if(c1 & Numvlong) {
                if((c1 & Numuns) || convvtox(vv, TVLONG) < 0) {
                        c = LUVLCONST;
                        t = TUVLONG;
                        goto nret;
                }
                c = LVLCONST;
                t = TVLONG;
                goto nret;
        }
        if(c1 & Numlong) {
                if((c1 & Numuns) || convvtox(vv, TLONG) < 0) {
                        c = LULCONST;
                        t = TULONG;
                        goto nret;
                }
                c = LLCONST;
                t = TLONG;
                goto nret;
        }
        if((c1 & Numuns) || convvtox(vv, TINT) < 0) {
                c = LUCONST;
                t = TUINT;
                goto nret;
        }
        c = LCONST;
        t = TINT;
        goto nret;

nret:
        yylval.vval = convvtox(vv, t);
        if(yylval.vval != vv){
                nearln = lineno;
                warn(Z, "truncated constant: %T %s", types[t], symb);
        }
        return c;
[...]

9front introduces some widening:
[...]
        vv = yylval.vval;
        /*
         * c99 is silly: decimal constants stay signed,
         * hex and octal go unsigned before widening.
         */
        w = 32;
        if((c1 & (Numdec|Numuns)) == Numdec)
                w = 31;
        if(c1 & Numvlong || (c1 & Numlong) == 0 && (uvlong)vv >= 1ULL<<w){
                if((c1&(Numdec|Numvlong)) == Numdec && vv < 1ULL<<32)
                        warn(Z, "int constant widened to vlong: %s", symb);
                if((c1 & Numuns) || convvtox(vv, TVLONG) < 0) {
                        c = LUVLCONST;
                        t = TUVLONG;
                        goto nret;
                }
                c = LVLCONST;
                t = TVLONG;
                goto nret;
        }
        if(c1 & Numlong) {
[...]

It doesn't follow c99. Constants with explicit L prefix which
doesn't fit in a ulong type will not be promoted to vlong but
truncated as ulong. Removing '(c1 & Numlong) == 0' should do the
trick, I think, but I don't like the code.  The promotion of type
from int to vlong only occurs just because all the compilers define
int and long with the same width, and because that width is precisely
4 bytes. The rest of the original code is clearly not taking that
fact into account which is in my opinion the correct approach.

Forcing to specify explicitly the type if a different one from an
int is desired is interesting, but the automatic conversion to an
unsigned one already brakes the idea, doesn't it?.

Some thoughts to help me decide what to do in my dist?

Regards,
adr.

------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T22754f10b241991c-Mb38696f1b928920ec2c4d16d
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

             reply	other threads:[~2022-04-20 10:20 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-20 10:19 adr [this message]
2022-04-20 11:38 ` Charles Forsyth
2022-04-20 14:37   ` adr
2022-04-20 14:37 ` ori
2022-04-21  7:54   ` adr
2022-04-21 15:29     ` ori
2022-04-28  1:10     ` ori
2022-05-01 12:03       ` adr
2022-04-28  2:43     ` ori
2022-04-28  3:12       ` ori
2022-05-01 12:35       ` adr
2022-05-09 19:31         ` adr
2022-05-10 13:06           ` adr
2022-05-10 14:48             ` adr
2022-06-06  0:35 adr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16a447c9-e11-c379-598-7430d2ed39d4@SDF.ORG \
    --to=adr@sdf.org \
    --cc=9fans@9fans.net \
    --subject='Re: [9fans] Conversion of constants in C compiler' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).