From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mimir.eigenstate.org ([206.124.132.107]) by ewsd; Fri Jul 24 00:09:22 EDT 2020 Received: from abbatoir.fios-router.home (pool-74-101-2-6.nycmny.fios.verizon.net [74.101.2.6]) by mimir.eigenstate.org (OpenSMTPD) with ESMTPSA id edb6d54d (TLSv1.2:ECDHE-RSA-AES256-SHA:256:NO) for <9front@9front.org>; Thu, 23 Jul 2020 21:09:07 -0700 (PDT) Message-ID: To: 9front@9front.org Subject: cc: fix c99 integer conversions Date: Thu, 23 Jul 2020 21:09:06 -0700 From: ori@eigenstate.org MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit List-ID: <9front.9front.org> List-Help: X-Glyph: ➈ X-Bullshit: extended open firewall The C99 standard, section 6.4.4.1 paragraph 5 says that integer constants should be converted as follows: If they fit in an int, they should be an int. If they're written in decimal, they should be converted to the smallest signed type that can hold the value. If they're written as oct or hex, they should be converted to the smallest signed *or* unsigned type that can hold the value. Right now, we don't widen to vlong when appropriate. This fixes the issue. This bug/quirk was discovered by Amavect, and they wrote the test code: /* integer constant type test * reference: C standard 6.4.4.1 * not really compliant lol * ori & Amavect */ #include #include void main(void) { print("%ullX ", 0xFF66554433221100); /* uvlong */ print("%llX ", 0x0000000180000000); /* vlong */ print("\n"); print("%X ", 0x7FFFFFFF); /* int */ print("%uX ", 0x80000000); /* uint */ print("%uX ", 0xFFFFFFFF); /* uint */ print("%llX ", 0x100000000); /* vlong */ print("\n"); /* vlong (C standard) * if it parses as uint, it's technically wrong to the standard * even though it works just fine * ideally, - is part of an integer constant, * but that's just not in the standard. */ print("%lld ", -2147483648); print("%d", -0x80000000); /* uint, no warning for int format */ print("\n"); print("%llX ", 0x7FFFFFFFFFFFFFFF); /* vlong */ print("%ullX ", 0x8000000000000000); /* uvlong */ print("%ullX ", 0xFFFFFFFFFFFFFFFF); /* uvlong */ print("%ullX", ~1ULL); /* uvlong */ print("\n"); /* uvlong (C standard) * C standard specifies an extended integer type * uvlong is our extended vlong :) */ print("%lld ", -9223372036854775808); /* no warning for vlong format */ print("%lld", -0x8000000000000000); /* uvlong, no warning for vlong format */ print("\n"); } And here's a patch that makes our integer conversions comply with c99: diff -r 639ad985a75b sys/src/cmd/cc/lex.c --- a/sys/src/cmd/cc/lex.c Mon Jul 20 18:58:52 2020 -0700 +++ b/sys/src/cmd/cc/lex.c Thu Jul 23 21:04:15 2020 -0700 @@ -444,7 +444,7 @@ yylex(void) { vlong vv; - long c, c1, t; + long c, c1, t, w; char *cp; Rune rune; Sym *s; @@ -844,7 +844,8 @@ yyerror("overflow in constant"); vv = yylval.vval; - if(c1 & Numvlong) { + w = (c1 & Numdec) ? 31 : 32; + if(c1 & Numvlong || (uvlong)vv >= 1ULL<