From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_LOW,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 26416 invoked from network); 29 May 2023 15:46:58 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 29 May 2023 15:46:58 -0000 Received: (qmail 11417 invoked by uid 550); 29 May 2023 15:46:54 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 11385 invoked from network); 29 May 2023 15:46:53 -0000 Date: Mon, 29 May 2023 11:46:40 -0400 From: Rich Felker To: =?utf-8?B?SuKCkeKCmeKCmw==?= Gustedt Cc: musl@lists.openwall.com Message-ID: <20230529154640.GT4163@brightrain.aerifal.cx> References: <20230526203107.GN4163@brightrain.aerifal.cx> <20230526225119.4daa2815@inria.fr> <20230526210358.GQ4163@brightrain.aerifal.cx> <20230529091413.04bc8d85@inria.fr> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="JI+G0+mN8WmwPnOn" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20230529091413.04bc8d85@inria.fr> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] [C23 printf 2/3] C23: implement the wN length specifiers for printf --JI+G0+mN8WmwPnOn Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit On Mon, May 29, 2023 at 09:14:13AM +0200, Jā‚‘ā‚™ā‚› Gustedt wrote: > Rich, > > on Fri, 26 May 2023 17:03:58 -0400 you (Rich Felker ) > wrote: > > > I think you need an extra state that's "plain but not bare" that > > duplicates only the integer transitions out of it, like the l, ll, > > etc. prefix states do. > > Hm, the problem is that for the other prefixes the table entries then > encode the concrete type that is to be expected. We could not do this > here because the type depends on the requested width. So we would then > need to "repair" that type after the loop. A `switch` to do that would > look substantially similar to what is there, now. Do you think that > would be better? OK I think I can communicate better with code than natural language text, so here's a diff, completely untested, of what I had in mind. Rich --JI+G0+mN8WmwPnOn Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="printf-wprefix.diff" diff --git a/src/stdio/vfprintf.c b/src/stdio/vfprintf.c index a712d80f..a751fbdf 100644 --- a/src/stdio/vfprintf.c +++ b/src/stdio/vfprintf.c @@ -33,7 +33,7 @@ enum { BARE, LPRE, LLPRE, HPRE, HHPRE, BIGLPRE, - ZTPRE, JPRE, + ZTPRE, JPRE, PLAIN, WPRE, W1PRE, W3PRE, W6PRE, STOP, PTR, INT, UINT, ULLONG, LONG, ULONG, @@ -44,9 +44,10 @@ enum { MAXSTATE }; -#define S(x) [(x)-'A'] +#define ST_BASE '1' +#define S(x) [(x)-ST_BASE] -static const unsigned char states[]['z'-'A'+1] = { +static const unsigned char states[]['z'-ST_BASE+1] = { { /* 0: bare types */ S('d') = INT, S('i') = INT, S('o') = UINT, S('u') = UINT, S('x') = UINT, S('X') = UINT, @@ -94,10 +95,24 @@ static const unsigned char states[]['z'-'A'+1] = { S('o') = UMAX, S('u') = UMAX, S('x') = UMAX, S('X') = UMAX, S('n') = PTR, + }, { /* 8: explicit-width-prefixed bare equivalent */ + S('d') = INT, S('i') = INT, + S('o') = UINT, S('u') = UINT, + S('x') = UINT, S('X') = UINT, + S('n') = PTR, + }, { /* 9: w-prefixed */ + S('1') = W1PRE, S('3') = W3PRE, + S('6') = W6PRE, S('8') = HHPRE, + }, { /* 10: w1-prefixed */ + S('6') = HPRE, + }, { /* 11: w3-prefixed */ + S('2') = PLAIN, + }, { /* 12: w6-prefixed */ + S('4') = LLONG_MAX > LONG_MAX ? LLPRE : LPRE, } }; -#define OOB(x) ((unsigned)(x)-'A' > 'z'-'A') +#define OOB(x) ((unsigned)(x)-ST_BASE > 'z'-ST_BASE) union arg { @@ -547,6 +562,7 @@ static int printf_core(FILE *f, const char *fmt, va_list *ap, union arg *nl_arg, switch(t) { case 'n': switch(ps) { + case PLAIN: case BARE: *(int *)arg.p = cnt; break; case LPRE: *(long *)arg.p = cnt; break; case LLPRE: *(long long *)arg.p = cnt; break; --JI+G0+mN8WmwPnOn--