From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_LOW, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 30721 invoked from network); 30 May 2023 06:46:53 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 30 May 2023 06:46:53 -0000 Received: (qmail 7894 invoked by uid 550); 30 May 2023 06:46:49 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 7857 invoked from network); 30 May 2023 06:46:49 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=inria.fr; s=dc; h=date:from:to:cc:subject:message-id:in-reply-to: references:mime-version; bh=zWu/M5qaLlL3gql9TIFPgOv1UAaOdKDbScMUUkCFYU0=; b=TtwatRtSe56cGVcjjWqnKXiEihMYfB3XmGaSsLQslaPB8CPcc8Ocxd8U ftn5Xp6SpeQfLs1Y+Ehavs1oQM6MF9ET6fV+iecF7+xmHhFloQIJVW2q6 Ado64jVwRzSyPgRWRbtkVvVmVT1aIredJ56mq8vkPfRyTirmAzyf5F9V3 Y=; Authentication-Results: mail2-relais-roc.national.inria.fr; dkim=none (message not signed) header.i=none; spf=SoftFail smtp.mailfrom=jens.gustedt@inria.fr; dmarc=fail (p=none dis=none) d=inria.fr X-IronPort-AV: E=Sophos;i="6.00,203,1681164000"; d="scan'208";a="110176944" Date: Tue, 30 May 2023 08:46:36 +0200 From: =?UTF-8?B?SuKCkeKCmeKCmw==?= Gustedt To: Rich Felker Cc: musl@lists.openwall.com Message-ID: <20230530084636.41a14b89@inria.fr> In-Reply-To: <20230530014822.GW4163@brightrain.aerifal.cx> References: <20230526203107.GN4163@brightrain.aerifal.cx> <20230526225119.4daa2815@inria.fr> <20230526210358.GQ4163@brightrain.aerifal.cx> <20230529091413.04bc8d85@inria.fr> <20230529154640.GT4163@brightrain.aerifal.cx> <20230529212155.583e2ab7@inria.fr> <20230530014822.GW4163@brightrain.aerifal.cx> Organization: inria.fr X-Mailer: Claws Mail 4.0.0 (GTK+ 3.24.33; x86_64-pc-linux-gnu) X-Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwBAMAAAClLOS0AAAAAXNSR0IArs4c6QAAACRQTFRFERslNjAsLTE9Ok9wUk9TaUs8iWhSrYZkj42Rz6aD3sGZ MIME-Version: 1.0 Content-Type: multipart/signed; boundary="Sig_/_xmpZUO2CYd2MkWHgT03R0q"; protocol="application/pgp-signature"; micalg=pgp-sha1 Subject: Re: [musl] [C23 printf 2/3] C23: implement the wN length specifiers for printf --Sig_/_xmpZUO2CYd2MkWHgT03R0q Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Rich, on Mon, 29 May 2023 21:48:22 -0400 you (Rich Felker ) wrote: > On Mon, May 29, 2023 at 09:21:55PM +0200, J=E2=82=91=E2=82=99=E2=82=9B Gu= stedt wrote: > > Rich, > >=20 > > on Mon, 29 May 2023 11:46:40 -0400 you (Rich Felker > > ) wrote: > > =20 > > > On Mon, May 29, 2023 at 09:14:13AM +0200, J=E2=82=91=E2=82=99=E2=82= =9B Gustedt wrote: =20 > [...] =20 > [...] =20 > [...] =20 > > >=20 > > > OK I think I can communicate better with code than natural > > > language text, so here's a diff, completely untested, of what I > > > had in mind. =20 > >=20 > > that's ... ugh ... not so prety, I think > >=20 > > In my current version I track the desired width, if there is w > > specifier, and then do the adjustments after the loop. That takes > > indeed care of undefined character sequences. > >=20 > > I find that much better readable, and also easier to extend (later > > there comes the `wf` case and the `128`, and perhaps some day > > `256`) =20 >=20 > It sounds like the core issue is that you don't like the state machine > approach to how musl's printf processes format specifiers. It is well suited for simple grammars, I agree with that, but here the grammar is becomming more complex. Be it just for the fact that you'd have to enlargen the set of possible values to match decimal digits. > Personally, > I like it because there's an obvious structured way to validate that > it's accepting exactly the right things and nothing else, vs an > approach like what you tried where you ended up accepting a lot of > bogus specifiers. >=20 > One alternative I would consider is doing something like what you did, > but moving it outside of/before the state machine loop, so it's not > mixing the w* processing with the state machine. This avoids accepting > bogus repeated w32 prefixes and similar (because there is no loop) and > lets you get by with just adding one PLAIN state to have it start in > (rather than BARE) after w32. I expect the overall size would be > similar. Concept attached. I'll post what I have in a minute. It has the advantage over yours that it doesn't do the switch on the width inside the automaton and also that it doesn't have to increase the rows of the matrix. Thanks J=E2=82=91=E2=82=99=E2=82=9B --=20 :: ICube :::::::::::::::::::::::::::::: deputy director :: :: Universit=C3=A9 de Strasbourg :::::::::::::::::::::: ICPS :: :: INRIA Nancy Grand Est :::::::::::::::::::::::: Camus :: :: :::::::::::::::::::::::::::::::::::: =E2=98=8E +33 368854536 :: :: https://icube-icps.unistra.fr/index.php/Jens_Gustedt :: --Sig_/_xmpZUO2CYd2MkWHgT03R0q Content-Type: application/pgp-signature Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- iF0EARECAB0WIQSN9stI2OFN1pLljN0P0+hp2tU34gUCZHWbzAAKCRAP0+hp2tU3 4vvHAJ9ZzO59gj+IUstZ/PjItzHhEKPTOgCfapdDHe5/7jkrueVMzjHmexoM54s= =EuQO -----END PGP SIGNATURE----- --Sig_/_xmpZUO2CYd2MkWHgT03R0q--