From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.7 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_LOW,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 23249 invoked from network); 30 May 2023 18:00:50 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 30 May 2023 18:00:50 -0000 Received: (qmail 5368 invoked by uid 550); 30 May 2023 18:00:47 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 5334 invoked from network); 30 May 2023 18:00:46 -0000 Date: Tue, 30 May 2023 14:00:32 -0400 From: Rich Felker To: =?utf-8?B?SuKCkeKCmeKCmw==?= Gustedt Cc: musl@lists.openwall.com Message-ID: <20230530180032.GY4163@brightrain.aerifal.cx> References: <20230526203107.GN4163@brightrain.aerifal.cx> <20230526225119.4daa2815@inria.fr> <20230526210358.GQ4163@brightrain.aerifal.cx> <20230529091413.04bc8d85@inria.fr> <20230529154640.GT4163@brightrain.aerifal.cx> <20230529212155.583e2ab7@inria.fr> <20230530014822.GW4163@brightrain.aerifal.cx> <20230530084636.41a14b89@inria.fr> <20230530172832.GX4163@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20230530172832.GX4163@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] [C23 printf 2/3] C23: implement the wN length specifiers for printf On Tue, May 30, 2023 at 01:28:33PM -0400, Rich Felker wrote: > On Tue, May 30, 2023 at 08:46:36AM +0200, Jₑₙₛ Gustedt wrote: > > Rich, > > > > on Mon, 29 May 2023 21:48:22 -0400 you (Rich Felker ) > > wrote: > > > > > On Mon, May 29, 2023 at 09:21:55PM +0200, Jₑₙₛ Gustedt wrote: > > > > Rich, > > > > > > > > on Mon, 29 May 2023 11:46:40 -0400 you (Rich Felker > > > > ) wrote: > > > > > > > > > On Mon, May 29, 2023 at 09:14:13AM +0200, Jₑₙₛ Gustedt wrote: > > > [...] > > > [...] > > > [...] > > > > > > > > > > OK I think I can communicate better with code than natural > > > > > language text, so here's a diff, completely untested, of what I > > > > > had in mind. > > > > > > > > that's ... ugh ... not so prety, I think > > > > > > > > In my current version I track the desired width, if there is w > > > > specifier, and then do the adjustments after the loop. That takes > > > > indeed care of undefined character sequences. > > > > > > > > I find that much better readable, and also easier to extend (later > > > > there comes the `wf` case and the `128`, and perhaps some day > > > > `256`) > > > > > > It sounds like the core issue is that you don't like the state machine > > > approach to how musl's printf processes format specifiers. > > > > It is well suited for simple grammars, I agree with that, but here the > > grammar is becomming more complex. Be it just for the fact that you'd > > have to enlargen the set of possible values to match decimal digits. > > I don't think it's really any more complex. It's just a few gratuitous > aliases that have a very small number of edge paths. The wf ones > almost entirely collapse with the w ones, and if we wanted to get rid > of the gratuitous separate hh/h loading, they'd entirely collapse. But > the version I posted as code is probably enough smaller to be > perferable. I guess I should take a look at that and see... Ah, now I remember why we handle h/hh despite them seeing useless. They're needed for %n, and once you distinguish them for that, there's hardly any point in trying to treat them the same as bare elsewhere. Rich