From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 20625 invoked from network); 22 Jun 2023 23:52:08 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 22 Jun 2023 23:52:08 -0000 Received: (qmail 15838 invoked by uid 550); 22 Jun 2023 23:52:05 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 15800 invoked from network); 22 Jun 2023 23:52:04 -0000 Date: Thu, 22 Jun 2023 19:51:53 -0400 From: Rich Felker To: "Alex Xu (Hello71)" Cc: musl@lists.openwall.com, Markus Wichmann Message-ID: <20230622235152.GO4163@brightrain.aerifal.cx> References: <20230621213746.GM4163@brightrain.aerifal.cx> <20230622144550.GN4163@brightrain.aerifal.cx> <1687470871.bzv5u1iuij.none@localhost> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1687470871.bzv5u1iuij.none@localhost> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] [PATCH] [RFC] trap on invalid printf formats On Thu, Jun 22, 2023 at 07:37:22PM -0400, Alex Xu (Hello71) wrote: > Excerpts from Rich Felker's message of June 22, 2023 10:45 am: > > FWIW I don't think there are a lot of these cases left in the wild at > > all, but I'm not sure. it might be nice to do some distro-wide testing > > with this patch applied (which is what I had in mind posting it) and > > see if any problems are caught before really considering whether to > > pursue upstreaming it. > > Unfortunately, it seems fairly widespread: > https://codesearch.debian.net/search?q=printf.*%5B%25%5DL%5Bud%5D+-package%3Agcc+-package%3Allvm*+path%3A.*%5C.c%24&literal=0 > > The most painful example: > > #if defined(__OpenBSD__) || defined(__FreeBSD__) || defined(__APPLE_CC__) || defined(__APPLE__) || defined(ARGUS_SOLARIS) > sprintf (pbuf, "%llu", value); > #else > sprintf (pbuf, "%Lu", value); > #endif > > (copied and pasted 17 times in the same file, of course) Which file are you looking at? Some of the results I see are Linux (the kernel) using its own very-nonstandard printf format specifier system. Those aren't relevant at all. All of the rest are bugs where the software is silently malfunctioning now. Some of these are junk code like examples/* etc. but presumably a lot are actual bugs that need to be fixed, where something is malfunctioning now. > I did some research and the most likely source of %Lu is the Linux > man-pages, which, before 1999 or thereabouts, said: > > > • The optional character l (ell) specifying that a following d, i, o, > > u, x, or X conversion applies to a pointer to a long int or unsigned > > long int argument, or that a following n conversion corresponds to a > > pointer to a long int argument. Linux provides a non ANSI compliant > > use of two l flags as a synonym to q or L. Thus ll can be used in > > combination with float conversions. *This usage is, however, strongly > > discouraged.* > > > > • The character L specifying that a following e, E, f, g, or G > > conversion corresponds to a long double argument, or a following d, i, > > o, u, x, or X conversion corresponds to a long long argument. Note > > that long long is not specified in ANSI C and therefore not portable > > to all architectures. > > Emphasis added. So, pre-C99, L was in fact the recommended modifier for > long long. As I read it, the usage that was discouraged was using ll as an alias for L with float conversions. Not use of ll where it was the right form. In any case, I'm not sure what we can take away from the history. ll was the form that ended up getting standardized (probably because it didn't "steal" any additional letter that might be used for other things on existing implementations or in the future), and both q and the overloading of L got rejected. Rich