From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12869 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Re: [PATCH] scanf: handle the L modifier for integers Date: Thu, 31 May 2018 19:44:36 -0400 Message-ID: <20180531234436.GN1392@brightrain.aerifal.cx> References: <20180531064719.6805-1-avagin@virtuozzo.com> <20180531190021.GA9758@outlook.office365.com> <20180531224442.3bc8dcdc@ncopa-desktop.copa.dup.pw> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1527810166 3675 195.159.176.226 (31 May 2018 23:42:46 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 31 May 2018 23:42:46 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-12885-gllmg-musl=m.gmane.org@lists.openwall.com Fri Jun 01 01:42:42 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1fOXDN-0000sc-U7 for gllmg-musl@m.gmane.org; Fri, 01 Jun 2018 01:42:41 +0200 Original-Received: (qmail 3135 invoked by uid 550); 31 May 2018 23:44:49 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 3113 invoked from network); 31 May 2018 23:44:48 -0000 Content-Disposition: inline In-Reply-To: <20180531224442.3bc8dcdc@ncopa-desktop.copa.dup.pw> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:12869 Archived-At: On Thu, May 31, 2018 at 10:44:42PM +0200, Natanael Copa wrote: > On Thu, 31 May 2018 12:00:22 -0700 > Andrei Vagin wrote: > > > >>Without this patch, ret will be 1 and mask will be 0. It is obviously > > >>incorrect. According to the man page, L should work like ll: > > >> > > >>L Indicates that the conversion will be either e, f, or g and the > > >> next pointer is a pointer to long double or the conversion will > > >> be d, i, o, u, or x and the next pointer is a pointer to long > > >> long. > > > > > > This is a GNU extension. POSIX states that L is only valid before > > >a floating-point conversion specifier: > > > > > >L > > > Specifies that a following a, A, e, E, f, F, g, or G conversion > > >specifier > > > applies to an argument with type pointer to long double. > > > > > > from > > >http://pubs.opengroup.org/onlinepubs/9699919799/functions/scanf.html > > > > > > So, it is valid for musl not to accept %Lx. > > > Now, the argument that it's a good idea to align musl's behaviour to > > >glibc's whenever possible is a sensible one. But it's a decision for > > >the musl authors to make, and the pros and cons need to be carefully > > >balanced; musl's current behaviour is not _incorrect_. > > > > It is incorrect, because scanf() has to return 0, or it has to handle the > > L modifier. Currently it doesn't handle L and return 1, so the > > application can't detect this issue. > > That sounds like a bug in musl libc. > > > I would prefer a case when musl works like glibc, if there are not any > > reason to not to do that. For example, now Alpine Linux is very popular > > and there are a lot of packages. In many cases, a maintainer, who adds a > > new package, fixes compile-time errors and doesn't run any tests. > > A target application can work differently with musl comparing with glibc > > due to this sort of issues. > > FreeBSD man page says: > > L Indicates that the conversion will be one of a, e, f, or g and > the next pointer is a pointer to long double. > > NetBSD man page says: > > L Indicates that the conversion will be efg and the next pointer is > a pointer to long double. > > OpenBSD man page says: > > L > Indicates that the conversion will be one of efg and the next pointer is a pointer to long double. > > So the application will break on most (every) system that is not GNU > libc. It would be better to fix the application in this case: > > > char str[] = "sigmask: 0x200"; > long long mask = 0; > int ret; > > #if defined(__GLIBC__) > ret = sscanf(str, "sigmask: %Lx", &mask)); > #else > ret = sscanf(str, "sigmask: %llx", &mask)); > #endif > printf("%d %llx\n", ret, mask); > > > > Or just use %llx which is POSIX and should work everywhere. Indeed, there is no reason to use %Lx anywhere. It's simply wrong. > That said, those things are tricky to detect at compile time as you > mentioned and they are tricky to detect with configure scripts that > works with cross compilation. If gcc does not catch this with -Wformat, it's a gcc bug that we should report and try to get fixed. It's possible that they're making an exception for the invalidity of L with integer formats since some libcs support that, but I don't see any good reason for this; gcc should still be warning about the incorrect and nonportable usage. I can't imagine they'd be opposed to a patch to fix it. > Also many developers seems to think that > Linux == glibc so they only read the GNU manuals, so yeah, implement > glibc behavior here seems like a good idea, unless someone else has a > brilliant idea how to catch this at compile time. Aside from fixing gcc at compile time, this has come up before (with regard to printf, not scanf), and my leaning then and now was to detect the UB at runtime by crashing rather than reporting an error as we do now, since (1) it's UB, so an application can't reasonably expect an error, and (2) applications seem to be ignoring errors anyway. We should also get the man page fixed. The printf man page is clear that L with integer specifiers is a nonstandard extension and should not be used (they're not documented under L, only as a note at the end) but it seems whoever fixed this overlooked changing scanf at the same time. Rich