From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12880 Path: news.gmane.org!.POSTED!not-for-mail From: Andrei Vagin Newsgroups: gmane.linux.lib.musl.general Subject: Re: Re: [PATCH] scanf: handle the L modifier for integers Date: Fri, 1 Jun 2018 00:36:07 -0700 Message-ID: <20180601073606.GA10581@gmail.com> References: <20180531064719.6805-1-avagin@virtuozzo.com> <20180531190021.GA9758@outlook.office365.com> <20180531224442.3bc8dcdc@ncopa-desktop.copa.dup.pw> <20180531234436.GN1392@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=koi8-r X-Trace: blaine.gmane.org 1527838462 15119 195.159.176.226 (1 Jun 2018 07:34:22 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 1 Jun 2018 07:34:22 +0000 (UTC) User-Agent: Mutt/1.9.3 (2018-01-21) Cc: musl@lists.openwall.com To: Rich Felker Original-X-From: musl-return-12896-gllmg-musl=m.gmane.org@lists.openwall.com Fri Jun 01 09:34:18 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1fOeZj-0003lU-KN for gllmg-musl@m.gmane.org; Fri, 01 Jun 2018 09:34:15 +0200 Original-Received: (qmail 12287 invoked by uid 550); 1 Jun 2018 07:36:23 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 12266 invoked from network); 1 Jun 2018 07:36:22 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=R4dv3mpPjqQXWZJZtYKPHp36/WH65IvwjfO58/5N5f4=; b=heetCFyqVZ2PakwZ5kVF3mmCwADqhw+BH5HCWygOnEByU5qiZElt8BPz7rhtD+faHP M52d6ZOjm2Hx5/kaLoUZ8ElBPsNCPVQwcblkeJ0A3Q4ju3XOJQRK3k8/98RQY53BsFOM u7X9kkF08dnEPAOTgCkeXVDn2EEX0FdKsUVE6zX5Un5bNbBSzfO1fSL4vUVf7n4b7Ph7 MVLF5dpsTohY5zdd7D9hszpwe/uWwUoMWI0LE6T25V26MuDSTXS/ObAgFJPS1Lg7JzAK wGc1zQLXN1nhucCtBwKfq7Dz+OF0TZBJktQKg8GIRKkPUj36gSXy1P8T0tDWIMC3A3Nd RwoA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=R4dv3mpPjqQXWZJZtYKPHp36/WH65IvwjfO58/5N5f4=; b=fKMQrvAFrBbDudzEvmaVH64rscMmoxRywGcLy7zdL7NJDhVR9t3rkaYNEKMpp1rhWQ zysMKbW9+486WPhTOm5wCCiSXRPEHbjNLkziMVk8le38wd7TWERcpMKtxkX8q0K2Wdys kbwmJ06oIwtw08dnIjz8Nat/KI3X0ak7MVVjw9kuj2tu3Q4rs2chVFMyEjLBFGL60MEb zZMyoAvWVZECqSVGjIjL3anST5IY1dsugD4k2SKyGuJqdJu/rWIpnCid+T8BSEK2Cb0A Zuf9ecPmNZ53TnU31eBAIbb4/clCWd+8fn16ruf+2EmRfCdU8XjguH/sVCDcXIVNUXuZ 9xSQ== X-Gm-Message-State: ALKqPwdeMoKluBDrFZ/IC/FgF6AEsIYCP89Bk3DOtVtoetOEeavgW0bD XgYWMm7bgWdx1EofmOEMLwQ= X-Google-Smtp-Source: ADUXVKKus1UyQG/4+GpQlTbebxxjtg1Q1A6edo3V2z/R/TT3fLMqbW1JG96i+GNfE7bBVzVhZNtTxg== X-Received: by 2002:a63:980a:: with SMTP id q10-v6mr2192910pgd.50.1527838570272; Fri, 01 Jun 2018 00:36:10 -0700 (PDT) Content-Disposition: inline In-Reply-To: <20180531234436.GN1392@brightrain.aerifal.cx> Xref: news.gmane.org gmane.linux.lib.musl.general:12880 Archived-At: On Thu, May 31, 2018 at 07:44:36PM -0400, Rich Felker wrote: > On Thu, May 31, 2018 at 10:44:42PM +0200, Natanael Copa wrote: > > On Thu, 31 May 2018 12:00:22 -0700 > > Andrei Vagin wrote: > > > > > >>Without this patch, ret will be 1 and mask will be 0. It is obviously > > > >>incorrect. According to the man page, L should work like ll: > > > >> > > > >>L Indicates that the conversion will be either e, f, or g and the > > > >> next pointer is a pointer to long double or the conversion will > > > >> be d, i, o, u, or x and the next pointer is a pointer to long > > > >> long. > > > > > > > > This is a GNU extension. POSIX states that L is only valid before > > > >a floating-point conversion specifier: > > > > > > > >L > > > > Specifies that a following a, A, e, E, f, F, g, or G conversion > > > >specifier > > > > applies to an argument with type pointer to long double. > > > > > > > > from > > > >http://pubs.opengroup.org/onlinepubs/9699919799/functions/scanf.html > > > > > > > > So, it is valid for musl not to accept %Lx. > > > > Now, the argument that it's a good idea to align musl's behaviour to > > > >glibc's whenever possible is a sensible one. But it's a decision for > > > >the musl authors to make, and the pros and cons need to be carefully > > > >balanced; musl's current behaviour is not _incorrect_. > > > > > > It is incorrect, because scanf() has to return 0, or it has to handle the > > > L modifier. Currently it doesn't handle L and return 1, so the > > > application can't detect this issue. > > > > That sounds like a bug in musl libc. > > > > > I would prefer a case when musl works like glibc, if there are not any > > > reason to not to do that. For example, now Alpine Linux is very popular > > > and there are a lot of packages. In many cases, a maintainer, who adds a > > > new package, fixes compile-time errors and doesn't run any tests. > > > A target application can work differently with musl comparing with glibc > > > due to this sort of issues. > > > > FreeBSD man page says: > > > > L Indicates that the conversion will be one of a, e, f, or g and > > the next pointer is a pointer to long double. > > > > NetBSD man page says: > > > > L Indicates that the conversion will be efg and the next pointer is > > a pointer to long double. > > > > OpenBSD man page says: > > > > L > > Indicates that the conversion will be one of efg and the next pointer is a pointer to long double. > > > > So the application will break on most (every) system that is not GNU > > libc. It would be better to fix the application in this case: > > > > > > char str[] = "sigmask: 0x200"; > > long long mask = 0; > > int ret; > > > > #if defined(__GLIBC__) > > ret = sscanf(str, "sigmask: %Lx", &mask)); > > #else > > ret = sscanf(str, "sigmask: %llx", &mask)); > > #endif > > printf("%d %llx\n", ret, mask); > > > > > > > > Or just use %llx which is POSIX and should work everywhere. > > Indeed, there is no reason to use %Lx anywhere. It's simply wrong. > > > That said, those things are tricky to detect at compile time as you > > mentioned and they are tricky to detect with configure scripts that > > works with cross compilation. > > If gcc does not catch this with -Wformat, it's a gcc bug that we > should report and try to get fixed. It's possible that they're making > an exception for the invalidity of L with integer formats since some > libcs support that, but I don't see any good reason for this; gcc > should still be warning about the incorrect and nonportable usage. I > can't imagine they'd be opposed to a patch to fix it. I found that gcc catches this with -pedantic -Wformat: /musl # gcc -Wall -pedantic test.c test.c: In function 'main': test.c:9:28: warning: ISO C does not support the '%Lx' gnu_scanf format [-Wformat=] ret = sscanf(str, "%llx %Lx", &a, &b); > > > Also many developers seems to think that > > Linux == glibc so they only read the GNU manuals, so yeah, implement > > glibc behavior here seems like a good idea, unless someone else has a > > brilliant idea how to catch this at compile time. > > Aside from fixing gcc at compile time, this has come up before (with > regard to printf, not scanf), and my leaning then and now was to > detect the UB at runtime by crashing rather than reporting an error as > we do now, since (1) it's UB, so an application can't reasonably > expect an error, and (2) applications seem to be ignoring errors > anyway. > > We should also get the man page fixed. The printf man page is clear > that L with integer specifiers is a nonstandard extension and should > not be used (they're not documented under L, only as a note at the > end) but it seems whoever fixed this overlooked changing scanf at the > same time. > > Rich