From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13972 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: segfault on sscanf Date: Thu, 14 Mar 2019 18:40:12 -0400 Message-ID: <20190314224012.GI23599@brightrain.aerifal.cx> References: <20190314104617.711ac7d8@faultier2go> <20190314162814.GI28106@voyager> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="120078"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-13988-gllmg-musl=m.gmane.org@lists.openwall.com Thu Mar 14 23:40:27 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1h4Z1X-000V6w-5c for gllmg-musl@m.gmane.org; Thu, 14 Mar 2019 23:40:27 +0100 Original-Received: (qmail 15431 invoked by uid 550); 14 Mar 2019 22:40:25 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 15407 invoked from network); 14 Mar 2019 22:40:24 -0000 Content-Disposition: inline In-Reply-To: <20190314162814.GI28106@voyager> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:13972 Archived-At: On Thu, Mar 14, 2019 at 05:28:14PM +0100, Markus Wichmann wrote: > On Thu, Mar 14, 2019 at 10:46:17AM +0100, Marian Buschsieweke wrote: > > Hi, > > > > running pdflatex on Alpine Linux for a specific document resulted in a > > segfault, which I could trace down to a specific call to sscanf. This is a > > minimum example to reproduce that segfault: > > > > #include > > > > int main(void) { > > const char *too_parse = "0 1 -1 0"; > > double f1,f2,f3,f4; > > char dummy; > > sscanf(too_parse, " %lf %lf %lf %lf %c", &f1, &f2, &f3, &f4, &dummy); > > > > printf("f1=%f, f2=%f, f3=%f, f4=%f, dummy=\"%c\"\n", f1, f2, f3, f4, dummy); > > > > return 0; > > } > > > > This is the backtrace: > > > > #0 0x00007ffff7fb7eba in vfscanf (f=f@entry=0x7fffffffe6f8, > > fmt=, ap=ap@entry=0x7fffffffe7f8) at src/stdio/vfscanf.c:262 > > #1 0x00007ffff7fb971a in vsscanf (s=, fmt=, > > ap=ap@entry=0x7fffffffe7f8) at src/stdio/vsscanf.c:14 > > #2 0x00007ffff7fb594d in sscanf (s=, fmt=) > > at src/stdio/sscanf.c:9 > > #3 0x0000555555555213 in main () at test.c:7 > > > > I have the package Alpine Linux package musl-1.1.21-r0 installed, which is musl > > version 1.1.21 with minimal changes. > > > > Kind regards, > > Marian > > OK, so here's the crashing line: > > while (scanset[(c=shgetc(f))+1]) > s[i++] = c; > > It is (unsurprisingly) inside the %c parsing case. At the end of input, > shgetc() returns EOF, which is -1. EOF+1 is therefore 0. And scanset[0] > should be set to 0 (that happens a few lines further up). So the > crashing line should never occur (the line number of the crash is for > the loop body itself). > > The error is reproducible whenever sscanf() runs out of input within a > %f conversion, and another conversion happens after it. I would not be > surprised if __floatscan() manages to set the file state wrong on EOF. > > The above isn't actually minimal. Here's an even shorter segfault. > > #include > > int main(void) { > const char *too_parse = "0"; > double f1; > char dummy; > sscanf(too_parse, "%f%c", &f1, &dummy); > > printf("f1=%f, dummy=\"%c\"\n", f1, dummy); > > return 0; > } > > So, I'm off to read __floatscan(). As I recall, it was complicated, so > expect me back in about 10 years or so... The above test is invalid due to UB; f1 should have type float not double, and dummy should be initialized so that it's not trying to print an indeterminate value on success. Fixing those aspects, my proposed fix seems to work as long as it doesn't break anything else (setting f->shend = f->rpos instead of 0 on eof). Rich