From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13966 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Markus Wichmann Newsgroups: gmane.linux.lib.musl.general Subject: Re: segfault on sscanf Date: Thu, 14 Mar 2019 17:53:35 +0100 Message-ID: <20190314165335.GJ28106@voyager> References: <20190314104617.711ac7d8@faultier2go> <20190314162814.GI28106@voyager> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="92879"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.10.1 (2018-07-13) To: musl@lists.openwall.com Original-X-From: musl-return-13982-gllmg-musl=m.gmane.org@lists.openwall.com Thu Mar 14 17:53:56 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1h4TcA-000O2x-Ut for gllmg-musl@m.gmane.org; Thu, 14 Mar 2019 17:53:55 +0100 Original-Received: (qmail 5657 invoked by uid 550); 14 Mar 2019 16:53:52 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 5639 invoked from network); 14 Mar 2019 16:53:52 -0000 Content-Disposition: inline In-Reply-To: <20190314162814.GI28106@voyager> X-Provags-ID: V03:K1:v5ff8Ao/mljrKaHRpyeHL2B9DIVEdFIKJDi772T7mX1vmIc13Vl wLorfY6a7hKI13pr1pXSVzqANcdVxFdwuojkOa1W6xs8jc/K1T4Pnjk82UO/wJdLfaeXy6X fkuIN+NQZRAa+Td2R1AbNksMBZuQuP0qe9XCOc0eIOmOil0l0PeHTiQfFWbFczR1h/7gR1j uSKSzMKawULEibbk1q3zA== X-UI-Out-Filterresults: notjunk:1;V03:K0:v4tAtLRB+VQ=:80wRtj4+SjfI02v6YtQgPE xfamep0sDaim2noQyIw65rDtjYS2SHG2eA2p55RKgLXQi3hM1h8UVuupUTRe9CrKW8LY8aUiL laTGnfmjYQXApt62DZoUqUivzhqmABMq2u7g1+WZanLJh6qT6PDIV8QukMpmt/MjoJ4rdfnzK PkHs2rQvjVsTLUYriSROzcVJ8riVjOkqEwYEltaCLenPhgStbHbbaVfErAVNt56LWq7txg3UU yYqhPgZwyOA13SRbBaBPGgo0up8lImpwTGCeRsAXdzxRfkUVMwoGOUpa3zr4oUFL+VlK3jEAq duZhJ/ecEqRWh59jzfj4nAABAW9exFy9J2x28e/FIofgGfmJdBMF7mNhs/CGFwyXH6NLJ2H9c 4gbEZIqb2TOxd9ZI86YahHT08c/poYIZuLrflWH8lYzAeSx+0hvpQnC7786V06MBPcHMRh6Xw 9rtcGgA7JNS6XXQq8KKhU0U86crZT3UFOAF6MOqA+8DRqko/WoiG22ZxurkUxxYTmVi+9KbjQ ClZparPNPOLUYgQv1d2/hdzZmHCThOa8/tP1yEjXBhEt4KstdoBHnzl5E4z4yYv2untpXYKSE yoFAyL1KvHrD1bsVNk/X0BYsxFnm/gYkd5WgWVtkUw8wbCf2suHiPMYMVkQUm3aHGpRCYATOZ 43+UFgbvGRWgKuKj6TG09UhqF3N4BT2/iHZRedYY7QbsqsMCvcIiV2dTePion+NmEbdUYf6OR 6mPBI1Ak3nLKn3EjLPl/pvRKQVS1y5UCxNKVaNz3ipHbHj/jBIQEatHe1BkTXHqe834MWGbs Xref: news.gmane.org gmane.linux.lib.musl.general:13966 Archived-At: On Thu, Mar 14, 2019 at 05:28:14PM +0100, Markus Wichmann wrote: > #include > > int main(void) { > const char *too_parse = "0"; > double f1; > char dummy; > sscanf(too_parse, "%f%c", &f1, &dummy); > > printf("f1=%f, dummy=\"%c\"\n", f1, dummy); > > return 0; > } > > So, I'm off to read __floatscan(). As I recall, it was complicated, so > expect me back in about 10 years or so... > > Ciao, > Markus Actually, strike that, I think I have it. From __floatscan(): [among other things c = shgetc(f)] if (c=='0') { c = shgetc(f); if ((c|32) == 'x') return hexfloat(f, bits, emin, sign, pok); shunget(f); c = '0'; } The input is just "0". So inside this if-clause, shgetc() will return EOF and set the FILE's shend to 0. The shunget() therefore does nothing. Then we continue on to decfloat(). decfloat() will call shget() at least once. Unfortunately, this is shget()s definition: #define shgetc(f) (((f)->rpos != (f)->shend) ? *(f)->rpos++ : __shgetc(f)) Since f->shend == 0, but f->rpos == "0"+1, this will start dereferencing uncharted territory. But it will probably not crash immediately. That's what the %c parser is for. For %c it will keep parsing forever, eventually reaching unmapped memory and segfaulting. Bonus: Since now f->rpos > f->rend, __shlim() does nothing to prevent this issue. Maybe the EOF status should be sticky. Like this? (Line break because e-mail). #define shgetc(f) (!(f)->shend ? EOF : \ (f)->rpos != (f)->shend ? *(f)->rpos++ : __shgetc(f)) That way shgetc() keeps returning EOF until the next call to shlim(), at which point shgetc() will revert to __shgetc(), which might load more data, or might go back to EOFing everywhere... Ciao, Markus