From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12113 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH] fix fgetwc when decoding a character that crosses buffer boundary Date: Sat, 18 Nov 2017 20:14:33 -0500 Message-ID: <20171119011433.GH1627@brightrain.aerifal.cx> References: <20171118165148.GF15263@port70.net> <20171119011257.GG1627@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1511054088 24021 195.159.176.226 (19 Nov 2017 01:14:48 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 19 Nov 2017 01:14:48 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-12129-gllmg-musl=m.gmane.org@lists.openwall.com Sun Nov 19 02:14:42 2017 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1eGEC0-0005or-7c for gllmg-musl@m.gmane.org; Sun, 19 Nov 2017 02:14:40 +0100 Original-Received: (qmail 11442 invoked by uid 550); 19 Nov 2017 01:14:46 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 11424 invoked from network); 19 Nov 2017 01:14:45 -0000 Content-Disposition: inline In-Reply-To: <20171119011257.GG1627@brightrain.aerifal.cx> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:12113 Archived-At: On Sat, Nov 18, 2017 at 08:12:57PM -0500, Rich Felker wrote: > On Sat, Nov 18, 2017 at 05:51:48PM +0100, Szabolcs Nagy wrote: > > Update the buffer position according to the bytes consumed into st when > > decoding an incomplete character at the end of the buffer. > > --- > > src/stdio/fgetwc.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/src/stdio/fgetwc.c b/src/stdio/fgetwc.c > > index e455cfec..a00c1a86 100644 > > --- a/src/stdio/fgetwc.c > > +++ b/src/stdio/fgetwc.c > > @@ -15,20 +15,21 @@ static wint_t __fgetwc_unlocked_internal(FILE *f) > > if (f->rpos < f->rend) { > > l = mbrtowc(&wc, (void *)f->rpos, f->rend - f->rpos, &st); > > if (l+2 >= 2) { > > f->rpos += l + !l; /* l==0 means 1 byte, null */ > > return wc; > > } > > if (l == -1) { > > f->rpos++; > > return WEOF; > > } > > + f->rpos = f->rend; > > } else l = -2; > > Thanks, applying! Here is a test case that demonstrates the bug > reproducibly; feel free to adapt (it should probably use the framework > functions for getting a utf-8 locale and error reporting) & include it > in libc-test. > > Rich > #include > #include ^^^^^^^^ Ooops, this is spurious/leftover from when I thought I was going to need to do something fancier to reliably trigger it. > #include > #include > #include > > int main() > { > setlocale(LC_CTYPE, ""); > int p[2]; > pipe(p); > write(p[1], "x\340\240", 3); > dup2(p[0], 0); > wchar_t wc; > wc = fgetwc(stdin); > write(p[1], "\200", 1); > close(p[1]); > wc = fgetwc(stdin); > printf("got %x\n", wc); > }