From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12112 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH] fix fgetwc when decoding a character that crosses buffer boundary Date: Sat, 18 Nov 2017 20:12:57 -0500 Message-ID: <20171119011257.GG1627@brightrain.aerifal.cx> References: <20171118165148.GF15263@port70.net> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="ed/6oDxOLijJh8b0" X-Trace: blaine.gmane.org 1511053991 2420 195.159.176.226 (19 Nov 2017 01:13:11 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 19 Nov 2017 01:13:11 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-12128-gllmg-musl=m.gmane.org@lists.openwall.com Sun Nov 19 02:13:07 2017 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1eGEAT-0000Gx-Qg for gllmg-musl@m.gmane.org; Sun, 19 Nov 2017 02:13:05 +0100 Original-Received: (qmail 9734 invoked by uid 550); 19 Nov 2017 01:13:10 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 9715 invoked from network); 19 Nov 2017 01:13:09 -0000 Content-Disposition: inline In-Reply-To: <20171118165148.GF15263@port70.net> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:12112 Archived-At: --ed/6oDxOLijJh8b0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sat, Nov 18, 2017 at 05:51:48PM +0100, Szabolcs Nagy wrote: > Update the buffer position according to the bytes consumed into st when > decoding an incomplete character at the end of the buffer. > --- > src/stdio/fgetwc.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/stdio/fgetwc.c b/src/stdio/fgetwc.c > index e455cfec..a00c1a86 100644 > --- a/src/stdio/fgetwc.c > +++ b/src/stdio/fgetwc.c > @@ -15,20 +15,21 @@ static wint_t __fgetwc_unlocked_internal(FILE *f) > if (f->rpos < f->rend) { > l = mbrtowc(&wc, (void *)f->rpos, f->rend - f->rpos, &st); > if (l+2 >= 2) { > f->rpos += l + !l; /* l==0 means 1 byte, null */ > return wc; > } > if (l == -1) { > f->rpos++; > return WEOF; > } > + f->rpos = f->rend; > } else l = -2; Thanks, applying! Here is a test case that demonstrates the bug reproducibly; feel free to adapt (it should probably use the framework functions for getting a utf-8 locale and error reporting) & include it in libc-test. Rich --ed/6oDxOLijJh8b0 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="fgetwc2.c" #include #include #include #include #include int main() { setlocale(LC_CTYPE, ""); int p[2]; pipe(p); write(p[1], "x\340\240", 3); dup2(p[0], 0); wchar_t wc; wc = fgetwc(stdin); write(p[1], "\200", 1); close(p[1]); wc = fgetwc(stdin); printf("got %x\n", wc); } --ed/6oDxOLijJh8b0--