From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/11887 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Issues in mbsnrtowcs and wcsnrtombs Date: Thu, 31 Aug 2017 14:28:15 -0400 Message-ID: <20170831182815.GA7054@brightrain.aerifal.cx> References: Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1504204130 19103 195.159.176.226 (31 Aug 2017 18:28:50 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 31 Aug 2017 18:28:50 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-11900-gllmg-musl=m.gmane.org@lists.openwall.com Thu Aug 31 20:28:46 2017 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1dnUCa-0003vK-6g for gllmg-musl@m.gmane.org; Thu, 31 Aug 2017 20:28:28 +0200 Original-Received: (qmail 26462 invoked by uid 550); 31 Aug 2017 18:28:33 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 26397 invoked from network); 31 Aug 2017 18:28:27 -0000 Content-Disposition: inline In-Reply-To: Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:11887 Archived-At: On Tue, Jul 18, 2017 at 11:05:29PM +0300, Mikhail Kremnyov wrote: > Hi, > > It looks like there are some bugs in the implementations of mbsnrtowcs > and wcsnrtombs. > E.g. inside mbsnrtowcs there is this code: > > while ( s && wn && ( (n2=n/4)>=wn || n2>32 ) ) { > if (n2>=wn) n2=wn; > n -= n2; > l = mbsrtowcs(ws, &s, n2, st); > > Here "n" is the number of source bytes to convert and "n2" is the number > of wide chars that may be put to the destination, so it's incorrect to > subtract one from another. And indeed a simple test shows that the > function doesn't work correctly if long enough non-ascii string is > passed to it. E.g.: Failure to understand what you mean here kept me from making sense of the problem right away, but I think I understand now. While derived from a number of bytes (n), n2 is a bound on the number of output wide characters to ensure that no more than 4*n2 (<=n) bytes of input may be read. It will only be equal to the number of bytes of input read in the case where each wide character was converted from a single byte (i.e. ASCII input), and thus adjusting the remaining n by subtracting n2 is incorrect. Instead, as your follow-up patch does, the number of input bytes processed must be measured by taking the difference of old and new values of s. As long as I don't see any problems I'll go ahead and apply now. Thanks and sorry for the delay getting back to this. Rich