From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9541 invoked by alias); 27 Oct 2015 10:50:34 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 36978 Received: (qmail 1353 invoked from network); 27 Oct 2015 10:50:34 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.0 X-AuditID: cbfec7f5-f794b6d000001495-35-562f56f66f2c Date: Tue, 27 Oct 2015 10:50:21 +0000 From: Peter Stephenson To: zsh-workers@zsh.org Subject: Re: Question about mb_metastrlen Message-id: <20151027105021.20c5e67c@pwslap01u.europe.root.pri> In-reply-to: References: <20151027091026.5197b79f@pwslap01u.europe.root.pri> Organization: Samsung Cambridge Solution Centre X-Mailer: Claws Mail 3.7.9 (GTK+ 2.22.0; i386-redhat-linux-gnu) MIME-version: 1.0 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrKLMWRmVeSWpSXmKPExsVy+t/xq7rfwvTDDBq3WVgcbH7I5MDoserg B6YAxigum5TUnMyy1CJ9uwSujFd/+5kLnnFUvN79gKWB8RtbFyMnh4SAiUTnncPsELaYxIV7 64HiXBxCAksZJV6c3ssI4cxgkjj39QUrhLONUeJJxyoWkBYWAVWJj+8Wg9lsAoYSUzfNZgSx RQTEJc6uPQ8WFxbQlOhZeR0szitgL7H10F+wdZwCwRLvj/5lghh6nlHi8ooDrCAJfgF9iat/ PzFB3GQvMfPKGahmQYkfk++BDWUW0JLYvK2JFcKWl9i85i0ziC0koC5x4+5u9gmMQrOQtMxC 0jILScsCRuZVjKKppckFxUnpuUZ6xYm5xaV56XrJ+bmbGCGB+3UH49JjVocYBTgYlXh4DSr0 woRYE8uKK3MPMUpwMCuJ8Arq6IcJ8aYkVlalFuXHF5XmpBYfYpTmYFES5525632IkEB6Yklq dmpqQWoRTJaJg1OqgXF+R5Gn/ZOdh0vf5Oe+USqoWb+xLWyBYJk566Wq6g7fHToMzXz8by94 qPVefbx7c3fNgxlMs55svj+xUnixzBPbJwIv0o+23bzC9yGmgTtYddLqy39Lortkzl497D2X tfbGdwWhV35ccxQXdLfu9/dSk5ktafY73+Sj45EHh/JLpjAyKMQtL1JiKc5INNRiLipOBAAI RLmnWAIAAA== On Tue, 27 Oct 2015 11:34:35 +0100 Sebastian Gniazdowski wrote: > On 27 October 2015 at 10:10, Peter Stephenson wrote: > > On Tue, 27 Oct 2015 09:31:02 +0100 > > The function you're talking about is for a string length, not a > > character length. num_in_char counts the number of trailing bytes that > > didn't form a wide character. Each will be treated as a single byte. > > So each counts 1 for the length of the string. > > There is the condition: > if (ret == MB_INVALID) { > > Isn't it that if there are many trailing bytes that do not form a > character, they will be catched into MB_INVALID, and only last > "character" can stay as not yet complete? Only the last multibyte character can consist of multiple individual bytes that look like part of an incomplete character rather than simply as invalid, that's correct. Hence the note at the end of the function about use of num_in_char, and hence we reset num_in_char to 0 any time we get a full multibyte character or mark a byte as invalid rather than incomplete. pws