From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 28323 invoked by alias); 10 Sep 2015 11:35:59 -0000 Mailing-List: contact zsh-users-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Users List List-Post: List-Help: X-Seq: 20537 Received: (qmail 6833 invoked from network); 10 Sep 2015 11:35:56 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.0 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=JQZoFonntq3hR9f4sxm2UOUuQQ15CfWrDzzyIrtfgmE=; b=haadzJdabbdSPTnzdvfkodvY+x22BsdsYDDchQiBGZVy9+3+dFMi6vk2fL/qsgA9Ij TJERa/ypxAqjuYNWZWgiW5oFtH2aNuz3QhZJYDp5o0K72o/MN6mLOx/QMTSLfIBsyn8B EzM7/dhcEhT/GRgJLfcWu6naAzbkihVFtwmyRR34/FzsNeNqVgps95qvP3P/C6nYtlHw rwn8MB8uSZWPYypFnVjQxiybgK5ukb/ZjA2uO2TM3jIJeGZ8t670wS5EBci8/6kwyjmv 90jF/ELmlQ289XcDMuvCqzYNlu5MbunGwWzTWssrwzmnQIK7OoamegqDMHUAVsNjY+R0 Pyaw== X-Gm-Message-State: ALoCoQmi3dk29BCXDvzBJxsjW8v957dhgn/TjTrTOfnPc30N8v/+qJ1txyJV+VgL5R5hE9iTwgMZ MIME-Version: 1.0 X-Received: by 10.194.171.4 with SMTP id aq4mr60423718wjc.114.1441884951106; Thu, 10 Sep 2015 04:35:51 -0700 (PDT) X-Originating-IP: [193.174.53.85] Date: Thu, 10 Sep 2015 13:35:51 +0200 Message-ID: Subject: Match length and multibyte characters From: Erik Bernstein To: zsh-users@zsh.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hello everybody, while playing around with zsh expansions, I've stumbled across this small annoyance that I think might be worth to ask the list about. Let's suppose I have an array and I want to know the length of the longest string contained. After going through zshexpn(1), the first thing I came up with was: % array=3D(a bbb cc) % print ${${(O)array//(#m)*/${#MATCH}}[1]} 3 which is perfectly fine and seems to do the job. Later I found that the same thing can be accomplished by this. % print ${${(ON)array%%*}[1]} 3 However, the second version seems to break on multibyte characters while the first one works just fine: % array=3D(a =C3=A4 a) % print ${${(O)array//(#m)*/${#MATCH}}[1]} ${${(ON)array%%*}[1]} 1 2 Can maybe someone shed some light on whether the second version is supposed to work with multibyte characters and, if, what has to be done to make it count multibyte chars only once just like the first version does? regards erik