From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4882 invoked by alias); 10 Nov 2016 14:57:10 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 39908 Received: (qmail 4700 invoked from network); 10 Nov 2016 14:57:10 -0000 X-Qmail-Scanner-Diagnostics: from new2-smtp.messagingengine.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.99.2/21882. spamassassin: 3.4.1. Clear:RC:0(66.111.4.224):SA:0(0.0/5.0):. Processed in 0.668559 secs); 10 Nov 2016 14:57:10 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=FREEMAIL_FROM,SPF_PASS, T_DKIM_INVALID autolearn=unavailable autolearn_force=no version=3.4.1 X-Envelope-From: psprint@fastmail.com X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: pass (ns1.primenet.com.au: SPF record at spf.messagingengine.com designates 66.111.4.224 as permitted sender) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=fastmail.com; h= content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-sender :x-me-sender:x-sasl-enc; s=mesmtp; bh=66ePxgXm9eIcb3JV8J58IYHukT 0=; b=H9UViBnA2OhmVQg25VFtXE5Sb35am4hTNuXQbbjNompJGgcbHOxNy4zExY Avxg1pKTxnLzW5HHT0xyRVpTR1vLPFtjDAT82XcGbmOimZIMIbfjYtTFmLMQrIIg DjTobSifRhKhkoUXEneltXta5OoITuBGbK8ydr8qEgO2+gKCk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-sender:x-me-sender:x-sasl-enc; s=smtpout; bh=66 ePxgXm9eIcb3JV8J58IYHukT0=; b=o49oNzvZPMMNjJ/CDoS6+nag2aBteAIr+G z25jbgw/O8nRgl+E9jSLOe/CP6UWqHN3BizAfvJ7YdMIYwTP54Ddq5F/9W8rakdj CjZFELRDf1JKNKk/UeOPIcKLcHp1+KzzmP5G6bsz0of5mA3zUozZcWL6Ll76vIFx L6QJWwPhQ= X-ME-Sender: Message-Id: <1478789821.2424071.783592081.17FE98D4@webmail.messagingengine.com> From: Sebastian Gniazdowski To: zsh-workers@zsh.org MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" X-Mailer: MessagingEngine.com Webmail Interface - ajax-d68eb56e In-Reply-To: <20161110134722.06e6dc51@pwslap01u.europe.root.pri> Subject: Re: multibyte optimisations Date: Thu, 10 Nov 2016 06:57:01 -0800 References: <1478774232.2371010.783342705.69C81F52@webmail.messagingengine.com> <20161110134722.06e6dc51@pwslap01u.europe.root.pri> On Thu, Nov 10, 2016, at 05:47 AM, Peter Stephenson wrote: > On Thu, 10 Nov 2016 02:37:12 -0800 > Sebastian Gniazdowski wrote: > > Other pointed functions seem to be very valid / expected =E2=80=93=C2= =A0multibyte > > functions. They can be optimized if a courageous decision will be made = =E2=80=93 > > to do what charnext / pattern.c does: > >=20 > > if (!(patglobflags & GF_MULTIBYTE) || !(STOUC(*x) & 0x80)) > > return x + 1; > >=20 > > I.e. to optimize for ASCII as subset of UTF-8 also when calling > > MB_METACHARLEN, not only for MB_METASTRLEN (recent change). >=20 > These look straightforward and along the same lines as what we already > do. Was worried that multibyte state can be not clear when requesting length of character, but that cannot really happen, and if it would, then the loop that advances char by char would have a problem, being in unclear situation after recent advancement. With this patch the parser runs for 1493 ms instead of 2148 ms :) --=20 Sebastian Gniazdowski psprint@fastmail.com