From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/112396 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Marcus Vinicius Mesquita via ntg-context Newsgroups: gmane.comp.tex.context Subject: Re: lpeg pattern in function Date: Sat, 7 Aug 2021 06:40:46 -0300 Message-ID: References: <1f23a352-59e5-00d6-4610-c0982f1a4e62@xs4all.nl> Reply-To: mailing list for ConTeXt users Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5671684792624468120==" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="8641"; mail-complaints-to="usenet@ciao.gmane.io" Cc: Marcus Vinicius Mesquita , mailing list for ConTeXt users To: Hans Hagen Original-X-From: ntg-context-bounces@ntg.nl Sat Aug 07 11:41:31 2021 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane-mx.org Original-Received: from zapf.boekplan.nl ([5.39.185.232] helo=zapf.ntg.nl) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mCIpd-00025k-T8 for gctc-ntg-context-518@m.gmane-mx.org; Sat, 07 Aug 2021 11:41:29 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id CC685286537; Sat, 7 Aug 2021 11:41:05 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 50pqa-m9R6Z1; Sat, 7 Aug 2021 11:41:04 +0200 (CEST) Original-Received: from zapf.ntg.nl (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id F01AA286552; Sat, 7 Aug 2021 11:41:03 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by zapf.ntg.nl (Postfix) with ESMTP id 99E09286550 for ; Sat, 7 Aug 2021 11:41:02 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at zapf.boekplan.nl Original-Received: from zapf.ntg.nl ([127.0.0.1]) by localhost (zapf.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mvLuow641n0y for ; Sat, 7 Aug 2021 11:41:01 +0200 (CEST) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.216.53; helo=mail-pj1-f53.google.com; envelope-from=marcusvinicius.mesquita@gmail.com; receiver= Original-Received: from mail-pj1-f53.google.com (mail-pj1-f53.google.com [209.85.216.53]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits)) (No client certificate requested) by zapf.ntg.nl (Postfix) with ESMTPS id E3327286537 for ; Sat, 7 Aug 2021 11:41:00 +0200 (CEST) Original-Received: by mail-pj1-f53.google.com with SMTP id k11-20020a17090a62cbb02901786a5edc9aso2641906pjs.5 for ; Sat, 07 Aug 2021 02:41:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=pVBmgE51KY4sxGbwRV8R/bLClFI/BA/2KcAms8XEBmI=; b=FIA8sK4aAycupYv9urdxybD4SAsj5ipGAZ3gvVRNlbBpR1NYq48zZ+FhB+deDgKZ8r M1UktSBT5JUnuqkj/poKpaDPmuVHCYS6G/I/giJ6711KZ6G3+0BfclMhOoIUJ2dz3WQh WVMOsfhS3ZhXRSmg+c+TL7w2yNmbM8NBhc9XDohbRbrGdoqxZw/hH3ie+a8c2KiHNakJ VCmI+VekdvBzNmlvxI9274YoKRNDv2e4ezDemQrSoDOx7ehmWMHkgSetlVvdlM6Xshcx VjYjpHkDoi6hDrvSnOrVAvPyOA639uHMo6X3/OxenfAaKUNLX12Tqnh8qS79/Ugljjnu CYKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pVBmgE51KY4sxGbwRV8R/bLClFI/BA/2KcAms8XEBmI=; b=asGw2eKRSWYDROZcwR1rDAOIs7UieJGuWA6IkQiy3WwcbmydbisTbdv3GNhX93chzY Tu51S11I7iWY5cjtk2NoFNGoTu2WO0Zla8vCWJNLFxEm6RGc7jUwNYZZv0leaGNu+2lf OHVem9eNIV8dfpK3ljN4F3tm8+fyRmqnS3EdYCzNLJDu/OrluWF9pKtPG1L0KrAmVq6p +sygb6mv2l5PflOY9ujJGrTnQYif0bHhxSiGQI+i9Pauh7vAhLcZGTONxW//OdZo6rmq 2XzMcd8q4V3T6LWMmB02ih45WMkvzcIHZAqrKbbGk4Og4JHQWhhsulha/BEXhnNzNCX/ ySzg== X-Gm-Message-State: AOAM531FlRxAUEqx61Nc6IrZtDpG9t1R5i325zOrRDy4FWFz1tQ8T2yU BCBk5v5GoRzA9KC8RH8Njz9SVrJn/eJqk2/uPts= X-Google-Smtp-Source: ABdhPJxIQyMuEMCvEp/ZOPbZXEDJC64tYdXZC3ZYVVDZp8fMfIv5GPpM1KoRRINhjnEGdbEpO1B/6acFd24EEx+U7zo= X-Received: by 2002:a17:903:41c1:b029:12c:a3eb:21c with SMTP id u1-20020a17090341c1b029012ca3eb021cmr3714229ple.72.1628329259123; Sat, 07 Aug 2021 02:40:59 -0700 (PDT) In-Reply-To: <1f23a352-59e5-00d6-4610-c0982f1a4e62@xs4all.nl> X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.26 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: "ntg-context" Xref: news.gmane.io gmane.comp.tex.context:112396 Archived-At: --===============5671684792624468120== Content-Type: multipart/alternative; boundary="00000000000011468205c8f4f3dc" --00000000000011468205c8f4f3dc Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thank you, Hans. Very nice indeed your solutions. Marcus Vinicius On Sat, Aug 7, 2021 at 5:37 AM Hans Hagen wrote: > On 8/6/2021 10:58 PM, Marcus Vinicius Mesquita via ntg-context wrote: > > Dear list, > > in the mwe below, the expected result is ok for most entries but fails > > when the word contains the letters =C3=B3 or =C3=B4. > > We get zoolco instead of zool=C3=B3gico, and termtro instead of term=C3= =B4metro. > > What am I doing wrong? > > > > mwe: > > > > \def\stripnumber#1% > > {\cldcontext{lpeg.match(lpeg.stripper("[=C2=B9=C2=B2=C2=B3=E2= =81=B4=E2=81=B5=E2=81=B6=E2=81=B7=E2=81=B8=E2=81=B9=E2=81=B0]"), > > [=3D=3D[#1]=3D=3D])}} > > > > \starttext > > > > \stripnumber{=C3=A1rbitro=E2=81=B6} > > \stripnumber{=C3=A9bano=C2=B9} > > \stripnumber{=C3=ADcone=E2=81=B8} > > \stripnumber{zool=C3=B3gico=E2=81=B0} > > \stripnumber{eletroac=C3=BAstico=E2=81=B9} > > \stripnumber{tr=C3=A2nsfuga=E2=81=B7} > > \stripnumber{farmac=C3=AAutico=C2=B9} > > \stripnumber{ma=C3=AEtre=C2=B2} > > \stripnumber{term=C3=B4metro=C2=B3} > > \stripnumber{no=C3=BBs=E2=81=B4} > > > > \stoptext > > > \def\stripnumber#1% > {\cldcontext{lpeg.match(lpeg.stripper(lpeg.US("=C2=B9=C2=B2=C2=B3=E2= =81=B4=E2=81=B5=E2=81=B6=E2=81=B7=E2=81=B8=E2=81=B9=E2=81=B0")), > [=3D=3D[#1]=3D=3D])}} > > (US -> utf set) > > or > > \startluacode > local s =3D lpeg.stripper(lpeg.US("=C2=B9=C2=B2=C2=B3=E2=81=B4=E2=81= =B5=E2=81=B6=E2=81=B7=E2=81=B8=E2=81=B9=E2=81=B0")) > function document.StripNumber(str) > context(lpeg.match(s, str)) > end > \stopluacode > > \def\stripnumber#1{\ctxlua{document.StripNumber([=3D=3D[#1]=3D=3D])}} > > or you can go fancy: > > \startluacode > local p_strip =3D lpeg.stripper(lpeg.US("=C2=B9=C2=B2=C2=B3=E2=81=B4= =E2=81=B5=E2=81=B6=E2=81=B7=E2=81=B8=E2=81=B9=E2=81=B0")) > > interfaces.implement { > name =3D "StripNumber", > public =3D true, > arguments =3D "string", > actions =3D function(str) > context(lpeg.match(p_strip, str)) > end > } > \stopluacode > > \StripNumber{zool=C3=B3gico=E2=81=B0} > > There are more efficient variants but i guess it's good enough. > > Hans > > ----------------------------------------------------------------- > Hans Hagen | PRAGMA ADE > Ridderstraat 27 | 8061 GH Hasselt | The Netherlands > tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl > ----------------------------------------------------------------- > --=20 Todas as coisas fatigam o corpo, salvo a m=C3=BAsica, que n=C3=A3o fatiga n= em o corpo nem seus membros, por ser descanso da alma, primavera do cora=C3=A7=C3=A3o,= distra=C3=A7=C3=A3o do aflito, entretenimento do solit=C3=A1rio, e vi=C3=A1tico do viajante. Kunn=C3=A2sh al-H=C3=A2'ik (Cancioneiro de al-H=C3=A2'ik) --00000000000011468205c8f4f3dc Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thank you, Hans. Very nice indeed your solutions.

M= arcus Vinicius

On Sat, Aug 7, 2021 at 5:37 AM Hans Hagen <j.hagen@xs4all.nl> wrote:
On 8/6/2021 10:58 PM, Marcus Vinic= ius Mesquita via ntg-context wrote:
> Dear list,
> in the mwe below, the expected result is ok for most entries but fails=
> when the word contains the letters =C3=B3 or =C3=B4.
> We get zoolco instead of zool=C3=B3gico, and termtro instead of term= =C3=B4metro.
> What am I doing wrong?
>
> mwe:
>
> \def\stripnumber#1%
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 {\cldcontext{lpeg.match(lpeg.strippe= r("[=C2=B9=C2=B2=C2=B3=E2=81=B4=E2=81=B5=E2=81=B6=E2=81=B7=E2=81=B8=E2= =81=B9=E2=81=B0]"),
> [=3D=3D[#1]=3D=3D])}}
>
> \starttext
>
> \stripnumber{=C3=A1rbitro=E2=81=B6}
> \stripnumber{=C3=A9bano=C2=B9}
> \stripnumber{=C3=ADcone=E2=81=B8}
> \stripnumber{zool=C3=B3gico=E2=81=B0}
> \stripnumber{eletroac=C3=BAstico=E2=81=B9}
> \stripnumber{tr=C3=A2nsfuga=E2=81=B7}
> \stripnumber{farmac=C3=AAutico=C2=B9}
> \stripnumber{ma=C3=AEtre=C2=B2}
> \stripnumber{term=C3=B4metro=C2=B3}
> \stripnumber{no=C3=BBs=E2=81=B4}
>
> \stoptext


\def\stripnumber#1%
=C2=A0 =C2=A0{\cldcontext{lpeg.match(lpeg.stripper(lpeg.US("=C2=B9=C2= =B2=C2=B3=E2=81=B4=E2=81=B5=E2=81=B6=E2=81=B7=E2=81=B8=E2=81=B9=E2=81=B0&qu= ot;)),
[=3D=3D[#1]=3D=3D])}}

(US -> utf set)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0

or

\startluacode
=C2=A0 =C2=A0 =C2=A0local s =3D lpeg.stripper(lpeg.US("=C2=B9=C2=B2=C2= =B3=E2=81=B4=E2=81=B5=E2=81=B6=E2=81=B7=E2=81=B8=E2=81=B9=E2=81=B0"))<= br> =C2=A0 =C2=A0 =C2=A0function document.StripNumber(str)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0context(lpeg.match(s, str))
=C2=A0 =C2=A0 =C2=A0end
\stopluacode

\def\stripnumber#1{\ctxlua{document.StripNumber([=3D=3D[#1]=3D=3D])}}

or you can go fancy:

\startluacode
=C2=A0 =C2=A0 =C2=A0local p_strip =3D lpeg.stripper(lpeg.US("=C2=B9=C2= =B2=C2=B3=E2=81=B4=E2=81=B5=E2=81=B6=E2=81=B7=E2=81=B8=E2=81=B9=E2=81=B0&qu= ot;))

=C2=A0 =C2=A0 =C2=A0interfaces.implement {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0name=C2=A0 =C2=A0 =C2=A0 =3D "StripN= umber",
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0public=C2=A0 =C2=A0 =3D true,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0arguments =3D "string",
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0actions=C2=A0 =C2=A0=3D function(str)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0context(lpeg.match(p_strip,= str))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0end
=C2=A0 =C2=A0 =C2=A0}
\stopluacode

\StripNumber{zool=C3=B3gico=E2=81=B0}

There are more efficient variants but i guess it's good enough.

Hans

-----------------------------------------------------------------
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0Hans Hagen | PRAGMA ADE
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Ridderstraat 27 | 80= 61 GH Hasselt | The Netherlands
=C2=A0 =C2=A0 =C2=A0 =C2=A0 tel: 038 477 53 69 | www.pragma-ade.nl | www.= pragma-pod.nl
-----------------------------------------------------------------


--
Todas as coisas fatigam o corpo, salvo a m=C3=BAsica, = que n=C3=A3o fatiga nem o corpo nem seus membros, por ser descanso da alma,= primavera do cora=C3=A7=C3=A3o, distra=C3=A7=C3=A3o do aflito, entretenime= nto do solit=C3=A1rio, e vi=C3=A1tico do viajante.

Kunn=C3=A2sh al-H=C3=A2'ik (Cancioneiro de al-H=C3=A2'ik)
--00000000000011468205c8f4f3dc-- --===============5671684792624468120== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX18KSWYgeW91ciBxdWVzdGlvbiBpcyBvZiBpbnRlcmVz dCB0byBvdGhlcnMgYXMgd2VsbCwgcGxlYXNlIGFkZCBhbiBlbnRyeSB0byB0aGUgV2lraSEKCm1h aWxsaXN0IDogbnRnLWNvbnRleHRAbnRnLm5sIC8gaHR0cDovL3d3dy5udGcubmwvbWFpbG1hbi9s aXN0aW5mby9udGctY29udGV4dAp3ZWJwYWdlICA6IGh0dHA6Ly93d3cucHJhZ21hLWFkZS5ubCAv IGh0dHA6Ly9jb250ZXh0LmFhbmhldC5uZXQKYXJjaGl2ZSAgOiBodHRwczovL2JpdGJ1Y2tldC5v cmcvcGhnL2NvbnRleHQtbWlycm9yL2NvbW1pdHMvCndpa2kgICAgIDogaHR0cDovL2NvbnRleHRn YXJkZW4ubmV0Cl9fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCg== --===============5671684792624468120==--