From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/17688 Path: news.gmane.org!.POSTED!not-for-mail From: Jeff Newsgroups: gmane.text.pandoc Subject: Re: simple question about using manyTill Date: Wed, 24 May 2017 19:18:22 -0700 (PDT) Message-ID: <9d30a290-c9e1-477e-baf1-112ed0ca7fc4@googlegroups.com> References: <20170524202650.GA20212@Johns-MBP.home> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_72_1054938313.1495678702563" X-Trace: blaine.gmane.org 1495678702 19462 195.159.176.226 (25 May 2017 02:18:22 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Thu, 25 May 2017 02:18:22 +0000 (UTC) To: pandoc-discuss Original-X-From: pandoc-discuss+bncBDY77TM4UQARB375TDEQKGQECCJQ4RA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Thu May 25 04:18:18 2017 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-it0-f62.google.com ([209.85.214.62]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dDiLy-0004wp-5a for gtp-pandoc-discuss@m.gmane.org; Thu, 25 May 2017 04:18:18 +0200 Original-Received: by mail-it0-f62.google.com with SMTP id c15sf19409503ith.1 for ; Wed, 24 May 2017 19:18:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=7gobCqNdplvCaowuwINmWa1gd+DX0AJ1ss1eGuhj8g4=; b=B/mrTtiU83O4FHUY/GT9cux6B8s/sbgu+GNlRrN4Tm+c/vB4V4+NN2/FUTLGX4VQIZ XeJb/KQK/PtdJcLWKrpNWxn5rdDoeY3kIxmkDTkvXA+qx9GzmzRGmMYk0LaicqGL9y44 8cfqvEerQR2yBXvcD348nCKG3u7QUb0cEtS3Ycq/MKJLqSzqDOGJEWyEj26HzFD0b/8H dy4Q0y8sFuhg9PDY6OkuCuuv6vGh/Iw4G1WOhcngk8mi/WcFktdLgA+ouzqf56AG2Qis j0h0S3dR/1YPMxg+Jmo2YJigqSVOeOHNCEt8BIqqs9XpmFb/6FFQBSRZnkPjXuMTLz9h IRTQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=7gobCqNdplvCaowuwINmWa1gd+DX0AJ1ss1eGuhj8g4=; b=rs7AZ3j8J8lBFUxjhj8hjxNV7aOc1vmAFvjP+U6bxm5LNJEu7QT0Hpr6Hnz/lGam2V GtKqSx6X5yMAGdfARepii80W3SCwlihwEZrKRZ+IpEcyHkrnxOt2YCjM1iXTw1ZeSnke 9QWxMTZWeYYHn/UjheaTneBxFIqj+nxZCp2QbUdgSf93vc+g4D3vhUqABtqkm8IpHIGD OK5JzvLgEVM+3UG8RAeb9Zyc1WFItZsWpEdRkXeLZdJJDQBMAwn6ldOIwWbZBWSSNAUu EH4UnliQzX2vCtitdH5Q1JIj6KfB1DVG3/aeU8Ljo+FhbGblBCeHzWZWIQDfmyBJs43J zhFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=7gobCqNdplvCaowuwINmWa1gd+DX0AJ1ss1eGuhj8g4=; b=Tg2F9XUiWjno48jt7FFqGpe8K7ZoTsbyKqIWDfWJwWQ7fQjxhJlAH191q49ciDpN7i EmHNGO9kGCDEz30tH6xsdgxJScAbEtERMLHJ1SmwTt+rhdGGA4XV+HL8u0i7gCiTrBOz 94zXQD1RhGPFMUDfDoWkhb0Bykc3po3iabqYQrpgoPBMtTZpum3M3R4jCs/8QwRAjuWa SkH4YuCjPb4fu6zxvCE1ZSFinSu4jJoCS1P60mbU/3YqiQXgvCpgfNs8H7+U3MiOcWHW EuKyWgiZwMQ0x5mattxpxF/XNsQbQP/u/VPIwR/XMIbnekJnrdn5oeEutPeHTv7drJG+ awrg== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AODbwcCMHKomviyYicrZ/o9FVjOxd+Ul3fUL3fVnSmAIEGfR7AdY18WQ gRThiz8TziXyEg== X-Received: by 10.157.82.15 with SMTP id e15mr291913oth.6.1495678703573; Wed, 24 May 2017 19:18:23 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.157.27.130 with SMTP id z2ls9630613otd.45.gmail; Wed, 24 May 2017 19:18:23 -0700 (PDT) X-Received: by 10.157.31.68 with SMTP id x4mr288388otx.19.1495678703103; Wed, 24 May 2017 19:18:23 -0700 (PDT) In-Reply-To: <20170524202650.GA20212-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org> X-Original-Sender: baconp-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:17688 Archived-At: ------=_Part_72_1054938313.1495678702563 Content-Type: multipart/alternative; boundary="----=_Part_73_336108508.1495678702563" ------=_Part_73_336108508.1495678702563 Content-Type: text/plain; charset="UTF-8" Yes great explanation - solved my problem. Thanks! Jeff On Wednesday, 24 May 2017 16:27:06 UTC-4, John MacFarlane wrote: > > +++ Jeff [May 24 17 08:35 ]: > > I am new to Haskell and I am trying to write a new reader for Pandoc. > > There is something strange about manyTill and I'm a bit stuck. > > I have written a minimal code to present the problem. Suppose I want > to > > write a parser that succeeds if the string ends in three equal signs > > "===", and return the string preceeding the "===". I have written > three > > versions. One with Parsec, and two with Pandoc: > > import Text.Parsec.String (Parser) > > import Text.Parsec (parse, ParseError) > > import Text.Parsec.Combinator (many1, manyTill) > > import Text.Parsec.Char (anyChar, string, noneOf) > > import qualified Text.Pandoc.Builder as B (str) > > import Text.Pandoc.Builder (Inlines) > > import Text.Parsec.Prim (try) > > simpleParse :: Parser a -> String -> Either ParseError a > > simpleParse p = parse p "" > > header' :: Parser String > > header' = manyTill anyChar (string "===") > > header'' :: Parser [Inlines] > > header'' = manyTill (B.str <$> (many1 anyChar)) (string "===") > > header''' :: Parser [Inlines] > > header''' = manyTill (B.str <$> (many1 (noneOf "="))) (string "===") > > I thought header' and header'' are the same parser except the type is > a > > bit different. > > There's an important difference. header' will parse one > character at a time (anyChar), each time stopping to see > if the end condition (string "===") is met. > > header'' will parse CHUNKS of one or more character at > a time (many1 anyChar), only checking after each chunk > to see if the end condition is met. Does that explain it? > > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/9d30a290-c9e1-477e-baf1-112ed0ca7fc4%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. ------=_Part_73_336108508.1495678702563 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable

Yes great explanation - solved my problem. Thanks!
=
Jeff

On Wednesday, 24 May 2017 16:27:06 UTC-4, John MacFarlane = wrote:
+++ Jeff [May 24 17 08:3= 5 ]:
> =C2=A0 I am new to Haskell and I am trying to write a new reader f= or Pandoc.
> =C2=A0 There is something strange about manyTill and I'm a bit= stuck.
> =C2=A0 I have written a minimal code to present the problem. Suppo= se I want to
> =C2=A0 write a parser that succeeds if the string ends in three eq= ual signs
> =C2=A0 "=3D=3D=3D", and return the string preceeding the= "=3D=3D=3D". I have written three
> =C2=A0 versions. One with Parsec, and two with Pandoc:
> =C2=A0 import Text.Parsec.String (Parser)
> =C2=A0 import Text.Parsec (parse, ParseError)
> =C2=A0 import Text.Parsec.Combinator (many1, manyTill)
> =C2=A0 import Text.Parsec.Char (anyChar, string, noneOf)
> =C2=A0 import qualified Text.Pandoc.Builder as B (str)
> =C2=A0 import Text.Pandoc.Builder (Inlines)
> =C2=A0 import Text.Parsec.Prim (try)
> =C2=A0 simpleParse :: Parser a -> String -> Either ParseErro= r a
> =C2=A0 simpleParse p =3D parse p ""
> =C2=A0 header' :: Parser String
> =C2=A0 header' =3D manyTill anyChar (string "=3D=3D=3D&qu= ot;)
> =C2=A0 header'' :: Parser [Inlines]
> =C2=A0 header'' =3D manyTill (B.str <$> (many1 anyCh= ar)) (string "=3D=3D=3D")
> =C2=A0 header''' :: Parser [Inlines]
> =C2=A0 header''' =3D manyTill (B.str <$> (many1 = (noneOf "=3D"))) (string "=3D=3D=3D")
> =C2=A0 I thought header' and header'' are the same par= ser except the type is a
> =C2=A0 bit different.

There's an important difference. =C2=A0header' will parse one
character at a time (anyChar), each time stopping to see
if the end condition (string "=3D=3D=3D") is met.

header'' will parse CHUNKS of one or more character at
a time (many1 anyChar), only checking after each chunk
to see if the end condition is met. =C2=A0Does that explain it?

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/= msgid/pandoc-discuss/9d30a290-c9e1-477e-baf1-112ed0ca7fc4%40googlegroups.co= m.
For more options, visit http= s://groups.google.com/d/optout.
------=_Part_73_336108508.1495678702563-- ------=_Part_72_1054938313.1495678702563--