public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* simple question about using manyTill
@ 2017-05-24 15:35 Jeff
       [not found] ` <d879451a-e34c-495b-b72d-414e752d1e51-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff @ 2017-05-24 15:35 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2829 bytes --]

I am new to Haskell and I am trying to write a new reader for Pandoc.

There is something strange about manyTill and I'm a bit stuck.

I have written a minimal code to present the problem. Suppose I want to 
write a parser that succeeds if the string ends in three equal signs "===", 
and return the string preceeding the "===". I have written three versions. 
One with Parsec, and two with Pandoc:

import Text.Parsec.String (Parser)
import Text.Parsec (parse, ParseError)
import Text.Parsec.Combinator (many1, manyTill)
import Text.Parsec.Char (anyChar, string, noneOf)
import qualified Text.Pandoc.Builder as B (str)
import Text.Pandoc.Builder (Inlines)
import Text.Parsec.Prim (try)

simpleParse :: Parser a -> String -> Either ParseError a
simpleParse p = parse p ""

header' :: Parser String
header' = manyTill anyChar (string "===")

header'' :: Parser [Inlines]
header'' = manyTill (B.str <$> (many1 anyChar)) (string "===")

header''' :: Parser [Inlines]
header''' = manyTill (B.str <$> (many1 (noneOf "="))) (string "===")

I thought header' and header'' are the same parser except the type is a bit 
different. But I'm wrong:

*Main> simpleParse header' "a==="
Right "a"
*Main> simpleParse header'' "a==="
Left (line 1, column 5):
unexpected end of input
expecting "==="
*Main> simpleParse header''' "a==="
Right [Many {unMany = fromList [Str "a"]}]

header''' works, but it seems to me a hacky solution, and I also don't 
understand *why do I have to exclude '=' in the Pandoc implementation?*
Another problem with header''' is that I can't parse say "=a===" with it.

Of course one may try to use 'try', as follows:

headerWithTry' :: Parser String
headerWithTry' = manyTill anyChar (try (string "==="))

headerWithTry'' :: Parser [Inlines]
headerWithTry'' = manyTill (B.str <$> (many1 anyChar)) (try (string "==="))

headerWithTry''' :: Parser [Inlines]
headerWithTry''' = manyTill (B.str <$> (many1 (noneOf "="))) (try (string 
"==="))

But it does not work:
*Main> simpleParse headerWithTry' "=a==="
Right "=a"
*Main> simpleParse headerWithTry'' "=a==="
Left (line 1, column 6):
unexpected end of input
expecting "==="
*Main> simpleParse headerWithTry''' "=a==="
Left (line 1, column 1):
unexpected "a"
expecting "==="

*Why?*

Thanks,
Jeff

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/d879451a-e34c-495b-b72d-414e752d1e51%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 3957 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: simple question about using manyTill
       [not found] ` <d879451a-e34c-495b-b72d-414e752d1e51-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2017-05-24 20:26   ` John MacFarlane
       [not found]     ` <20170524202650.GA20212-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: John MacFarlane @ 2017-05-24 20:26 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Jeff [May 24 17 08:35 ]:
>   I am new to Haskell and I am trying to write a new reader for Pandoc.
>   There is something strange about manyTill and I'm a bit stuck.
>   I have written a minimal code to present the problem. Suppose I want to
>   write a parser that succeeds if the string ends in three equal signs
>   "===", and return the string preceeding the "===". I have written three
>   versions. One with Parsec, and two with Pandoc:
>   import Text.Parsec.String (Parser)
>   import Text.Parsec (parse, ParseError)
>   import Text.Parsec.Combinator (many1, manyTill)
>   import Text.Parsec.Char (anyChar, string, noneOf)
>   import qualified Text.Pandoc.Builder as B (str)
>   import Text.Pandoc.Builder (Inlines)
>   import Text.Parsec.Prim (try)
>   simpleParse :: Parser a -> String -> Either ParseError a
>   simpleParse p = parse p ""
>   header' :: Parser String
>   header' = manyTill anyChar (string "===")
>   header'' :: Parser [Inlines]
>   header'' = manyTill (B.str <$> (many1 anyChar)) (string "===")
>   header''' :: Parser [Inlines]
>   header''' = manyTill (B.str <$> (many1 (noneOf "="))) (string "===")
>   I thought header' and header'' are the same parser except the type is a
>   bit different.

There's an important difference.  header' will parse one
character at a time (anyChar), each time stopping to see
if the end condition (string "===") is met.

header'' will parse CHUNKS of one or more character at
a time (many1 anyChar), only checking after each chunk
to see if the end condition is met.  Does that explain it?


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: simple question about using manyTill
       [not found]     ` <20170524202650.GA20212-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
@ 2017-05-25  2:18       ` Jeff
  0 siblings, 0 replies; 3+ messages in thread
From: Jeff @ 2017-05-25  2:18 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2363 bytes --]


Yes great explanation - solved my problem. Thanks!

Jeff

On Wednesday, 24 May 2017 16:27:06 UTC-4, John MacFarlane wrote:
>
> +++ Jeff [May 24 17 08:35 ]: 
> >   I am new to Haskell and I am trying to write a new reader for Pandoc. 
> >   There is something strange about manyTill and I'm a bit stuck. 
> >   I have written a minimal code to present the problem. Suppose I want 
> to 
> >   write a parser that succeeds if the string ends in three equal signs 
> >   "===", and return the string preceeding the "===". I have written 
> three 
> >   versions. One with Parsec, and two with Pandoc: 
> >   import Text.Parsec.String (Parser) 
> >   import Text.Parsec (parse, ParseError) 
> >   import Text.Parsec.Combinator (many1, manyTill) 
> >   import Text.Parsec.Char (anyChar, string, noneOf) 
> >   import qualified Text.Pandoc.Builder as B (str) 
> >   import Text.Pandoc.Builder (Inlines) 
> >   import Text.Parsec.Prim (try) 
> >   simpleParse :: Parser a -> String -> Either ParseError a 
> >   simpleParse p = parse p "" 
> >   header' :: Parser String 
> >   header' = manyTill anyChar (string "===") 
> >   header'' :: Parser [Inlines] 
> >   header'' = manyTill (B.str <$> (many1 anyChar)) (string "===") 
> >   header''' :: Parser [Inlines] 
> >   header''' = manyTill (B.str <$> (many1 (noneOf "="))) (string "===") 
> >   I thought header' and header'' are the same parser except the type is 
> a 
> >   bit different. 
>
> There's an important difference.  header' will parse one 
> character at a time (anyChar), each time stopping to see 
> if the end condition (string "===") is met. 
>
> header'' will parse CHUNKS of one or more character at 
> a time (many1 anyChar), only checking after each chunk 
> to see if the end condition is met.  Does that explain it? 
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/9d30a290-c9e1-477e-baf1-112ed0ca7fc4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 3251 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-05-25  2:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-24 15:35 simple question about using manyTill Jeff
     [not found] ` <d879451a-e34c-495b-b72d-414e752d1e51-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-05-24 20:26   ` John MacFarlane
     [not found]     ` <20170524202650.GA20212-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
2017-05-25  2:18       ` Jeff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).