caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Spiros Eliopoulos <seliopou@gmail.com>
To: "Daniel Bünzli" <daniel.buenzli@erratique.ch>
Cc: OCaml <caml-list@inria.fr>
Subject: Re: [Caml-list] ANN: angstrom
Date: Fri, 22 Jul 2016 17:46:33 -0400	[thread overview]
Message-ID: <CAEkQQgJrOFD=oe-ZaCgiTA69ZAeZJDa6PJmj-H2FHL5KtGUXGA@mail.gmail.com> (raw)
In-Reply-To: <F989DEB4A94D49D7962B03033C4A3038@erratique.ch>

[-- Attachment #1: Type: text/plain, Size: 2694 bytes --]

On Fri, Jul 22, 2016 at 5:38 PM, Daniel Bünzli <daniel.buenzli@erratique.ch>
wrote:

> > For a high-level comparison of Angstrom's features to other
> parser-combinator libraries, see the table included in the README:
> >
> > https://github.com/inhabitedtype/angstrom#comparison-to-other-libraries
> Do you have a story for precise line-column and byte count tracking ? It's
> quite important in practice to be able to give good error reports.
>

Angstrom's targeting the use case of network protocols and serialization
formats, a use case where line/column numbers are of dubious value, and
doesn't really make sense when dealing with binary formats. So it will
likely never support line/column tracking. It will however at some point
report character position on failure. I have a local branch that implements
it, though it's untested.


> Also does the API easily allow to do best-effort decoding (after reporting
> an error allow to resync input by discarding and restart the parsing
> process) ?


From what I understand, this would require users to modify input rather
than putting any correction logic into the parser itself. Angstrom does not
support this functionality, and likely won't. In principle the only change
necessary would be to simply surface the success continuation on failure.
Everything else is accessible to the user (except for failure position, see
above). Why it's valuable to do this outside of the parser is unclear to me
though.

> Yojson wins hands down (it benefits greatly from not having to support
> non-blocking incremental input),
>
> I guess it also benefits of not implementing the standard at all, e.g. it
> won't check the validity of the underlying character stream.
>
> Also regarding benchmarks it would be more interesting to have benchmarks
> on real world examples where you convert json input to concrete ocaml data
> types. E.g. it would be cool if you could provide a jsonm-like streaming
> interface with angstrom and then use angstrom's combinators to parse the
> stream of json lexeme into OCaml data structures.


Doing that would seem to muddle application and library performance
measurements within the benchmark. Arguably, constructing a generic
in-memory representation is doing the same in essence. At least this way
it's an "application" that's easy for benchmarks to standardize on (more or
less) and implement, so that one can use the benchmark results to compare
different libraries.

But anyways, there's nothing in principle preventing it from happening.
There parser would look something like this:

  skip_many (token >>| (handler : json_token -> unit))

-Spiros E.

[-- Attachment #2: Type: text/html, Size: 3564 bytes --]

  reply	other threads:[~2016-07-22 21:46 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-22 13:35 Spiros Eliopoulos
2016-07-22 14:38 ` Daniel Bünzli
2016-07-22 21:46   ` Spiros Eliopoulos [this message]
2016-07-23  2:17     ` Daniel Bünzli
2016-07-23  6:41       ` Cole Brown
2016-07-23 11:37         ` Daniel Bünzli
2016-07-25 14:15       ` Spiros Eliopoulos
2016-07-25 15:44         ` Daniel Bünzli
2016-07-25 16:07           ` Spiros Eliopoulos
2016-07-25 16:21             ` Daniel Bünzli
2016-07-25 19:37               ` Spiros Eliopoulos
2016-07-25 20:03                 ` Daniel Bünzli
2016-10-13 12:42 ` Anil Madhavapeddy
2017-02-03 19:37 ` Hendrik Boom
2017-02-04 16:36   ` Spiros Eliopoulos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEkQQgJrOFD=oe-ZaCgiTA69ZAeZJDa6PJmj-H2FHL5KtGUXGA@mail.gmail.com' \
    --to=seliopou@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=daniel.buenzli@erratique.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).