On Fri, Jul 22, 2016 at 5:38 PM, Daniel Bünzli wrote: > > For a high-level comparison of Angstrom's features to other > parser-combinator libraries, see the table included in the README: > > > > https://github.com/inhabitedtype/angstrom#comparison-to-other-libraries > Do you have a story for precise line-column and byte count tracking ? It's > quite important in practice to be able to give good error reports. > Angstrom's targeting the use case of network protocols and serialization formats, a use case where line/column numbers are of dubious value, and doesn't really make sense when dealing with binary formats. So it will likely never support line/column tracking. It will however at some point report character position on failure. I have a local branch that implements it, though it's untested. > Also does the API easily allow to do best-effort decoding (after reporting > an error allow to resync input by discarding and restart the parsing > process) ? From what I understand, this would require users to modify input rather than putting any correction logic into the parser itself. Angstrom does not support this functionality, and likely won't. In principle the only change necessary would be to simply surface the success continuation on failure. Everything else is accessible to the user (except for failure position, see above). Why it's valuable to do this outside of the parser is unclear to me though. > Yojson wins hands down (it benefits greatly from not having to support > non-blocking incremental input), > > I guess it also benefits of not implementing the standard at all, e.g. it > won't check the validity of the underlying character stream. > > Also regarding benchmarks it would be more interesting to have benchmarks > on real world examples where you convert json input to concrete ocaml data > types. E.g. it would be cool if you could provide a jsonm-like streaming > interface with angstrom and then use angstrom's combinators to parse the > stream of json lexeme into OCaml data structures. Doing that would seem to muddle application and library performance measurements within the benchmark. Arguably, constructing a generic in-memory representation is doing the same in essence. At least this way it's an "application" that's easy for benchmarks to standardize on (more or less) and implement, so that one can use the benchmark results to compare different libraries. But anyways, there's nothing in principle preventing it from happening. There parser would look something like this: skip_many (token >>| (handler : json_token -> unit)) -Spiros E.