caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Spiros Eliopoulos <seliopou@gmail.com>
To: "Daniel Bünzli" <daniel.buenzli@erratique.ch>
Cc: OCaml <caml-list@inria.fr>
Subject: Re: [Caml-list] ANN: angstrom
Date: Mon, 25 Jul 2016 10:15:59 -0400	[thread overview]
Message-ID: <CAEkQQgJt7XMczL8-umxXowJq4UcvPv9VVgo7ePR=_Pf-hJQUxw@mail.gmail.com> (raw)
In-Reply-To: <8B3345BC17954C9F8DC59E1C0AFBE09D@erratique.ch>

[-- Attachment #1: Type: text/plain, Size: 2807 bytes --]

> Le vendredi, 22 juillet 2016 à 23:46, Spiros Eliopoulos a écrit :
> > Angstrom's targeting the use case of network protocols and serialization
> formats, a use case where line/column numbers are of dubious value,
>
> Well when you are dealing with large malformed json streams it's nice to
> know where they error… But if you target binary data only a byte count
> should suffice.
>

Most text editors give users the ability to seek to a byte position. That
should be sufficient to debug parse errors for both binary data and those
intended for human consumption.


> > From what I understand, this would require users to modify input rather
> than putting any correction logic into the parser itself.
>
> No the parser itself is in charge of performing this. A very simple
> example of this is when you get an UTF-8 decode error. You want to be able
> to report the error and let the client restart the decode which is easy to
> do by finding a synchronization byte in the input. But again this may not
> be useful for binary protocols, it is however useful for decoding text and
> parsing languages.


Seems like this would be just as easily accomplished by writing a parser
that does the appropriate error recovery while accumulating descriptions of
the error. The value of handing restart control to the client seems dubious
to me, though. If the client will always abandon a parse if there's a
correctable parse error, then the parser should just fail with the first
error (no accumulation). If the client will always accept any parse
correction from the parser, then it should accumulate all errors and return
them along with the parse result. If the client selectively picks which
parse errors it will accept corrections for, then the parser can be
parameterized on those choices. Inverting control in the failure case
doesn't really seem to offer any benefits, especially if the client is just
going invoke the continuation (or not) without modification.


> > Doing that would seem to muddle application and library performance
> measurements within the benchmark. Arguably, constructing a generic
> in-memory representation is doing the same in essence.
>
> Not really, it can completely change the outcome of your benchmarks. For
> example jsonm allows you to completely eschew going through a generic
> in-memory representation before being able to extract the data.


The point of a benchmark is to offer some comparison between different
systems by subjecting them to identical (as possible) test loads. The more
and varied test loads involved in a benchmark, the better. But absence of a
multitude does not invalidate the one, especially when the one is
representative of how most web applications deal with JSON data.

-Spiros E.

[-- Attachment #2: Type: text/html, Size: 3468 bytes --]

  parent reply	other threads:[~2016-07-25 14:16 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-22 13:35 Spiros Eliopoulos
2016-07-22 14:38 ` Daniel Bünzli
2016-07-22 21:46   ` Spiros Eliopoulos
2016-07-23  2:17     ` Daniel Bünzli
2016-07-23  6:41       ` Cole Brown
2016-07-23 11:37         ` Daniel Bünzli
2016-07-25 14:15       ` Spiros Eliopoulos [this message]
2016-07-25 15:44         ` Daniel Bünzli
2016-07-25 16:07           ` Spiros Eliopoulos
2016-07-25 16:21             ` Daniel Bünzli
2016-07-25 19:37               ` Spiros Eliopoulos
2016-07-25 20:03                 ` Daniel Bünzli
2016-10-13 12:42 ` Anil Madhavapeddy
2017-02-03 19:37 ` Hendrik Boom
2017-02-04 16:36   ` Spiros Eliopoulos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEkQQgJt7XMczL8-umxXowJq4UcvPv9VVgo7ePR=_Pf-hJQUxw@mail.gmail.com' \
    --to=seliopou@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=daniel.buenzli@erratique.ch \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).