From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 546D77EF5E for ; Mon, 25 Jul 2016 16:16:20 +0200 (CEST) IronPort-PHdr: 9a23:iB1U5Rb/RwfeQtwbuCO+shT/LSx+4OfEezUN459isYplN5qZpcmzbnLW6fgltlLVR4KTs6sC0LuO9f65EjJYqb+681k6OKRWUBEEjchE1ycBO+WiTXPBEfjxciYhF95DXlI2t1uyMExSBdqsLwaK+i760zceF13FOBZvIaytQ8iJ3pzxibn5pcWbSj4LrQL1Wal1IhSyoFeZnegtqqwmFJwMzADUqGBDYeVcyDAgD1uSmxHh+pX4p8Y7oGwD884mouJJV6T3e5MS2bpKCDVuZ2w84szmsV/JUAaJ9H8demgMiBNUAhHY4VfxXsGinDH9s79GwCiAOta+YLQ1Xiyl8qNsU1e8kyoDNjkh93z/hcl5jaYdqxWk8U8si7XIaZ2YYaItNpjWeskXEDJM Authentication-Results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=seliopou@gmail.com; spf=Pass smtp.mailfrom=seliopou@gmail.com; spf=None smtp.helo=postmaster@mail-lf0-f43.google.com Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of seliopou@gmail.com) identity=pra; client-ip=209.85.215.43; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="seliopou@gmail.com"; x-sender="seliopou@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of seliopou@gmail.com designates 209.85.215.43 as permitted sender) identity=mailfrom; client-ip=209.85.215.43; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="seliopou@gmail.com"; x-sender="seliopou@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-lf0-f43.google.com) identity=helo; client-ip=209.85.215.43; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="seliopou@gmail.com"; x-sender="postmaster@mail-lf0-f43.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0CnAwAPHpZXhivXVdFdhREGrSCGOoUFgXyGHQKBNAc6EgEBAQEBAQEBEQEBAQgLCwkZL4IyFYIWAQQBEhEdARsdAQMBCwYFCzcCAiIBEQEFARwGNYdzAQMPCJlEgTI+MYs7gWqCWgWEIAoZJw1UgyMBAQEBAQEEAQEBAQEBGQIGEIYahE2HQYJaBZkojm+PP45iEh6BDyUEgkeBcyAyiQ0BAQE X-IPAS-Result: A0CnAwAPHpZXhivXVdFdhREGrSCGOoUFgXyGHQKBNAc6EgEBAQEBAQEBEQEBAQgLCwkZL4IyFYIWAQQBEhEdARsdAQMBCwYFCzcCAiIBEQEFARwGNYdzAQMPCJlEgTI+MYs7gWqCWgWEIAoZJw1UgyMBAQEBAQEEAQEBAQEBGQIGEIYahE2HQYJaBZkojm+PP45iEh6BDyUEgkeBcyAyiQ0BAQE X-IronPort-AV: E=Sophos;i="5.28,419,1464645600"; d="scan'208,217";a="227880176" Received: from mail-lf0-f43.google.com ([209.85.215.43]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/AES128-GCM-SHA256; 25 Jul 2016 16:16:19 +0200 Received: by mail-lf0-f43.google.com with SMTP id b199so129656039lfe.0 for ; Mon, 25 Jul 2016 07:16:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=XAq/tJavN6GghGlmV4GnhlWKWrPdzrBjkcoC6x+pweo=; b=APzzohT42N2i8Wkn1+PA/kLdx1hm0ltmZ/POdPznBW/AsFaLP4lWoArASO96m8EaWx bBDvdSsJqMby7D+lQUZwJEonmldD69QO36qMq58Ufa0ZsrTXiJ2JL6NVSRHaqvhUqegP fQEG3Y56CMIuggWJUCE3WSH/UWW2k+5lgX7R4UPNz+EIDroiHg6pZFBlLJvWgkA+9tWJ 67tjgj62kki3HdFq71m6EByLCt2nu0dnu+fh8g/1ELvjeSknFt229L/Zyfxvo/SY2CQa PzvwPwRZhWB8tV2NOk27lTz2be0Qo6O+3tmtHjrMhzKej7LOBzDyu6NlloExcUAR9KKt jyVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=XAq/tJavN6GghGlmV4GnhlWKWrPdzrBjkcoC6x+pweo=; b=MVN4k2Ba/lKxmfT3aSePvKsd3z4w4nx8Q0O/hAuKlRB1ZryvBUJeA/wr6kEO7wcp8O /P93LNy8ukKRIlGEp+KfB3kz4I5oaISTTWsmlHbbWmDfg+2ghWsEtepGJmYoV/BCSf9M jPkHWZKvZkUEztA9tHsibhwXsxbq4/td3xk2gVxjvr6f4u9nbjMFaTdPjuNyANL+ogiz us7kTrp6Lv/uxN+x0PyXC/OTlk9iRedb/CTj/T8tHRqN7kQ4uJ0PukB89xd+uGcv96/G 93m2hBHle/jquWEynsblZw88J3bm/JEuGp1FottaO9tlA8+DSc4mAK7QZOuzzrbVWrT8 4ztw== X-Gm-Message-State: AEkooutazHbxuBnx8WI6NTbIR/58QVsUqur6dkuCjD33kPj0t9etoXrcoJaphBD2aT5zoEWeTE038yRYWZAnIA== X-Received: by 10.25.35.19 with SMTP id j19mr6754805lfj.106.1469456178670; Mon, 25 Jul 2016 07:16:18 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.87.136 with HTTP; Mon, 25 Jul 2016 07:15:59 -0700 (PDT) In-Reply-To: <8B3345BC17954C9F8DC59E1C0AFBE09D@erratique.ch> References: <8B3345BC17954C9F8DC59E1C0AFBE09D@erratique.ch> From: Spiros Eliopoulos Date: Mon, 25 Jul 2016 10:15:59 -0400 Message-ID: To: =?UTF-8?Q?Daniel_B=C3=BCnzli?= Cc: OCaml Content-Type: multipart/alternative; boundary=001a113c72f28af4490538766d7c Subject: Re: [Caml-list] ANN: angstrom --001a113c72f28af4490538766d7c Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable > Le vendredi, 22 juillet 2016 =C3=A0 23:46, Spiros Eliopoulos a =C3=A9crit= : > > Angstrom's targeting the use case of network protocols and serialization > formats, a use case where line/column numbers are of dubious value, > > Well when you are dealing with large malformed json streams it's nice to > know where they error=E2=80=A6 But if you target binary data only a byte = count > should suffice. > Most text editors give users the ability to seek to a byte position. That should be sufficient to debug parse errors for both binary data and those intended for human consumption. > > From what I understand, this would require users to modify input rather > than putting any correction logic into the parser itself. > > No the parser itself is in charge of performing this. A very simple > example of this is when you get an UTF-8 decode error. You want to be able > to report the error and let the client restart the decode which is easy to > do by finding a synchronization byte in the input. But again this may not > be useful for binary protocols, it is however useful for decoding text and > parsing languages. Seems like this would be just as easily accomplished by writing a parser that does the appropriate error recovery while accumulating descriptions of the error. The value of handing restart control to the client seems dubious to me, though. If the client will always abandon a parse if there's a correctable parse error, then the parser should just fail with the first error (no accumulation). If the client will always accept any parse correction from the parser, then it should accumulate all errors and return them along with the parse result. If the client selectively picks which parse errors it will accept corrections for, then the parser can be parameterized on those choices. Inverting control in the failure case doesn't really seem to offer any benefits, especially if the client is just going invoke the continuation (or not) without modification. > > Doing that would seem to muddle application and library performance > measurements within the benchmark. Arguably, constructing a generic > in-memory representation is doing the same in essence. > > Not really, it can completely change the outcome of your benchmarks. For > example jsonm allows you to completely eschew going through a generic > in-memory representation before being able to extract the data. The point of a benchmark is to offer some comparison between different systems by subjecting them to identical (as possible) test loads. The more and varied test loads involved in a benchmark, the better. But absence of a multitude does not invalidate the one, especially when the one is representative of how most web applications deal with JSON data. -Spiros E. --001a113c72f28af4490538766d7c Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

=
Le vendredi, 22 juillet 201= 6 =C3=A0 23:46, Spiros Eliopoulos a =C3=A9crit :
> Angstrom's targeting the use case of network protocols and seriali= zation formats, a use case where line/column numbers are of dubious value,<= br>
Well when you are dealing with large malformed json streams it's= nice to know where they error=E2=80=A6 But if you target binary data only = a byte count should suffice.

Most text = editors give users the ability to seek to a byte position. That should be s= ufficient to debug parse errors for both binary data and those intended for= human consumption.
=C2=A0
> From what I understand, this would require users to modify input rathe= r than putting any correction logic into the parser itself.

No the parser itself is in charge of performing this. A very simple = example of this is when you get an UTF-8 decode error. You want to be able = to report the error and let the client restart the decode which is easy to = do by finding a synchronization byte in the input. But again this may not b= e useful for binary protocols, it is however useful for decoding text and p= arsing languages.

Seems like this would be = just as easily accomplished by writing a parser that does the appropriate e= rror recovery while accumulating descriptions of the error. The value of ha= nding restart control to the client seems dubious to me, though. If the cli= ent will always abandon a parse if there's a correctable parse error, t= hen the parser should just fail with the first error (no accumulation). If = the client will always accept any parse correction from the parser, then it= should accumulate all errors and return them along with the parse result. = If the client selectively picks which parse errors it will accept correctio= ns for, then the parser can be parameterized on those choices. Inverting co= ntrol in the failure case doesn't really seem to offer any benefits, es= pecially if the client is just going invoke the continuation (or not) witho= ut modification.
=C2=A0
> Doing that would seem to muddle application and library performance me= asurements within the benchmark. Arguably, constructing a generic in-memory= representation is doing the same in essence.

Not really, it can completely change the outcome of your benchmarks.= For example jsonm allows you to completely eschew going through a generic = in-memory representation before being able to extract the data.

The = point of a benchmark is to offer some comparison between different systems = by subjecting them to identical (as possible) test loads. The more and vari= ed test loads involved in a benchmark, the better. But absence of a multitu= de does not invalidate the one, especially when the one is representative o= f how most web applications deal with JSON data.

-Spiros E.
--001a113c72f28af4490538766d7c--