> Le vendredi, 22 juillet 2016 à 23:46, Spiros Eliopoulos a écrit : > > Angstrom's targeting the use case of network protocols and serialization > formats, a use case where line/column numbers are of dubious value, > > Well when you are dealing with large malformed json streams it's nice to > know where they error… But if you target binary data only a byte count > should suffice. > Most text editors give users the ability to seek to a byte position. That should be sufficient to debug parse errors for both binary data and those intended for human consumption. > > From what I understand, this would require users to modify input rather > than putting any correction logic into the parser itself. > > No the parser itself is in charge of performing this. A very simple > example of this is when you get an UTF-8 decode error. You want to be able > to report the error and let the client restart the decode which is easy to > do by finding a synchronization byte in the input. But again this may not > be useful for binary protocols, it is however useful for decoding text and > parsing languages. Seems like this would be just as easily accomplished by writing a parser that does the appropriate error recovery while accumulating descriptions of the error. The value of handing restart control to the client seems dubious to me, though. If the client will always abandon a parse if there's a correctable parse error, then the parser should just fail with the first error (no accumulation). If the client will always accept any parse correction from the parser, then it should accumulate all errors and return them along with the parse result. If the client selectively picks which parse errors it will accept corrections for, then the parser can be parameterized on those choices. Inverting control in the failure case doesn't really seem to offer any benefits, especially if the client is just going invoke the continuation (or not) without modification. > > Doing that would seem to muddle application and library performance > measurements within the benchmark. Arguably, constructing a generic > in-memory representation is doing the same in essence. > > Not really, it can completely change the outcome of your benchmarks. For > example jsonm allows you to completely eschew going through a generic > in-memory representation before being able to extract the data. The point of a benchmark is to offer some comparison between different systems by subjecting them to identical (as possible) test loads. The more and varied test loads involved in a benchmark, the better. But absence of a multitude does not invalidate the one, especially when the one is representative of how most web applications deal with JSON data. -Spiros E.