From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 55978BBB8 for ; Fri, 28 Jul 2006 00:58:08 +0200 (CEST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id k6RMw7xR032160 for ; Fri, 28 Jul 2006 00:58:07 +0200 Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id AAA17877 for ; Fri, 28 Jul 2006 00:58:07 +0200 (MET DST) Received: from out2.smtp.messagingengine.com (out2.smtp.messagingengine.com [66.111.4.26]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id k6RMw6Ip032004 for ; Fri, 28 Jul 2006 00:58:07 +0200 Received: from frontend3.internal (frontend3.internal [10.202.2.152]) by frontend1.messagingengine.com (Postfix) with ESMTP id A761ED91E47; Thu, 27 Jul 2006 18:57:58 -0400 (EDT) Received: from heartbeat2.messagingengine.com ([10.202.2.161]) by frontend3.internal (MEProxy); Thu, 27 Jul 2006 18:58:02 -0400 X-Sasl-enc: IDbsQLoDNcW7qBJlgo9+z2vOlO394iOJYLvw1sDwZcFj 1154041082 Received: from [127.0.0.1] (ip-81-1-119-195.cust.homechoice.net [81.1.119.195]) by mail.messagingengine.com (Postfix) with ESMTP id 1DFC712C8; Thu, 27 Jul 2006 18:58:01 -0400 (EDT) Message-ID: <44C944F5.6050006@acm.org> Date: Thu, 27 Jul 2006 23:57:57 +0100 From: Fermin Reig User-Agent: Thunderbird 1.5.0.4 (Windows/20060516) MIME-Version: 1.0 To: ocaml Cc: Markus Mottl Subject: Re: [Caml-list] Partial parsing References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Miltered: at concorde with ID 44C944FF.002 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at concorde with ID 44C944FE.002 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; reig:01 reig:01 parsing:01 markus:01 mottl:01 lexing:01 parsing:01 ocaml:01 lexing:01 buffer:01 lexer:01 parser:01 buffer:01 parser:01 combinator:01 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 Markus Mottl wrote: > Hi, > > one feature that I'd love to see with our lexing/parsing tools for > OCaml would be the possibility to perform partial lexing / parsing. > > For example, imagine that you read a message from a socket into a > string buffer, but it is not yet complete. You may not be able to > know that in advance, because that would require parsing it. > > What I'd like to be able to do is to e.g. call a lexer or parser > function on this buffer, and it will recognize as much as possible > with the option to continue parsing in another buffer. For this we > could e.g. define a type "parse_result": > > type parse_result = > | Done of result * int > | Cont of parse_fun > and parse_fun = pos : int -> len : int -> string -> parse_result > > If a full parse was detected, it would return e.g. "Done (result, > next_pos)", where "result" is the result of the successful parse, and > "next_pos" is the position in the buffer that contains the character > following the successfully parsed text. > [...] This sounds like a parsing problem suitable for the parser combinator approach. There, the result of a parse is a pair where one of the two components is the remaining of the input string. Many of the papers and implementations are from the Haskell folk, but I think there are some in Ocaml as well. HTH, Fermin