From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id SAA19239; Mon, 21 Jun 2004 18:09:28 +0200 (MET DST) Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id SAA19215 for ; Mon, 21 Jun 2004 18:09:27 +0200 (MET DST) Received: from nef.ens.fr (nef.ens.fr [129.199.96.32]) by concorde.inria.fr (8.12.10/8.12.10) with ESMTP id i5LG9QSH002348 for ; Mon, 21 Jun 2004 18:09:26 +0200 Received: from clipper.ens.fr (clipper-gw.ens.fr [129.199.1.22]) by nef.ens.fr (8.12.11/1.01.28121999) with ESMTP id i5LG8rCU053451 ; Mon, 21 Jun 2004 18:08:53 +0200 (CEST) Received: from localhost (frisch@localhost) by clipper.ens.fr (8.12.3/jb-1.1) id i5LG8n6Z010915 ; Mon, 21 Jun 2004 18:08:50 +0200 (MET DST) X-Authentication-Warning: clipper.ens.fr: frisch owned process doing -bs Date: Mon, 21 Jun 2004 18:08:49 +0200 (MET DST) From: Alain Frisch X-X-Sender: frisch@clipper.ens.fr Reply-To: Alain Frisch To: Richard Jones cc: Caml list Subject: Re: [Caml-list] Parse crazy HTML, output XML In-Reply-To: <20040621160328.GA28952@redhat.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-milter (http://amavis.org/) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.3.3 (nef.ens.fr [129.199.96.32]); Mon, 21 Jun 2004 18:08:54 +0200 (CEST) X-Miltered: at concorde with ID 40D70836.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; alain:01 frisch:01 alain:01 frisch:01 caml-list:01 pxp:01 ocamlnet:01 ocaml:01 parser:02 wrote:03 library:03 library:03 parse:04 parse:04 output:05 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Mon, 21 Jun 2004, Richard Jones wrote: > The problem is the parsing phase. Both PXP and XmlLight will only > parse valid XML (as far as I can see). Is there any simple pure OCaml > library for parsing HTML and producing a DOM? There is an html parser in the ocamlnet library. -- Alain ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners