From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id JAA32179; Fri, 25 Jun 2004 09:17:16 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id JAA31736 for ; Fri, 25 Jun 2004 09:17:15 +0200 (MET DST) Received: from smtpout.mac.com (smtpout.mac.com [17.250.248.87]) by concorde.inria.fr (8.12.10/8.12.10) with ESMTP id i5P7HDSH013848 for ; Fri, 25 Jun 2004 09:17:14 +0200 Received: from mac.com (smtpin07-en2 [10.13.10.152]) by smtpout.mac.com (Xserve/MantshX 2.0) with ESMTP id i5P7HAI8008367; Fri, 25 Jun 2004 00:17:10 -0700 (PDT) Received: from [192.168.1.100] (dsl081-080-123.lax1.dsl.speakeasy.net [64.81.80.123]) (authenticated bits=0) by mac.com (Xserve/smtpin07/MantshX 4.0) with ESMTP id i5P7H0Mw013588; Fri, 25 Jun 2004 00:17:10 -0700 (PDT) In-Reply-To: <20040621161923.GZ595@speakeasy.org> References: <20040621160328.GA28952@redhat.com> <20040621161923.GZ595@speakeasy.org> Mime-Version: 1.0 (Apple Message framework v618) Content-Type: text/plain; charset=US-ASCII; format=flowed Message-Id: <9B391FCE-C677-11D8-92BF-000A27DEEC20@mac.com> Content-Transfer-Encoding: 7bit Cc: caml-list@inria.fr From: Paul Snively Subject: Re: [Caml-list] Parse crazy HTML, output XML Date: Fri, 25 Jun 2004 00:16:41 -0700 To: Shawn Wagner X-Pgp-Agent: GPGMail 1.0.2 X-Mailer: Apple Mail (2.618) X-Miltered: at concorde with ID 40DBD179.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 hash:01 2004:99 pxp:01 ocamlnet:01 ocamlnet:01 alain:01 frisch's:01 xpath:01 modularized:01 functorized:01 nethtml:01 shawnw:01 1.2.4:01 darwin:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Jun 21, 2004, at 9:19 AM, Shawn Wagner wrote: > On Mon, Jun 21, 2004 at 05:03:28PM +0100, Richard Jones wrote: >> >> The problem is the parsing phase. Both PXP and XmlLight will only >> parse valid XML (as far as I can see). Is there any simple pure OCaml >> library for parsing HTML and producing a DOM? >> > > There's a html parser in the ocamlnet library. > I've recently found the OCamlNet HTML parser also. Does anyone know if Alain Frisch's XPath implementation, which is modularized and functorized, has been/can be used on the resulting tree from Nethtml? > -- > Shawn Wagner > shawnw@speakeasy.org > Many thanks and best regards, Paul Snively -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (Darwin) iEYEARECAAYFAkDb0WIACgkQbot1wzHBQBUd7QCcDkrzAX1diwMisH31VUDR2aeV S3MAoLatoYjH1lmpKSaOxhAm4VmYKfCc =Skxm -----END PGP SIGNATURE----- ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners