From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id KAA29812; Tue, 11 Nov 2003 10:39:57 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id KAA29976 for ; Tue, 11 Nov 2003 10:39:56 +0100 (MET) Received: from aomori.annexia.org (annexia.force9.co.uk [212.56.101.183]) by nez-perce.inria.fr (8.11.1/8.11.1) with ESMTP id hAB9dt121887 for ; Tue, 11 Nov 2003 10:39:55 +0100 (MET) Received: from rich by aomori.annexia.org with local (Exim 3.36 #1 (Debian)) id 1AJV06-0002qy-00; Tue, 11 Nov 2003 09:39:54 +0000 Date: Tue, 11 Nov 2003 09:39:54 +0000 To: Sven Luther Cc: Pierre LAFFITTE , caml-list Subject: Re: [Caml-list] access to the internet Message-ID: <20031111093954.GA10795@redhat.com> References: <25885621.1068486985089.JavaMail.www@wwinf0203> <20031111092126.GA4903@iliana> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20031111092126.GA4903@iliana> User-Agent: Mutt/1.5.4i From: Richard Jones X-Loop: caml-list@inria.fr X-Spam: no; 0.00; caml-list:01 sven:01 luther:01 pierre:01 sourceforge:01 lwp:01 fuzzy:01 reinvent:01 freshmeat:01 ltd:98 ocaml:01 ocaml:01 caml:01 caml:01 nov:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Tue, Nov 11, 2003 at 10:21:26AM +0100, Sven Luther wrote: > On Mon, Nov 10, 2003 at 06:56:25PM +0100, Pierre LAFFITTE wrote: > > Is it possible from a caml program, to give an internet adress, to get the result in a file or in a set of character to analyse it. > > I have been searching for exactly that some time ago, but i think it is > not possible. Your best guess currently is to call the external wget > program, save it to a temporary file, and then read it in. > > I agree that a full url-reading module would be a good addition to the > ocaml library though, java has it for example. Actually there are two (at least) ways of doing this: http://sourceforge.net/projects/ocurl/ which is an OCaml wrapper around the Curl library. Or, you could use some Perl-fu with: http://www.merjis.com/developers/perl4caml/ which includes a wrapper around the Perl LWP and HTML::TreeBuilder libraries, so you could not only download the page, but also parse it into an HTML tree (the HTML::TreeBuilder parser is about the best parser ever written for parsing fuzzy, incorrect HTML, and there's really no way you would want reinvent this in OCaml). Rich. -- Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj Merjis Ltd. http://www.merjis.com/ - improving website return on investment MONOLITH is an advanced framework for writing web applications in C, easier than using Perl & Java, much faster and smaller, reusable widget-based arch, database-backed, discussion, chat, calendaring: http://www.annexia.org/freeware/monolith/ ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners