caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Richard Jones <rich@annexia.org>
To: Sven Luther <sven.luther@wanadoo.fr>
Cc: Pierre LAFFITTE <pierre.laffitte@wanadoo.fr>,
	caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] access to the internet
Date: Tue, 11 Nov 2003 09:39:54 +0000	[thread overview]
Message-ID: <20031111093954.GA10795@redhat.com> (raw)
In-Reply-To: <20031111092126.GA4903@iliana>

On Tue, Nov 11, 2003 at 10:21:26AM +0100, Sven Luther wrote:
> On Mon, Nov 10, 2003 at 06:56:25PM +0100, Pierre LAFFITTE wrote:
> > Is it possible from a caml program, to give an internet adress, to get the result in a file or in a set of character to analyse it.
> 
> I have been searching for exactly that some time ago, but i think it is
> not possible. Your best guess currently is to call the external wget
> program, save it to a temporary file, and then read it in.
> 
> I agree that a full url-reading module would be a good addition to the
> ocaml library though, java has it for example.

Actually there are two (at least) ways of doing this:

http://sourceforge.net/projects/ocurl/

which is an OCaml wrapper around the Curl library.

Or, you could use some Perl-fu with:

http://www.merjis.com/developers/perl4caml/

which includes a wrapper around the Perl LWP and HTML::TreeBuilder
libraries, so you could not only download the page, but also parse it
into an HTML tree (the HTML::TreeBuilder parser is about the best
parser ever written for parsing fuzzy, incorrect HTML, and there's
really no way you would want reinvent this in OCaml).

Rich.

-- 
Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj
Merjis Ltd. http://www.merjis.com/ - improving website return on investment
MONOLITH is an advanced framework for writing web applications in C, easier
than using Perl & Java, much faster and smaller, reusable widget-based arch,
database-backed, discussion, chat, calendaring:
http://www.annexia.org/freeware/monolith/

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  reply	other threads:[~2003-11-11  9:39 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-11-10 17:56 Pierre LAFFITTE
2003-11-11  9:21 ` Sven Luther
2003-11-11  9:39   ` Richard Jones [this message]
2003-11-11 10:21     ` Artem Prisyznuk
2003-11-11 17:54       ` Christophe TROESTLER
2003-11-11 14:20     ` Eric C. Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031111093954.GA10795@redhat.com \
    --to=rich@annexia.org \
    --cc=caml-list@inria.fr \
    --cc=pierre.laffitte@wanadoo.fr \
    --cc=sven.luther@wanadoo.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).