caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: "José Romildo Malaquias" <j.romildo@gmail.com>
To: caml-list@inria.fr
Subject: [Caml-list] Extracting information from HTML documents
Date: Wed, 23 Jan 2013 18:52:29 -0200	[thread overview]
Message-ID: <20130123205229.GA2673@jrm.no-ip.org> (raw)

Hello.

tagsoup[1][2] is a Haskell library for parsing and extracting
information from (possibly malformed) HTML/XML documents.

tagsoup provides a basic data type for a list of unstructured tags, a
parser to convert HTML into this tag type, and useful functions and
combinators for finding and extracting information.

Is there a similar library for OCaml?

I want to write an application which will need to extract some
information from HTML documents from the web. tagsoup helps a lot in the
Haskell version of my program. Which OCaml libraries can help me with
that when porting the application to OCaml?

[1] http://community.haskell.org/~ndm/tagsoup/
[2] http://hackage.haskell.org/package/tagsoup


Romildo

             reply	other threads:[~2013-02-21 18:12 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-23 20:52 José Romildo Malaquias [this message]
2013-02-22  8:43 ` AW: " Gerd Stolpmann
2013-02-23 12:40   ` Florent Monnier
2013-02-23 13:23     ` AW: " Gerd Stolpmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130123205229.GA2673@jrm.no-ip.org \
    --to=j.romildo@gmail.com \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).