From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 19D847EE4A for ; Thu, 21 Feb 2013 19:12:35 +0100 (CET) Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of j.romildo@gmail.com) identity=pra; client-ip=209.85.161.170; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="j.romildo@gmail.com"; x-sender="j.romildo@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of j.romildo@gmail.com designates 209.85.161.170 as permitted sender) identity=mailfrom; client-ip=209.85.161.170; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="j.romildo@gmail.com"; x-sender="j.romildo@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-gg0-f170.google.com) identity=helo; client-ip=209.85.161.170; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="j.romildo@gmail.com"; x-sender="postmaster@mail-gg0-f170.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AowDAMNiJlHRVaGqemdsb2JhbABFFoVxvAIWDgEBCwkMCBQEI4JgBgEbHgMSJjQFDwkIAQUBSYdrAQMPAgqeP4JwjDKCe4QyChknDVmIWgEFDI8fgklhA5Y1gR6NXz+EQg X-IPAS-Result: AowDAMNiJlHRVaGqemdsb2JhbABFFoVxvAIWDgEBCwkMCBQEI4JgBgEbHgMSJjQFDwkIAQUBSYdrAQMPAgqeP4JwjDKCe4QyChknDVmIWgEFDI8fgklhA5Y1gR6NXz+EQg X-IronPort-AV: E=Sophos;i="4.84,710,1355094000"; d="scan'208";a="3952177" Received: from mail-gg0-f170.google.com ([209.85.161.170]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 21 Feb 2013 19:12:34 +0100 Received: by mail-gg0-f170.google.com with SMTP id k4so1618003ggn.1 for ; Thu, 21 Feb 2013 10:12:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:resent-from:resent-date:resent-message-id:resent-to:date :from:to:subject:message-id:mail-followup-to:mime-version :content-type:content-disposition:user-agent; bh=FZljUICvyY8X/zvz9UTiP1KzlNz8/Noz1a2aKHIL/Cw=; b=FMyLgE/EEtABWGYO4rSxbEQG01yoMeB3Aj5Wzglt548UYbXlC8Y15UgL+So5ffc9I5 +l1qeISUcbLSUb7tNeSIZ81noNAySf5Ll2UbtbnqE5wIqoS1ipk2A9kiEes79APTyG4T MFWONAmGKb6IyvGVsLWJJ7pZX4UdeR+0S41DXaeR/px5Phoh1brTYotleqS7iCmKVqBa 6QGT6Qr9WPxIXiUQSr7Z22t3aExNNAtpn+tBxVwIQwWzGdc7pLgsud/peclWxJhZZnyN Y/fBAgoMMk9mzNJG1Z4cMKAsv+BKyU0WqjZQZcnW3ECn4tV6nzfigKMTRKEyBVMhG+g8 +AlQ== X-Received: by 10.101.40.2 with SMTP id s2mr13891921anj.88.1361470353428; Thu, 21 Feb 2013 10:12:33 -0800 (PST) Received: from jrm.localdomain ([200.239.133.59]) by mx.google.com with ESMTPS id w7sm116559701yhj.0.2013.02.21.10.12.32 (version=TLSv1.2 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 21 Feb 2013 10:12:32 -0800 (PST) Received: by jrm.localdomain (Postfix, from userid 1000) id CD77BE0584; Thu, 21 Feb 2013 15:13:25 -0300 (BRT) Resent-From: =?iso-8859-1?Q?Jos=E9?= Romildo Malaquias Resent-Date: Thu, 21 Feb 2013 15:13:25 -0300 Resent-Message-ID: <20130221181325.GD29074@jrm.DHCP-GERAL> Resent-To: caml-list@inria.fr Date: Wed, 23 Jan 2013 18:52:29 -0200 From: =?iso-8859-1?Q?Jos=E9?= Romildo Malaquias To: caml-list@inria.fr Message-ID: <20130123205229.GA2673@jrm.no-ip.org> Mail-Followup-To: caml-list@inria.fr MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Subject: [Caml-list] Extracting information from HTML documents Hello. tagsoup[1][2] is a Haskell library for parsing and extracting information from (possibly malformed) HTML/XML documents. tagsoup provides a basic data type for a list of unstructured tags, a parser to convert HTML into this tag type, and useful functions and combinators for finding and extracting information. Is there a similar library for OCaml? I want to write an application which will need to extract some information from HTML documents from the web. tagsoup helps a lot in the Haskell version of my program. Which OCaml libraries can help me with that when porting the application to OCaml? [1] http://community.haskell.org/~ndm/tagsoup/ [2] http://hackage.haskell.org/package/tagsoup Romildo