From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 44AD87F75C for ; Mon, 11 Aug 2014 08:58:09 +0200 (CEST) Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of jacquesdpz@gmail.com) identity=pra; client-ip=209.85.223.171; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="jacquesdpz@gmail.com"; x-sender="jacquesdpz@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of jacquesdpz@gmail.com designates 209.85.223.171 as permitted sender) identity=mailfrom; client-ip=209.85.223.171; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="jacquesdpz@gmail.com"; x-sender="jacquesdpz@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-ie0-f171.google.com) identity=helo; client-ip=209.85.223.171; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="jacquesdpz@gmail.com"; x-sender="postmaster@mail-ie0-f171.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqgBAAtp6FPRVd+rlGdsb2JhbABag19XBIJ0yXqHTAGBDAgWEAEBAQEHCwsJEiuEAwEBAQMBEhEdARseAwELBgUEAQYNKgICIgERAQUBHAYTIogLAQMJCJ9baospgXKDEIk8ChknDWaEehEBBQ6PRYJ5gVMFlT6GcYFXkTIYKYUPOy8 X-IPAS-Result: AqgBAAtp6FPRVd+rlGdsb2JhbABag19XBIJ0yXqHTAGBDAgWEAEBAQEHCwsJEiuEAwEBAQMBEhEdARseAwELBgUEAQYNKgICIgERAQUBHAYTIogLAQMJCJ9baospgXKDEIk8ChknDWaEehEBBQ6PRYJ5gVMFlT6GcYFXkTIYKYUPOy8 X-IronPort-AV: E=Sophos;i="5.01,839,1400018400"; d="scan'208";a="74481487" Received: from mail-ie0-f171.google.com ([209.85.223.171]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 11 Aug 2014 08:58:08 +0200 Received: by mail-ie0-f171.google.com with SMTP id at1so9370152iec.30 for ; Sun, 10 Aug 2014 23:58:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=h/t/+iZ2R6NfzfigYe5CoqfIlCSi19+X9RoTaofZ7Q8=; b=a+X9SpNSd4M5BkNlOvglccrSFXtNxKEBf6WOsEpbuevrrvR9Lve2pGcgL51bu06BHj Uv7jMgEoOD5Mh1YhnMyTnZm2PCrBYJ6NF0VR34ihZX7QqY0s8BFFidDgf5yGigeGmtBm 3Fp4U4jJlGV2PnEYJPF4tkiwqG+yCRYnh2L1eKGmWvEkpARKgz7uZAbJoJ6VpsTTkLeR nh1wPcIf/7k1bU4frp7DhcQLkCZc4Msf+jtF7/e3+ciF97w+dCybtHYXumM1R1amO3vJ yma2vtWKrYFTJqcsXyRuQFvKv6AA6Mgm99mRi10rLv2dMvvGWiS4fcJGnJ5I3blMtsvy WHtg== X-Received: by 10.50.124.102 with SMTP id mh6mr27186229igb.27.1407740287022; Sun, 10 Aug 2014 23:58:07 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.50.198 with HTTP; Sun, 10 Aug 2014 23:57:46 -0700 (PDT) In-Reply-To: <20140810.224256.1353397051109538039.Christophe.Troestler@umons.ac.be> References: <20140810.224256.1353397051109538039.Christophe.Troestler@umons.ac.be> From: Jacques du Preez Date: Mon, 11 Aug 2014 08:57:46 +0200 Message-ID: To: OCaml Mailing List Content-Type: multipart/alternative; boundary=089e0111c1babe7ed305005513a5 Subject: Re: [Caml-list] OCaml HTML parsing & manipulation --089e0111c1babe7ed305005513a5 Content-Type: text/plain; charset=UTF-8 Thanks. I eventually discovered ocamlnet, but I'm hoping there's maybe more than 1 option? ============================== Jacques du Preez Web: OpenLandscape.net Twitter: @jacquesdp On Sun, Aug 10, 2014 at 10:42 PM, Christophe Troestler < Christophe.Troestler@umons.ac.be> wrote: > Hi, > > On Sun, 10 Aug 2014 19:38:39 +0200, Jacques du Preez wrote: > > > > I've been searching for an OCaml library to parse HTML, and then be able > to > > query and manipulate it similar to jQuery. > > > > The JSoup Java library, http://jsoup.org, allows me to do this. Is there > > something like this for OCaml? > > Nethtml in ocamlnet partly does what you need (you can easily write > recursive functions to extract the desired data from the HTML tree). > > Best, > C. > --089e0111c1babe7ed305005513a5 Content-Type: text/html; charset=UTF-8
Thanks. I eventually discovered ocamlnet, but I'm hoping there's maybe more than 1 option?

==============================
Jacques du Preez

Web: OpenLandscape.net
Twitter: @jacquesdp


On Sun, Aug 10, 2014 at 10:42 PM, Christophe Troestler <Christophe.Troestler@umons.ac.be> wrote:
Hi,

On Sun, 10 Aug 2014 19:38:39 +0200, Jacques du Preez wrote:
>
> I've been searching for an OCaml library to parse HTML, and then be able to
> query and manipulate it similar to jQuery.
>
> The JSoup Java library, http://jsoup.org, allows me to do this. Is there
> something like this for OCaml?

Nethtml in ocamlnet partly does what you need (you can easily write
recursive functions to extract the desired data from the HTML tree).

Best,
C.

--089e0111c1babe7ed305005513a5--