From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <joelr1@gmail.com>
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78])
	by yquem.inria.fr (Postfix) with ESMTP id A4FD1BB84
	for <caml-list@yquem.inria.fr>; Tue,  1 Aug 2006 02:06:53 +0200 (CEST)
Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.174])
	by nez-perce.inria.fr (8.13.6/8.13.6) with ESMTP id k7106r9W030258
	for <caml-list@yquem.inria.fr>; Tue, 1 Aug 2006 02:06:53 +0200
Received: by ug-out-1314.google.com with SMTP id e2so1078209ugf
        for <caml-list@yquem.inria.fr>; Mon, 31 Jul 2006 17:06:53 -0700 (PDT)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws;
        s=beta; d=gmail.com;
        h=received:mime-version:content-transfer-encoding:message-id:content-type:to:from:subject:date:x-mailer;
        b=amG1ImYYagk6vhgN3f3dqdiOUML8vJu4CC4II7AWrxzxzpqh6M7GAANaRjLO9CdFwl3APLkyRIjiAQkDjeprD0I9a6UpLocO4ntc57B9/jZR5cLbC1hD59EtneByC2CqiQZSVwnKbjaYmzqphaf9kc2rIcGUMMTisRj8W4FNPg4=
Received: by 10.78.147.3 with SMTP id u3mr73257hud;
        Mon, 31 Jul 2006 17:06:52 -0700 (PDT)
Received: from ?192.168.0.101? ( [88.3.13.8])
        by mx.gmail.com with ESMTP id 4sm2033641hue.2006.07.31.17.06.51;
        Mon, 31 Jul 2006 17:06:52 -0700 (PDT)
Mime-Version: 1.0 (Apple Message framework v752.2)
Content-Transfer-Encoding: 7bit
Message-Id: <FA4F65EE-B014-4BDB-9245-0E63C81543ED@gmail.com>
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
To: caml-list <caml-list@yquem.inria.fr>
From: Joel Reymont <joelr1@gmail.com>
Subject: Web page scraping packages
Date: Tue, 1 Aug 2006 01:06:52 +0100
X-Mailer: Apple Mail (2.752.2)
X-j-chkmail-Score: MSGID : 44CE9B1D.000 on nez-perce : j-chkmail score : XXXX : 5/20 2
X-Miltered: at nez-perce with ID 44CE9B1D.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)!
X-Spam: no; 0.00; ocaml:01 let:03 somewhat:05 folks:07 i'm:08 i'm:08 example:10 ruby:11 packages:12 packages:12 image:87 but:13 slow:13 something:14 something:14 
X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.0 required=5.0 tests=RCVD_BY_IP autolearn=disabled 
	version=3.0.3

Folks,

Are there any screen-scraping packages for OCaml?

I'm looking for something that would let me analyze the contents of a  
web page and extract, for example, all the image tags.

I'm using Ruby for this at work and something like hpricot [1] is  
very neat but also somewhat slow.

	Thanks, Joel

[1] http://code.whytheluckystiff.net/hpricot/

--
http://wagerlabs.com/