Joel Reymont wrote: > Are there any screen-scraping packages for OCaml? > > I'm looking for something that would let me analyze the contents of a > web page and extract, for example, all the image tags. I don't think of this as screen scraping. Spidering might be a better word. I've done a good bit of this in OCaml. I use the curl package for downloading web pages and the netstring package for parsing them. I'm going to attach a couple of files that I use for this sort of stuff. The file htmltreeutils.ml has a bunch of functions for working with the results of a nethtml parse tree. So your program would look something like this.. and this hasn't been tested: open Htmltreeutils let result = Buffer.create 2000 in let connection = Curl.init () in Curl.set_httpget connection true; Curl.set_url connection "http://www.yahoo.com/randompage.html"; Curl.set_writefunction connection (fun s -> Buffer.add_string result s); Curl.set_headerfunction connection (fun s -> ()); Curl.perform connection; Curl.cleanup connection; let dom = get_parsed_html_from_string result in let img_tags = list_tags "img" dom in .... do something with img tags here like pull out their src attributes Here are the two helper files: