From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id 35473BC6B for ; Thu, 24 Jan 2008 22:50:29 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAANeWmEfAXQInh2dsb2JhbACQJAEBAQgKKZ5Z X-IronPort-AV: E=Sophos;i="4.25,246,1199660400"; d="scan'208";a="7189991" Received: from concorde.inria.fr ([192.93.2.39]) by mail1-smtp-roc.national.inria.fr with ESMTP; 24 Jan 2008 22:50:29 +0100 Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id m0OLoSE8026215 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Thu, 24 Jan 2008 22:50:28 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAANeWmEeDrhCRh2dsb2JhbACQJAEBAQgKKZ5Z X-IronPort-AV: E=Sophos;i="4.25,246,1199660400"; d="scan'208";a="6541712" Received: from smeltpunt.science.ru.nl ([131.174.16.145]) by mail2-smtp-roc.national.inria.fr with ESMTP; 24 Jan 2008 22:50:28 +0100 Received: from tandem.cs.ru.nl (tandem.cs.ru.nl [131.174.142.18]) by smeltpunt.science.ru.nl (8.13.7/5.23) with ESMTP id m0OLoR10002822 for ; Thu, 24 Jan 2008 22:50:27 +0100 (MET) Received: from tews by tandem.cs.ru.nl with local (Exim 4.63) (envelope-from ) id 1JI9xr-00017k-EN for caml-list@inria.fr; Thu, 24 Jan 2008 22:50:27 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <18329.2083.415857.934232@tandem.cs.ru.nl> Date: Thu, 24 Jan 2008 22:50:27 +0100 To: caml-list@inria.fr Subject: ocamlnet mini tutorial: netclient and cookie based authentication X-Mailer: VM 7.19 under Emacs 21.4.1 From: Hendrik Tews X-Scanned-By: MIMEDefang 2.63 on 131.174.16.145 X-Miltered: at concorde with ID 47990824.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; ocamlnet:01 netclient:01 hendrik:01 tews:01 tews:01 ocamlnet:01 wikis:01 ocaml:01 wiki:01 netstring:01 netstring:01 defaults:01 compilation:01 findlib:01 -package:01 Hi, for ocamlnet novices I document here what you need to access web sites that use cookies for authentication (eg, wikis or mailman). With this you can let ocaml perform your mailing-list administrator tasks or write a wiki bot. Cookie authentication works as follows: When you first visit the site you have to authenticate with username and password that are usually sent (in cleartext) in a post request to the server. The server responses with a cookie that will be used for authentication in all following requests. Because you have to retrieve and set cookies in the headers you cannot use the Http_client.Convenience module. You really have to dive into ocamlnet. Here comes the solution: Netstring extension: - You need get_set_cookie, which will retrieve cookies from a response header but which is not in ocamlnets netstring library currently. Download from [1] or with two additional helper functions from [2]. Prerequisites: - Create a pipeline with "new Http_client.pipeline", it will process all http requests. (Don't ask me about timeouts and retries, the defaults work fine for me.) Get going (download some page without authentication): - Create a get call "new Http_client.get url" - Process the call: Add it to the pipeline (method #add) and run the pipeline (#run). By this the call will be changed and contain the response. - Check the response status (#response_status), which should usually be `OK and get the page (#response_body#value). - In case of error convert the status into something readable with Nethttp.string_of_http_status or consult the response status message (#response_status_text). Login: - Create post call with "new Http_client.post url parameters", the parameters are a string * string list, eq [("wpName", "mywikiname"); ("wpPassword", "mywikipasswd"); ("wpLoginattempt", "Log in")] for a MediaWiki. [In order to get the parameters save the login page, change the method of the right from into "GET", load the page into a browser, press the login button and parse the new URL you get.] - Process the call. - Check the status (#response_status): For MediaWiki it should be `Found, for mailman and others `Ok (don't ask me why). - Retrieve the response header (#response_header) and extract the cookies with get_set_cookies from [2]. Make an authenticated request: - Create your call "new get url" for getting a page or "new post url" for performing an action. - Access the request header (#request_header `Base) and add your cookies there (set_cookies from [2]). This is all done via side-effects, no need to write the modified header back into the call. - Process the call and get the response. Compilation: - with findlib and -package netclient -linkpkg: ocamlfind {ocamlc,ocamlopt} -package netclient -linkpkg your files ... Acknowledgements: - Praise Gerd Stolpmann for ocamlnet and findlib. That's it. For an example go to [3], which contains a MediaWiki bot that updates structural information in the Mozilla wiki [4]. The needed utility functions are in wiki_http: wiki_login to login, download to download a web page, get_page to get a wiki page and write_wiki_page to write one. Resources: [1] https://godirepo.camlcity.org/wwwsvn/trunk/code/get-set-cookie.ml?rev=1145&root=lib-ocamlnet2&view=auto [2] http://www.sos.cs.ru.nl/cgi-bin/~tews/olmar/viewvc-patch.cgi/elsa/olmar/wiki_bot/netstring_ext.ml?revision=HEAD&view=markup [3] http://www.sos.cs.ru.nl/cgi-bin/~tews/olmar/viewvc-patch.cgi/elsa/olmar/wiki_bot/ [4] http://wiki.mozilla.org/Elsa_ast_nodes Bye, Hendrik