From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 7F6977F0F9 for ; Mon, 23 Nov 2015 11:44:41 +0100 (CET) IronPort-PHdr: 9a23:Un5CjBCUhN2MtWFyoMFyUyQJP3N1i/DPJgcQr6AfoPdwSP7zp8bcNUDSrc9gkEXOFd2CrakU1qyJ6+u5BiQp2tWojjMrSNR0TRgLiMEbzUQLIfWuLgnFFsPsdDEwB89YVVVorDmROElRH9viNRWJ+iXhpQAbFhi3DwdpPOO9QteU1JTqkb/qsMSMKyxzxxODIppKZC2sqgvQssREyaBDEY0WjiXzn31TZu5NznlpL1/A1zz158O34YIxu38I46Fp34d6XK77Z6U1S6BDRHRjajhtpZ6jiR6WZA2T4X1UeGwdkhtOS1zM6g39WJ34uSv7sMJs0SmdOov9SrViChq46KI+ZgXhjqgANiVx2mbdjdA42KxBqRSqoB1khYTTaoaJHPFzd+XTZ4VJFiJ6Qs9NWnkZUcuHZIwVAr9EZL4Aog== Authentication-Results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=francois.bobot@cea.fr; spf=None smtp.mailfrom=francois.bobot@cea.fr; spf=None smtp.helo=postmaster@oxalide-out.extra.cea.fr Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of francois.bobot@cea.fr) identity=pra; client-ip=132.168.224.8; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="francois.bobot@cea.fr"; x-sender="francois.bobot@cea.fr"; x-conformance=sidf_compatible Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of francois.bobot@cea.fr) identity=mailfrom; client-ip=132.168.224.8; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="francois.bobot@cea.fr"; x-sender="francois.bobot@cea.fr"; x-conformance=sidf_compatible Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@oxalide-out.extra.cea.fr) identity=helo; client-ip=132.168.224.8; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="francois.bobot@cea.fr"; x-sender="postmaster@oxalide-out.extra.cea.fr"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0B+AADP7VJWnAjgqIRewgCEDYM9glICgTc8EAEBAQEBAQEBEAEBAQEBCAsJCSEugi2CBwEBAQMBIw8BBUABBQsLGAICBRYLAgIJAwIBAgFFBg0BBwIXiAsIrhCPdQEBAQEBBQEBAQEBAQEcgQGKUYd1gUQFkmuDZZZOky44glIdgUGGHAEBAQ X-IPAS-Result: A0B+AADP7VJWnAjgqIRewgCEDYM9glICgTc8EAEBAQEBAQEBEAEBAQEBCAsJCSEugi2CBwEBAQMBIw8BBUABBQsLGAICBRYLAgIJAwIBAgFFBg0BBwIXiAsIrhCPdQEBAQEBBQEBAQEBAQEcgQGKUYd1gUQFkmuDZZZOky44glIdgUGGHAEBAQ X-IronPort-AV: E=Sophos;i="5.20,336,1444687200"; d="scan'208";a="154786378" Received: from oxalide-out.extra.cea.fr ([132.168.224.8]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 23 Nov 2015 11:44:13 +0100 Received: from pisaure.intra.cea.fr (pisaure.intra.cea.fr [132.166.88.21]) by oxalide.extra.cea.fr (8.15.2/8.15.2/CEAnet-Internet-out-2.4) with ESMTP id tANAiC5D008365; Mon, 23 Nov 2015 11:44:13 +0100 Received: from pisaure.intra.cea.fr (localhost [127.0.0.1]) by localhost (Postfix) with SMTP id 3F762200E92; Mon, 23 Nov 2015 11:50:38 +0100 (CET) Received: from muguet1.intra.cea.fr (muguet1.intra.cea.fr [132.166.192.6]) by pisaure.intra.cea.fr (Postfix) with ESMTP id 328482028CB; Mon, 23 Nov 2015 11:50:38 +0100 (CET) Received: from [10.8.32.80] (is222783.intra.cea.fr [10.8.32.80]) by muguet1.intra.cea.fr (8.15.2/8.15.2/CEAnet-Intranet-out-1.4) with ESMTP id tANAiC1o019703; Mon, 23 Nov 2015 11:44:12 +0100 Message-ID: <5652EDFC.5070105@cea.fr> Date: Mon, 23 Nov 2015 11:44:12 +0100 From: =?UTF-8?B?RnJhbsOnb2lzIEJvYm90?= User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.7.0 MIME-Version: 1.0 To: Anton Bachin CC: caml-list@inria.fr References: <4824377F-4045-4D47-9BAB-E06B0C939988@yahoo.com> <564AF405.10400@cea.fr> <1A569326-8749-4332-A88C-719165728F20@yahoo.com> In-Reply-To: <1A569326-8749-4332-A88C-719165728F20@yahoo.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Caml-list] [ANN] Lambda Soup - HTML scraping and rewriting with CSS selectors On 22/11/2015 08:58, Anton Bachin wrote: >> Does the types are as powerful as the one in tyxml? Is it possible to have >> this kind of selectors on top of tyxml? > > The types are not as powerful as the ones in tyxml, in the sense of being > precise. Lambda Soup only distinguishes between "definitely elements", > "definitely documents", and "anything, including an element, or document, or any > other node". It doesn't know about element types or any constraints on their > composition. > > It does, however, seem to be possible to have the selectors on top of tyxml. I > am not very familiar with tyxml, but I do see types Xml.elt and Xml.econtent, > and value content. It may be possible to either use those directly as the > internal DOM representation of Lambda Soup, and build traversals over them, and > it is almost certainly possible to convert them to Lambda Soup's current > internal representation, the same way Lambda Soup currently converts Nethtml’s > parse trees. Nice to know. > If there is interest in this, perhaps with some study Lambda Soup can be > modified to use tyxml – perhaps functorized. It is just for not creating another xml type and simpler composition of ocaml libraries. The functor solution used by cohttp (for allowing lwt or async) for example is a nice one, and flambda should eliminate most of the cost of it (on a side note, does @inline work for functor application?) However in this case since the differences is about typing and not just the implementation, I'm not sure you can define a generic functor that could be applied with instance (NetHttp, tyxml, yours) that restricts differently the function applications. -- François