From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id BCEA97F0F9 for ; Mon, 23 Nov 2015 18:35:46 +0100 (CET) IronPort-PHdr: 9a23:ooEiSBCoiinPQw55hm3iUyQJP3N1i/DPJgcQr6AfoPdwSP78p8bcNUDSrc9gkEXOFd2CrakU1qyJ6+uxCSQp2tWojjMrSNR0TRgLiMEbzUQLIfWuLgnFFsPsdDEwB89YVVVorDmROElRH9viNRWJ+iXhpQAbFhi3DwdpPOO9QteU1JTqkb/qsMyDKyxzxxODIppKZC2sqgvQssREyaBDEY0WjiXzn31TZu5NznlpL1/A1zz158O34YIxu38I46Fp34d6XK77Z6U1S6BDRHRjajhtpZ6jiR6WYRGS/jNIXn8LigtPDEvO5RT+doX2siy8ve14jnq0J8rzGJkyRTOkp41iQx/pjm9TPjgl92fdg8dwjaRzsRuhoBs5yInRNtLGfMFid7/QKItJDVFKWdxcAmkYWtux Authentication-Results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=antonbachin@yahoo.com; spf=Pass smtp.mailfrom=antonbachin@yahoo.com; spf=None smtp.helo=postmaster@nm18-vm4.bullet.mail.ne1.yahoo.com Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of antonbachin@yahoo.com) identity=pra; client-ip=98.138.91.178; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="antonbachin@yahoo.com"; x-sender="antonbachin@yahoo.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of antonbachin@yahoo.com designates 98.138.91.178 as permitted sender) identity=mailfrom; client-ip=98.138.91.178; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="antonbachin@yahoo.com"; x-sender="antonbachin@yahoo.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@nm18-vm4.bullet.mail.ne1.yahoo.com) identity=helo; client-ip=98.138.91.178; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="antonbachin@yahoo.com"; x-sender="postmaster@nm18-vm4.bullet.mail.ne1.yahoo.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0B9AAClTFNWnLJbimJegi6CTxCvL4U4iA2EDYYPAoFFPBABAQEBAQEBARABAQEBAQYNCQkhLoItgggBAQMBIx0BOAEECwYFGgImAgJFEgaIKwEDCgivIoFqgnmFZAEjJwOEdwEBAQEBAQEDAQEBAQEBAQEBAQESBgwCAXKFU4IQgm6EWYMcL4EVBY4UiDyIFoUbFIFHh0IgjxyDcjiCLyMdgV9ThSsBAQE X-IPAS-Result: A0B9AAClTFNWnLJbimJegi6CTxCvL4U4iA2EDYYPAoFFPBABAQEBAQEBARABAQEBAQYNCQkhLoItgggBAQMBIx0BOAEECwYFGgImAgJFEgaIKwEDCgivIoFqgnmFZAEjJwOEdwEBAQEBAQEDAQEBAQEBAQEBAQESBgwCAXKFU4IQgm6EWYMcL4EVBY4UiDyIFoUbFIFHh0IgjxyDcjiCLyMdgV9ThSsBAQE X-IronPort-AV: E=Sophos;i="5.20,338,1444687200"; d="scan'208";a="154851564" Received: from nm18-vm4.bullet.mail.ne1.yahoo.com ([98.138.91.178]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 23 Nov 2015 18:35:45 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1448300142; bh=sJVC5HHnSiMBdOX6L29w2QMbfDWAfN0PUthHqY4aDJ0=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From:Subject; b=XCYss2uIru/+yrJGrnWAA7cvRR3yQ97Y2v11scBeKyPU3oLALpgFAL+z5iUtwZjbDZ8UNfuS/5d0PN67WHckcbh8RGiAOzieY32npf0bwoHj2ZPI5gawbDM0QdIvE3EmU/a9up/1H7DC0650k+54dVWYzBS2jyFxhZAHbVYLUwXUBC6BuKLNRg040KhGnFB5LXi8YX54UD+g7OVBCm1ObYQwpvT/tu/niBTsn+A6Qg2TAGj4ZDKPCQXcPRhNYnxa/YbOa4ghBoi7WjwPh+hhWbGvKEV/OrBQKrfX/La4aeSbNgzIfXN2BhE0+Pv/R67IRhH4sF1X9UtdJoaJv+5qfA== Received: from [98.138.100.118] by nm18.bullet.mail.ne1.yahoo.com with NNFMP; 23 Nov 2015 17:35:42 -0000 Received: from [98.138.226.61] by tm109.bullet.mail.ne1.yahoo.com with NNFMP; 23 Nov 2015 17:35:42 -0000 Received: from [127.0.0.1] by smtp212.mail.ne1.yahoo.com with NNFMP; 23 Nov 2015 17:35:42 -0000 X-Yahoo-Newman-Id: 439799.44815.bm@smtp212.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: bgRBkXIVM1mDRtMz3nAEZT4hlWbbOKWGgioAXFuobkcH0B0 ycFv8xVls8Au9qt1pwCCFYRHnPLee384X3Kvg3dZB06qP27VJBA52u5rh18_ n4HyCKdJe4RvHb7jGjCLUwi4Blohs_iCY56ZPmjLf9W_4uNLZEwC.UxOJRya UgyFtxwju7DRZ3YGGJut42kRIjwG.71cA3bPco81xIrJ7vW16opSTRDQI_Kb qpYolTHWk5iB48ihZVdUjI3WxR_ruRQHRsczOosvshS7uoOg2lodrmthg9gn y1lxkOq2sWYyb.19nTCX978bPmrx1nZ.aqXTYL0VglhpyD3sDxdjvUop.3h1 1l37cmjISZCeIJe8En90581A8vhoj2u3h8jQtfXvEP0aSapz42zPQmdS1wZY B9HJWR.7lu4zF1GGKNtc100sKA0x11VKr7yPVwjjtFHBplRmLgiQ4kD0iXRg Jme5GH9nYL0azfAvb0vEPWZ2kMuTfCnqSKC2VjlIpPInbs0r20wCnkoYJpjH dMfqWNZlnLwbl2rnPpN0qC.HwrTQ7Kks- X-Yahoo-SMTP: ddtAESaswBBsjSthz_dVP91gr2NDfymF Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2070.6\)) From: Anton Bachin In-Reply-To: <565349F9.6020405@zoho.com> Date: Mon, 23 Nov 2015 11:35:41 -0600 Cc: =?utf-8?Q?Fran=C3=A7ois_Bobot?= , caml-list@inria.fr X-Mao-Original-Outgoing-Id: 469992941.564861-95658d635a3c38b147f8fd2cd3265bb3 Content-Transfer-Encoding: quoted-printable Message-Id: <5D7BF541-7E63-4A21-842A-8C34F36B550B@yahoo.com> References: <4824377F-4045-4D47-9BAB-E06B0C939988@yahoo.com> <564AF405.10400@cea.fr> <1A569326-8749-4332-A88C-719165728F20@yahoo.com> <5652EDFC.5070105@cea.fr> <98E819C0-76A2-4038-A5E6-DFBDC08DF7FA@yahoo.com> <565349F9.6020405@zoho.com> To: Drup X-Mailer: Apple Mail (2.2070.6) Subject: Re: [Caml-list] [ANN] Lambda Soup - HTML scraping and rewriting with CSS selectors > There seems to be a slight misunderstanding about how tyxml is constructe= d, so let me clarify things a bit. Thanks. I will still have to look at tyxml, though. > Now, in order to type lambda_soup using tyxml's types: It's going to be a= bit of work. You can perfectly reuse all tyxml's type, but you need typefu= l combinators instead of strings, otherwise you have no way to know what yo= ur selection is going to return. You may be able to cheat your way through = by creating a fake xml module and instantiate tyxml's functors on it to cre= ate all the combinators (that would be fun :p) Does tyxml have checked coercions? I was thinking of something like filtering a traversal by a checked coercion. This is how Lambda Soup currently does it for traversing elements. While traversing nodes, it filters by a checked coercion to elements. Typed selection, as you suggest, is another possibility, but my guess is that it would take a quite a while to design something that is easily learnable and not very challenging to type, if that is possible at all =E2=80=93 as you seem to ag= ree. > In any case, you will pay typesafety by a significant increase in verbosi= ty and awkwardness. I'm not sure it's worth the effort, since a lot of real= world html trees are not correct and that you never really need to select = tyxml-constructed trees anyway. Simple compatibility with tyxml is much eas= ier: you just have to agree with tyxml's signatures (which would deserve a = bit of a cleanup). This is what I would be going for by default, since without resorting to coercions, that is the best, in terms of typing, that you could hope for when parsing. My main concern beyond that, as expressed in my previous message, is how Lambda Soup could best interact at the type level with trees constructed by tyxml, not what types it (or any other library in OCaml) could assign to a tree constructed from arbitrary input. I suppose that if people really never need to select on tyxml trees, as you say, then Lambda Soup and tyxml are simply addressing different usages that don=E2=80=99t interact very much, which is what I suspected from the beginning. Having no experience with tyxml, however, I would like more feedback :) Regards, Anton=