From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id F18257FCCB for ; Fri, 1 May 2015 22:42:10 +0200 (CEST) Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of daniel.buenzli@erratique.ch) identity=pra; client-ip=74.55.86.74; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="daniel.buenzli@erratique.ch"; x-sender="daniel.buenzli@erratique.ch"; x-conformance=sidf_compatible Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of daniel.buenzli@erratique.ch) identity=mailfrom; client-ip=74.55.86.74; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="daniel.buenzli@erratique.ch"; x-sender="daniel.buenzli@erratique.ch"; x-conformance=sidf_compatible Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@smtp.webfaction.com) identity=helo; client-ip=74.55.86.74; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="daniel.buenzli@erratique.ch"; x-sender="postmaster@smtp.webfaction.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0CtAQCu40NVnEpWN0pch1jCNQGHXwKCEA8BAQEBAQEBEQEBAQEBBg0JCSEuhCEBAQQjVhALGgImAgJHEAYbiCMEtAOTJAEKAQEBHoEhiheEUjMHgmgvgRYBBJ1chgwPim+DUIEDgQWCEYMyAQEB X-IPAS-Result: A0CtAQCu40NVnEpWN0pch1jCNQGHXwKCEA8BAQEBAQEBEQEBAQEBBg0JCSEuhCEBAQQjVhALGgImAgJHEAYbiCMEtAOTJAEKAQEBHoEhiheEUjMHgmgvgRYBBJ1chgwPim+DUIEDgQWCEYMyAQEB X-IronPort-AV: E=Sophos;i="5.13,352,1427752800"; d="scan'208";a="138563867" Received: from mail6.webfaction.com (HELO smtp.webfaction.com) ([74.55.86.74]) by mail2-smtp-roc.national.inria.fr with ESMTP; 01 May 2015 22:42:10 +0200 Received: from [172.20.10.2] (5.229.197.178.dynamic.wless.zhbmb00p-cgnat.res.cust.swisscom.ch [178.197.229.5]) by smtp.webfaction.com (Postfix) with ESMTP id CD6995A100D8; Fri, 1 May 2015 20:42:08 +0000 (UTC) Date: Fri, 1 May 2015 22:42:04 +0200 From: =?utf-8?Q?Daniel_B=C3=BCnzli?= To: Gabriel Scherer Cc: Ocaml Mailing List Message-ID: In-Reply-To: References: <33523412539540019A672C80600DED12@erratique.ch> X-Mailer: sparrow 1.6.4 (build 1178) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Subject: Re: [Caml-list] Safe UTF-8 string literals and pattern matching for OCaml Le vendredi, 1 mai 2015 =C3=A0 22:13, Gabriel Scherer a =C3=A9crit : > I wonder whether a system based on extensions [%u "..."] rather than attr= ibutes "..." [@u] could be easier to extend in the future. For example, you= might want to introduce a different annotation `u16` that generates an int= eger array representing an utf16-encoded literal (or an abstract type of yo= ur liking, but then not in pattern position). Having an annotation change t= he type of the code would not be very nice. I don't think that multiplying *representations* is a good idea and a singl= e canonical one should be eventually chosen for this in OCaml. So personall= y I'm not interested in that kind of extension. In any case desugaring the same notation to integer arrays is part of the d= esign space I consider (though if we imagine a compiler integration it woul= d be much less convenient). But I would never use UTF-16 for that. Using Un= icode scalar values directly is less costly and leads to a conceptually cor= rect type from a Unicode processing point of view. Best, Daniel