From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 69A8F7FA1F for ; Mon, 21 Jul 2014 17:06:35 +0200 (CEST) Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of alain@frisch.fr) identity=pra; client-ip=193.252.23.212; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="alain@frisch.fr"; x-sender="alain@frisch.fr"; x-conformance=sidf_compatible Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of alain@frisch.fr) identity=mailfrom; client-ip=193.252.23.212; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="alain@frisch.fr"; x-sender="alain@frisch.fr"; x-conformance=sidf_compatible Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@msa.smtpout.orange.fr) identity=helo; client-ip=193.252.23.212; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="alain@frisch.fr"; x-sender="postmaster@msa.smtpout.orange.fr"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgsBABMrzVPB/BfUnGdsb2JhbABZg2DGJ4dFAYEuDwEBAQEBCAsJCRQphAQBBScRQBELDgoJFg8JAwIBAgFFBgEMCAEBiEIJvkwXjSiCKoRGAQSbJYFNhUiQYGoB X-IPAS-Result: AgsBABMrzVPB/BfUnGdsb2JhbABZg2DGJ4dFAYEuDwEBAQEBCAsJCRQphAQBBScRQBELDgoJFg8JAwIBAgFFBgEMCAEBiEIJvkwXjSiCKoRGAQSbJYFNhUiQYGoB X-IronPort-AV: E=Sophos;i="5.01,701,1400018400"; d="scan'208";a="72282716" Received: from msa03.smtpout.orange.fr (HELO msa.smtpout.orange.fr) ([193.252.23.212]) by mail3-smtp-sop.national.inria.fr with ESMTP; 21 Jul 2014 17:06:34 +0200 Received: from [192.168.1.132] ([92.151.89.71]) by mwinf5d09 with ME id V36U1o00u1YMuwU0336VZQ; Mon, 21 Jul 2014 17:06:32 +0200 X-ME-Helo: [192.168.1.132] X-ME-Auth: bGV4aWZpQHdhbmFkb28uZnI= X-ME-Date: Mon, 21 Jul 2014 17:06:32 +0200 X-ME-IP: 92.151.89.71 Message-ID: <53CD2C75.8090208@frisch.fr> Date: Mon, 21 Jul 2014 17:06:29 +0200 From: Alain Frisch User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Gerd Stolpmann , caml-list References: <1404501528.4384.4.camel@e130> In-Reply-To: <1404501528.4384.4.camel@e130> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Caml-list] Immutable strings On 07/04/2014 09:18 PM, Gerd Stolpmann wrote: > http://blog.camlcity.org/blog/bytes1.html Coming back to motivating example of this post. Lexing provides: val from_channel : in_channel -> lexbuf val from_string : string -> lexbuf val from_function : (bytes -> int -> int) -> lexbuf In particular, from_function expects you to write to a buffer, so it's pretty clear that its callback must accept a "bytes", not a "string". There is no place for a (string -> int -> int) -> lexbuf function. Concerning from_string: this function copies the string to an internal buffer. This is purely implemented on the OCaml side without any unsafe features. We could avoid this copy because we know that the generated lexers won't actually modify the buffer in that case, but it would be very difficult to do this without using an unsafe feature, even if we had some sort of generalization of bytes and string. We would instead need a completely different implementation (which would not use "stringable" to make the "source" (string or "stream") explicit in the lexbuf datastructure. We could also provide an extra from_bytes function, but it can currently be implemented by composing Bytes.to_string and Lexing.from_string. Are you concerned only by the performance overhead of this approach (two copies)? If so, the same argument would apply to the current implementation of from_string, and we would need to switch to a different approach, for which it's not clear that "stringable" would be a big help (see above). Before doing anything like that, it would be interesting to evaluate the exact overhead. It could very well be negligible/acceptable for most cases compared to the cost of actual lexing. -- Alain