From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 2419C7FA4D for ; Mon, 28 Jul 2014 17:51:09 +0200 (CEST) Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of markus.mottl@gmail.com) identity=pra; client-ip=209.85.220.49; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="markus.mottl@gmail.com"; x-sender="markus.mottl@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of markus.mottl@gmail.com designates 209.85.220.49 as permitted sender) identity=mailfrom; client-ip=209.85.220.49; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="markus.mottl@gmail.com"; x-sender="markus.mottl@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-pa0-f49.google.com) identity=helo; client-ip=209.85.220.49; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="markus.mottl@gmail.com"; x-sender="postmaster@mail-pa0-f49.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AooBAL9w1lPRVdwxlGdsb2JhbABZg2BXBIJ0sDCYZIdFAYEHCBYQAQEBAQcLCwkSK4QDAQEBAwESER0BFAcSCwEDAQsGBQsNAgIJHQICIQEBEQEFAQoSBhMSEIgLAQMJCA2bA2qLKYFygxCKHwoZJwMKFU+GLREBBQ6BHotzCoE3bAeCeYFRBYRzBZRPA4ICgVKMV4QxGCmFFiEvAYEL X-IPAS-Result: AooBAL9w1lPRVdwxlGdsb2JhbABZg2BXBIJ0sDCYZIdFAYEHCBYQAQEBAQcLCwkSK4QDAQEBAwESER0BFAcSCwEDAQsGBQsNAgIJHQICIQEBEQEFAQoSBhMSEIgLAQMJCA2bA2qLKYFygxCKHwoZJwMKFU+GLREBBQ6BHotzCoE3bAeCeYFRBYRzBZRPA4ICgVKMV4QxGCmFFiEvAYEL X-IronPort-AV: E=Sophos;i="5.01,749,1400018400"; d="scan'208";a="73084085" Received: from mail-pa0-f49.google.com ([209.85.220.49]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 28 Jul 2014 17:51:07 +0200 Received: by mail-pa0-f49.google.com with SMTP id hz1so10650409pad.22 for ; Mon, 28 Jul 2014 08:51:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=1lBfkrBicL8ZNPBbNxu6OFXSNxg0ZdOMaxvkCs9HI0k=; b=tOVHMo9J18roacIchojMZitec1aeA6kB3fb1Tny7wAsCvABLZKVf7TKhOpqfgozHxO f2uKwQ/OlTZk30Qicek6NDMse7nv4Y0Dc+bmRqm40yajfIEBKgZX+wZDK1TGp+qBIm9/ bDHTrZJuRqCeL2968yj+KaWvxGuNCWFVKjfGArunInqAac/oHI4G3Ts9hWT7B68aD2Up nfE2yu40IMapAA2hJHjGbcUEjxosaveXFUA0M10SL97DpCJDxyMmItpEjj//RZPwSX9b KJ8AR2CFbIZkUJ6/cbNVEQkSCq9t7f7yFoO0+GZV/4fL1Wo78zoGR1aULUb0r64HCc5N WGoQ== MIME-Version: 1.0 X-Received: by 10.68.137.98 with SMTP id qh2mr39887768pbb.26.1406562666150; Mon, 28 Jul 2014 08:51:06 -0700 (PDT) Received: by 10.70.22.201 with HTTP; Mon, 28 Jul 2014 08:51:05 -0700 (PDT) In-Reply-To: <20140728111452.GA26816@frosties> References: <1404501528.4384.4.camel@e130> <20140728111452.GA26816@frosties> Date: Mon, 28 Jul 2014 11:51:05 -0400 Message-ID: From: Markus Mottl To: Goswin von Brederlow Cc: caml-list@inria.fr Content-Type: text/plain; charset=UTF-8 Subject: Re: [Caml-list] Immutable strings On Mon, Jul 28, 2014 at 7:14 AM, Goswin von Brederlow wrote: > Why is that? A bigarray allocates a small block on the ocaml heap and > the buffer outside the ocaml heap. Is that normal malloc() call just > so much slower? Or are there other factors involved? If you look at the runtime code, you'll see that there is quite a lot going on to create a bigarray value. Allocating small OCaml-strings on the minor heap only costs a handful of cheap instructions, which is obviously way more efficient. There is some threshold at which malloc will perform more expensive system calls to obtain memory whereas OCaml may still be able to get some larger chunks from the major heap. Unless Bigarrays become really large, standard OCaml strings can be obtained much more cheaply. > On the other hand if your app is IO heavy then you should allocate a > few buffers and reuse them. In that case the allocation overhead is > constant and the time saved for not copying in the I/O will more than > make up for it. Exactly. Bigarrays are my buffer of choice for I/O. > Or read/mmap the file into a huge bigarray and the slice it into > smaller chunks. This can improve performance for certain operations, but beware of page faults when accessing ranges that only reside on disk. Unless this access is done outside of the OCaml-lock, your application could freeze longer than allowed for realtime applications. Regards, Markus >> On Fri, Jul 4, 2014 at 3:18 PM, Gerd Stolpmann wrote: >> > Hi list, >> > >> > I've just posted a blog article where I criticize the new concept of >> > immutable strings that will be available in OCaml 4.02 (as option): >> > >> > http://blog.camlcity.org/blog/bytes1.html >> > >> > In short my point is that it the new concept is not far reaching enough, >> > and will even have negative impact on the code quality when it is not >> > improved. I also present three ideas how to improve it. >> > >> > Gerd > > You have a few more points: > > 1) there are 3 kinds of strings: > > - string literal / constant strings [which never change ever] > - read-only strings [which YOU are not allowed to change but might change] > - mutable strings [which you are allowed to changed] > > There is one other thing you didn't mention here. While it is nice to > pass a mutable string to the lexer (or similar) one has to realize > that that is not thread save. Another thread might be mutating the > string while it is being used. > > So I would suggest there is a 4th kind of string: > > - frozen strings [which are mutable but won't be changed anymore] > > That is basically like read-only strings but with the addes promise > that they won't be changed. Nothing in the type system garanties that, > it is just a promise from the programmer. > > 2) there are lots of functions that just need any kind of string and > should accept all 3 > > This kind of asks for type classes. There should be a read-from-string > type class that all 3 string types would fit. Then one could have one > function accepting a read-from-string type class and all 3 string > types could be passed. But unfortunately ocaml doesn't have type > classes. > > The next best thing would be enumerations (not in stdlib). Make > enumerations accept all 3 string types and then have everything else > accept enumerations. This would also mean you could pass a char list > or rope or any other type that gives you an enumeration of chars. > > 3) I/O code > > That the stdlib uses strings for I/O and needs to copy the data around > all the time has been nagging me for years. There certainly should be > read/write functions dealing with bigarrays. > > There also should be a function to create a bigarray with special > alignment (e.g. PAGESIZE) to get the best I/O performance (or in case > of linux async IO make it work at all). > > As for mutable/immutable strings there should be a read function > returning an immutable string, which it creates internally. The string > can't be passed as argument so creating a fresh one is the only way. > > > Here is a completly new point: > > 4) What is good for strings is also good for bigarray > > The same arguments concerning strings applies to bigarrays. Say you > pass a bigarray to the lexer. Can it just use it as is for its lexbuf > or does it need to copy it because it might mutate? An immutable > bigarray could be used savely as is. > > > And this doesn't realy stop at bigarray. Even references could be > read-only, in the sense of "this might change but YOU aren't allowed > to change it". And I think the only way to solve the > const/immutable/mutable/frozen sub-types that will be applicable to > more than just string is to use phantom types. > > MfG > Goswin > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs -- Markus Mottl http://www.ocaml.info markus.mottl@gmail.com