From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id D2FFFBC6B for ; Mon, 8 Oct 2007 23:37:16 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAIE/CkfAXQImh2dsb2JhbACORQEBAQgKKQ X-IronPort-AV: E=Sophos;i="4.21,244,1188770400"; d="scan'208";a="2467338" Received: from discorde.inria.fr ([192.93.2.38]) by mail1-smtp-roc.national.inria.fr with ESMTP; 08 Oct 2007 23:37:16 +0200 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l98LbFZQ022554 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Mon, 8 Oct 2007 23:37:16 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAIE/CkfLENaMnmdsb2JhbACORQEBAQEHBAYRGA X-IronPort-AV: E=Sophos;i="4.21,244,1188770400"; d="scan'208";a="4071490" Received: from ipmail01.adl2.internode.on.net ([203.16.214.140]) by mail3-smtp-sop.national.inria.fr with ESMTP; 08 Oct 2007 23:37:14 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAL89Ckd5LCRsVWdsb2JhbAAMji8BIA X-IronPort-AV: E=Sophos;i="4.21,244,1188743400"; d="scan'208";a="206539246" Received: from ppp121-44-36-108.lns10.syd7.internode.on.net (HELO [192.168.1.201]) ([121.44.36.108]) by ipmail01.adl2.internode.on.net with ESMTP; 09 Oct 2007 07:07:11 +0930 Subject: Re: [Caml-list] Correct way of programming a CGI script From: skaller To: Gerd Stolpmann Cc: Tom , Caml-list List In-Reply-To: <1191859489.10162.16.camel@localhost.localdomain> References: <1191859489.10162.16.camel@localhost.localdomain> Content-Type: text/plain Date: Tue, 09 Oct 2007 07:37:09 +1000 Message-Id: <1191879429.28011.27.camel@rosella.wigram> Mime-Version: 1.0 X-Mailer: Evolution 2.10.1 Content-Transfer-Encoding: 7bit X-Miltered: at discorde with ID 470AA30B.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; 0200,:01 gerd:01 stolpmann:01 ocaml:01 gerd:01 recursive:01 ocaml:01 mutable:01 bug:01 xavier's:01 compiler:01 buffer:01 buffer:01 compiler:01 sourceforge:01 On Mon, 2007-10-08 at 18:04 +0200, Gerd Stolpmann wrote: > > I heard that OCaml is particularly slow (and probably > > memory-inefficient) when it comes to string manipulation. > > No, this is nonsense. Of course, you can slow everything down by using > strings in an inappropriate way, like > > let rec concat_list l = > match l with > [] -> "" > | s :: l' -> s ^ concat_list l' Now Gerd, I would not call the claim nonsense. If you can't use a data structure in a natural way, I'd say the claim indeed has some weight. The example above is ugly because it isn't tail recursive. If you consider an imperative loop to concat the strings in an array let s = ref "" in for i = 0 to Array.length a do s := !s ^ a.[i] done; then Ocaml is likely to do this slowly. C++ on the other hand will probably do this faster, especially if you reserve enough storage first. The poor performance of Ocaml in such situations is a result of two factors. The first is the worst-possible choice for a data representation: mutable characters and immutable length. The mutability of characters has limited utility and prevents easy functional optimisations, the useful mutability would have to include both the content and the length (as in C++). The second issue would probably make a functional string have poor performance: Ocaml doesn't do any serious optimisations, so it probably wouldn't recognize an optimisation opportunity anyhow. Note this is by design policy, it isn't a bug or laziness. [I'm sure someone can quote a ref to Xavier's comments on this] The effect is that you do have to make fairly low level choices in Ocaml to get good performance. The good thing about this is that the optimisation techniques are manifest in the source code so you have control over them. Felix does high level optimisations and sometimes a tiny change in the code can cause orders of magnitude performance differences, and when I notice it can take me (the author) quite some time to track down what triggered the difference in the generated code. Now, back to the issue: in the Felix compiler itself, the code generator is printing out C++ code. This is primarily done with Ocaml string concatenation of exactly the kind which one might call 'inappropriate'. Buffer is used too, but only for aggregating large strings. The reason, trivially, is that it is easier and clearer to write "class " ^ name ^ " {\n" ^ catmap "\n" string_of_member members ^ "\n};\n" than to use the imperative Buffer everywhere. The above gives more clue to what the output looks like. Despite the cost of using strings this way .. the compiler backend code generator isn't a performance bottleneck. -- John Skaller Felix, successor to C++: http://felix.sf.net