From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.1 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id B4E98BC6B for ; Tue, 9 Oct 2007 12:26:33 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAI70CkfAXQImh2dsb2JhbACBWYxuAQEBCAop X-IronPort-AV: E=Sophos;i="4.21,248,1188770400"; d="scan'208";a="17732587" Received: from discorde.inria.fr ([192.93.2.38]) by mail4-smtp-sop.national.inria.fr with ESMTP; 09 Oct 2007 12:26:33 +0200 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l99AQWdX007255 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Tue, 9 Oct 2007 12:26:33 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAJ/zCkfU4366nmdsb2JhbACBWYxuAQEBAQcCCBEY X-IronPort-AV: E=Sophos;i="4.21,248,1188770400"; d="scan'208";a="4242045" Received: from moutng.kundenserver.de ([212.227.126.186]) by mail3-smtp-sop.national.inria.fr with ESMTP; 09 Oct 2007 12:26:32 +0200 Received: from gate.lan.gerd-stolpmann.de (dslb-084-059-122-233.pools.arcor-ip.net [84.59.122.233]) by mrelayeu.kundenserver.de (node=mrelayeu7) with ESMTP (Nemesis) id 0ML2xA-1IfCI40l8Y-0001eh; Tue, 09 Oct 2007 12:26:16 +0200 Received: from flakew.lan.gerd-stolpmann.de (fw.lan.gerd-stolpmann.de [192.168.1.1]) by gate.lan.gerd-stolpmann.de (Postfix) with ESMTP id E963CC0C6; Tue, 9 Oct 2007 12:26:14 +0200 (CEST) Subject: Re: [Caml-list] Correct way of programming a CGI script From: Gerd Stolpmann To: skaller Cc: Caml-list List In-Reply-To: <1191879429.28011.27.camel@rosella.wigram> References: <1191859489.10162.16.camel@localhost.localdomain> <1191879429.28011.27.camel@rosella.wigram> Content-Type: text/plain Date: Tue, 09 Oct 2007 12:26:13 +0200 Message-Id: <1191925574.10162.54.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.6.1 Content-Transfer-Encoding: 7bit X-Provags-ID: V01U2FsdGVkX18dDV9l2ZTyXM6GJbgXcNOTwPsaG90r85lLCJ8 7tiP1JJdLEo3liMXH9VuouZwPE/O/msd4MuIIwkW4Gs7Xfn/om UZLlWmfxx0h/9lyghDwAXjcpC+HVknl X-Miltered: at discorde with ID 470B5758.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; gerd:01 stolpmann:01 0200,:01 gerd:01 stolpmann:01 ocaml:01 runtime:01 uglier:01 o'caml:01 runtime:01 compiler:01 buffer:01 o'caml:01 ocaml:01 compiler:01 Am Dienstag, den 09.10.2007, 07:37 +1000 schrieb skaller: > On Mon, 2007-10-08 at 18:04 +0200, Gerd Stolpmann wrote: > > > > I heard that OCaml is particularly slow (and probably > > > memory-inefficient) when it comes to string manipulation. > > > > No, this is nonsense. Of course, you can slow everything down by using > > strings in an inappropriate way, like > > > > let rec concat_list l = > > match l with > > [] -> "" > > | s :: l' -> s ^ concat_list l' > > Now Gerd, I would not call the claim nonsense. If you can't > use a data structure in a natural way, I'd say the claim indeed > has some weight. I don't know what the "nature" of strings is. I'm rather to believe they are artifacts, and that there are several ways of defining strings mostly resulting in different runtime behaviour. The point is here that ^ always copies strings, and this is generally expensive, especially in this example, because the same bytes are copied every time the result string is extended. I'm fully aware that you can get rid of this copying in the definition of strings, but this has a price for some other operations. As said, you can implement strings in various ways. > If you consider an imperative loop to concat the strings > in an array > > let s = ref "" in > for i = 0 to Array.length a do > s := !s ^ a.[i] > done; I would call this version even uglier... But taste differs. The point is that neither the O'Caml runtime representation of strings nor the compiler (it could recognize the specific use of ^ and implicitly convert the code so it uses a buffer) do anything for avoiding this trap. But we have to be fair. It is simply nonsense to call the whole O'Caml string manipulation slow. You have access to all operations you need to do it fast. You just have to know how to code it. > then Ocaml is likely to do this slowly. C++ on the other > hand will probably do this faster, especially if you reserve > enough storage first. > Now, back to the issue: in the Felix compiler itself, the > code generator is printing out C++ code. This is primarily > done with Ocaml string concatenation of exactly the kind which > one might call 'inappropriate'. Buffer is used too, but only > for aggregating large strings. > > The reason, trivially, is that it is easier and clearer to write > > "class " ^ name ^ " {\n" ^ > catmap "\n" string_of_member members ^ > "\n};\n" > > > than to use the imperative Buffer everywhere. The above gives more > clue to what the output looks like. Well, if you only concatenate a few strings isn't going to be a problem, and is probably as fast as using a buffer (which has also some cost). Gerd -- ------------------------------------------------------------ Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany gerd@gerd-stolpmann.de http://www.gerd-stolpmann.de Phone: +49-6151-153855 Fax: +49-6151-997714 ------------------------------------------------------------