From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.3 required=5.0 tests=AWL,DNS_FROM_RFC_ABUSE autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id 8891CBC6B for ; Tue, 30 Oct 2007 06:47:34 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAG5iJkfLENaHn2dsb2JhbACOXgIBAQcEBgkIGA X-IronPort-AV: E=Sophos;i="4.21,345,1188770400"; d="scan'208";a="3816863" Received: from ipmail03.adl2.internode.on.net ([203.16.214.135]) by mail1-smtp-roc.national.inria.fr with ESMTP; 30 Oct 2007 06:47:32 +0100 X-IronPort-AV: E=Sophos;i="4.21,345,1188743400"; d="scan'208";a="177496227" Received: from ppp121-44-98-190.lns10.syd6.internode.on.net (HELO [192.168.1.201]) ([121.44.98.190]) by ipmail03.adl2.internode.on.net with ESMTP; 30 Oct 2007 16:16:24 +1030 Subject: Re: [Caml-list] Preferred Way to Split a List From: skaller To: Julien Moutinho Cc: caml-list@yquem.inria.fr In-Reply-To: <20071030012012.GA29836@localhost> References: <47266DB7.1020009@SmokejumperIT.com> <20071030012012.GA29836@localhost> Content-Type: text/plain Date: Tue, 30 Oct 2007 16:46:22 +1100 Message-Id: <1193723182.6129.66.camel@rosella.wigram> Mime-Version: 1.0 X-Mailer: Evolution 2.12.0 Content-Transfer-Encoding: 7bit X-Spam: no; 0.00; 0100,:01 pointer:01 mutable:01 'list:01 dependencies:01 sourceforge:01 wrote:01 wrote:01 rec:01 rec:01 caml-list:01 tail:01 tail:01 unsafe:01 unsafe:01 On Tue, 2007-10-30 at 02:20 +0100, Julien Moutinho wrote: > On Mon, Oct 29, 2007 at 06:33:11PM -0500, Robert Fischer wrote: > > What is the preferred way to split a list into two, at an arbitrary point? > > There's lots of ways you could do it, but I'm not sure if there's a > > standard best practice for this. > > Two answers for what they're worth: Ouch. A simpler way is: let split ls n = let rec aux inp out n = if n = 0 then unsafe_rev_cat out inp else match inp with | [] -> unsafe_rev_cat out [] | h :: t -> aux t (h::out) (n-1) in aux ls [] n Here 'unsafe_rev_cat' uses magic to reverse a list in place and tack the tail onto the end. However this isn't the fastest way. The fastest way is: let split ls n = let out = magic_list() in let rec aux inp n = if n = 0 then magic_cat out inp else match inp with | [] -> () | h :: t -> magic_append h out; aux t (n-1) in aux ls n; list_of out Here the magic list is the usual list data structure PLUS a pointer to the last element allowing O(1) append. This is of course a mutable data structure. All the operations here can be implemented without Obj.magic, however it's much faster WITH Obj.magic: by using a magic list structure identical to an actual list, the 'list_of' function becomes a typecast. A similar code uses a reversed ordinary list for the temporary, and then reverses it inplace (safe because it is a temporary) and tacks the tail on. Felix uses that algorithm in its standard library, however the one actually coded above is optimal (it isn't possible to do it any faster). Note the function above is purely functional (mutations are isolated to the implementation) and does not itself use any unsafe features (but the magic implementation is unsafe, so it should be put in a library to limit bugs and implementation dependencies). -- John Skaller Felix, successor to C++: http://felix.sf.net