From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 83BF67F891 for ; Fri, 14 Mar 2014 11:21:33 +0100 (CET) X-IronPort-AV: E=Sophos;i="4.97,653,1389740400"; d="scan'208";a="62774268" Received: from yquem.inria.fr ([128.93.8.37]) by mail2-relais-roc.national.inria.fr with ESMTP; 14 Mar 2014 11:21:32 +0100 Received: by yquem.inria.fr (Postfix, from userid 18965) id DD18FE199F; Fri, 14 Mar 2014 11:21:32 +0100 (CET) Date: Fri, 14 Mar 2014 11:21:32 +0100 From: Francois Pottier To: oleg@okmij.org Cc: bob.atkey@gmail.com, caml-list@inria.fr Message-ID: <20140314102132.GA12133@yquem.inria.fr> Reply-To: Francois.Pottier@inria.fr References: <20140311093235.92383.qmail@www1.g3.pair.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20140311093235.92383.qmail@www1.g3.pair.com> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [Caml-list] I never succeeded in using Format Dear all, Interesting discussion, which prompted me to come out of my retreat and publish a new implementation of PPrint, available now :-) Oleg is right: the now-old PPrint implementation uses backtracking and its time complexity is dependent on the window width. Oleg's message caused me to think, and I realized that I had missed a major opportunity for simplification and optimization, which in fact was shortly thereafter mentioned by Bob in his reply to Oleg's message. The idea is simple. The PPrint API requires the document to be constructed in an eager, bottom-up manner. (One reason for this decision was syntax: I did not want the user to have to write "lazy" everywhere, and I did not want to impose the use of a syntax extension.) So, we are paying a high cost in space (linear space overhead), but in return, it is easy to compute the required width of every (sub-)document as it is constructed. This in turn means that the rendering process can be performed in linear time, without dependency on the window width, without any backtracking or buffering. I have implemented this idea in the new version of PPrint and compared it with the old version. In short, the results are as follows, for a set of randomly-generated documents: - the construction of the document is roughly just as fast as it was (but documents occupy slightly more space in memory) - rendering is faster than before, between 2x and 3x faster - the code is much simpler than before (this is the key benefit) - two features are lost (namely, the primitive operators [nesting] and [column], which were used to get access to the engine's state during rendering; they are no longer supported because they prevent the width pre-computation). I should point out that rendering is now between 10x and 20x faster than the construction of the document, which (I believe) means that the current implementation is essentially unbeatable in terms of throughput. So, in my view, the bottom line is, if one is willing to live with linear space overhead, then this approach is preferable for its simplicity and efficiency; if one must work in constant space, then Oleg's approach is preferable. The new release of PPrint is available here: http://gallium.inria.fr/~fpottier/pprint/pprint.tar.gz and should reach opam pretty soon ("opam install pprint"). Best regards, -- François Pottier Francois.Pottier@inria.fr http://gallium.inria.fr/~fpottier/