From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <francois.pottier@inria.fr>
X-Original-To: caml-list@sympa.inria.fr
Delivered-To: caml-list@sympa.inria.fr
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83])
	by sympa.inria.fr (Postfix) with ESMTPS id 83BF67F891
	for <caml-list@sympa.inria.fr>; Fri, 14 Mar 2014 11:21:33 +0100 (CET)
X-IronPort-AV: E=Sophos;i="4.97,653,1389740400"; 
   d="scan'208";a="62774268"
Received: from yquem.inria.fr ([128.93.8.37])
  by mail2-relais-roc.national.inria.fr with ESMTP; 14 Mar 2014 11:21:32 +0100
Received: by yquem.inria.fr (Postfix, from userid 18965)
	id DD18FE199F; Fri, 14 Mar 2014 11:21:32 +0100 (CET)
Date: Fri, 14 Mar 2014 11:21:32 +0100
From: Francois Pottier <Francois.Pottier@inria.fr>
To: oleg@okmij.org
Cc: bob.atkey@gmail.com, caml-list@inria.fr
Message-ID: <20140314102132.GA12133@yquem.inria.fr>
Reply-To: Francois.Pottier@inria.fr
References: <CA+aevF_n0T1FQeka029scoXKy2dd+j3RWVeQUp988+UVgL6kmQ@mail.gmail.com>
 <20140311093235.92383.qmail@www1.g3.pair.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <20140311093235.92383.qmail@www1.g3.pair.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Subject: Re: [Caml-list] I never succeeded in using Format


Dear all,

Interesting discussion, which prompted me to come out of my retreat and
publish a new implementation of PPrint, available now :-)

Oleg is right: the now-old PPrint implementation uses backtracking and its
time complexity is dependent on the window width.

Oleg's message caused me to think, and I realized that I had missed a major
opportunity for simplification and optimization, which in fact was shortly
thereafter mentioned by Bob in his reply to Oleg's message. The idea is
simple. The PPrint API requires the document to be constructed in an eager,
bottom-up manner. (One reason for this decision was syntax: I did not want the
user to have to write "lazy" everywhere, and I did not want to impose the use
of a syntax extension.) So, we are paying a high cost in space (linear space
overhead), but in return, it is easy to compute the required width of every
(sub-)document as it is constructed. This in turn means that the rendering
process can be performed in linear time, without dependency on the window
width, without any backtracking or buffering.

I have implemented this idea in the new version of PPrint and compared it with
the old version. In short, the results are as follows, for a set of
randomly-generated documents:

  - the construction of the document is roughly just as fast as it was
    (but documents occupy slightly more space in memory)

  - rendering is faster than before, between 2x and 3x faster

  - the code is much simpler than before (this is the key benefit)

  - two features are lost (namely, the primitive operators [nesting] and
    [column], which were used to get access to the engine's state during
    rendering; they are no longer supported because they prevent the width
    pre-computation).

I should point out that rendering is now between 10x and 20x faster than the
construction of the document, which (I believe) means that the current
implementation is essentially unbeatable in terms of throughput. So, in my
view, the bottom line is, if one is willing to live with linear space
overhead, then this approach is preferable for its simplicity and efficiency;
if one must work in constant space, then Oleg's approach is preferable.

The new release of PPrint is available here:

  http://gallium.inria.fr/~fpottier/pprint/pprint.tar.gz

and should reach opam pretty soon ("opam install pprint").

Best regards,

-- 
François Pottier
Francois.Pottier@inria.fr
http://gallium.inria.fr/~fpottier/