Re: WWW Page of Team PLClub (Re: ICFP programming contest: results)

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* Re: WWW Page of Team PLClub (Re: ICFP programming contest: results)
@ 2000-10-05 10:33 David McClain
  2000-10-05 20:20 ` Stefan Monnier
  2000-10-06  8:07 ` WWW Page of Team PLClub (Re: ICFP programming contest: results) Xavier Leroy
  0 siblings, 2 replies; 17+ messages in thread
From: David McClain @ 2000-10-05 10:33 UTC (permalink / raw)
  To: eijiro_sumii, caml-list

Hi, Congratulations!!!!

Could you provide a reference to de Bruijin indexing? I recall reading about
it some time ago, and after having searched my books here, I cannot find any
references to it. So I probably saw it in a paper some time ago... Anyway, I
would be interested since I have an interpreter name-lookup problem here
too.

Thanks in advance,

Congrats!!!

- David McClain

-----Original Message-----
From: eijiro_sumii@anet.ne.jp <eijiro_sumii@anet.ne.jp>
To: caml-list@inria.fr <caml-list@inria.fr>
Cc: jgm@cs.cornell.edu <jgm@cs.cornell.edu>; vouillon@saul.cis.upenn.edu
<vouillon@saul.cis.upenn.edu>; vgapeyev@seas.upenn.edu
<vgapeyev@seas.upenn.edu>; hahosoya@saul.cis.upenn.edu
<hahosoya@saul.cis.upenn.edu>; sumii@saul.cis.upenn.edu
<sumii@saul.cis.upenn.edu>
Date: Wednesday, October 04, 2000 10:26 PM
Subject: WWW Page of Team PLClub (Re: ICFP programming contest: results)


>Dear Camlers,
>
>> By the way, we are going to make a web page to describe what we did in
>> detail.  We'll announce it in this list when it's ready.
>
>Here it is.
>
>  http://www.cis.upenn.edu/~sumii/icfp/
>
>Enjoy,
>
>Team PLClub
>
>P.S.  Could you please add a link to the web page above in the contest
>result page (http://www.cs.cornell.edu/icfp/contest_results.htm), Greg?
>



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WWW Page of Team PLClub (Re: ICFP programming contest: results)
  2000-10-05 10:33 WWW Page of Team PLClub (Re: ICFP programming contest: results) David McClain
@ 2000-10-05 20:20 ` Stefan Monnier
  2000-10-06 19:26   ` eijiro_sumii
  2000-10-06  8:07 ` WWW Page of Team PLClub (Re: ICFP programming contest: results) Xavier Leroy
  1 sibling, 1 reply; 17+ messages in thread
From: Stefan Monnier @ 2000-10-05 20:20 UTC (permalink / raw)
  To: caml-list

>>>>> "David" == David McClain <dmcclain@azstarnet.com> writes:
> Could you provide a reference to de Bruijin indexing? I recall reading about

On a related note, how would that compare (performancewise) to an
approach like "abstract syntax" (represent a function not
as (<id>, <exp>) and neither as <exp> (as in the case of deBruijn)
but as fn x => <exp>) ?


        Stefan



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WWW Page of Team PLClub (Re: ICFP programming contest: results)
  2000-10-05 10:33 WWW Page of Team PLClub (Re: ICFP programming contest: results) David McClain
  2000-10-05 20:20 ` Stefan Monnier
@ 2000-10-06  8:07 ` Xavier Leroy
  2000-10-07  8:35   ` Pierre Weis
  1 sibling, 1 reply; 17+ messages in thread
From: Xavier Leroy @ 2000-10-06  8:07 UTC (permalink / raw)
  To: David McClain, eijiro_sumii, caml-list, monnier+lists/caml/news/

David McClain asks:

> Could you provide a reference to de Bruijin indexing? I recall reading about
> it some time ago, and after having searched my books here, I cannot find any
> references to it.

The idea behind de Bruijn indices is to represent variable references
not by name, but by the number of binders to cross to get at the
binder for the variable in question.  An example shoud make this
clearer:

        fun x -> fun y -> x + y

is represented in de Bruijn notation as

        fun -> fun -> var1 + var0

"var1" means "go up one binder", so this refers to the variable bound
by the first "fun".
"var0" means "go to the closest binder", so this refers to the
variable bound by the second "fun".

Notice that a de Bruijn index is relative to the position of the
variable reference.  Hence the same variable can be referenced by
different de Bruijn indices depending on context, e.g.

        fun x -> print_int x; fun y -> x + y

becomes
        fun -> print_int var0; fun -> var1 + var0
                         ^^^^         ^^^^
                       this is x     this is x too!

De Bruijn indices are interesting for several reasons.  One is
that they totally eliminate any alpha-conversion problems: expressions
have unique representations, without any need to work modulo renaming of
bound variables; moreover, substitution of variables by non-closed
expressions works very well, without any risk of name clashes as in
the normal, name-based notation.

Another reason is that if you represent the evaluation environment as
a stack, with each new binding simply pushing the bound value on top,
then the de Bruijn index for a variable is simply the position of the
variable's value in the stack, relative to the top of the stack.
(Substitute "list" for "stack" to deal with persistent, heap-allocated
environments, e.g. for closures.)  This leads to very simple
translations into abstract machine code.

Stefan Monnier asks:

> On a related note, how would that compare (performancewise) to an
> approach like "abstract syntax" (represent a function not
> as (<id>, <exp>) and neither as <exp> (as in the case of deBruijn)
> but as fn x => <exp>) ?

My feeling is that higher-order abstract syntax isn't applicable to
the GML language, because the language has variable assignment, which
I don't see how to fit in the h.o.a.s. representation.

- Xavier Leroy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WWW Page of Team PLClub (Re: ICFP programming contest: results)
  2000-10-05 20:20 ` Stefan Monnier
@ 2000-10-06 19:26   ` eijiro_sumii
  2000-10-07  8:19     ` automatic translation in Team PLClub ICFP'2000 entry Julian Assange
  0 siblings, 1 reply; 17+ messages in thread
From: eijiro_sumii @ 2000-10-06 19:26 UTC (permalink / raw)
  To: caml-list; +Cc: sumii

> > Could you provide a reference to de Bruijin indexing?

Xavier kindly gave a nice explanation.:) Thanks!

> On a related note, how would that compare (performancewise) to an
> approach like "abstract syntax" (represent a function not
> as (<id>, <exp>) and neither as <exp> (as in the case of deBruijn)
> but as fn x => <exp>) ?

As far as I know, to use that "higher-order abstract syntax" approach,
we have to _translate_ a GML program into a Caml code and evaluate it
at runtime.  That is exactly what I did in the "compiling" version
(PLClubCN) of our entry, but it didn't work well because the
compilation itself (including ocamlopt.opt) took some time, which
didn't pay.

Eijiro

^ permalink raw reply	[flat|nested] 17+ messages in thread

* automatic translation in Team PLClub ICFP'2000 entry
  2000-10-06 19:26   ` eijiro_sumii
@ 2000-10-07  8:19     ` Julian Assange
  2000-10-07 16:30       ` eijiro_sumii
  0 siblings, 1 reply; 17+ messages in thread
From: Julian Assange @ 2000-10-07  8:19 UTC (permalink / raw)
  To: sumii; +Cc: caml-list

> compilation itself (including ocamlopt.opt) took some time, which
> didn't pay.

There must be an adapative break even point though. e.g if you're
n/m pixils done in s seconds, and the average cost of compilation
is c times the simple rendering speed per line and the average speedup
is q, then subtracting the pixils already done, work out whether

	(m-n) * p < c*lines*p' + (m-n)*p*q
	where
		p = (m-n)/s 		rendering speed
		p' =			simple rendering speed (speed of
					a delay loop or uncomplicated pixel etc)

And if so, compile.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WWW Page of Team PLClub (Re: ICFP programming contest: results)
  2000-10-06  8:07 ` WWW Page of Team PLClub (Re: ICFP programming contest: results) Xavier Leroy
@ 2000-10-07  8:35   ` Pierre Weis
  2000-10-07  9:55     ` Markus Mottl
                       ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Pierre Weis @ 2000-10-07  8:35 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list

Xavier explained de Bruijn indices

> Notice that a de Bruijn index is relative to the position of the
> variable reference.  Hence the same variable can be referenced by
> different de Bruijn indices depending on context, e.g.
> 
>         fun x -> print_int x; fun y -> x + y
> 
> becomes
>         fun -> print_int var0; fun -> var1 + var0
>                          ^^^^         ^^^^
>                        this is x     this is x too!
> 
> De Bruijn indices are interesting for several reasons.  One is
> that they totally eliminate any alpha-conversion problems: expressions
> have unique representations, without any need to work modulo renaming of
> bound variables; moreover, substitution of variables by non-closed
> expressions works very well, without any risk of name clashes as in
> the normal, name-based notation.
> 
> Another reason is that if you represent the evaluation environment as
> a stack, with each new binding simply pushing the bound value on top,
> then the de Bruijn index for a variable is simply the position of the
> variable's value in the stack, relative to the top of the stack.
> (Substitute "list" for "stack" to deal with persistent, heap-allocated
> environments, e.g. for closures.)  This leads to very simple
> translations into abstract machine code.

All this is perfectly true and correct, but in my humble opinion you
forgot to mention the dark side of De Bruijn indices:

- they are perfect for machines but not convenient for human beings!
(We cannot easily consider that the same variable have plenty of
different names in an expression.)

- you have to manipulate indices anyway to mimmick alpha-conversion
when performing beta-reduction (the operation is named the ``lifting''
of De Bruijn indices), and you know that this is at least as difficult
and error prone as performing alpha-renaming. As an example, consider
the parallel (or multiple) beta-reduction: the problem is to calculate
f e1 e2 ... en by substituting e1 e2 ... en, in the body of f,
supposed to be defined by lambda expression with n binders fun -> fun
-> ... -> body. The problem is to perform the reduction in one pass on
the body of f. This operation is trivial if you use a conventional
beta reducer, but it is surprisingly difficult if you use De Bruijn
indices.

- as a consequence of these two properties, debugging code that
manipulates and transforms lambda code in De Bruijn form is just a
nightmare (each time I had to do it, I ended by writing a
pretty-printer that reverts De Bruijn notation to good old names for
variables!)

That's why I consider a ``tour de force'' the Caml Light compiler that
uses De Bruijn indices for the whole language. However, remember the
famous ``De Bruijn wins again!'' leitmotiv we used to ear in the
building when someone was desesperately fighting with wrong indices
generated by the compiler!

Also consider that the Caml Light compiler does not optimize and
symbolycally manipulate programs: that's devoted to the next ``tour de
force'', the Objective Caml optimizing compiler, which does not use
the De Bruijn indices notation for lambda terms...

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://cristal.inria.fr/~weis/




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WWW Page of Team PLClub (Re: ICFP programming contest: results)
  2000-10-07  8:35   ` Pierre Weis
@ 2000-10-07  9:55     ` Markus Mottl
  2000-10-07 10:24       ` Pierre Weis
  2000-10-08 21:26     ` John Max Skaller
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Markus Mottl @ 2000-10-07  9:55 UTC (permalink / raw)
  To: Pierre Weis; +Cc: caml-list

On Sat, 07 Oct 2000, Pierre Weis wrote:
> Also consider that the Caml Light compiler does not optimize and
> symbolycally manipulate programs: that's devoted to the next ``tour de
> force'', the Objective Caml optimizing compiler, which does not use
> the De Bruijn indices notation for lambda terms...

This sounds interesting! - May I ask what optimisations are considered?
Will there be documentation for the (new) intermediate representation on
which optimisations operate? It would be really important to have
information about side effects there, too.

I could imagine playing around with such representations in LambdaProlog
(extremely convenient for prototyping transformation systems). I don't know
whether anything useful will come out, but it may be fun - for others as
well.

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WWW Page of Team PLClub (Re: ICFP programming contest: results)
  2000-10-07  9:55     ` Markus Mottl
@ 2000-10-07 10:24       ` Pierre Weis
  0 siblings, 0 replies; 17+ messages in thread
From: Pierre Weis @ 2000-10-07 10:24 UTC (permalink / raw)
  To: Markus Mottl; +Cc: caml-list

> Markus Mottl wrote:

> On Sat, 07 Oct 2000, Pierre Weis wrote:
> > Also consider that the Caml Light compiler does not optimize and
> > symbolycally manipulate programs: that's devoted to the next ``tour de
> > force'', the Objective Caml optimizing compiler, which does not use
> > the De Bruijn indices notation for lambda terms...
> 
> This sounds interesting! - May I ask what optimisations are considered?

Classical: inlining, type-directed optimizations for primitives,
compile time beta-reductions, constant folding, ...

A bit more hairy: direct calls to functions (including curried
functions and functions whose argument is a tuple), 0-cfa like
analysis to access globals in modules, ...

Xavier is the right man to tell you what's going on inside the compiler,
but you can also read the compiler's source files ...

> Will there be documentation for the (new) intermediate representation on
> which optimisations operate? It would be really important to have
> information about side effects there, too.

Yes, you need a purity analysis for some of those optimizations...

> I could imagine playing around with such representations in LambdaProlog
> (extremely convenient for prototyping transformation systems). I don't know
> whether anything useful will come out, but it may be fun - for others as
> well.

I'm sure many people will be interested at reading your code and
conclusions about implementing transformations in Lambdaprolog ...

Best regards,

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://cristal.inria.fr/~weis/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: automatic translation in Team PLClub ICFP'2000 entry
  2000-10-07  8:19     ` automatic translation in Team PLClub ICFP'2000 entry Julian Assange
@ 2000-10-07 16:30       ` eijiro_sumii
  0 siblings, 0 replies; 17+ messages in thread
From: eijiro_sumii @ 2000-10-07 16:30 UTC (permalink / raw)
  To: proff; +Cc: caml-list, sumii

> > compilation itself (including ocamlopt.opt) took some time, which
> > didn't pay.
> 
> There must be an adapative break even point though. e.g if you're
...
> And if so, compile.

Right, I actually tried that, but the "even point" was too high -- as
I wrote in our web page, my translator produced too complex Caml
programs, and ocaml{c,opt}{,.opt} often aborted after several minutes,
saying "out of registers" or "out of memory" or something like that...
I should probably have modified the translator to produce simpler
programs, but I didn't have the time to do that.

Eijiro

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WWW Page of Team PLClub (Re: ICFP programming contest: results)
  2000-10-07  8:35   ` Pierre Weis
  2000-10-07  9:55     ` Markus Mottl
@ 2000-10-08 21:26     ` John Max Skaller
  2000-10-10 10:23       ` Pierre Weis
  2000-10-09  5:51     ` Benjamin Pierce
  2000-10-09  7:19     ` de Bruijn indices (Re: WWW Page of Team PLClub) Eijiro Sumii
  3 siblings, 1 reply; 17+ messages in thread
From: John Max Skaller @ 2000-10-08 21:26 UTC (permalink / raw)
  To: Pierre Weis; +Cc: Xavier Leroy, caml-list

Pierre Weis wrote:

> the body of f. This operation is trivial if you use a conventional
> beta reducer, but it is surprisingly difficult if you use De Bruijn
> indices.

	Just out of curiousity, what do you mean
by a 'difficult' algorithm?

	To explain my question in slightly more depth: given
some fixed problems with known algorithms, all these algorithms,
in the first instance, have equal 'difficulty', namely,  'trivial':
if the algorithm is known, it can be implemented. (In general,
coding a known algorithm is so easy compared with other programming
tasks that I would classify coding by how laborious it is: the only
'difficulty' involved is staying awake long enough to finish the job :-)

	It is sometimes difficult to _find_ an algorithm for a problem,
and one may say that some algorithms are 'inflexible' in the sense
that small variations in the problem make finding a solution
by considering the 'original' algorithm difficult.

	It may also be hard to tranform a correct algorithm into
a more efficient version.

	Also, it is clear that some algorithms are difficult to
understand. And, some algorithms, coded incorrectly, may be difficult
to debug.
 
-- 
John (Max) Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
checkout Vyper http://Vyper.sourceforge.net
download Interscript http://Interscript.sourceforge.net



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WWW Page of Team PLClub (Re: ICFP programming contest: results)
  2000-10-07  8:35   ` Pierre Weis
  2000-10-07  9:55     ` Markus Mottl
  2000-10-08 21:26     ` John Max Skaller
@ 2000-10-09  5:51     ` Benjamin Pierce
  2000-10-09  7:19     ` de Bruijn indices (Re: WWW Page of Team PLClub) Eijiro Sumii
  3 siblings, 0 replies; 17+ messages in thread
From: Benjamin Pierce @ 2000-10-09  5:51 UTC (permalink / raw)
  To: Pierre Weis; +Cc: caml-list

Fanning the fire on deBruijn indices...

> - you have to manipulate indices anyway to mimmick alpha-conversion
> when performing beta-reduction (the operation is named the ``lifting''
> of De Bruijn indices), and you know that this is at least as difficult
> and error prone as performing alpha-renaming. As an example, consider
> ...

There's an important difference, though, in the ways that these errors
show up: bugs in deBruijn-indexing schemes tend to manifest immediately
and catastrophically -- if you miss a 'shift' everything quickly gets all
screwed up and you can see that you did something wrong and fix it right
away -- while name-based implementations of substitution tend to fail
only six months later, when someone makes an unlucky choice of variable
names. 

There's a story that a substitution bug in the original LCS theorem
prover persisted for six *years* before anybody noticed...

> - as a consequence of these two properties, debugging code that
> manipulates and transforms lambda code in De Bruijn form is just a
> nightmare (each time I had to do it, I ended by writing a
> pretty-printer that reverts De Bruijn notation to good old names for
> variables!)

I usually do this too.  The pretty-printer can also do a little
consistency checking on the deBruijn indices: each variable is
represented as a pair of its index and the expected size of the context;
if the printing routine sees that the actual context does not have this
size, it complains.  

One drawback of deBruijn indices that you did not mention is that, at
least when implemented naively, they are pretty expensive in terms of
the amount of consing that's required to perform substitutions. 

At the end of the day, I'm not that big a fan of *either* deBruijn or
named presentations of terms... both are pretty ugly to work with, in
different ways.  In fact, I find both distressing and fascinating that,
after so many years of thought by so many smart people, we're still in
such an unsatisfactory state wrt. binding.

    B

^ permalink raw reply	[flat|nested] 17+ messages in thread

* de Bruijn indices (Re: WWW Page of Team PLClub)
  2000-10-07  8:35   ` Pierre Weis
                       ` (2 preceding siblings ...)
  2000-10-09  5:51     ` Benjamin Pierce
@ 2000-10-09  7:19     ` Eijiro Sumii
  2000-10-10 10:36       ` Pierre Weis
  2000-10-10 14:04       ` de Bruijn indices Gerard Huet
  3 siblings, 2 replies; 17+ messages in thread
From: Eijiro Sumii @ 2000-10-09  7:19 UTC (permalink / raw)
  To: Pierre.Weis; +Cc: Xavier.Leroy, caml-list, sumii

I don't actually use De Bruijn indices so often (because I usually use
higher-order abstract syntax whenever appropriate), but anyway...

> All this is perfectly true and correct, but in my humble opinion you
> forgot to mention the dark side of De Bruijn indices:

In my experience of writing some interpreters, compilers, type
checkers, partial evaluators, etc.,

> - they are perfect for machines but not convenient for human beings!
> (We cannot easily consider that the same variable have plenty of
> different names in an expression.)

this inconvenience can be mitigated by including _both_ a name (for
human use) and an index (for machine use) in each variable, and

> - you have to manipulate indices anyway to mimmick alpha-conversion
> when performing beta-reduction (the operation is named the ``lifting''
> of De Bruijn indices), and you know that this is at least as difficult
> and error prone as performing alpha-renaming.

such errors in lifting, shifting, etc. of De Bruijn indices are much
easier to find than those in alpha-conversion, because they lead to
immediate problems (which occurs in simple programs) rather than
subtle ones (which occurs in only programs that "shadow" variables in
particular ways).

Eijiro



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: WWW Page of Team PLClub (Re: ICFP programming contest: results)
  2000-10-08 21:26     ` John Max Skaller
@ 2000-10-10 10:23       ` Pierre Weis
  0 siblings, 0 replies; 17+ messages in thread
From: Pierre Weis @ 2000-10-10 10:23 UTC (permalink / raw)
  To: John Max Skaller; +Cc: caml-list

> Pierre Weis wrote:
> 
> > the body of f. This operation is trivial if you use a conventional
> > beta reducer, but it is surprisingly difficult if you use De Bruijn
> > indices.
> 
> 	Just out of curiousity, what do you mean
> by a 'difficult' algorithm?

I did not mention the word algorithm. I meant here that implementing
the parallel beta-reduction is not a trivial kind of iteration (map or
fold) on the basic one step beta-reducer (if you do want to obtain a
one pass reduction).

BTW (joking):

Definition: an algorithm is said to be difficult iff it it not
trivial to implement in Caml !

> 	To explain my question in slightly more depth: given
> some fixed problems with known algorithms, all these algorithms,
> in the first instance, have equal 'difficulty', namely,  'trivial':
> if the algorithm is known, it can be implemented. (In general,
> coding a known algorithm is so easy compared with other programming
> tasks that I would classify coding by how laborious it is: the only
> 'difficulty' involved is staying awake long enough to finish the job :-)

So, let's me tell a story about this ``difficult'' problem: having
problems to implement the parallel beta-reduction using the De Bruijn
indices, I looked in the litterature and found a thesis that claimed
to specify this transformation and used it in the rest of the
thesis. So, I turned the specification into a piece of Caml program;
it gave wrong answers. Fortunately, I had the thesis's author at hand;
hence, we sat together at the terminal and double-checked the
implementation wrt the specification; we were not able to find any
discrepancy in the program; then we changed the specification; then we
changed the program accordingly; it was still giving wrong answers on
some examples!

I gave up, and revert to multiple calls to the beta-reducer (and
accordingly to inefficient multiple rewritings of the function body).

I do not claim this problem is impossible to solve; I just claimed it
is ``surprisingly difficult'' compared to the trivial solution you
give to the same problem when you use a conventional beta-reducer.  It
is at least so difficult that a carefully written thesis may give a
wrong specification of the solution, even if it has been reviewed by
experts of the domain.

I think the De Bruijn indices solution to this problem may not be
worth the efforts it needs.

> 	It is sometimes difficult to _find_ an algorithm for a problem,
> and one may say that some algorithms are 'inflexible' in the sense
> that small variations in the problem make finding a solution
> by considering the 'original' algorithm difficult.

That's exactly what I observed for parallel beta-reduction in one pass.

> 	It may also be hard to tranform a correct algorithm into
> a more efficient version.

That's exactly the intention in using parallel beta-reduction in one pass.

> 	Also, it is clear that some algorithms are difficult to
> understand. And, some algorithms, coded incorrectly, may be difficult
> to debug.

Also true with De Bruijn indices transformations.

So, this problem meets all your criteria: that's why I think we can
say ``it is a surprisingly difficult problem''.

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://cristal.inria.fr/~weis/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: de Bruijn indices (Re: WWW Page of Team PLClub)
  2000-10-09  7:19     ` de Bruijn indices (Re: WWW Page of Team PLClub) Eijiro Sumii
@ 2000-10-10 10:36       ` Pierre Weis
  2000-10-10 14:04       ` de Bruijn indices Gerard Huet
  1 sibling, 0 replies; 17+ messages in thread
From: Pierre Weis @ 2000-10-10 10:36 UTC (permalink / raw)
  To: sumii; +Cc: caml-list

> this inconvenience can be mitigated by including _both_ a name (for
> human use) and an index (for machine use) in each variable, and

Hope you will not get the dark sides of both worlds!

> > - you have to manipulate indices anyway to mimmick alpha-conversion
> > when performing beta-reduction (the operation is named the ``lifting''
> > of De Bruijn indices), and you know that this is at least as difficult
> > and error prone as performing alpha-renaming.
> 
> such errors in lifting, shifting, etc. of De Bruijn indices are much
> easier to find than those in alpha-conversion, because they lead to
> immediate problems (which occurs in simple programs) rather than
> subtle ones (which occurs in only programs that "shadow" variables in
> particular ways).

You're right.

Still, I fooled a supposed ``perfectly debugged'' partial evaluator
with my first higher-order example (admittedly not a simple but a
rather subttle program, that forces you to recursively insert
lambda-terms and beta-reduce them, and could exercise your De Bruijn
indices lifter in a non standard way ...).

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://cristal.inria.fr/~weis/




^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: de Bruijn indices
  2000-10-09  7:19     ` de Bruijn indices (Re: WWW Page of Team PLClub) Eijiro Sumii
  2000-10-10 10:36       ` Pierre Weis
@ 2000-10-10 14:04       ` Gerard Huet
  2000-10-10 17:29         ` Chet Murthy
  1 sibling, 1 reply; 17+ messages in thread
From: Gerard Huet @ 2000-10-10 14:04 UTC (permalink / raw)
  To: sumii; +Cc: caml-list

In defense of de Bruijn indices :

1. They lead to efficient implementations. My original implementation of
the constructive engine in Coq's early versions used de Bruijn indices, and
performed reasonably well. It leads to a completely applicative
implementation of lambda-calculus, and the cost of lifting may be reduced
by Okasaki-like data structures. I still remember the day when this
super-hacker declared that he was going to replace all this inefficient
stuff by slick hashtables - only to undo everything 2 months later because
he was 30% slower.

2. The apparent complexity of lifting is to my view an inherent complexity
of substitution, which you have better face up rather than put under the
rug of name management of symbol tables, for which you will have to pay the
price in unexpected places. And when the going gets rough, at least your DB
indices behave well in natural recursions. When I set to understand fully
Bohm's theorem, down to programming a running Bohm discriminator, I used DB
indices, which generalise well to Bohm trees (ie potentially infinite
lambda-terms). I do not think I would have succeded in programming this
stuff without this explicit and concrete control on variable references.
And I could even publish the code itself, which is available as commented
Caml code in the paper:
G. Huet An Analysis of Bohm's theorem, Theoretical Computer Science 121
(1993) 145-167.

3. You can do proofs by induction on the corresponding structure, and thus
prove meta-theoretical results in a direct and clean way. You can even hope
to actually mechanise the formal development. See for instance:
G. Huet "Residual theory in lambda calculus: a formal development". J.
Functional programming 4,3 (1994) 371-394.
where a fair portion of the syntactic theory of beta-reduction is done
completely formally, using DB indices again. 

Gérard

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: de Bruijn indices
  2000-10-10 14:04       ` de Bruijn indices Gerard Huet
@ 2000-10-10 17:29         ` Chet Murthy
  2000-10-11 22:35           ` John Max Skaller
  0 siblings, 1 reply; 17+ messages in thread
From: Chet Murthy @ 2000-10-10 17:29 UTC (permalink / raw)
  To: caml-list

>>>>> "GH" == Gerard Huet <Gerard.Huet@inria.fr> writes:

    GH> 1. They lead to efficient implementations. My original
    GH> implementation of the constructive engine in Coq's early
    GH> versions used de Bruijn indices, and performed reasonably
    GH> well. It leads to a completely applicative implementation of
    GH> lambda-calculus, and the cost of lifting may be reduced by
    GH> Okasaki-like data structures. I still remember the day when
    GH> this super-hacker declared that he was going to replace all
    GH> this inefficient stuff by slick hashtables - only to undo
    GH> everything 2 months later because he was 30% slower.

It turns out that deBruijn indices are *significantly* more efficient
than explicit names.  Moreover, the renaming operations that must be
introduced to use explicit names are essentially the same as the
"lifting", etc, operations used in deBruijn indices.

I once modified Coq to use explicit names, and after spending a month
or so, trying to make it as fast as Coq with deBruijn indices,
switched back.

That experiment, and finding that every _single_ avenue for improved
performance, with explicit names, was a dead end, pretty much
convinced me that deBruijn indices are much faster.

Also, regarding human-readability, I think it is a red herring to ask
whether the term Lambda("x",Ref 2) (which corresponds to something
like ("lambda x. y") is more readable than Lambda("x",Var "y").

In both cases, without the "environment" that binds free variables to
values, types, or "meanings" of some sort, the meaning of the term is
undefined.  So we need to keep these environments around, and once we
have them, well, we can always just lookup the name, if want it.

Moreover, it is only during debugging that having the "explicit name"
actually buys us anything in readability.  At runtime, if the program
ever prints out something, it does so with respect to this
"environment", so there's always a good name to use.  If it is
manipulating terms, well, it couldn't very well use the name for
anything other than a lookup, since even in explicit names, the name
can be modified, e.g. by alpha-conversion.

Again, I started out as an "explicit name fanatic", and was converted
to the "creed of deBruijn indices", by hard experience.  I would
recommend that before anyone write off deBruijn indices, they attempt
to build a _large_ system (ok, so Coq isn't large by commercial
standards, but by the standards of academic systems it ain't small)
both ways.

--chet--

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: de Bruijn indices
  2000-10-10 17:29         ` Chet Murthy
@ 2000-10-11 22:35           ` John Max Skaller
  0 siblings, 0 replies; 17+ messages in thread
From: John Max Skaller @ 2000-10-11 22:35 UTC (permalink / raw)
  To: Chet Murthy; +Cc: caml-list

Chet Murthy wrote:
> to build a _large_ system (ok, so Coq isn't large by commercial
> standards, but by the standards of academic systems it ain't small)

Actually, if you multiply the size of Coq by 10, to get the equivalent
size for a C/C++ program, it isn't that small by commercial standands
either. :-)

-- 
John (Max) Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
checkout Vyper http://Vyper.sourceforge.net
download Interscript http://Interscript.sourceforge.net



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2000-10-12  6:25 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-10-05 10:33 WWW Page of Team PLClub (Re: ICFP programming contest: results) David McClain
2000-10-05 20:20 ` Stefan Monnier
2000-10-06 19:26   ` eijiro_sumii
2000-10-07  8:19     ` automatic translation in Team PLClub ICFP'2000 entry Julian Assange
2000-10-07 16:30       ` eijiro_sumii
2000-10-06  8:07 ` WWW Page of Team PLClub (Re: ICFP programming contest: results) Xavier Leroy
2000-10-07  8:35   ` Pierre Weis
2000-10-07  9:55     ` Markus Mottl
2000-10-07 10:24       ` Pierre Weis
2000-10-08 21:26     ` John Max Skaller
2000-10-10 10:23       ` Pierre Weis
2000-10-09  5:51     ` Benjamin Pierce
2000-10-09  7:19     ` de Bruijn indices (Re: WWW Page of Team PLClub) Eijiro Sumii
2000-10-10 10:36       ` Pierre Weis
2000-10-10 14:04       ` de Bruijn indices Gerard Huet
2000-10-10 17:29         ` Chet Murthy
2000-10-11 22:35           ` John Max Skaller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).