From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id JAA15658 for caml-red; Sat, 21 Oct 2000 09:51:04 +0200 (MET DST) Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id RAA09338 for ; Fri, 20 Oct 2000 17:00:57 +0200 (MET DST) Received: from HUET-GERA1P (huet-gera2p.inria.fr [128.93.20.113]) by concorde.inria.fr (8.11.1/8.10.0) with SMTP id e9KF0u521516 for ; Fri, 20 Oct 2000 17:00:56 +0200 (MET DST) Message-Id: <200010201500.e9KF0u521516@concorde.inria.fr> X-Sender: huet@pop-rocq.inria.fr X-Mailer: QUALCOMM Windows Eudora Pro Version 4.0.2 Date: Fri, 20 Oct 2000 16:59:16 +0200 To: caml-list@inria.fr From: Gerard Huet (via Gerard Huet ) Subject: Re: Undefined evaluation order Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Sender: weis@pauillac.inria.fr At 22:35 11/10/00 +0200, Pierre Weis wrote: >I would like to report the most difficult example I know to confuse students > [...] >map add [1; 2; 3];; >argument 3 >argument 2 >argument 1 >- : int list = [2; 3; 4] Your students are not happy. > [...] >Pierre Weis So, maybe it is time to write a course on programming in ML that will put things up front so that people don't get bitten unexpectedly later on. I'll be glad to contribute to the first course. The intended style is like FAQ, code interspersed with comments. It is intended to teachers of ML beginners. ______________________________________________________________________________ Programming in ML - Course 1 1. Hello to the class. You start the class by greeting everyone, telling them that it is a programming language class, that there is no prerequisite except knowing how to access a computer account and using an editor, and you give them 3 informations : a - how to login on their course account b - what is the calling sequence to the ML top-level c - what is the calling sequence to emacs configured with tuareg and stuff and then you tell them a little bit of history, in fairy tale style (once upon a time, there was LISP, there was LCF, there was (Edinburgh LCF) ML, blabla... In this course we shall use as backend a modern implementation of ML called Objective Caml V3.0, abbreviated Ocaml, and as front end a better syntax defined with the Camlp4 preprocessor. 2. Hello to the world. This is the first programming example in any programming language : how to print stuff on your terminal. If you do not know how to do that, you are condemned to stay within a shallow shell, some read-eval loop interpreter, but you do not have access to the wonderful world of your file system, operating system, and the world wide web on Internet. So here it is : value hello_world_in_Ocaml = let the_inverting_demon _ = () in the_inverting_demon(print_string "World!", print_string "Hello "); At this point students should react with various questions: Question 1. Why are the two string "World!" and "Hello " written in the wrong order ? Answer. Look at the answer computed by ML for the value of hello_world_in_Ocaml and you will see that "Hello World!" gets printed, and thus the strings are written in the right order, it is just that calling the function the_inverting_demon executes the two print statements in reverse. Question 2. I do not understand how the_inverting_demon works. What is this crazy notation ? If I type in "();" it tells me "- : unit = ()", and if I type in "_" it tells me "Parse error", I am confused. Answer. "()" is the unique value of data type unit, like "True" and "False" are the two values of data type bool. We use the dummy value "()" as the value of statements, that is expressions which are used only for their effect, but this way statements may be accommodated within the syntax as expressions. Thus if you type in "print_string;" you see that print_string is a function of type "string -> unit", and thus "print_string (print_string "Hello");" does not print anything, since the type checker verifies that the inner expression has type unit instead of the expected type string, and so you get a typing error and the compilation aborts. Whereas "_" is not the representation of a value, but of a pattern, it is the pattern that matches any value. Patterns can appear only in the contexts where formal parameters are expected, so you get a parsing error. Question 3. If _ matches any value, what is its type ? Answer. If you bring the inside let expression as a top value declaration: value the_inverting_demon _ = (); then you see its type : the_inverting_demon : 'a -> unit = indicating that it takes a value of any type, since 'a is a type variable (read as alpha) which may be instanciated by any concrete type, and thus you may call the_inverting_demon (); the_inverting_demon True; the_inverting_demon "Hello" and it always just returns the dummy value (). Question 4. If it does nothing, why do you put it in the code of hello_world ? Answer. If I remove it, I get a bug: value buggy_hello_world_in_Ocaml = (print_string "World!" print_string "Hello "); I get "World!Hello " printed, this does not obey the spec of hello_world. Question 5 (smart student). Why don't you write: value hello_world = (print_string "Hello " print_string "World!"); Answer. You may indeed. This is correct, and it will work not just in Ocaml, but in other dialects of ML, modulo adjustement of syntax. A program specification admits many different solutions, we should not expect to have non-redundancy in programming; indeed there are many ways to program a task, some are more efficient, some are more elegant, some have code that is better to understand and maintain. Question 6 (dumb student). Why did you not give us hello_world in the first place ? Answer. Because I wanted to teach you the inverting effect of calling this function on its two arguments, so that you understand the difference between the_inverting_demon (print_string "World!", print_string "Hello "); which prints "Hello World!" and the_inverting_demon (print_string "World!" print_string "Hello "); which prints "World!Hello ". [you may expect some students to leave the room at this point and look for alternative programming courses in Java or C++]. A few brave and/or curious students will keep asking questions as to why "," means right to left while "" means left to right, until either a very smart student asks the following question, or you yourself declare that the proper question to ask first is: Question 7. Why does the_inverting_demon (print_string "Foo Bar"); print anything in the first place, since the_inverting_demon does not even look at its argument ? Answer. Ah Ah ! This is because ML uses a strict evaluation regime: when you call a function foo with an argument expression bar, which may be written in ML "foo bar" or "foo(bar)" or "(foo)bar" or "(foo bar)" or "(foo)(bar)" etc, what happens is that first "foo" is evaluated to its (functional) value, that is a so-called closure which encapsulates a piece of code with a representation of the environment in which its global variables are bound to their values, then bar is evaluated to a value, and finally we execute the code in the extended environment. This is in contrast with other dialects of ML, such as Haskell, which use a lazy evaluation regime, where the_inverting_demon (print_string "Foo Bar"); would return imediately value () without evaluating the argument to the_inverting_demon since it is not needed, and thus without any printing effect. Another terminology tells that Ocaml uses "call by value" whereas Haskell uses "call by need". Question 8 (bright student). I still do not see why the arguments to the_inverting_demon get evaluated from the right to the left. Answer. Ah Ah! This is because Ocaml happens to evaluate multiple arguments to a function in a right to left manner. This has nothing to do with the_inverting_demon, I just fooled you all along with a crazy name, but names don't matter in a program, calling you functions "bug_free" or "spec_certified" does not help, only the program text matters. There are other dialects of ML, such as SML, which use strict evaluation, but with left to right evaluation of their arguments. Actually, you may test the evaluation regime of ML dialects with the help of the following ML_tester, where we see another control structure for programming exceptional effects: value ML_tester () = let null _ = () in try (null(raise Left, raise Right); print_string "Haskell!") with [ Left -> print_string "SML!" | Right -> print_string "Ocaml!" ]; Question 9. Why did the implementers of Ocaml make this choice ? Answer. Actually, they did not commit themselves to any particular order. It is more convenient to evaluate arguments from right to left in the bytecode interpreter, as we shall see in a more advanced lesson, but the Ocaml manual says that the order of evaluation of arguments is not specified, so that your programs should not depend on it. And so maybe one day, after enough debate proves conclusively that right to left has more inconvenients than advantages, the Ocaml implementers will change their minds, and we shall change slightly this introductory lesson. Many questions from excited students speaking all together: * How do you program Hello World in Haskell ? * I typed in "(0,1);" to Ocaml and it returned the pair "(0,1)" and not the pair "(1,0)", is this a bug ? OK, now the course is over, see you to-morrow.