Re: [Caml-list] Purity in ocaml - Goswin von Brederlow

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

From: Goswin von Brederlow <goswin-v-b@web.de>
To: caml-list@inria.fr
Subject: Re: [Caml-list] Purity in ocaml
Date: Thu, 23 Jan 2014 10:20:09 +0100	[thread overview]
Message-ID: <20140123092009.GA20624@frosties> (raw)
In-Reply-To: <CAN6ygO=J3SQeyHJyw+eBfLtrZ1W2d59Nc7XW0o8N8GDGwLzz5Q@mail.gmail.com>

On Tue, Jan 21, 2014 at 10:27:13AM -0500, Yotam Barnoy wrote:
> So that's the rationale for pure functions. Since ocaml allows for mutable
> state, it's useful to create another class of purity called st, which
> corresponds to the state monad in haskell. In haskell, having even local
> mutable state requires a state monad, which is both cumbersome and verbose.
> In ocaml, you get the state monad for free by allowing for a class of
> functions that can access local state. By local state I mean any mutable
> variables that are in some local scope. The reason for this is that local
> variables are not accessible from external code. A pure function can also
> include local mutable state. So long as that state is not accessible from
> another, external scope, it's not considered a side effect. The local state
> is thus contained within the pure function and the function maintains its
> purity. So a pure function could, for example, create some mutable state,
> and pass it by refs or by closure to an st function. This is the kind of
> state I already use on a regular basis: for example, computing something
> expensive using an internal hashtable that has only local scope and is thus
> destroyed once the function returns.
> 
> It also makes sense to define a half-pure annotation to enable reuse of
> library functions. The classic map and fold functions don't do anything
> impure themselves, but they can't vouch for the higher-order functions they
> call. By declaring half-purity (hpure), you can now use these functions in
> both pure and impure context (something haskell cannot do). Any pure
> function that takes a higher function without restricting its purity is
> essentially a half-pure function.

Obviously the purity of a higher order function can be dependent on
the purity of its arguments. You need a syntax for the type of a
function that can reflect that.
 
> The nice thing is, this can all be done mostly transparently. You never
> need to annotate anything, since the compiler will just figure out which
> parts of your code are pure and impure and add annotations to your type
> signatures.

Unfortunately in ocaml you have to annotate a lot if done right. You
need to create a lot of .mli files, module types or class types. Not
to mention GADT need type annoation for nearly every function.

So I'm not too sure about keeping this transparent.
 
> In response to your question, a function that doesn't modify its mutable
> argument can be considered pure so long as it also doesn't read from said
> argument. This is very useful when you have a large record with some
> mutable fields. So long as a function reads only from the immutable fields,
> it's still pure. If it accesses the immutable fields, it becomes an st
> function, though that means it can also be called with a global reference,
> in which case it functions as impure in that context. If it calls an impure
> function directly, it becomes impure, and can no longer be called from a
> pure function.

It might be better to talk about the scope of impurity there. The
impurity dies with the death of the mutable. Like when you create a
local hashtbl to speed up computations, get the result and then return
the result (destroying the hashtbl). The impurity ends there. Global
mutables are just a special case of having the lifetime of the
programm.
 
> 2) I've been debating syntax for a long time. I've settled on pure, st,
> hpure, hst and unsafe in function types for now. In most cases, the
> programmer should not have to think of purity. The way I envision it, you
> write your function signatures as you do now, and the compiler would try to
> infer if it can make the signatures any stricter. It would save the
> stricter signatures into another signature file. If an extra signature file
> is present, the compiler will use that when compiling modules. Otherwise,
> it will use the regular signature file and assume every function in the
> module is impure (which is the default).

That won't work. You need to be able to declare purity in the source.
Otherwise, for example, how would you pass a module to a function that
must have a pure function? (as you say below)
 
> Sometime you will want to control the purity of functions. For example, a
> parallel monadic library may want to make sure the functions it calls are
> pure. In this case you can put in type annotations for purity. The compiler
> will never make functions less strict than their purity annotations.

So what is your syntax idea for annotating purity?

If you can annotate that in the source (.ml or .mli files) then that
should end up in the .cmi file as well, right. Then why would the
compiler generated purity, in case you don't have a .mli file, not end
up in there too?

Note that if a library does not annotate purity but the function
happens to be pure then you can't have the compiler fill that in. The
library makes no promise that the function will remain pure in the
future and by having the compiler adding that promise you would change
the API of the library. This would break dynamic linking.

So I realy think purity belongs in the .mli and .cmi files. Not
something seperate. And the compiler can only fill in the purity when
there is no .mli file. With .mli file it can only check that the given
type can be unified with the code regarding its purity.
 
> So many times I've come across libraries where the author says,
> unfortunately ocaml doesn't side effects so we had to do X and hope for the
> best. Well, this changes that. It allows for ocaml to be as pure as haskell
> is when there's a need for it. Nevertheless, I don't expect pure code to
> become the dominant paradigm in ocaml.
> 
> 3) Exceptions are considered a side effect. See this answer
> http://stackoverflow.com/a/10720037. I can't say I understand all the
> specifics, but one can reason that unless every exception thrown by a
> function is caught right at its invocation (like different variants), the
> behavior of said function is unpredictable. Haskell's solution is to only
> allow exceptions to be caught by impure code, and I'm mimicking this
> solution, with an additional allowance for catching exceptions raised
> within the same function (and serving as gotos). An alternative solution is
> to annotate the types of functions with all the exceptions they may raise,
> but this becomes extremely hairy when you consider higher order functions
> (functions become incompatible with each other's type because they raise
> different exceptions), so I'd rather not go that way.

I think that would be a mistake. Exceptions are wide spread in ocaml.
By not annotating types with their exceptions you MAKE them side
effects because they become unpredictable.

On the other hand if types are annotated with the exceptions they
throw then they become just another return value and become pure.
Same input, same exception every time.

True this would complicate the type of functions because the return
type would be a pair of successfull type and exceptions. You need a
syntax for the type that allows the return type to be a combination of
the argument types, like:

val map : ('a -> 'b ['c] ) -> 'a list -> 'b list ['c]
val mapmap : ('a -> 'b ['c] ) ('b -> 'd ['e] ) -> 'a list -> 'd list ['c | 'e]

This will be hairy for purity, too, or even more so. You need ways to
say that the combined purity is the least pure of the inputs. Or with
to function arguments the purity is the result of passing one function
to the other. And so on.

Then again consider this code:

let f () = raise Not_found
val f : unit -> 'a [Not_found]

let g fn x = try fn x with Not_found -> Empty
val g : ('a -> 'b [< Not_found as 'c | 'd]) -> 'b [ Empty if 'c | 'd]

g takes a function that may raise Not_found or 'd. G then may itself
raise Empty (only if the function can raise Not_found though) or 'd.

let h fn1 fn2 x = fn2 fn1 x
val h : ('a -> 'b [< Not_found as 'c | 'd]) ->
        (('a -> 'b [< Not_found as 'c | 'd]) -> 'b [ Empty if 'c | 'd]) ->
        'a -> 'b [ Empty if 'c | 'd])

let h2 = h f g
val h2: 'a -> 'b [Empty]

> Pure code in general doesn't rely much on exceptions except to signal a
> catastrophic failure, and I don't see a reason to stray from that. For the
> ...

Much of the stdlib uses exceptions. So you eigther have to rewrite it
to be pure or it will be too hard to write pure code.

> most part, I don't see people programming exclusively in pure code anyway.
> Rather, an impure function can have pure, st and impure parts within it.
> For example, suppose an impure function writes to some global state,
> catches an exception, reads from some local state, and maps and folds pure
> lambdas over a collection. The function as a whole is clearly impure.
> However, the parts of the function that are pure can be rearranged and
> optimized -- the map/fold may be turned into a for loop if there are no
> intermediate dependencies between them and the st/impure parts. The st
> parts cannot be reordered - state modification to the same reference must
> retain its order - but they can be moved around to different areas of the
> function for optimization. Thus, with good cross-module information about
> purity, you can still do a lot within impure functions to improve
> performance. On the other hand, libraries for parallel computation probably
> want to enforce purity, and that means eschewing exceptions as well.
> 
> One issue I haven't discussed is the matter of strings and arrays. Strings
> are mutable by default, which means that any read of a string amounts to
> reading mutable state, causing at least an st annotation. One way of
> dealing with this is by adding an immutable string type, which could be
> constructed by something like "hello world"p, where the p stands for pure.
> Another option is that since mutation of strings is so rare, a flag could
> exist per type scope to signal if strings are modified anywhere in the
> local scope. If they're not, they can be treated as immutable within the
> local type scope. Type scope can be defined as the extent up to which a
> type is kept abstract -- if I have a module with a type t that contains a
> string and writes to it, but t is not fully exported, then outside of this
> module, nobody can write to strings. It would also be useful to have
> immutable arrays, which can be turned into mutable arrays only after
> optimization.

I like phantom types for that. But that gets hairy with containers.
E.g. a Hashtbl of strings can be itself mutable or contain mutable
strings or both. So the type paramters of the hashtbl need phantom
types too. And thats where my ocaml magic broke down when I tried to
generalize that.
 
> I hope this clarifies things.

You haven't suggested a possible annotation syntax.
 
> -Yotam
> 
> 
> On Tue, Jan 21, 2014 at 4:49 AM, Goswin von Brederlow <goswin-v-b@web.de>wrote:
> 
> > On Mon, Jan 20, 2014 at 03:45:15PM -0500, Yotam Barnoy wrote:
> > > I wanted to gauge the interest of people on the list in adding purity
> > > annotations to ocaml. Purity is one of those things that could really
> > help
> > > with reducing memory allocations through deforestation and decreasing the
> > > running time of programs written in the functional paradigm, and it could
> > > be very useful for parallelism as well. The basic scheme I have in mind
> > is
> > > this:
> > >
> > > - Functions that do not access mutable structures would be marked pure.
> > > - Functions that access only local mutable structures would be marked as
> > st
> > > (a la state monad)
> >
> > Does local include the arguments passed to the function?
> >
> > > - Functions that access global mutable data would be unmarked (as they
> > are
> > > now).
> > > - Pure functions can call st functions/code so long as all of the state
> > > referred to by the st code is contained within said pure functions.
> >
> > Because if arguments don't count this makes no sense.
> >
> > But then shouldn't there be another level for functions that don't
> > alter its arguments?
> >
> > > - Functions that call higher order functions, but do not modify mutable
> > > state would be marked hpure (half-pure). These functions would be pure so
> > > long as the functions they call remain pure. This allows List.map,
> > > List.fold etc to work for both pure and impure code.
> > > - The same thing exists for st code: hst represents functions that take
> > > higher order functions but only performs local state mutation.
> > > - In order to take advantage of this mechanism, there's no need to
> > annotate
> > > functions. The type inference algorithm will figure out the strictest
> > type
> > > that can be applied to a function and will save the annotation to an
> > > external, saved annotation file. This means that non-annotated code can
> > > take advantage of purity without doing any extra work, and the programmer
> > > never has to think about purity.
> > > - Having the purity annotations as an option is useful to force certain
> > > parts of the code, such as monads, to be pure.
> > > - An edge case: local state can be made to refer to global state by some
> > > external function call. Therefore, local state is considered 'polluted'
> > > (and global) if it is passed to an impure function.
> > > - Exceptions: not sure how to handle them yet. The easiest solution is to
> > > forbid them in st/pure code. Another easy alternative is to only allow
> > > catching them in impure code, as haskell does.
> > >
> > > Thoughts?
> > >
> > > -Yotam
> >
> > 1) What does pure mean? What does it gain you? How do you want to use it?
> >
> > 2) What syntax do you suggest for annotating functions?
> >
> > 3) Why are exceptions a problem?
> >
> > 4) Will this allow to annotate exceptions too so the compiler can
> > track which exception could be thrown and when all exceptions are
> > caught? If no exception can escape an function then it can be pure
> > again, right?
> >
> > MfG
> >         Goswin

MfG
	Goswin

next prev parent reply	other threads:[~2014-01-23  9:20 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-20 20:45 Yotam Barnoy
2014-01-20 21:08 ` Siraaj Khandkar
2014-01-20 21:16   ` Yotam Barnoy
2014-01-20 21:31     ` Siraaj Khandkar
2014-01-20 21:43       ` Yotam Barnoy
2014-01-20 22:55         ` Siraaj Khandkar
2014-01-21  1:37           ` Yotam Barnoy
2014-01-21  9:49 ` Goswin von Brederlow
2014-01-21 15:27   ` Yotam Barnoy
2014-01-23  9:20     ` Goswin von Brederlow [this message]
2014-01-23  9:35       ` Arnaud Spiwack
2014-01-27  9:32         ` Goswin von Brederlow
2014-01-28  9:21           ` Arnaud Spiwack
2014-02-01 14:52             ` Goswin von Brederlow
2014-02-03  9:20               ` Arnaud Spiwack
2014-01-23 18:18       ` Yotam Barnoy
2014-01-27  9:46         ` Goswin von Brederlow
2014-01-29 17:16           ` Yotam Barnoy
2014-02-01 15:03             ` Goswin von Brederlow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140123092009.GA20624@frosties \
    --to=goswin-v-b@web.de \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).