* Stdlib regularity @ 1999-10-06 13:25 Ohad Rodeh 1999-10-06 16:18 ` Markus Mottl ` (3 more replies) 0 siblings, 4 replies; 36+ messages in thread From: Ohad Rodeh @ 1999-10-06 13:25 UTC (permalink / raw) To: caml-list Caml list, I have used OCaml extensively in the past few years, and I've had some misgivings about the CAML standard library argument ordering. It is a little bit confusing and not standard. For example: val Queue.add: 'a -> 'a t -> unit val Hashtbl.add: ('a,'b) t -> 'a -> 'b -> unit My general suggestion is to always make the first argument the <'a t> type and the second the <'a> type. The only exception to this rule should be functionals, for example, in Queue: val iter: ('a -> unit) -> 'a t -> unit I've summed up the proposed changes in an order of importance, please remember that this is suggestion based on my personal taste alone. The changes I'm most interested are: Module Queue: switch: val add: 'a -> 'a t -> unit to: val add: 'a t -> 'a -> unit Module Stack: switch: val push: 'a -> 'a t -> unit to: val push: 'a t -> 'a -> unit Module Stream: switch: npeek : int -> 'a t -> 'a list;; to: npeek : 'a t -> int -> 'a list;; This make the data-structure modules (Hashtbl,Queue,Stack,Stream) behave the same. If this is possible, I'd like this to apply to the Map and Set modules. For module Map, this is the current signature: module type OrderedType = sig type t val compare: t -> t -> int end module type S = sig type key type 'a t val empty: 'a t val add: key -> 'a -> 'a t -> 'a t val find: key -> 'a t -> 'a val remove: key -> 'a t -> 'a t val mem: key -> 'a t -> bool val iter: (key -> 'a -> unit) -> 'a t -> unit val map: ('a -> 'b) -> 'a t -> 'b t val fold: (key -> 'a -> 'b -> 'b) -> 'a t -> 'b -> 'b end end module Make(Ord: OrderedType): (S with type key = Ord.t) I'd rather make it: module type S = sig type key type 'a t val empty: 'a t val add: 'a t -> key -> 'a -> 'a t val find: 'a t -> key -> 'a val remove: 'a t -> key -> 'a t val mem: 'a t -> key -> bool val iter: (key -> 'a -> unit) -> 'a t -> unit val map: ('a -> 'b) -> 'a t -> 'b t val fold: (key -> 'a -> 'b -> 'b) -> 'a t -> 'b -> 'b end For module Set, this is the current signature: module type OrderedType = sig type t val compare: t -> t -> int end module type S = sig type elt type t val empty: t val is_empty: t -> bool val mem: elt -> t -> bool val add: elt -> t -> t val singleton: elt -> t val remove: elt -> t -> t val union: t -> t -> t val inter: t -> t -> t val diff: t -> t -> t val compare: t -> t -> int val equal: t -> t -> bool val subset: t -> t -> bool val iter: (elt -> unit) -> t -> unit val fold: (elt -> 'a -> 'a) -> t -> 'a -> 'a val cardinal: t -> int val elements: t -> elt list val min_elt: t -> elt val max_elt: t -> elt val choose: t -> elt end module Make(Ord: OrderedType): (S with type elt = Ord.t) Id rather switch S to: module type S = sig type elt type t val empty: t val is_empty: t -> bool val mem: t -> elt -> bool val add: t -> elt -> t val singleton: elt -> t val remove: t -> elt -> t val union: t -> t -> t val inter: t -> t -> t val diff: t -> t -> t val compare: t -> t -> int val equal: t -> t -> bool val subset: t -> t -> bool val iter: (elt -> unit) -> t -> unit val fold: (elt -> 'a -> 'a) -> t -> 'a -> 'a val cardinal: t -> int val elements: t -> elt list val min_elt: t -> elt val max_elt: t -> elt val choose: t -> elt end Module List has some of the same functions (mem,remove), so my suggestion is: switch: val mem : 'a -> 'a list -> bool val memq : 'a -> 'a list -> bool to: val mem : 'a list -> 'a -> bool val memq : 'a list -> 'a -> bool switch: val assoc : 'a -> ('a * 'b) list -> 'b val assq : 'a -> ('a * 'b) list -> 'b val mem_assoc : 'a -> ('a * 'b) list -> bool val mem_assq : 'a -> ('a * 'b) list -> bool val remove_assoc : 'a -> ('a * 'b) list -> ('a * 'b) list val remove_assq : 'a -> ('a * 'b) list -> ('a * 'b) list to: val assoc : ('a * 'b) list -> 'a -> 'b val assq : ('a * 'b) list -> 'a-> 'b val mem_assoc : ('a * 'b) list -> 'a -> bool val mem_assq : ('a * 'b) list -> 'a-> bool val remove_assoc : ('a * 'b) list -> 'a -> ('a * 'b) list val remove_assq : ('a * 'b) list -> 'a -> ('a * 'b) list The more important changes are the first minor 3, the rest are optional. What do you think? Ohad. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-06 13:25 Stdlib regularity Ohad Rodeh @ 1999-10-06 16:18 ` Markus Mottl 1999-10-08 14:06 ` Matías Giovannini 1999-10-08 14:10 ` skaller 1999-10-06 18:50 ` John Prevost ` (2 subsequent siblings) 3 siblings, 2 replies; 36+ messages in thread From: Markus Mottl @ 1999-10-06 16:18 UTC (permalink / raw) To: Ohad Rodeh; +Cc: OCAML > I have used OCaml extensively in the past few years, and I've had > some misgivings about the CAML standard library argument ordering. It > is a little bit confusing and not standard. For example: Although the standard library is quite ok, there are some (minor) inconsistencies. What concerns my wishes for it, I'd love to see more features (= functions or even modules). At the moment I am not always linking against the standard library, but I use own modules, which I have extended a bit, because I need some important features all of the time (e.g. why is there no "partition"-function in the set-module?). What do you think about this proposal: why not put a version of the standard library on the CVS-server of INRIA, where volunteers can contribute extensions, replacements, new modules, etc.? >From time to time, the maintainers of OCaml might want to take a look at the diffs to the original library and merge some (all?) of the goodies into the main branch. I can imagine that you have a lot of patches which are just waiting to be uploaded... Best regards, Markus Mottl -- Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-06 16:18 ` Markus Mottl @ 1999-10-08 14:06 ` Matías Giovannini 1999-10-10 20:09 ` Pierre Weis 1999-10-08 14:10 ` skaller 1 sibling, 1 reply; 36+ messages in thread From: Matías Giovannini @ 1999-10-08 14:06 UTC (permalink / raw) To: OCAML Markus Mottl wrote: > > > I have used OCaml extensively in the past few years, and I've had > > some misgivings about the CAML standard library argument ordering. It > > is a little bit confusing and not standard. For example: > > Although the standard library is quite ok, there are some (minor) > inconsistencies. What concerns my wishes for it, I'd love to see more > features (= functions or even modules). Yes! Yes! I always begin my Caml code by writing iota, and I wish it were included in the standard library. It's silly simple, and imprescindible. let iota n = let rec aux l n = if n > 0 then aux (n::l) (n-1) else l in aux [] n And then a "functional for" loop looks like List.map (fun i -> ...) (iota n) (Incidentaly, the name "iota" comes from APL, and stands for the "Initial natural Interval".) -- I got your message. I couldn't read it. It was a cryptogram. -- Laurie Anderson ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-08 14:06 ` Matías Giovannini @ 1999-10-10 20:09 ` Pierre Weis 1999-10-10 20:12 ` Matías Giovannini 0 siblings, 1 reply; 36+ messages in thread From: Pierre Weis @ 1999-10-10 20:09 UTC (permalink / raw) To: matias; +Cc: caml-list > Yes! Yes! I always begin my Caml code by writing iota, and I wish it > were included in the standard library. It's silly simple, and imprescindible. > > let iota n = > let rec aux l n = > if n > 0 then aux (n::l) (n-1) else l > in aux [] n We may reuse this ``prelude'' code that slightly generalizes iota (named range in this version of the standard library): (*\ \subsection{Lists of consecutive integers} \index{Lists of consecutive integers} \begin{caml_primitive} interval range \end{caml_primitive} \begin{itemize} \item \verb"interval n m" returns the list of all integers in increasing order, from \verb"n" to \verb"m". \item \verb"range n" gives the first n integers. \end{itemize} \*) let interval n m = let rec loop l m = if n > m then l else loop (m :: l) (pred m) in loop [] m;; let range = interval 1;; Pierre Weis INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://cristal.inria.fr/~weis/ ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-10 20:09 ` Pierre Weis @ 1999-10-10 20:12 ` Matías Giovannini 0 siblings, 0 replies; 36+ messages in thread From: Matías Giovannini @ 1999-10-10 20:12 UTC (permalink / raw) Cc: caml-list Pierre Weis wrote: > > > Yes! Yes! I always begin my Caml code by writing iota, and I wish it > > were included in the standard library. It's silly simple, and imprescindible. > We may reuse this ``prelude'' code that slightly generalizes iota (named > range in this version of the standard library): There is no way to pre-load a prelude file in the interpreter without relinking a custom runtime, is it? -- I got your message. I couldn't read it. It was a cryptogram. -- Laurie Anderson ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-06 16:18 ` Markus Mottl 1999-10-08 14:06 ` Matías Giovannini @ 1999-10-08 14:10 ` skaller 1999-10-08 19:21 ` Markus Mottl 1999-10-09 21:14 ` Dave Mason 1 sibling, 2 replies; 36+ messages in thread From: skaller @ 1999-10-08 14:10 UTC (permalink / raw) To: Markus Mottl; +Cc: Ohad Rodeh, OCAML Markus Mottl wrote: > What do you think about this proposal: why not put a version of the > standard library on the CVS-server of INRIA, where volunteers can > contribute extensions, replacements, new modules, etc.? I think this is a good idea, but I think that implementing extensions and new modules is relatively easy (in most cases) but deciding on the best interfaces is not. Even the best library designer cannot make decision without user feedback. As a member of two ISO Working Groups, my experience is that there is something to be said for proposing changes in a slightly formal manner, followed by discussion of the proposal. There is another advantage of this approach, which is that the 'offical' ocaml developers can indicate tentative support for some changes, allowing users to try them out with modules with the same interface but potentially less efficient implementations, to gain experience with these interfaces, and to write code that will work well with the next official release. -- John Skaller, mailto:skaller@maxtal.com.au 1/10 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller downloads: http://www.triode.net.au/~skaller ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-08 14:10 ` skaller @ 1999-10-08 19:21 ` Markus Mottl 1999-10-09 21:14 ` Dave Mason 1 sibling, 0 replies; 36+ messages in thread From: Markus Mottl @ 1999-10-08 19:21 UTC (permalink / raw) To: skaller; +Cc: OCAML > I think this is a good idea, but I think that implementing > extensions and new modules is relatively easy (in most cases) > but deciding on the best interfaces is not. Deciding on interfaces is probably the most important (and difficult) problem in library design. Finding suitable (=intuitive) names for functions alone can be a difficult task. No sensible contributor would check in changes that effect the interface without discussing this with other developers/contributors before. So if there is a good means of communication (e.g. a specialized mailing list) and if people adhere to strict rules concerning check-in, I am quite convinced that this would work well. Regards, Markus Mottl -- Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-08 14:10 ` skaller 1999-10-08 19:21 ` Markus Mottl @ 1999-10-09 21:14 ` Dave Mason 1 sibling, 0 replies; 36+ messages in thread From: Dave Mason @ 1999-10-09 21:14 UTC (permalink / raw) To: OCAML >>>>> On Sat, 09 Oct 1999 00:10:48 +1000, skaller <skaller@maxtal.com.au> said: > Markus Mottl wrote: >> What do you think about this proposal: why not put a version of the >> standard library on the CVS-server of INRIA, where volunteers can >> contribute extensions, replacements, new modules, etc.? > I think this is a good idea, but I think that implementing > extensions and new modules is relatively easy (in most cases) but > deciding on the best interfaces is not. > Even the best library designer cannot make decision without user > feedback. As a member of two ISO Working Groups, my experience is > that there is something to be said for proposing changes in a > slightly formal manner, followed by discussion of the proposal. For another take on this, see the SRFI process that has been in use for the last year in the Scheme community. http://srfi.schemers.org/ Things are a bit different in caml since there is really only one implementation organization (although I guess there is ocaml and caml). ../Dave ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-06 13:25 Stdlib regularity Ohad Rodeh 1999-10-06 16:18 ` Markus Mottl @ 1999-10-06 18:50 ` John Prevost 1999-10-07 7:33 ` skaller 1999-10-07 9:18 ` Francisco Valverde Albacete 3 siblings, 0 replies; 36+ messages in thread From: John Prevost @ 1999-10-06 18:50 UTC (permalink / raw) To: Ohad Rodeh; +Cc: caml-list Ohad Rodeh <orodeh@cs.cornell.edu> writes: > I have used OCaml extensively in the past few years, and I've had > some misgivings about the CAML standard library argument ordering. It > is a little bit confusing and not standard. For example: > > val Queue.add: 'a -> 'a t -> unit > val Hashtbl.add: ('a,'b) t -> 'a -> 'b -> unit > > My general suggestion is to always make the first argument the <'a t> > type and the second the <'a> type. The only exception to this rule > should be functionals, for example, in Queue: > > val iter: ('a -> unit) -> 'a t -> unit I have a slightly different proposal, but one which is along the same lines: Standard ordering is: val ho_func : ('a -> unit) -> 'a t -> unit val ho_func : ('a -> 'b) -> 'a t -> 'b t val imp_func : 'a t -> 'a -> unit val func : 'a -> 'a t -> 'a t Rationale: The basic rationale is that the most often-repeated item should be at the front to make it easier to curry. Along with this, there's the desire to have a consistent style for ordering arguments. The basic style I propose is that for functions acting on some sort of agregate data type the "importance" of the arguments is as follows: Any higher-order function gets its function arguments first. (Idea: the function argument determines the meaning of the function. Hence it should be closer to the function than the other arguments. Corollary: non-function arguments that determine the meaning of a function should also bind closely.) In an imperative case, the agregate argument should come next, after any behavior-determining arguments, but before any single values. (Idea: in an imperative case, the value "acted upon" is more important than the value used in the action--sort of like direct object vs. indirect object.) i.e. give john pizza ==> john is the "aggregate", pizza is the value used in the action. In the non-imperative case, the "value" place should come before the aggregate case--this is because we're no longer "acting on" something. Now that we're not, the value determines the meaning of the function which is applied to the aggregate, returning a new aggregate. In essence, the function should be thought of in this case as taking an argument and returning a new function, like map does, rather than acting on something after receiving multiple arguments. So, the basic ordering: val func : determiners -> arguments -> result val Queue.add : 'a t -> 'a -> unit + val Hashtbl.add : ('a,'b) t -> 'a -> 'b -> unit + val Queue.iter : ('a -> unit) -> 'a t -> unit val Stream.npeek : int -> 'a t -> 'a list * val S.add : key -> 'a -> 'a t -> 'a t * val S.find : 'a t -> key -> 'a * val S.remove : key -> 'a t -> 'a t * val S.mem : key -> 'a t -> bool * ... The * is where I disagree with Ohad's strategy. The + is where I disagree with O'Caml's. As a note, O'Caml's strategy makes imperative functions order arguments more like pure functions. This may make the default currying order less useful, but is probably better than my strategy. Hence, O'Caml's order is pretty good. :) John. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-06 13:25 Stdlib regularity Ohad Rodeh 1999-10-06 16:18 ` Markus Mottl 1999-10-06 18:50 ` John Prevost @ 1999-10-07 7:33 ` skaller 1999-10-07 9:18 ` Francisco Valverde Albacete 3 siblings, 0 replies; 36+ messages in thread From: skaller @ 1999-10-07 7:33 UTC (permalink / raw) To: Ohad Rodeh; +Cc: caml-list Ohad Rodeh wrote: > > Caml list, > I have used OCaml extensively in the past few years, and I've had > some misgivings about the CAML standard library argument ordering. > What do you think? While I tend to agree with the sentiment (and, apart from just remembering the ordering, the most likely Currying isn't possible when the data structure type isn't the first argument), it would create a compatibility problem to just change the existing library, and a mess to add a new set of modules just to 'fix' the argument order to be slightly more intuitive. I guess the original reasoning was more to do with reading order: List.mem element theList reads well as element <is member of> theList This kind of issue (argument order) was much more important in C++, where generics represented by templates _mandate_ consistency (in the naming conventions as well). Even before the STL was finalised, it was being used enough that people argued against changing it to avoid breaking code. It is probably more important to consider how to introduce FISh 2 style polymorphism, in which functions like 'map' and 'iter' can be applied to _any_ data structure. In that case, you gain consistency automatically, since there is _really_ only one such algorithm for all data structures :-) A more limited way to achieve this is to use an STL style library; that is, iterator based algorithms, with the client supplying the iterators. -- John Skaller, mailto:skaller@maxtal.com.au 1/10 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller downloads: http://www.triode.net.au/~skaller ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-06 13:25 Stdlib regularity Ohad Rodeh ` (2 preceding siblings ...) 1999-10-07 7:33 ` skaller @ 1999-10-07 9:18 ` Francisco Valverde Albacete 1999-10-08 14:56 ` skaller 1999-10-08 16:38 ` Proposal for study: Add a categorical Initial type to ocaml skaller 3 siblings, 2 replies; 36+ messages in thread From: Francisco Valverde Albacete @ 1999-10-07 9:18 UTC (permalink / raw) To: Ohad Rodeh; +Cc: caml-list Ohad Rodeh wrote: > Caml list, > I have used OCaml extensively in the past few years, and I've had > some misgivings about the CAML standard library argument ordering. It > is a little bit confusing and not standard. For example: Greetings, everybody, I have my opinion about argument order too! However, I have also read the proposals by other people and I guess all of us have their bit of reason. C'mon, even the implementors of the language and its their own child! Now, the work of the OLabl group is much more daring and advanced than any of the proposals *we* can make: It allows free order of parameters (with certain restrictions). It involves also named parameters, missing arguments and default values (which can deal more elegantly with obnoxious default comparison functions for aggregate types like sets and priority queues). It can also accept OCaml-like function types. Take a look at http://pauillac.inria.fr/olabl Alas, for one of particular "features" above mentioned (I don't remember which) there is a time penalty in runtime, although minor if I remember well (Any explanations from J.Garrigue or J.P. Furuse)? But we never got into OCaml strictly for time-efficiency reasons, did we? The most important consideration is that the OLabl people are not *the OCaml people* and their work remains different (if faithful to releases of the original language. I guess both teams are closely related?). And that has always been my main deterrent for not adhering to OLabl: I use all its tools *except* this particular feature with the types of functions, for fear I'll be left out of the main OCaml tool development. Any gues what comes next? *YES* can I ask the OCaml implemtors if there are any plans to consider implementing OLabl's typing discipline/suggestions as an alternative to OCaml's more rigid one while maintaining the present scheme for compatibility's sake? Thanks for your attention! Francisco Valverde Universidad Carlos III de Madrid Resume' en francais: Pourquoi ne profite-t-on pas des advantages de la discipline de typage de OLabl dans OCaml? Merci. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-07 9:18 ` Francisco Valverde Albacete @ 1999-10-08 14:56 ` skaller 1999-10-09 22:26 ` Francois Rouaix 1999-10-08 16:38 ` Proposal for study: Add a categorical Initial type to ocaml skaller 1 sibling, 1 reply; 36+ messages in thread From: skaller @ 1999-10-08 14:56 UTC (permalink / raw) To: Francisco Valverde Albacete; +Cc: Ohad Rodeh, caml-list Francisco Valverde Albacete wrote: > But we never got into > OCaml strictly for time-efficiency reasons, did we? In my case, no, but it is still vital, otherwise I'd use Haskell instead :-) In particular, because procedural algorithms can be implemented in ocaml just like in C++, the same complexity _should_ be obtainable, and work on optimisation plus standard library support should allow 'close' to unit performance factors. There are several cases where the core language prevents this, because it lacks functionality available in C++: the ability to create uninitialised values, and the ability to destroy them are two that I've become aware of trying to build a variable length array module. Both these features seem mandatory. Both are unsafe. A reasonable compromise might be: do not implement the features in the standard bytecode interpreter do not allow these features in the compiler unless a special switch is specified Unfortunately, I do not think it is 'enough' to provide a variable length array in the standard library, because that is only one data structure which cannot be implemented efficiently or cleanly without these features. There is another more serious problem: ocaml doesn't handle recursive types well. I'm not sure I fully understand this. When all values are boxed, all recursion of algebraic types is sound. [Proof: it is sound in C, where all non datum values are represented by pointers; note that _initialising_ the structure may not be possible in ocaml unless uninitialised values are permitted] It is not clear to me if the result extends to modules. However, the lack of recursion across module boundaries is also pain. In trying to implement a variable length array, I found that I needed to work around the inability to create an uninitialised array by using a functor whose argument supplied the array value type and special value of that type to initialise the array with. Unfortunately, the type of the instantiated functor's aggregate component was a variant of the type over which it needed to be instantiated. So this solution cannot work. To exhibit the problem more plainly, it is something like: type t' = X | Y of G.t where module G = V.Make(sig type t = t'; val default = X) [In the actual code, the type of a python expression, 'expr', includes a case Initial suitable for initialising an array, and a case 'V of expr varray', where varray is a variable length array of expressions, this works as is, but I cannot make varray from a functor that needs t = expr and default = Initial; even if the recursion could work, there is no syntax for it] -- John Skaller, mailto:skaller@maxtal.com.au 1/10 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller downloads: http://www.triode.net.au/~skaller ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-08 14:56 ` skaller @ 1999-10-09 22:26 ` Francois Rouaix 1999-10-10 5:38 ` skaller 0 siblings, 1 reply; 36+ messages in thread From: Francois Rouaix @ 1999-10-09 22:26 UTC (permalink / raw) To: skaller; +Cc: caml-list <skaller@maxtal.com.au> writes: skaller> There are several cases where the core language prevents skaller> this, because it lacks functionality available in C++: the skaller> ability to create uninitialised values, and the ability to skaller> destroy them are two that I've become aware of trying to build skaller> a variable length array module. Uninitialized values are easily implemented with the 'a option type. Of course, the code is then ugly, because you have to match your values everywhere to None | Some x. In C/C++ terms, this forces you to check for NULL pointers systematically, which is a Really Good Thing. Adding uninitialised values is a major source of bugs, and it's kind of natural to pay the price for it in the readability of the source, if you want your code to be robust. skaller> [About a functor for an extensible array type, and the problem skaller> of a dummy value] Builtin arrays require you to provide a initialization value, even for zero-length array. Why don't you carry the same requirement to your extensible arrays, and simply use a polymorphic type: type 'a earray = { mutable current : 'a array; mutable used : int; } let create n i = { current = Array.create n i; used = n } And then, if you want to have the equivalent of NULL pointers, use None, and option types everywhere. It seems to me that you are trying to force the language to do something it has been purposely designed against. I'm not sure you can win this fight. --f ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-09 22:26 ` Francois Rouaix @ 1999-10-10 5:38 ` skaller 1999-10-10 20:44 ` William Chesters 0 siblings, 1 reply; 36+ messages in thread From: skaller @ 1999-10-10 5:38 UTC (permalink / raw) To: Francois Rouaix; +Cc: caml-list Francois Rouaix wrote: > > <skaller@maxtal.com.au> writes: > > skaller> There are several cases where the core language prevents > skaller> this, because it lacks functionality available in C++: the > skaller> ability to create uninitialised values, and the ability to > skaller> destroy them are two that I've become aware of trying to build > skaller> a variable length array module. > > Uninitialized values are easily implemented with the 'a option type. > Of course, the code is then ugly, because you have to match your > values everywhere to None | Some x. In C/C++ terms, this forces you > to check for NULL pointers systematically, which is a Really Good Thing. There is a difference: Some/None must be matched EVERY use. Null pointers only need to be checked where they can occur. In the case of an array of variable length, with extra uninitialised slots, the Some/None typing of each element is extra overhead that isn't required if the accessing codes are correct, it is only necessary to check if the index is in range. > Adding uninitialised values is a major source of bugs, and it's kind > of natural to pay the price for it in the readability of the source, > if you want your code to be robust. No. This isn't necesarily the case. There are other solutions. In C++, the philosophy is to provide as much protection as possible, without costing at run time, and with protection that does cost at run time being optional. For example, the designer of a class can determine whether or not use of a variable of the class requires initialisation, whether a default value makes sense, etc. [in this case, the client of the class cannot override the designers decisions]. To put this another way: static typing, and other safety guarrantees, are intrinsically limited in what can be expressed: it cannot prevent all run time errors, and it may _exclude_ correct cases as well. So it is always a compromise. It makes sense to improve these systems to be more expressive, and provide even better guarrantees, but there will (almost always) be a need to escape the system when the programmer knows better. In C++, such escapes are provided, for example with new style casts, but in a way which discourages use, rather than outright prevention. In ocaml, Obj.magic does some of the same kind of things, I believe, and there is always the ability to write C, or even patch the compiler. In general, I really think it is necessary to provide a 'low level unsafe' interface in a language, which at least will not be used by accident, and where deliberate uses are at least easy to find. These low level features are not needed where reasonable and complete safe solutions are available: for example, it has been proved that 'goto' is not required in a language with a few sensible looping constructions, and so one can argue this feature is not required. I think you can argue that a powerful enough functional language does _not_ require uninitialised values (that is, there is always an efficient way to initialise variables), but that isn't the case in a procedural language. Because referential transparency is lost, compilers cannot reason as easily about program structure, and hence cannot optimise away useless dummy initialisations, so efficiency is lost. > Builtin arrays require you to provide a initialization value, even for > zero-length array. Why don't you carry the same requirement to your > extensible arrays, and simply use a polymorphic type: > > type 'a earray = { > mutable current : 'a array; > mutable used : int; > } > > let create n i = { current = Array.create n i; used = n } Sure, but that only works for the 'create' method. In other cases, like, concatenation, the solution requires testing if the two arrays to be concatenated are zero length, if so creating a zero length array using [| |], otherwise finding an object from one of the two arrays that contains one, to initialise the result array. It is always possible to do this, but it is messy, and it _requires_ that the physical length of an array used to hold no elements be zero in cases when there is no available 'dummy' value to fill the unused slots. > And then, if you want to have the equivalent of NULL pointers, use None, > and option types everywhere. Ocaml arrays are already slower than the C ones Python uses, and I am trying to match Python performance as closely as possible. > It seems to me that you are trying to force the language to do something > it has been purposely designed against. I'm not sure you can win this fight. I am not trying to do this per se, since I believe, more or less, that the design principles are good. Rather, I would seek a 'proper' solution, in terms of these principles, believing one should exist, since the principles surely do not deliberately try to make inefficient code. This is why I proposed a categorical 'Initial' value, since this should fit well into the category theoretic type system framework. By having a special initial value, which can be used to initialise anything, and testing for it in the 'safe' versions of the generated code (and eliding the test in the unsafe ones), there may be a well typed solution, and a possibility the compiler can reason about the code enough to optimise even the safe code. I happen to 'know' that this is the state of the art in case of array bounds checking: in Modula code, apparently, 70% of the mandatory array bounds checks can be elided by adding suitable reasoning algorithms to the code. I do not make a numerical claim about the case of uninitialised values, although I note that some C compilers including gcc can issue warnings in some cases when it appears an uninitialised value is used. I think what I am saying, is that this kind of solution probably _is_ within the spirit of ocaml. My particular suggestion may be no good. Perhaps there is a better one? For example, a standard variable length array module would solve the problem, but only in that special case. Is there a more general, if not complete, solution, that is not as risky as the one I propose? -- John Skaller, mailto:skaller@maxtal.com.au 1/10 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller downloads: http://www.triode.net.au/~skaller ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-10 5:38 ` skaller @ 1999-10-10 20:44 ` William Chesters 1999-10-10 21:43 ` Hongwei Xi 1999-10-11 0:36 ` skaller 0 siblings, 2 replies; 36+ messages in thread From: William Chesters @ 1999-10-10 20:44 UTC (permalink / raw) To: caml-list skaller writes: > No. This isn't necesarily the case. There are other > solutions. In C++, the philosophy is to provide as much protection > as possible, without costing at run time, and with protection > that does cost at run time being optional. Are you sure that initialising arrays etc. carries enough cost to be worth avoiding? After all, one of the two problems, namely the requirement to keep legal values in slots at all times, is quite easy to work around when you have to---my basic Vector is about 100 lines, generously spaced---while the other, performance, worry seems a priori likely to be misplaced, because if you are constructing an array then your time complexity is presumably at least k×n for some nontrivial k, so that the extra few instructions × n are unlikely to make a big difference to the overall program, however annoying they look "in the small". ocaml already goes some way beyond what C++ considers acceptible inefficiency. That's fine for a vast number of applications on modern desktop hardware. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-10 20:44 ` William Chesters @ 1999-10-10 21:43 ` Hongwei Xi 1999-10-11 0:36 ` skaller 1 sibling, 0 replies; 36+ messages in thread From: Hongwei Xi @ 1999-10-10 21:43 UTC (permalink / raw) To: William Chesters; +Cc: caml-list Yes, I agree with the argument. It is not clear to me that there is a significant amount of efficiency lost on array initialization, which is always a one-time thing. Is there any convincing data to support otherwise? However, there is one case which could result in some significant loss of efficency (I learned this from Robert Harper). The case is where you represent a sparse array using an (ordinary) array representation (for quick access). This means O(n^2) time is to be spent on initialization, which may contribute to a significant portion of the entire running time of some program. Cheers, --Hongwei \~~~~/ \\ // \\ // @ Mail: hwxi@ececs.uc.edu C-o^o, ))__|| \\__//_ // \\ Url: http://www.ececs.uc.edu/~hwxi ( ^ ) ))__|| \--/-\\ \\ Tel: +1 513 871 4947 (home) / \V\ )) || // \\ \\ Tel: +1 513 556 4762 (office) ------ // || o // \\ \\//Fax: +1 513 556 7326 (department) Rhodes Hall 811-D Department of ECE & CS University of Cincinnati P. O. Box 210030 Cincinnati, OH 45221-0030 On Sun, 10 Oct 1999, William Chesters wrote: > skaller writes: > > No. This isn't necesarily the case. There are other > > solutions. In C++, the philosophy is to provide as much protection > > as possible, without costing at run time, and with protection > > that does cost at run time being optional. > > Are you sure that initialising arrays etc. carries enough cost to be > worth avoiding? After all, one of the two problems, namely the > requirement to keep legal values in slots at all times, is quite easy > to work around when you have to---my basic Vector is about 100 lines, > generously spaced---while the other, performance, worry seems a priori > likely to be misplaced, because if you are constructing an array then > your time complexity is presumably at least k×n for some nontrivial k, > so that the extra few instructions × n are unlikely to make a big > difference to the overall program, however annoying they look "in the > small". > > ocaml already goes some way beyond what C++ considers acceptible > inefficiency. That's fine for a vast number of applications on modern > desktop hardware. > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-10 20:44 ` William Chesters 1999-10-10 21:43 ` Hongwei Xi @ 1999-10-11 0:36 ` skaller 1999-10-12 7:20 ` David Mentr{'e} 1 sibling, 1 reply; 36+ messages in thread From: skaller @ 1999-10-11 0:36 UTC (permalink / raw) To: William Chesters; +Cc: caml-list William Chesters wrote: > > skaller writes: > > No. This isn't necesarily the case. There are other > > solutions. In C++, the philosophy is to provide as much protection > > as possible, without costing at run time, and with protection > > that does cost at run time being optional. > > Are you sure that initialising arrays etc. carries enough cost to be > worth avoiding? No, I'm not. > After all, one of the two problems, namely the > requirement to keep legal values in slots at all times, is quite easy > to work around when you have to---my basic Vector is about 100 lines, > generously spaced---while the other, performance, worry seems a priori > likely to be misplaced, because if you are constructing an array then > your time complexity is presumably at least k×n for some nontrivial k, > so that the extra few instructions × n are unlikely to make a big > difference to the overall program, however annoying they look "in the > small". > > ocaml already goes some way beyond what C++ considers acceptible > inefficiency. That's fine for a vast number of applications on modern > desktop hardware. Unfortunately, when writing an interpreter such as one for Python, reasonable efficieny plus some other advantages seem neccessary to get any users at all. JPython (Java implementation), for example, is 3 times slower than CPython (C implementation), and that was enough for me to discount it. In fact, I am writing a python interpreter because CPython is vastly (several THOUSAND times) too slow for what I need: extremely fast string operations are required to support Interscript, my literate programming tool. Example: Interscript includes documentation for ISO10646 (unicode) characters, the complete build of interscript itself, which is written in interscript, takes several HOURS on my 120Mhz pentium: the generated documentation is around 5Meg, mainly consisting of the character code tables. This is mainly because I have to do things like convert a string or plain text to HTML, which requires replacing characters '<' with '<'; that is, scan each individual character .. in an interpretive loop. It may seem strange to believe I can actually achieve this performance in ocaml! But I do! Because of the 'higher level algorithms are easier to write in ocaml than C' factor, optimisations using techniques such as type inference, special pattern recognition, etc, and generation of fast C or assembler, are possible. FYI the design is to use the interpreter only to load the modules required for a main program, then do whole program analysis to generate a single executable. But because Python also has dynamic features like 'exec string' it is necessary to make the interpreter available at run time, even in the compiled code. So it needs to be reasonably efficient. -- John Skaller, mailto:skaller@maxtal.com.au 1/10 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller downloads: http://www.triode.net.au/~skaller ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Stdlib regularity 1999-10-11 0:36 ` skaller @ 1999-10-12 7:20 ` David Mentr{'e} 0 siblings, 0 replies; 36+ messages in thread From: David Mentr{'e} @ 1999-10-12 7:20 UTC (permalink / raw) To: skaller; +Cc: William Chesters, caml-list skaller <skaller@maxtal.com.au> writes: > This is mainly because I have to do things like convert > a string or plain text to HTML, which requires replacing > characters '<' with '<'; that is, scan each individual > character .. in an interpretive loop. It could be choking for you, especially in a Caml mailing-list, but have you: 1. consider using regular expressions? They are typically made for the kind of thing you are trying to do. And regex engines have optimization. 2. consider using the Perl language? Is far from perfect, not very clean (in the Caml way at least (in other way either ;)) but has a very powerful bultin regex engine. Even without Perl, OCaml has a regex library I think. Any way, I think you should have a look at _Mastering Regular Expressions_ O'Reilly book. Best regards, d. -- David.Mentre@irisa.fr -- http://www.irisa.fr/prive/dmentre/ Opinions expressed here are only mine. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Proposal for study: Add a categorical Initial type to ocaml 1999-10-07 9:18 ` Francisco Valverde Albacete 1999-10-08 14:56 ` skaller @ 1999-10-08 16:38 ` skaller 1999-10-09 22:43 ` John Prevost 1 sibling, 1 reply; 36+ messages in thread From: skaller @ 1999-10-08 16:38 UTC (permalink / raw) To: caml-list I would like to propose adding a new special type to ocaml, a categorical initial type. This type is the categorical dual of the categorical terminal type, unit. There proposal is for a syntactic designator (say '$') for the non-existant value of the initial type, which can be bound to a variable of any type. [You could say it has type 'a, as does 'raise SomeException'] The effect of attempting to read this value from any type should be to raise the exception Uninitialised_value. Example: type A = { data: t } let x = { data = $ } in x.data (* raises exception *) type B = {mutable data: t } let y = { data = $ } in if something x.data <- t_value; x.data (* raise exception if not something *) The use of initial is something like a Haskell monad with all types T changed to T option, and all read acesses changed to match t with | Some t' -> t' | None -> raise Uninitialised_value A better name than $ is probably 'none'. Boxed values can use a null pointer for none. Integers and floats can be treated as follows: do not initialise them at all, if -unsafe is specified. Otherwise, use the spare value of integers and some NaN for floats, and insert tests for all read accesses. Chars can be handled too, but it is probably not worth the effort until they are lifted to ISO10646, which has suitable code points available. [We could use 0xFF for 8 bit chars] -- John Skaller, mailto:skaller@maxtal.com.au 1/10 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller downloads: http://www.triode.net.au/~skaller ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Proposal for study: Add a categorical Initial type to ocaml 1999-10-08 16:38 ` Proposal for study: Add a categorical Initial type to ocaml skaller @ 1999-10-09 22:43 ` John Prevost 1999-10-10 3:18 ` chet ` (2 more replies) 0 siblings, 3 replies; 36+ messages in thread From: John Prevost @ 1999-10-09 22:43 UTC (permalink / raw) To: skaller; +Cc: caml-list skaller <skaller@maxtal.com.au> writes: > A better name than $ is probably 'none'. > Boxed values can use a null pointer for none. > Integers and floats can be treated as follows: > do not initialise them at all, if -unsafe is > specified. Otherwise, use the spare value of integers > and some NaN for floats, and insert tests > for all read accesses. I have a question--how is this different (except for the efficiency of using null) from using a 'a option array? The array still must be initialized--in this case to all null, or in the unsafe case, well, it's unsafe. If anything, I would argue that this points at representing 'a option specially for already boxed values by using a null pointer for None, rather than having a special strange value that comes out of arrays and magically creates exceptions. One of the frustrations I have with things like Java is that it's not possible to type references in a way which guarantees a value is not null. With my proposal, your special thing turns into an 'a option array, lookup is the same (but is not guaranteed to provide a Some value), and there's a new array function: val Array.make_with_option (or something) : unit -> 'a option array which in some sort of "unsafe" mode doesn't actually initialize the array and guarantee that all values are well-formed. This provides a failure mechanism similar to unsafe array bounds, without adding strange null values into the language. Again, rather than introduce new features into the language, add a new unsafe optimization and make the option type more efficient for boxed values. John. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Proposal for study: Add a categorical Initial type to ocaml 1999-10-09 22:43 ` John Prevost @ 1999-10-10 3:18 ` chet 1999-10-10 6:14 ` skaller 1999-10-10 16:10 ` chet 2 siblings, 0 replies; 36+ messages in thread From: chet @ 1999-10-10 3:18 UTC (permalink / raw) To: John Prevost; +Cc: skaller, caml-list I second John's point. In order to capture Java's semantics "on the nose", you need to be able to model a slot of the T, which can contain a "null". Doing that by transforming Java's "T" to Caml's "T option", while using a more-efficient representation under-the-covers, preserves the possibility of disabling that efficiency-hack and debugging in a cleaner environment. Moreover, Java's semantics for variable-access is that fetching a null value doesn't raise an exception -- only invoking a method on it raises an exception. So one can't use this "initial value" to model Java's "null". Now, nobody claimed that Java's "null" was the target here, but, hey, that's my $0.02. --chet-- ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Proposal for study: Add a categorical Initial type to ocaml 1999-10-09 22:43 ` John Prevost 1999-10-10 3:18 ` chet @ 1999-10-10 6:14 ` skaller 1999-10-10 21:05 ` William Chesters ` (2 more replies) 1999-10-10 16:10 ` chet 2 siblings, 3 replies; 36+ messages in thread From: skaller @ 1999-10-10 6:14 UTC (permalink / raw) To: John Prevost; +Cc: caml-list John Prevost wrote: > > skaller <skaller@maxtal.com.au> writes: > > > A better name than $ is probably 'none'. > > Boxed values can use a null pointer for none. > > Integers and floats can be treated as follows: > > do not initialise them at all, if -unsafe is > > specified. Otherwise, use the spare value of integers > > and some NaN for floats, and insert tests > > for all read accesses. > > I have a question--how is this different (except for the efficiency of > using null) from using a 'a option array? The array still must be > initialized--in this case to all null, or in the unsafe case, well, > it's unsafe. It isn't, in theory, it is exactly the same (and therefore sound). The difference lies in the way it is represented: using the initial the 'a option wrapping is done by the system, not the client programmer (sort of like a Haskell monad) and therefore is more robust, more efficient, and leaves the client code cleaner. This is much the same argument as for exceptions. > If anything, I would argue that this points at representing 'a option > specially for already boxed values by using a null pointer for None, > rather than having a special strange value that comes out of arrays > and magically creates exceptions. That would make 'a option faster to use, but still require the overhead of writing code that needless does matches where it is known (dynamically) that only one of the cases can be used (such as in a variable length array). > Again, rather than introduce new features into the language, add a new > unsafe optimization and make the option type more efficient for boxed > values. The initial is not 'strange', rather it is fundamental, the dual of the 'unit' type. -- John Skaller, mailto:skaller@maxtal.com.au 1/10 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller downloads: http://www.triode.net.au/~skaller ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Proposal for study: Add a categorical Initial type to ocaml 1999-10-10 6:14 ` skaller @ 1999-10-10 21:05 ` William Chesters 1999-10-10 22:36 ` chet ` (2 more replies) 1999-10-11 12:40 ` John Prevost 1999-10-12 11:33 ` Jean-Francois Monin 2 siblings, 3 replies; 36+ messages in thread From: William Chesters @ 1999-10-10 21:05 UTC (permalink / raw) To: caml-list skaller writes: > > > Boxed values can use a null pointer for none. > > > > I have a question--how is this different (except for the efficiency of > > using null) from using a 'a option array? > > It isn't, in theory, it is exactly the same (and therefore sound). I think (correct me if I'm wrong, sorry) that this is a return to an issue which seems to be a bit of an FAQ. The problem is that if you have a value of the type `int option option', you have to be able to distinguish between `None' and `Some None'! If both the enumeration-indirections are elided, you can't; if only one of them is, you have inconsistency which would presumably have to be resolved using whole-program optimisation or something. For the lowdown and an interesting proposal for a fix, see http://caml.inria.fr/caml-list/0967.html The same idea would obviously apply to arrays. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Proposal for study: Add a categorical Initial type to ocaml 1999-10-10 21:05 ` William Chesters @ 1999-10-10 22:36 ` chet 1999-10-10 22:38 ` chet 1999-10-11 0:51 ` Proposal for study: Add a categorical Initial type to ocaml skaller 2 siblings, 0 replies; 36+ messages in thread From: chet @ 1999-10-10 22:36 UTC (permalink / raw) To: William Chesters; +Cc: caml-list, Pierre.Weis I read the referenced note [http://caml.inria.fr/caml-list/0967.html] and while I agree with most of what's said there, I think that putting the "option" attribute onto a field would reduce its value. Specifically, it is a highly common (albeit problematic) idiom in C, C++, Java, and other languages, to program with a pointer to X, which is perhaps null, and is only checked late, at the point when it is destructured somehow. That sort of idiom would not be possible to capture -- when we fetch the value of an "option" field, we are implicitly asking for the field to be checked for non-null-ness. In other words, I think that "nullability" is a useful property of types in general, and not merely of types of fields. --chet-- ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Proposal for study: Add a categorical Initial type to ocaml 1999-10-10 21:05 ` William Chesters 1999-10-10 22:36 ` chet @ 1999-10-10 22:38 ` chet 1999-10-11 19:30 ` John Prevost 1999-10-12 8:34 ` Option types and O'Labl merger Jacques Garrigue 1999-10-11 0:51 ` Proposal for study: Add a categorical Initial type to ocaml skaller 2 siblings, 2 replies; 36+ messages in thread From: chet @ 1999-10-10 22:38 UTC (permalink / raw) To: William Chesters; +Cc: caml-list, Pierre.Weis Of course, it is problematic that "None" and "Some None" are indistinguishable. But is it a reason to not have such a facility (as a storage-cost-free "option" type constructor)? I'd like to believe that the answer is "no" -- that the efficiency values of such a type-constructor outweigh the semantic difficulties. Of course, there's only one way to prove that -- by implementing both and trying it out on large programs. --chet-- >>>>> "WC" == William Chesters <williamc@dai.ed.ac.uk> writes: WC> The problem is that if you have a value of the type `int WC> option option', you have to be able to distinguish between WC> `None' and `Some None'! If both the enumeration-indirections WC> are elided, you can't; if only one of them is, you have WC> inconsistency which would presumably have to be resolved using WC> whole-program optimisation or something. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Proposal for study: Add a categorical Initial type to ocaml 1999-10-10 22:38 ` chet @ 1999-10-11 19:30 ` John Prevost 1999-10-12 8:34 ` Option types and O'Labl merger Jacques Garrigue 1 sibling, 0 replies; 36+ messages in thread From: John Prevost @ 1999-10-11 19:30 UTC (permalink / raw) To: chet; +Cc: William Chesters, caml-list, Pierre.Weis chet@watson.ibm.com writes: > Of course, it is problematic that "None" and "Some None" are > indistinguishable. But is it a reason to not have such a facility (as > a storage-cost-free "option" type constructor)? > > I'd like to believe that the answer is "no" -- that the efficiency > values of such a type-constructor outweigh the semantic difficulties. > > Of course, there's only one way to prove that -- by implementing both > and trying it out on large programs. One might presume that the definition that "'a nullOption === 'a nullOption nullOption" isn't too hard to understand (although it might muck up the inference engine.) (Some (Some x)) would be the same as (Some x) in this model, just as None would be the same as Some None. Probably not suitable for a replacement of the option type, but a useful additional type? John. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Option types and O'Labl merger 1999-10-10 22:38 ` chet 1999-10-11 19:30 ` John Prevost @ 1999-10-12 8:34 ` Jacques Garrigue 1999-10-12 14:38 ` William Chesters 1 sibling, 1 reply; 36+ messages in thread From: Jacques Garrigue @ 1999-10-12 8:34 UTC (permalink / raw) To: chet; +Cc: caml-list From: chet@watson.ibm.com > Of course, it is problematic that "None" and "Some None" are > indistinguishable. But is it a reason to not have such a facility (as > a storage-cost-free "option" type constructor)? > > I'd like to believe that the answer is "no" -- that the efficiency > values of such a type-constructor outweigh the semantic difficulties. In fact the semantic problem is not that bad. When I implemented O'Labl, I had also the idea that it would be nice to have option types without space (and dereference) cost, particularly since options are used to implement optional arguments. I even went as far as implementing it once, but dropped it since didn't make a big difference in efficiency, and might become a pain to maintain. The idea is just to represent Some^n(v) (the Some constructor applied n times to v) as v when v is not an option, and Some^n(None) as n. To be more precise, since you want to distinguish it from (n : int), you represent it as a pointer in an unused memory area. Word addresses starting at 0 would be a possible approach. Semantically this was correct since ML polymorphism is not strong enough to allow the dynamic creation of Some^n(None) for an arbitrary n. That is, the maximum n is a function of the program, and not of its input. Remark however that since you have polymorphic recursion in O'Labl now, this is no longer true, meaning that you would need a runtime check to make sure that you don't create wrong values. By the way, and this also an answer to another mail I saw recently, O'Caml and O'Labl are going to merge soon. That is, O'Labl features will be merged into the official O'Caml release (beware though that the syntax may be slightly changed to fit everybody's taste). If you really think that having such costless options would be useful, it is the time to make pressure to have them included in the merger. Regards, Jacques > >>>>> "WC" == William Chesters <williamc@dai.ed.ac.uk> writes: > > WC> The problem is that if you have a value of the type `int > WC> option option', you have to be able to distinguish between > WC> `None' and `Some None'! If both the enumeration-indirections > WC> are elided, you can't; if only one of them is, you have > WC> inconsistency which would presumably have to be resolved using > WC> whole-program optimisation or something. > ------------------------------------------------------ Jacques Garrigue, visiting INRIA from Kyoto University Jacques.Garrigue at inria.fr ^ permalink raw reply [flat|nested] 36+ messages in thread
* Option types and O'Labl merger 1999-10-12 8:34 ` Option types and O'Labl merger Jacques Garrigue @ 1999-10-12 14:38 ` William Chesters 1999-10-13 5:35 ` Frank A. Christoph 0 siblings, 1 reply; 36+ messages in thread From: William Chesters @ 1999-10-12 14:38 UTC (permalink / raw) To: caml-list Jacques Garrigue writes: > By the way, and this also an answer to another mail I saw recently, > O'Caml and O'Labl are going to merge soon. hey hey :-) > If you really think that having such costless options would be useful, > it is the time to make pressure to have them included in the merger. "Could we have costless options please, if it's REALLY not going to complicate the compiler too much?" ^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: Option types and O'Labl merger 1999-10-12 14:38 ` William Chesters @ 1999-10-13 5:35 ` Frank A. Christoph 1999-10-13 8:48 ` Jacques Garrigue 0 siblings, 1 reply; 36+ messages in thread From: Frank A. Christoph @ 1999-10-13 5:35 UTC (permalink / raw) To: caml-list > > If you really think that having such costless options would be useful, > > it is the time to make pressure to have them included in the merger. > > "Could we have costless options please, if it's REALLY not going to > complicate the compiler too much?" I don't really have an opinion on "costless options", but it is worth noting that including them is going to change the runtime's data representation, undoubtedly breaking many people's C-interfacing code (unless they use CamlIDL, I guess). --FAC ^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: Option types and O'Labl merger 1999-10-13 5:35 ` Frank A. Christoph @ 1999-10-13 8:48 ` Jacques Garrigue 0 siblings, 0 replies; 36+ messages in thread From: Jacques Garrigue @ 1999-10-13 8:48 UTC (permalink / raw) To: christo; +Cc: caml-list From: "Frank A. Christoph" <christo@nextsolution.co.jp> > > > If you really think that having such costless options would be useful, > > > it is the time to make pressure to have them included in the merger. > > > > "Could we have costless options please, if it's REALLY not going to > > complicate the compiler too much?" > > I don't really have an opinion on "costless options", but it is worth noting > that including them is going to change the runtime's data representation, > undoubtedly breaking many people's C-interfacing code (unless they use > CamlIDL, I guess). Indeed, and I was quite conscious of that when I made the remark, since most of my libraries would need extensive changes. But, then, the FFI is not part of the specification of the language but only of its implementation. And changes have already occured (unboxed float arrays for instance). If we're going to refuse a change because of incompatibilities at the FFI level, the language is dead. Another remark is that this would probably make writing C interfaces by hand a bit easier: no need to create a box everytime you return an option, with all the root registering machinery it may involve. And if you choose NULL as representation for None, lots of code may need no conversion at all. If you use CamlIDL, well, everything is automated, so you don't really. Stubs may be slightly smaller yet. Actually it's difficult to evaluate the efficiency gain of such a feature. With bytecode you cannot expect any real improvement in performance, since allocation is fast enough, and you still need to do some pointer comparisons with the unboxed scheme. The main avantage I would see is that there is more room for optimisation in the native code with such a representation. Take for instance a default parameter "fun ?x [< 5 >] -> ..." in O'Labl. The resolution might be reduced to (x86 code): cmp $ax, 0 ; check whether null pointer jne l1 mov $ax, (5*2)+1 ; load Val_int(5) l1: ... But again this is very compiler dependent, and the tests may be as costly as the dereference. And, coming back to the beginning of this discussion, this is a major gain in space if we are going to use 'a option arrays (factor of 3). Regards, Jacques ------------------------------------------------------ Jacques Garrigue, visiting INRIA from Kyoto University Jacques.Garrigue at inria.fr ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Proposal for study: Add a categorical Initial type to ocaml 1999-10-10 21:05 ` William Chesters 1999-10-10 22:36 ` chet 1999-10-10 22:38 ` chet @ 1999-10-11 0:51 ` skaller 2 siblings, 0 replies; 36+ messages in thread From: skaller @ 1999-10-11 0:51 UTC (permalink / raw) To: William Chesters; +Cc: caml-list William Chesters wrote: > > skaller writes: > > > > Boxed values can use a null pointer for none. > > > > > > I have a question--how is this different (except for the efficiency of > > > using null) from using a 'a option array? > > > > It isn't, in theory, it is exactly the same (and therefore sound). > > I think (correct me if I'm wrong, sorry) that this is a return to an > issue which seems to be a bit of an FAQ. > > The problem is that if you have a value of the type `int option > option', you have to be able to distinguish between `None' and `Some > None'! If both the enumeration-indirections are elided, you can't; if > only one of them is, you have inconsistency which would presumably > have to be resolved using whole-program optimisation or something. You are right. Note, however, that the 'initial' value would never be available to the client, since any attempt to get it would throw an exception, and in that case there is no need to distinguish the cases. To put it another way: programming languages generally distinguish isomorphic objects, for example ((), x) (* type is unit * int *) is isomorphic to x (* type is int *) but they are distinguished in _client_ code. The compiler may be able to use the isomorphism to achieve an optimisation in the representation. However, this is not the way initial works, since the client can never sensibly test for this value, since every attempt to do so would either fail or raise an exception. That is, while in the case of 'a option: match x with | Some x' -> do_something x' | None -> do_nothing it makes sense for do_nothing to do some kind of work, in the case of initial, it always signifies a programming error. So there is no need to distinguish the isomorphic cases here: unlike client code 'Initial' means 'forbidden', the action on the 'None' case is built in (raise an exception). -- John Skaller, mailto:skaller@maxtal.com.au 1/10 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller downloads: http://www.triode.net.au/~skaller ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Proposal for study: Add a categorical Initial type to ocaml 1999-10-10 6:14 ` skaller 1999-10-10 21:05 ` William Chesters @ 1999-10-11 12:40 ` John Prevost 1999-10-12 19:20 ` skaller 1999-10-12 11:33 ` Jean-Francois Monin 2 siblings, 1 reply; 36+ messages in thread From: John Prevost @ 1999-10-11 12:40 UTC (permalink / raw) To: skaller; +Cc: caml-list skaller <skaller@maxtal.com.au> writes: > > I have a question--how is this different (except for the efficiency of > > using null) from using a 'a option array? The array still must be > > initialized--in this case to all null, or in the unsafe case, well, > > it's unsafe. > > It isn't, in theory, it is exactly the same (and therefore sound). > The difference lies in the way it is represented: using the initial > the 'a option wrapping is done by the system, not the client programmer > (sort of like a Haskell monad) and therefore is more robust, more > efficient, and leaves the client code cleaner. This is much the > same argument as for exceptions. Hiding implementation details from the programmer does not always make code cleaner. You can have the monad comparison, though. I'll use it in a second. > > If anything, I would argue that this points at representing 'a option > > specially for already boxed values by using a null pointer for None, > > rather than having a special strange value that comes out of arrays > > and magically creates exceptions. > > That would make 'a option faster to use, but still require the > overhead of writing code that needless does matches where it is > known (dynamically) that only one of the cases can be used (such as > in a variable length array). Okay--so here's what I would do: "Inefficient" safe array implementation: type 'a exparray = { mutable ea_size : int; mutable ea_length : int; mutable ea_array : 'a array } let length a = a.ea_length let get a i = if i < a.ea_length then a.ea_array.(i) else raise (Invalid_argument "Earray.get") let set a i v = if i < a.ea_length then a.ea_array.(i) <- v else raise (Invalid_argument "Earray.set") let expand a n v = begin (if a.ea_length + n >= a.ea_size then let new_size = if n > a.ea_size then a.ea_size + n else a.ea_size * 2 in let new_arr = unsafe_create new_size in begin Array.blit a.ea_array 0 new_arr 0 a.ea_length; a.ea_array <- new_arr; a.ea_size <- new_size; end); Array.fill a.ea_array a.ea_length n v; a.ea_length <- a.ea_length + n end let create i = { ea_size = i; ea_length = 0; ea_array = unsafe_create i } let make i v = { ea_size = i; ea_length = i; ea_array = Array.make i v } Something like that. We can probably define unsafe_create like this: let unsafe_create i = (Obj.magic (Array.make i 0) : 'a array) (The above works, I've tested it. I highly recommend not trying to print out values of type 'a earray from the toplevel, though.) So this is the pretty unsafe version. There can't be a really unsafe version because not initializing the memory at all would play havoc with the GC. Now. What can we gather from this? First: by abusing Obj.magic, we can do what we want. Obviously, it's not strictly a nice thing to do, but if we can prove to ourselves that the code works right, then from the point of view of the API's client, everything's great. What else can we gather? Well, what would be different if this were implemented with options? There'd be the boxing inefficiency, of course. Anything else? Not really. We'd be guaranteeing something we can show to be true anyway, since that's why we were okay with using Obj.magic up there. So, is there a problem with my solution of working around the type system under the API level? > > Again, rather than introduce new features into the language, add a new > > unsafe optimization and make the option type more efficient for boxed > > values. > > The initial is not 'strange', > rather it is fundamental, the dual of the 'unit' type. No--it's strange. The reason it's strange is that it's something that can happen to a user that they don't expect. Like null in Java. There's nothing unsound about null (well, unless you consider that trying to call a method on null fails to be unsound, which I guess I probably do), but that doesn't mean it's not a pain in the ass. Why is it a pain in the ass? There's no way to turn it off. Like I've been saying, with option you can turn it off, with Obj.magic, the implementor ought to be damned sure he's doing things right. But with these special "uninitialized value" sorts of things, people like me who've gotten used to good type systems keep looking over their shoulders because they're afraid it might turn around and bite them in the butt. It's appropriate for exceptions to be thrown when out of bounds array accesses happen--although I'd rather it weren't. It's reasonable for the programmer to check and make sure this doesn't happen. It's not reasonable for the programmer to check that index i of an array that was passed in has actually been assigned before now. Much like it's not reasonable for the programmer to have to check that the argument is not null on a function that doesn't claim to accept null as an argument (my Java pet peeve). I really think adding anything that can cause more runtime errors is likely to be a wart, and no language needs more warts. Hrm. That last bit is too rambly, but I'll send this anyway. John. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Proposal for study: Add a categorical Initial type to ocaml 1999-10-11 12:40 ` John Prevost @ 1999-10-12 19:20 ` skaller 0 siblings, 0 replies; 36+ messages in thread From: skaller @ 1999-10-12 19:20 UTC (permalink / raw) To: John Prevost; +Cc: caml-list John Prevost wrote: > Something like that. We can probably define unsafe_create like this: > > let unsafe_create i = (Obj.magic (Array.make i 0) : 'a array) Thanks for the example, using Obj.magic. Since this is not really documented, I wasn't sure how it worked. This makes the code much cleaner. > (The above works, I've tested it. Thanks! > Now. What can we gather from this? First: by abusing Obj.magic, we > can do what we want. Obviously, it's not strictly a nice thing to do, > but if we can prove to ourselves that the code works right, then from > the point of view of the API's client, everything's great. Yes. You've used Obj.magic much as I'd envision I'd have used an 'initial' value. > What else can we gather? Well, what would be different if this were > implemented with options? There'd be the boxing inefficiency, of > course. Anything else? Not really. Yes: the implementation would be much messier. > So, is there a problem with my solution of working around the type > system under the API level? Obj.magic isn't documented; it isn't really part of the language. [But I think your solution is fine, provided it works] > > The initial is not 'strange', > > rather it is fundamental, the dual of the 'unit' type. > > No--it's strange. The reason it's strange is that it's something that > can happen to a user that they don't expect. Like null in Java. Which is better: initialising something with 'null', leaving it uninitialised, or initialising it with a dummy value, GIVEN that it must be declared before the first value is available? It would not be entirely unreasonable to argue that the dummy value is the _worst_ option, since failing to subsequently initialise it with the proper value is guarranteed NOT to be detected by the system (and will be reflected only in unexpected program semantics). Null seems better; even a core dump indicates there is an error. > Why is it [null] a pain in the ass? There's no way to turn it off. It's a built in 'a option, for every pointer. > Like I've been saying, with option you can turn it off, with Obj.magic, the > implementor ought to be damned sure he's doing things right. But with > these special "uninitialized value" sorts of things, people like me > who've gotten used to good type systems keep looking over their > shoulders because they're afraid it might turn around and bite them in > the butt. How is this different from Obj.magic? Can't that bite you in the ass too? > It's appropriate for exceptions to be thrown when out of bounds array > accesses happen--although I'd rather it weren't. It's better if the compiler checks. In Pascal, array bounds cannot be exceeded. [The index must be of the type for which the array is declared, which guarrantees it is in bounds]. Of course, an automatic conversion for int to a subrange can fail :-) >It's reasonable for > the programmer to check and make sure this doesn't happen. It's not > reasonable for the programmer to check that index i of an array that > was passed in has actually been assigned before now. Much like it's > not reasonable for the programmer to have to check that the argument > is not null on a function that doesn't claim to accept null as an > argument (my Java pet peeve). The way I understand 'initial', the programmer not only doesn't have to check, the programmer _cannot_ check for it. The check must be done by the system. > I really think adding anything that can cause more runtime errors is > likely to be a wart, and no language needs more warts. As I said above: I do not see that this argument holds water. The problem is that sometimes storage must be allocated before initial values are available (in a procedural language). It is possible then, to forget to initialise the storage properly. Forgetting is the problem, and it causes a run time error, no matter whether a special 'initial' value is used, Obj.magic is used, a dummy value is used, or the store is just left uninitialised. There is no getting around this: the only difference is the _kind_ of run time error. -- John Skaller, mailto:skaller@maxtal.com.au 1/10 Toxteth Rd Glebe NSW 2037 Australia homepage: http://www.maxtal.com.au/~skaller downloads: http://www.triode.net.au/~skaller ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Proposal for study: Add a categorical Initial type to ocaml 1999-10-10 6:14 ` skaller 1999-10-10 21:05 ` William Chesters 1999-10-11 12:40 ` John Prevost @ 1999-10-12 11:33 ` Jean-Francois Monin 2 siblings, 0 replies; 36+ messages in thread From: Jean-Francois Monin @ 1999-10-12 11:33 UTC (permalink / raw) To: skaller; +Cc: John Prevost, caml-list I'm not sure that category theory helps so much here. My own background in cat. th. is weak, here is my understanding: - unit is final because there is one & only one function from any type to unit, namely fun _ -> () - an initial type, say ini, is a type s.t. we have one & only one function from ini to any type; this should be the empty sum with no contructor: type emp = ;; The initial function would be let ini (x: emp) = match x with ;; Note that this is syntactically not allowed in ocaml. I don't think there is a theoretical problem to add it (at least there are extensions of caml type system allowing this) , however such a type would be intrinsically useless (without real use). In particular your '$' seems inconsistent to me. The only way to "get" such a value is to introduce it locally in the context, e.g. fun x -> x, or in your case let f dollar = let x = { data = ini dollar } which will never help ! [John Prevost <prevost@maya.com> wrote:] > I would like to propose adding a new special type to ocaml, > a categorical initial type. This type is the categorical dual > of the categorical terminal type, unit. > > There proposal is for a syntactic designator (say '$') for the > non-existant value of the initial type, which can > be bound to a variable of any type. > [You could say it has type 'a, as does 'raise SomeException'] > > The effect of attempting to read this value from any type > should be to raise the exception Uninitialised_value. > > Example: > > type A = { data: t } > let x = { data = $ } > in x.data (* raises exception *) Jean-Francois Monin ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: Proposal for study: Add a categorical Initial type to ocaml 1999-10-09 22:43 ` John Prevost 1999-10-10 3:18 ` chet 1999-10-10 6:14 ` skaller @ 1999-10-10 16:10 ` chet 2 siblings, 0 replies; 36+ messages in thread From: chet @ 1999-10-10 16:10 UTC (permalink / raw) To: John Prevost; +Cc: chet, skaller, caml-list After some thought, it occurred to me that, given ZINC's representation of objects, it should be possible to add a new type-constructor, type 'a unboxed_option = ubSome of 'a | ubNone wherein ubSome is represented by the identitty function in the code, and ubNone by NULL -- zero. If I remember right, there is currently no use for NULL pointers in CAML. So all we need to do is to write down the proper low-level lambda-language expressions for each of the constructor and destructor operations, as well as for equality-checking, and we're done (I think?). [[ubSome]] == [lam v]v [This is really equal to "if v == NULL then NULL else v", as explained below.] [[ubNone]] == 0 [[match e with ubSome x -> B1 | ubNone -> B2]] == let v1 = [[e]] in if v1 == NULL then [[B2]] else let x = v1 in [[B1]] Equality is taken care of by just doing what we do today, but making sure that NULL == NULL. I haven't thought this thru enough to believe that I could *prove* that well-typed programs don't go wrong, but I do *believe* it. We can think of the GC maintenance bit as being an implicit discriminator used to distinguish between the two cases -- ubSome or ubNone -- for *unboxed* values. Likewise, for *boxed* values, since no boxed value can be *null* (all boxed values are really represented in the heap, even -- especially -- nullary constructors) the null-ness of the ubNone value is used to discriminate. It is a happy coincidence that you don't have to strip off that discriminator "tag" when you destructure, and that you don't have to add the "tag" when you construct. Going further, things like ('a unboxed_option) unboxed_option are somewhat flawed, but they do work, (as they must, if this is going to work) -- of three possibilities, (a) ubSome(ubNone) (b) ubSome(ubSome x) (c) ubNone at construction time, (a) and (c) collapse, together, but everything is still well-typed. So there is a behavioural difference in programs -- they can detect that unboxed_option is actually unboxed, by using the fact that these two cases collapse. But seems like a small price to pay for something which gives you a rather useful feature of systems-programming languages. Comments, --chet-- ^ permalink raw reply [flat|nested] 36+ messages in thread
[parent not found: <199910130855.KAA19800@pauillac.inria.fr>]
* Re: Option types and O'Labl merger [not found] <199910130855.KAA19800@pauillac.inria.fr> @ 1999-10-13 11:11 ` Francisco Valverde Albacete 0 siblings, 0 replies; 36+ messages in thread From: Francisco Valverde Albacete @ 1999-10-13 11:11 UTC (permalink / raw) To: Pierre Weis; +Cc: caml-list Jacques Garrigue wrote: > By the way, and this also an answer to another mail I saw recently, > O'Caml and O'Labl are going to merge soon. That is, O'Labl features > will be merged into the official O'Caml release (beware though that > the syntax may be slightly changed to fit everybody's taste). I think it is only just that we thank everybody working in the merge. I know I am more than willing to pay for the adaption cost. Francisco Valverde Universidad Carlos III de Madrid ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~1999-10-14 12:48 UTC | newest] Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 1999-10-06 13:25 Stdlib regularity Ohad Rodeh 1999-10-06 16:18 ` Markus Mottl 1999-10-08 14:06 ` Matías Giovannini 1999-10-10 20:09 ` Pierre Weis 1999-10-10 20:12 ` Matías Giovannini 1999-10-08 14:10 ` skaller 1999-10-08 19:21 ` Markus Mottl 1999-10-09 21:14 ` Dave Mason 1999-10-06 18:50 ` John Prevost 1999-10-07 7:33 ` skaller 1999-10-07 9:18 ` Francisco Valverde Albacete 1999-10-08 14:56 ` skaller 1999-10-09 22:26 ` Francois Rouaix 1999-10-10 5:38 ` skaller 1999-10-10 20:44 ` William Chesters 1999-10-10 21:43 ` Hongwei Xi 1999-10-11 0:36 ` skaller 1999-10-12 7:20 ` David Mentr{'e} 1999-10-08 16:38 ` Proposal for study: Add a categorical Initial type to ocaml skaller 1999-10-09 22:43 ` John Prevost 1999-10-10 3:18 ` chet 1999-10-10 6:14 ` skaller 1999-10-10 21:05 ` William Chesters 1999-10-10 22:36 ` chet 1999-10-10 22:38 ` chet 1999-10-11 19:30 ` John Prevost 1999-10-12 8:34 ` Option types and O'Labl merger Jacques Garrigue 1999-10-12 14:38 ` William Chesters 1999-10-13 5:35 ` Frank A. Christoph 1999-10-13 8:48 ` Jacques Garrigue 1999-10-11 0:51 ` Proposal for study: Add a categorical Initial type to ocaml skaller 1999-10-11 12:40 ` John Prevost 1999-10-12 19:20 ` skaller 1999-10-12 11:33 ` Jean-Francois Monin 1999-10-10 16:10 ` chet [not found] <199910130855.KAA19800@pauillac.inria.fr> 1999-10-13 11:11 ` Option types and O'Labl merger Francisco Valverde Albacete
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).