caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: skaller <skaller@maxtal.com.au>
To: Francois Rouaix <frouaix@mail.liquidmarket.com>
Cc: caml-list@inria.fr
Subject: Re: Stdlib regularity
Date: Sun, 10 Oct 1999 15:38:24 +1000	[thread overview]
Message-ID: <38002650.6DD48411@maxtal.com.au> (raw)
In-Reply-To: <199910092226.PAA02183@fiji01.liquidmarket.com>

Francois Rouaix wrote:
> 
> <skaller@maxtal.com.au> writes:
> 
>   skaller>      There are several cases where the core language prevents
>   skaller> this, because it lacks functionality available in C++: the
>   skaller> ability to create uninitialised values, and the ability to
>   skaller> destroy them are two that I've become aware of trying to build
>   skaller> a variable length array module.
> 
> Uninitialized values are easily implemented with the 'a option type.
> Of course, the code is then ugly, because you have to match your
> values everywhere to None | Some x. In C/C++ terms, this forces you
> to check for NULL pointers systematically, which is a Really Good Thing.

	There is a difference: Some/None must be matched EVERY use.
Null pointers only need to be checked where they can occur.
In the case of an array of variable length, with extra uninitialised
slots, the Some/None typing of each element is extra overhead that
isn't required if the accessing codes are correct, it is only
necessary to check if the index is in range.

> Adding uninitialised values is a major source of bugs, and it's kind
> of natural to pay the price for it in the readability of the source,
> if you want your code to be robust.

	No. This isn't necesarily the case. There are other
solutions. In C++, the philosophy is to provide as much protection
as possible, without costing at run time, and with protection
that does cost at run time being optional.

	For example, the designer of a class can determine
whether or not use of a variable of the class requires initialisation,
whether a default value makes sense, etc. [in this case, the client
of the class cannot override the designers decisions].
 
	To put this another way: static typing, and other safety guarrantees, 
are intrinsically limited in what can be expressed: it cannot prevent
all run time
errors, and it may _exclude_ correct cases as well. So it is always a
compromise.
It makes sense to improve these systems to be more expressive, and
provide
even better guarrantees, but there will (almost always) be a need to
escape
the system when the programmer knows better.

	In C++, such escapes are provided, for example with
new style casts, but in a way which discourages use, rather than
outright prevention. In ocaml, Obj.magic does some of the same
kind of things, I believe, and there is always the ability to
write C, or even patch the compiler.

	In general, I really think it is necessary to provide
a 'low level unsafe' interface in a language, which at least
will not be used by accident, and where deliberate uses are at least
easy to find. These low level features are not needed where reasonable
and complete safe solutions are available: for example, it has been
proved that 'goto' is not required in a language with a few sensible
looping constructions, and so one can argue this feature is not
required.

	I think you can argue that a powerful enough functional
language does _not_ require uninitialised values (that is,
there is always an efficient way to initialise variables),
but that isn't the case in a procedural language. Because
referential transparency is lost, compilers cannot reason as
easily about program structure, and hence cannot optimise away 
useless dummy initialisations, so efficiency is lost.

	
> Builtin arrays require you to provide a initialization value, even for
> zero-length array. Why don't you carry the same requirement to your
> extensible arrays, and simply use a polymorphic type:
> 
> type 'a earray = {
>      mutable current : 'a array;
>      mutable used : int;
>      }
> 
> let create n i = { current = Array.create n i; used = n }

	Sure, but that only works for the 'create' method.
In other cases, like, concatenation, the solution requires
testing if the two arrays to be concatenated are zero length,
if so creating a zero length array using [| |], otherwise
finding an object from one of the two arrays that contains one,
to initialise the result array.

	It is always possible to do this, but it is messy,
and it _requires_ that the physical length of an array used to
hold no elements be zero in cases when there is no available
'dummy' value to fill the unused slots.
 
> And then, if you want to have the equivalent of NULL pointers, use None,
> and option types everywhere.

	Ocaml arrays are already slower than the C ones Python uses,
and I am trying to match Python performance as closely as possible.

> It seems to me that you are trying to force the language to do something
> it has been purposely designed against. I'm not sure you can win this fight.

	I am not trying to do this per se, since I believe, more or less,
that the design principles are good. Rather, I would seek a 'proper'
solution,
in terms of these principles, believing one should exist, since the
principles
surely do not deliberately try to make inefficient code. This is why I
proposed
a categorical 'Initial' value, since this should fit well into the
category
theoretic type system framework. By having a special initial value,
which can be used to initialise anything, and testing for it in the
'safe'
versions of the generated code (and eliding the test in the unsafe
ones),
there may be a well typed solution, and a possibility the compiler can
reason about the code enough to optimise even the safe code.

	I happen to 'know' that this is the state of the art in case
of array bounds checking: in Modula code, apparently, 70% of the
mandatory
array bounds checks can be elided by adding suitable reasoning
algorithms
to the code.

	I do not make a numerical claim about the case of uninitialised
values, although I note that some C compilers including gcc can issue
warnings in some cases when it appears an uninitialised value is used.

	I think what I am saying, is that this kind of solution
probably _is_ within the spirit of ocaml. My particular suggestion
may be no good. Perhaps there is a better one?

	For example, a standard variable length array module would
solve the problem, but only in that special case. Is there a more
general, if not complete, solution, that is not as risky as the one I
propose?

-- 
John Skaller, mailto:skaller@maxtal.com.au
1/10 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
downloads: http://www.triode.net.au/~skaller




  reply	other threads:[~1999-10-10 20:18 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1999-10-06 13:25 Ohad Rodeh
1999-10-06 16:18 ` Markus Mottl
1999-10-08 14:06   ` Matías Giovannini
1999-10-10 20:09     ` Pierre Weis
1999-10-10 20:12       ` Matías Giovannini
1999-10-08 14:10   ` skaller
1999-10-08 19:21     ` Markus Mottl
1999-10-09 21:14     ` Dave Mason
1999-10-06 18:50 ` John Prevost
1999-10-07  7:33 ` skaller
1999-10-07  9:18 ` Francisco Valverde Albacete
1999-10-08 14:56   ` skaller
1999-10-09 22:26     ` Francois Rouaix
1999-10-10  5:38       ` skaller [this message]
1999-10-10 20:44         ` William Chesters
1999-10-10 21:43           ` Hongwei Xi
1999-10-11  0:36           ` skaller
1999-10-12  7:20             ` David Mentr{'e}
1999-10-08 16:38   ` Proposal for study: Add a categorical Initial type to ocaml skaller
1999-10-09 22:43     ` John Prevost
1999-10-10  3:18       ` chet
1999-10-10  6:14       ` skaller
1999-10-10 21:05         ` William Chesters
1999-10-10 22:36           ` chet
1999-10-10 22:38           ` chet
1999-10-11 19:30             ` John Prevost
1999-10-12  8:34             ` Option types and O'Labl merger Jacques Garrigue
1999-10-12 14:38               ` William Chesters
1999-10-13  5:35                 ` Frank A. Christoph
1999-10-13  8:48                   ` Jacques Garrigue
1999-10-11  0:51           ` Proposal for study: Add a categorical Initial type to ocaml skaller
1999-10-11 12:40         ` John Prevost
1999-10-12 19:20           ` skaller
1999-10-12 11:33         ` Jean-Francois Monin
1999-10-10 16:10       ` chet
1999-10-09 16:58 Stdlib regularity William Chesters
1999-10-10  0:11 ` Matías Giovannini
1999-10-12 16:21 Damien Doligez
1999-10-13 12:18 ` Matías Giovannini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=38002650.6DD48411@maxtal.com.au \
    --to=skaller@maxtal.com.au \
    --cc=caml-list@inria.fr \
    --cc=frouaix@mail.liquidmarket.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).