From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <perrin@apotheon.com>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.1 required=5.0 tests=AWL autolearn=disabled 
	version=3.1.3
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39])
	by yquem.inria.fr (Postfix) with ESMTP id 7DB90BC0A
	for <caml-list@yquem.inria.fr>; Fri, 22 Dec 2006 21:15:23 +0100 (CET)
Received: from host15.ipowerweb.com (host15.ipowerweb.com [66.235.219.115])
	by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id kBMKFL2U015392
	(version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO)
	for <caml-list@yquem.inria.fr>; Fri, 22 Dec 2006 21:15:23 +0100
Received: from c-24-9-123-251.hsd1.co.comcast.net ([24.9.123.251] helo=apotheon.com)
	by host15.ipowerweb.com with esmtp (Exim 4.52)
	id 1GxqnQ-0003q3-3c; Fri, 22 Dec 2006 12:15:12 -0800
Received: by apotheon.com (Postfix, from userid 1000)
	id 70FC9333CC; Fri, 22 Dec 2006 13:14:52 -0700 (MST)
Date: Fri, 22 Dec 2006 13:14:52 -0700
From: Chad Perrin <perrin@apotheon.com>
To: Daniel =?iso-8859-1?Q?B=FCnzli?= <daniel.buenzli@epfl.ch>
Cc: Chad Perrin <perrin@apotheon.com>,
	Jon Harrop <jon@ffconsultancy.com>, caml-list@yquem.inria.fr
Subject: Re: strong/weak typing terminology (was Re: [Caml-list] Scripting in ocaml)
Message-ID: <20061222201452.GB23003@apotheon.com>
References: <6dbd4d000612201941wcd4b09anc503a13889576512@mail.gmail.com> <20061221153413.8f99e8ed.mle+ocaml@mega-nerd.com> <1166685756.5337.4.camel@rosella.wigram> <458A8C7B.7050204@hq.idt.net> <1166709162.5653.11.camel@rosella.wigram> <458AA143.3090303@hq.idt.net> <20061221202520.GG9440@apotheon.com> <DCBCB457-1A02-4AE6-8712-4A7D41FD1E3D@epfl.ch> <20061221221650.GL9440@apotheon.com> <3EC73FC3-41A6-4FB1-9549-29286A6568CC@epfl.ch>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
In-Reply-To: <3EC73FC3-41A6-4FB1-9549-29286A6568CC@epfl.ch>
User-Agent: Mutt/1.5.13 (2006-08-11)
Content-Transfer-Encoding: quoted-printable
X-Antivirus-Scanner: Clean mail though you should still use an Antivirus
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - host15.ipowerweb.com
X-AntiAbuse: Original Domain - yquem.inria.fr
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - apotheon.com
X-Source: 
X-Source-Args: 
X-Source-Dir: 
X-Miltered: at concorde with ID 458C3CD9.001 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)!
X-Spam: no; 0.00; ocaml:01 0100,:01 bunzli:01 avoided:01 statically:01 compilation:01 abstractions:01 high-level:01 abstractions:01 language's:01 higher-level:01 unreadable:01 statically:01 haskell:01 ocaml:01 

On Fri, Dec 22, 2006 at 01:21:07PM +0100, Daniel B=FCnzli wrote:
> Le 21 d=E9c. 06 =E0 23:16, Chad Perrin a =E9crit :
>=20
> >I mean that it doesn't allow you to go around doing in-place type
> >changes willy-nilly the way something like C does.
>=20
> (Well in fact you can with Obj.magic, but that's not my point)
>=20
> The problem is that this weak/strong terminology is hopelessly =20
> confused (see [1],[2]). Since there is no clear unique definition of =20
> strong/weak typing I think this terminology should be avoided. I tend =20
> to favour the definitions you can find in the introduction of this =20
> book [3] which are imho less confusing.

In point of fact, there is a clear definition.  The problem is that
people who don't pay much attention to it end up using the terms
"strongly typed" and "weakly typed" in a (ahem) weakly typed manner.
Thus, it is the colloquial definition that is almost meaningless due to
widespread conflict in its use, while there is a formal definition that
endures somewhere behind all the noise.

>=20
> Basically the author distinguishes on one hand between statically and =20
> dynamically typed languages, and on the other hand, between safe and =20
> unsafe languages.
>=20
> Static and dynamic type checking refers to whether type checks are =20
> respectively performed at compilation or run time.

That's absolutely correct, and fits the formal definitions.


>=20
> Safety is broadly defined as follows :
>=20
> "A safe language is one that protects its own abstractions. Every =20
> high-level language provides abstractions of machine services. Safety =20
> refers to the language's ability to guarantee the integrity of these =20
> abstractions and of higher-level abstractions introduced by the =20
> programmer using the definitional facilities of the language"

That's an almost useless representation of the definition,
unfortunately, because the terms used in it are vague in application and
the structure of the definition as presented is a bit too complex.
While he may well know what he's talking about, his prose reads a bit
like poorly written code -- which is to say that it's almost unreadable
in any meaningful sense.


>=20
> Later he gives the following chart
>=20
>        |Statically checked       | Dynamically checked
> -------------------------------------------------
> safe   | ML, Haskell, Java, etc. | Lisp, Scheme, Perl, Postscript, etc
> unsafe | C, C++, etc.            |
>=20
> Subsequently he adds :
>=20
> "Language safety is seldom absolute. Safe languages often offer =20
> programmers "escape hatches", such as foreign function calls to code =20
> written in other, possibly unsafe, languages. Indeed such escape =20
> hataches are sometimes provided in a controlled from within the =20
> language itself--Obj.magic in Ocaml, ... "

He comes dangerously close to touching on strong/weak typing in that
statement.

Here's a summary of type system characteristics as I learned about them,
and confirmed by experience, research, reasoning over the years:

type-safety: A type-safe language is one whose core type system does not
allow for type errors at runtime.  Such a language is almost impossible
while still being a useful general-purpose language, but many languages
come close enough to warrant the term "type-safe" for common usage.  By
contrast, a type-unsafe language in its purest meaning is one that does
nothing to prevent type errors at all.

statically typed: A statically typed language is one that determines
types at compile time.  By contrast, a dynamically typed language is one
that determines types at runtime, generally based on context.

strongly typed: A strongly typed language is one that does not allow the
programmer to simply dodge the type system and where any instantiated or
applied datum has a given type that cannot be changed, explicitly or
implicitly.  By contrast, a weakly typed language provides casts, bit
manipulation, and other in-place conversions or evasions to subvert
types.

duck typing: This is a more recent term than others related to typing,
and arose with the Ruby community.  It essentially refers to a language
that allows for a type to be unset until the datum to be typed has a
typable value, at which point the type is "obvious" and set in stone.
The most common implementation of this uses dynamic typing, but there
are static type systems that could be called "duck typed", such as OCaml
and Objective C -- though in OCaml's case, it is achieved through type
inference, and in Objective C it is limited to object types and is
determined by behavior rather than type inference.

That's how I've come to understand type system characteristics, and it's
an internally consistent set of definitions that provides meaningful
descriptive terms and agrees with the statements of all the experts who
actually seem to care about the respective terms' meanings.  Perhaps
more to the point, it fits the computer science framework of language
theory that informs the axioms on which the best programming theory
books available (at least to judge by those I've seen).

That makes OCaml type-safe, statically typed, strongly typed, and
arguably duck typed (depending on how you use it).  It also makes Ruby
duck typed, type-safe, dynamically typed, and strongly typed; it makes
Perl type-safe, dynamically typed, and strongly typed; it makes C
statically typed, weakly typed, and kind of half-assed type-safe if
you're careful; it makes C++ less type-safe, weakly typed (though
slightly less so than C), and statically typed, as well as allowing you
to fake duck typing via templates and the like; it makes Objective C
statically typed, exactly as half-assed type-safe as C, half-assed
strongly typed (more so than C or C++), and duck typed insofar as its
object system is concerned; it makes PHP weakly typed, dynamically
typed, and type-unsafe; it makes Java fairly type-safe, weakly typed,
and statically typed; and it makes Pascal statically typed, weakly
typed, and mostly type-safe.

At least, that's what comes to mind off the top of my head.

--=20
CCD CopyWrite Chad Perrin [ http://ccd.apotheon.org ]
"Real ugliness is not harsh-looking syntax, but having to
build programs out of the wrong concepts." - Paul Graham