From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.3 required=5.0 tests=AWL,DNS_FROM_RFC_ABUSE, HTML_MESSAGE autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id EFA1FBBAF for ; Sun, 9 Aug 2009 20:55:11 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AuADANu4fkqACH+TgWdsb2JhbACCVZd2AQEWJKMZhUeIToQYBYI2 X-IronPort-AV: E=Sophos;i="4.43,349,1246831200"; d="scan'208,217";a="34208566" Received: from server-nat-4.cs.umd.edu (HELO bacon.cs.umd.edu) ([128.8.127.147]) by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 09 Aug 2009 20:55:10 +0200 X-CSD-MailScanner-Watermark: 1250448902.97907@aKBdHC3bwy71W/X/qd5l4A Received: from [192.168.1.102] (208-59-169-96.c3-0.slvr-ubr1.lnh-slvr.md.cable.rcn.com [208.59.169.96]) (authenticated bits=0) by bacon.cs.umd.edu (8.13.1/8.14.1) with ESMTP id n79It0qf009920 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for ; Sun, 9 Aug 2009 14:55:00 -0400 Message-Id: <32623A80-A2C3-4AE4-B9FA-3FC0888494BD@cs.umd.edu> From: Elnatan Reisner To: Caml Mailing List In-Reply-To: <4A7ED53B.90607@frisch.fr> Content-Type: multipart/alternative; boundary=Apple-Mail-1-954022223 Mime-Version: 1.0 (Apple Message framework v936) Subject: Re: [Caml-list] Re: ocaml sefault in bytecode: unanswered questions Date: Sun, 9 Aug 2009 14:56:01 -0400 References: <000301ca184b$032b8cc0$0982a640$@chollet@free.fr> <000701ca184d$29507e90$7bf17bb0$@metastack.com> <001001ca18c7$37b22220$a7166660$@chollet@free.fr> <001501ca18cc$d59a61a0$80cf24e0$@metastack.com> <001201ca18e9$c4456810$4cd03830$@chollet@free.fr> <4A7ED53B.90607@frisch.fr> X-Mailer: Apple Mail (2.936) X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0 (bacon.cs.umd.edu [172.24.3.34]); Sun, 09 Aug 2009 14:55:00 -0400 (EDT) X-CSD-MailScanner-Information: Please email staff@cs.umd.edu for more information X-MailScanner-ID: n79It0qf009920 X-CSD-MailScanner: Found to be clean X-CSD-MailScanner-SpamCheck: not spam, SpamAssassin (not cached, score=-49.99, required 5, autolearn=not spam, ALL_TRUSTED -50.00, HTML_MESSAGE 0.01) X-CSD-MailScanner-From: elnatan@cs.umd.edu X-Spam: no; 0.00; ocaml:01 bytecode:01 pervasives:01 bug:01 pervasives:01 non-mutable:01 non-mutable:01 mutable:01 symmetric:01 frisch:01 recursive:01 ocaml:01 nans:01 bug:01 mutable:01 --Apple-Mail-1-954022223 Content-Type: text/plain; charset=WINDOWS-1252; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Continuing this conversation about equality... On Aug 9, 2009, at 9:20 AM, David Allsopp wrote: > Ivan Chollet wrote: > >> I would have thought physical equality implies structural equality, =20= >> but it >> doesn't seem like it. >> Can you please explain to me what=92s wrong there? > > "It does" - but only with the given caveat that comparison of cyclic > structures may not terminate. Pervasives.compare has a similar =20 > restriction, > although note that [compare q q] in your example does return 0 =20 > (which is > lucky or it would be a bug as the docs for Pervasives.(=3D=3D) state =20= > that [x =3D=3D > y] =3D> [compare x y =3D 0]). Let me start with a few pedantic comments about the documentation of =20 (=3D=3D): First, what David says isn't *quite* what the documentation says. The =20= documentation only says this is [x=3D=3Dy] implies [compare x y =3D 0] = for =20 non-mutable structures. And in fact, what the documentation says is =20 surprisingly weak: an implementation could define (=3D=3D) as *always =20= false* for non-mutable structures and yet satisfy the documentation. =20 I'm not sure of the right way to reword it, but this loophole in the =20 specification seems undesirable. My other issue is that the description of (=3D=3D) for mutable = structures =20 doesn't specify that it is symmetric; reading the documentation =20 literally only implies that e1 is a substructure of e2. Even just =20 adding 'and vice versa' might clean this up: e1 =3D=3D e2 is true if and only if physical modification of e1 also =20 affects e2 and vice versa Okay, now for more substantive questions: > The notes in http://caml.inria.fr/mantis/view.php?id=3D3322 may be of =20= > interest > too... I hadn't paid attention to the distinction drawn between 'may not =20 terminate' and 'does not terminate' until I read the discussion at =20 that link, but it's relevant to my question below... On Aug 9, 2009, at 9:55 AM, Alain Frisch wrote: > On 8/9/2009 2:06 PM, ivan chollet wrote: > >> I would have thought physical equality implies structural equality, =20= >> but >> it doesn=92t seem like it. >> >> Can you please explain to me what=92s wrong there? > > There are two modes for the generic comparison. The total mode =20 > (Pervasives.compare) creates a total ordering between values (except =20= > for functional values and custom blocks with no comparison function) =20= > and uses physical equality as a shortcut to cut the recursive =20 > traversal of sub-values. The non-total mode (used by the operators =3D = =20 > < > <=3D >=3D) has a different behavior for NaN values ([nan <=3D x], = [x =20 > <=3D nan], and [nan =3D x] all return false, including when x is nan) =20= > and does not use the physical equality shortcut (so that [let x =3D =20= > (nan, nan) in x =3D x] returns false). In terms of 'may not' versus 'does not' terminate, I just noticed that =20= the current documentation for (=3D) says 'may not terminate' while the =20= documentation for (>=3D) says 'does not terminate'. However, the example = =20 cited in the link above: type t =3D { a : t };; let rec x =3D { a =3D x } in x =3D x doesn't terminate in OCaml 3.11. This seems to be about as simple a =20 cyclic structure as there is, so shouldn't (=3D)'s documentation say =20 'Equality between cyclic structures does not terminate.'? Also, the documentation for Pervasives.compare says 'may not =20 terminate'. Is this supposed to imply that it uses the shortcut Alain =20= mentions (because if it did not use physical equality as a shortcut =20 then it would say 'does not terminate')? Or is this shortcutting meant =20= to be undocumented? And is there any reason, aside from NaN values, that (=3D) does *not* =20= use physical equality as a shortcut? The only other reason I can think =20= of is that, in cases where things are in fact *not* physically equal, =20= checking physical equality would introduce a small overhead. In any case, if I have a data structure which I know does not contain =20= any NaNs (for example, maybe it doesn't contain any floats =20 whatsoever), is there ever a reason I should prefer (=3D) to (compare x =20= y =3D 0)? It sounds to me like compare is preferable because of its =20 shortcutting behavior. -Elnatan= --Apple-Mail-1-954022223 Content-Type: text/html; charset=WINDOWS-1252 Content-Transfer-Encoding: quoted-printable
Continuing this = conversation about equality...

On Aug 9, 2009, = at 9:20 AM, David Allsopp wrote:

Ivan = Chollet wrote:

I would have thought physical equality implies structural = equality, but it
doesn't seem = like it.
Can you please = explain to me what=92s wrong there?

"It does" - but = only with the given caveat that comparison of cyclic
structures may = not terminate. Pervasives.compare has a similar restriction,
although = note that [compare q q] in your example does return 0 (which is
lucky = or it would be a bug as the docs for Pervasives.(=3D=3D) state that [x = =3D=3D
y] =3D> [compare x y =3D = 0]).

Let me start with a few = pedantic comments about the documentation of = (=3D=3D):

First, what David says isn't = *quite* what the documentation says. The documentation only says this is = [x=3D=3Dy] implies [compare x y =3D 0]  for = non-mutable structures. And in fact, what the documentation says is = surprisingly weak: an implementation could define (=3D=3D) as *always = false* for non-mutable structures and yet satisfy the documentation. I'm = not sure of the right way to reword it, but this loophole in the = specification seems undesirable.

My other issue = is that the description of (=3D=3D) for mutable structures doesn't = specify that it is symmetric; reading the documentation literally only = implies that e1 is a substructure of e2. Even just adding 'and vice = versa' might clean this up:
e1 =3D=3D e2 is true if and only if = physical modification of e1 also affects e2 and vice = versa

Okay, now for = more substantive questions:

The notes in http://caml.inria.= fr/mantis/view.php?id=3D3322 may be of = interest
too...

I hadn't paid = attention to the distinction drawn between 'may not terminate' and = 'does not terminate' until I read the discussion at that link, but = it's relevant to my question below...

On Aug 9, 2009, at 9:55 = AM, Alain Frisch wrote:

On = 8/9/2009 2:06 PM, ivan chollet wrote:

I would have thought physical equality implies structural = equality, but
it doesn=92t = seem like it.

Can you please = explain to me what=92s wrong there?

There are two = modes for the generic comparison. The total mode (Pervasives.compare) = creates a total ordering between values (except for functional values = and custom blocks with no comparison function) and uses physical = equality as a shortcut to cut the recursive traversal of sub-values. The = non-total mode (used by the operators =3D < > <=3D >=3D) has = a different behavior for NaN values ([nan <=3D x], [x <=3D nan], = and [nan =3D x] all return false, including when x is nan) and does not = use the physical equality shortcut (so that [let x =3D (nan, nan) in x =3D= x] returns false).

In terms of = 'may not' versus 'does not' terminate, I just noticed that the = current documentation for (=3D) says 'may not terminate' while the = documentation for (>=3D) says 'does not terminate'. However, the = example cited in the link above:
type t =3D { a : t = };;
let rec x =3D { a =3D x } in x =3D x
doesn't = terminate in OCaml 3.11. This seems to be about as simple a cyclic = structure as there is, so shouldn't (=3D)'s documentation say 'Equality = between cyclic structures does not = terminate.'?

Also, the documentation = for Pervasives.compare says 'may not terminate'. = Is this supposed to imply that it uses the shortcut Alain = mentions (because if it did not use physical equality as a shortcut then = it would say 'does not terminate')? Or is this shortcutting meant to be = undocumented?

And is there any reason, aside = from NaN values, that (=3D) does *not* use physical equality as a = shortcut? The only other reason I can think of is that, in cases where = things are in fact *not* physically equal, checking physical equality = would introduce a small overhead.

In any case, = if I have a data structure which I know does not contain any NaNs (for = example, maybe it doesn't contain any floats whatsoever), is there ever = a reason I should prefer (=3D) to (compare x y =3D 0)? It sounds to me = like compare is preferable because of its shortcutting = behavior.

-Elnatan
= --Apple-Mail-1-954022223--