From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82])
	by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id q18EBqBw030949
	for <caml-list@sympa-roc.inria.fr>; Wed, 8 Feb 2012 15:12:04 +0100
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqwBAHaBMk8machzl2dsb2JhbABDhQ2oMoJNAQEBAQEIFgc7gXIBAQQBIwQRQAEFCwsYAgIFFgsCAgkDAgECAUUGDQEHAQGHeKdnkgSBL4ocAQcCAgkFDQQGAQsBCAUDAwkFDAKCfhkEAwwDFAWDL4EWBKgK
X-IronPort-AV: E=Sophos;i="4.73,383,1325458800"; 
   d="scan'208";a="143401221"
Received: from mx2.janestreet.com (HELO nyc-dmz-mxout2.janestreet.com) ([38.105.200.115])
  by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 08 Feb 2012 15:12:03 +0100
Received: from nyc-qsv-mail1.delacy.com ([172.25.22.57])
	by nyc-dmz-mxout2.janestreet.com with esmtp (Exim 4.76)
	(envelope-from <dhouse@janestreet.com>)
	id 1Rv8Fe-0000Uu-GP; Wed, 08 Feb 2012 09:12:02 -0500
Received: from ldn-qws-011.delacy.com ([172.23.133.111])
	by nyc-qsv-mail1.delacy.com with esmtpsa (TLSv1:AES256-SHA:256)
	(Exim 4.76)
	(envelope-from <dhouse@janestreet.com>)
	id 1Rv8Fe-00009z-9t; Wed, 08 Feb 2012 09:12:02 -0500
Message-ID: <4F3282B1.1050205@janestreet.com>
Date: Wed, 08 Feb 2012 14:12:01 +0000
From: David House <dhouse@janestreet.com>
User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:9.0) Gecko/20111222 Thunderbird/9.0.1
MIME-Version: 1.0
To: oliver <oliver@first.in-berlin.de>
CC: Gabriel Scherer <gabriel.scherer@gmail.com>,
        =?UTF-8?B?TWF0ZWogS2/FocOt?=
 =?UTF-8?B?aw==?= <5764c029b688c1c0d24a2e97cd764f@gmail.com>,
        caml-list@inria.fr
References: <4F326EA6.20900@gmail.com> <CAPFanBGE6RchhVLBCvaT8u_5HtdkhMmGWbwN9_UGSk_Mtff=yA@mail.gmail.com> <4F32741C.4040501@janestreet.com> <20120208133926.GC1823@siouxsie> <4F327CBF.4030005@janestreet.com> <20120208135818.GG1823@siouxsie>
In-Reply-To: <20120208135818.GG1823@siouxsie>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Caml-list] syntactic detail

On Wed 08 Feb 2012 01:58:18 PM GMT, oliver wrote:
>> Perhaps this could happen. But I feel this could be expressed
>> equally clearly using some other mechanism, like a comment. We don't
>> have to have syntax-level support for every weird thing people would
>> like to do.
>
> If something is a weird thing often lies in the eye of the beholder.

My definition of "weird" is "few people use this in practice". 

Clearly, delimiting groups of thousands is useful to a lot of people. 
But it hides bugs, because if you see 10_000_0000 you are much more 
likely to think it is 10^7 than you are with 100000000, where you are 
likely to be careful and take your time. We can prevent this by more 
stringent syntax rules. This would also prevent some corner cases that 
you have described, that probably barely anyone cares about. It's not a 
free restriction, but it is cheap, and definitely has value.

> An int-value which raises an exception on overflow would be something
> much more important than making this syntax rule more restricted.

That's completely orthogonal.

> It's also somehow weird, to write   1_000_000_000 instead of 1000000000.
> Why should this weird "_" stuff supported at all?
>
> Writing +. instead of + also might be weird from a certain view.
> So you are using a weird language.

I think this is addressed by my definition of "weird" above.

>>> Why should this case be forbidden?
>>
>> Because it is impossible to distinguish it from the
>> wrongly-deliminated case that I described, which leads to the bugs I
>> described.
> [...]
>
>
> But that case is just a typo, like it would be without any "_".

I don't understand. Wouldn't it be better to have a syntax where it is 
harder to make typos?

> For some rsearch it might make sense to delimit those digits which
> are officially rounded in a setting from those which might be rounded.
>
> like
>
>     4.526829898
>    vs.
>     4.5_26829898
>    vs.
>     4.52_6829898
>
> and so on.
>
> So, even you have a floating point value with 9 digits after the
> decimal point, if you have a case where your official rounding
> is one or two digits, but you have to use the correct value,
> you could clarify this in the code.

This could also be done, by, e.g., defining a new type with explicit 
coercions:

module Two_dp_float : sig
  val of_float : float -> t
  val to_float : t -> float
end = struct
  type t = float
  let of_float x = x
  let to_float x = x
end

This actually enforces that you get the notation right in your code, 
rather than with the underscores, where you could typo and put the 
underscore too far right, or forget to put them in all together.

But more generally, I think it is worth more, in terms of bugs saved, 
to restrict the syntax versus allowing these infrequently-used cases.

>>> For Hex it might also make sense to have it all two characters.
>>>
>>> If the rule would be only all 4 characters, that would be bad.
>>
>> Sure, this seems okay.
>
> Too late, if the four-digit rule would have been implemented before the
> (weird?) two-digit rule was asked by someone...

You're right, that would be a change that would probably break a lot of 
code. I claim my suggestion would not break much code.