From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id q18EBqBw030949 for ; Wed, 8 Feb 2012 15:12:04 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqwBAHaBMk8machzl2dsb2JhbABDhQ2oMoJNAQEBAQEIFgc7gXIBAQQBIwQRQAEFCwsYAgIFFgsCAgkDAgECAUUGDQEHAQGHeKdnkgSBL4ocAQcCAgkFDQQGAQsBCAUDAwkFDAKCfhkEAwwDFAWDL4EWBKgK X-IronPort-AV: E=Sophos;i="4.73,383,1325458800"; d="scan'208";a="143401221" Received: from mx2.janestreet.com (HELO nyc-dmz-mxout2.janestreet.com) ([38.105.200.115]) by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 08 Feb 2012 15:12:03 +0100 Received: from nyc-qsv-mail1.delacy.com ([172.25.22.57]) by nyc-dmz-mxout2.janestreet.com with esmtp (Exim 4.76) (envelope-from ) id 1Rv8Fe-0000Uu-GP; Wed, 08 Feb 2012 09:12:02 -0500 Received: from ldn-qws-011.delacy.com ([172.23.133.111]) by nyc-qsv-mail1.delacy.com with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.76) (envelope-from ) id 1Rv8Fe-00009z-9t; Wed, 08 Feb 2012 09:12:02 -0500 Message-ID: <4F3282B1.1050205@janestreet.com> Date: Wed, 08 Feb 2012 14:12:01 +0000 From: David House User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:9.0) Gecko/20111222 Thunderbird/9.0.1 MIME-Version: 1.0 To: oliver CC: Gabriel Scherer , =?UTF-8?B?TWF0ZWogS2/FocOt?= =?UTF-8?B?aw==?= <5764c029b688c1c0d24a2e97cd764f@gmail.com>, caml-list@inria.fr References: <4F326EA6.20900@gmail.com> <4F32741C.4040501@janestreet.com> <20120208133926.GC1823@siouxsie> <4F327CBF.4030005@janestreet.com> <20120208135818.GG1823@siouxsie> In-Reply-To: <20120208135818.GG1823@siouxsie> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Caml-list] syntactic detail On Wed 08 Feb 2012 01:58:18 PM GMT, oliver wrote: >> Perhaps this could happen. But I feel this could be expressed >> equally clearly using some other mechanism, like a comment. We don't >> have to have syntax-level support for every weird thing people would >> like to do. > > If something is a weird thing often lies in the eye of the beholder. My definition of "weird" is "few people use this in practice". Clearly, delimiting groups of thousands is useful to a lot of people. But it hides bugs, because if you see 10_000_0000 you are much more likely to think it is 10^7 than you are with 100000000, where you are likely to be careful and take your time. We can prevent this by more stringent syntax rules. This would also prevent some corner cases that you have described, that probably barely anyone cares about. It's not a free restriction, but it is cheap, and definitely has value. > An int-value which raises an exception on overflow would be something > much more important than making this syntax rule more restricted. That's completely orthogonal. > It's also somehow weird, to write 1_000_000_000 instead of 1000000000. > Why should this weird "_" stuff supported at all? > > Writing +. instead of + also might be weird from a certain view. > So you are using a weird language. I think this is addressed by my definition of "weird" above. >>> Why should this case be forbidden? >> >> Because it is impossible to distinguish it from the >> wrongly-deliminated case that I described, which leads to the bugs I >> described. > [...] > > > But that case is just a typo, like it would be without any "_". I don't understand. Wouldn't it be better to have a syntax where it is harder to make typos? > For some rsearch it might make sense to delimit those digits which > are officially rounded in a setting from those which might be rounded. > > like > > 4.526829898 > vs. > 4.5_26829898 > vs. > 4.52_6829898 > > and so on. > > So, even you have a floating point value with 9 digits after the > decimal point, if you have a case where your official rounding > is one or two digits, but you have to use the correct value, > you could clarify this in the code. This could also be done, by, e.g., defining a new type with explicit coercions: module Two_dp_float : sig val of_float : float -> t val to_float : t -> float end = struct type t = float let of_float x = x let to_float x = x end This actually enforces that you get the notation right in your code, rather than with the underscores, where you could typo and put the underscore too far right, or forget to put them in all together. But more generally, I think it is worth more, in terms of bugs saved, to restrict the syntax versus allowing these infrequently-used cases. >>> For Hex it might also make sense to have it all two characters. >>> >>> If the rule would be only all 4 characters, that would be bad. >> >> Sure, this seems okay. > > Too late, if the four-digit rule would have been implemented before the > (weird?) two-digit rule was asked by someone... You're right, that would be a change that would probably break a lot of code. I claim my suggestion would not break much code.