caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] syntactic detail
@ 2012-02-08 12:46 Matej Košík
  2012-02-08 12:54 ` Gabriel Scherer
  2012-02-08 13:05 ` rixed
  0 siblings, 2 replies; 18+ messages in thread
From: Matej Košík @ 2012-02-08 12:46 UTC (permalink / raw)
  To: caml-list

Hi,

Ocaml allows me to add '_' at the end of a floating point literal, e.g.:

	1._

What can be a purpose for that?

In case of long or Long integers, optional adding of '_' between the
integer and 'l' or 'L' make sense ('l' is hard to discriminate from '1'
for many fonts). But in case of floats, I am not sure.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 12:46 [Caml-list] syntactic detail Matej Košík
@ 2012-02-08 12:54 ` Gabriel Scherer
  2012-02-08 13:09   ` David House
  2012-02-08 13:05 ` rixed
  1 sibling, 1 reply; 18+ messages in thread
From: Gabriel Scherer @ 2012-02-08 12:54 UTC (permalink / raw)
  To: Matej Košík; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1066 bytes --]

There is no purpose, it's just an edge case of the simple lexical
specification you can find at:
  http://caml.inria.fr/pub/docs/manual-ocaml/lex.html#float-literal

Everywhere digits are allowed, you can insert extraneous underscores. There
is no restriction that there must be at least one digit for underscores to
be valid. I don't see why there should be.

On Wed, Feb 8, 2012 at 1:46 PM, Matej Košík <
5764c029b688c1c0d24a2e97cd764f@gmail.com> wrote:

> Hi,
>
> Ocaml allows me to add '_' at the end of a floating point literal, e.g.:
>
>        1._
>
> What can be a purpose for that?
>
> In case of long or Long integers, optional adding of '_' between the
> integer and 'l' or 'L' make sense ('l' is hard to discriminate from '1'
> for many fonts). But in case of floats, I am not sure.
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa-roc.inria.fr/wws/info/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>

[-- Attachment #2: Type: text/html, Size: 1786 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 12:46 [Caml-list] syntactic detail Matej Košík
  2012-02-08 12:54 ` Gabriel Scherer
@ 2012-02-08 13:05 ` rixed
  2012-02-09  9:05   ` Matej Košík
  1 sibling, 1 reply; 18+ messages in thread
From: rixed @ 2012-02-08 13:05 UTC (permalink / raw)
  To: caml-list

Just in case it was not clear from Gabriel's answer alone, the
purpose of allowing underscores is to write big integers like:

  1_000_000_000

which is very nice once you are used to it.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 12:54 ` Gabriel Scherer
@ 2012-02-08 13:09   ` David House
  2012-02-08 13:39     ` oliver
  0 siblings, 1 reply; 18+ messages in thread
From: David House @ 2012-02-08 13:09 UTC (permalink / raw)
  To: Gabriel Scherer; +Cc: Matej Košík, caml-list

On 02/08/2012 12:54 PM, Gabriel Scherer wrote:
> There is no purpose, it's just an edge case of the simple lexical
> specification you can find at:
> http://caml.inria.fr/pub/docs/manual-ocaml/lex.html#float-literal
>
> Everywhere digits are allowed, you can insert extraneous underscores.
> There is no restriction that there must be at least one digit for
> underscores to be valid. I don't see why there should be.

I would actually prefer a slightly more constrained format. It is very 
easy to typo large numbers like:

   let ten_million = 10_000_0000 in

When eyeballing this, it is extremely easy to mistake this as 10^7, when 
it actual fact it is 10^8.

I would prefer a syntax rule that only allows underscore every three 
characters (starting at the RHS of the number, i.e. complying to the 
usual convention). Well, certainly that for decimal literals. For hex 
literals you probably want to enforce the same, but every four characters.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 13:09   ` David House
@ 2012-02-08 13:39     ` oliver
  2012-02-08 13:45       ` oliver
  2012-02-08 13:46       ` David House
  0 siblings, 2 replies; 18+ messages in thread
From: oliver @ 2012-02-08 13:39 UTC (permalink / raw)
  To: David House; +Cc: Gabriel Scherer, Matej Košík, caml-list

On Wed, Feb 08, 2012 at 01:09:48PM +0000, David House wrote:
> On 02/08/2012 12:54 PM, Gabriel Scherer wrote:
> >There is no purpose, it's just an edge case of the simple lexical
> >specification you can find at:
> >http://caml.inria.fr/pub/docs/manual-ocaml/lex.html#float-literal
> >
> >Everywhere digits are allowed, you can insert extraneous underscores.
> >There is no restriction that there must be at least one digit for
> >underscores to be valid. I don't see why there should be.
> 
> I would actually prefer a slightly more constrained format. It is
> very easy to typo large numbers like:
> 
>   let ten_million = 10_000_0000 in
> 
> When eyeballing this, it is extremely easy to mistake this as 10^7,
> when it actual fact it is 10^8.
[...]

I think the problem of being confounding 10_000_0000 with 100_000_000
or 10_000_0000 with 10_000_000 is less prominent then confounding
100000000 with 10000000.

So this syntax rule already makes things much easier to read.
That's it's advantage.

To have here all three characters as syntax rule could make sense,
but is less a problem than having no allowance of "_" in numbers at all.


For rare cases it might also make sense to have "_" inserted arbitrarily,
for example in math education or when clarifiying things to others,
it might also make sense to put "_" around the differing values.
So it migth make sense for clarifiying differences and for educational
purposes.

For example you might have a startvalue of an algorithm and the staring value
for a certin behaviour is
  3.36639926549992
but your colleague uses
  3.36639296549992

For clarification one could write:
  3.36639926549992 vs.
  3.36639296549992

or better

  3.36639_92_6549992 vs.
  3.36639_29_6549992

A case, where two digits were switched.


Might not be the case that pops up all to often,
but if so, it might be fine, if both values can be used:

  let startval_correct = 3.36639_92_6549992 
  let startval_wrong   = 3.36639_29_6549992


So I have invented reasons why it's fine as it is.
Why should this case be forbidden?


> 
> I would prefer a syntax rule that only allows underscore every three
> characters (starting at the RHS of the number, i.e. complying to the
> usual convention). Well, certainly that for decimal literals. For
> hex literals you probably want to enforce the same, but every four
> characters.
[...]

For Hex it might also make sense to have it all two characters.

If the rule would be only all 4 characters, that would be bad.

Ciao,
   Oliver

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 13:39     ` oliver
@ 2012-02-08 13:45       ` oliver
  2012-02-08 13:46       ` David House
  1 sibling, 0 replies; 18+ messages in thread
From: oliver @ 2012-02-08 13:45 UTC (permalink / raw)
  To: David House; +Cc: Gabriel Scherer, Matej Košík, caml-list

On Wed, Feb 08, 2012 at 02:39:26PM +0100, oliver wrote:
[...]
> For example you might have a startvalue of an algorithm and the staring value
> for a certin behaviour is
>   3.36639926549992
> but your colleague uses
>   3.36639296549992
> 
> For clarification one could write:
>   3.36639926549992 vs.
>   3.36639296549992
> 
> or better
> 
>   3.36639_92_6549992 vs.
>   3.36639_29_6549992
[...]

Or, of course izt would be possible to write:

  3.366_399_265_499_92
  vs.
  3.366_392_965_499_92

which would be the case with all-three-letters distannce.

This is less clear for clarifying the difference.

Ciao,
  Oliver

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 13:39     ` oliver
  2012-02-08 13:45       ` oliver
@ 2012-02-08 13:46       ` David House
  2012-02-08 13:58         ` oliver
  1 sibling, 1 reply; 18+ messages in thread
From: David House @ 2012-02-08 13:46 UTC (permalink / raw)
  To: oliver; +Cc: Gabriel Scherer, Matej Košík, caml-list

On Wed 08 Feb 2012 01:39:26 PM GMT, oliver wrote:
> I think the problem of being confounding 10_000_0000 with 100_000_000
> or 10_000_0000 with 10_000_000 is less prominent then confounding
> 100000000 with 10000000.
>
> So this syntax rule already makes things much easier to read.
> That's it's advantage.
>
> To have here all three characters as syntax rule could make sense,
> but is less a problem than having no allowance of "_" in numbers at all.

That is definitely true.

> Might not be the case that pops up all to often,
> but if so, it might be fine, if both values can be used:
>
>    let startval_correct = 3.36639_92_6549992
>    let startval_wrong   = 3.36639_29_6549992
>
>
> So I have invented reasons why it's fine as it is.

Perhaps this could happen. But I feel this could be expressed equally 
clearly using some other mechanism, like a comment. We don't have to 
have syntax-level support for every weird thing people would like to do.

> Why should this case be forbidden?

Because it is impossible to distinguish it from the wrongly-deliminated 
case that I described, which leads to the bugs I described.

Your example actually raises another point -- I'm not sure how my ideas 
would be extended to bits after the decimal place in floating point 
literals. Doing something like "every third character going right from 
the point" would probably be sufficient.

>> I would prefer a syntax rule that only allows underscore every three
>> characters (starting at the RHS of the number, i.e. complying to the
>> usual convention). Well, certainly that for decimal literals. For
>> hex literals you probably want to enforce the same, but every four
>> characters.
> [...]
>
> For Hex it might also make sense to have it all two characters.
>
> If the rule would be only all 4 characters, that would be bad.

Sure, this seems okay.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 13:46       ` David House
@ 2012-02-08 13:58         ` oliver
  2012-02-08 14:12           ` David House
  0 siblings, 1 reply; 18+ messages in thread
From: oliver @ 2012-02-08 13:58 UTC (permalink / raw)
  To: David House; +Cc: Gabriel Scherer, Matej Košík, caml-list

On Wed, Feb 08, 2012 at 01:46:39PM +0000, David House wrote:
> On Wed 08 Feb 2012 01:39:26 PM GMT, oliver wrote:
> >I think the problem of being confounding 10_000_0000 with 100_000_000
> >or 10_000_0000 with 10_000_000 is less prominent then confounding
> >100000000 with 10000000.
> >
> >So this syntax rule already makes things much easier to read.
> >That's it's advantage.
> >
> >To have here all three characters as syntax rule could make sense,
> >but is less a problem than having no allowance of "_" in numbers at all.
> 
> That is definitely true.
> 
> >Might not be the case that pops up all to often,
> >but if so, it might be fine, if both values can be used:
> >
> >   let startval_correct = 3.36639_92_6549992
> >   let startval_wrong   = 3.36639_29_6549992
> >
> >
> >So I have invented reasons why it's fine as it is.
> 
> Perhaps this could happen. But I feel this could be expressed
> equally clearly using some other mechanism, like a comment. We don't
> have to have syntax-level support for every weird thing people would
> like to do.

If something is a weird thing often lies in the eye of the beholder.


An int-value which raises an exception on overflow would be something
much more important than making this syntax rule more restricted.

It's also somehow weird, to write   1_000_000_000 instead of 1000000000.
Why should this weird "_" stuff supported at all?

Writing +. instead of + also might be weird from a certain view.
So you are using a weird language.


> 
> >Why should this case be forbidden?
> 
> Because it is impossible to distinguish it from the
> wrongly-deliminated case that I described, which leads to the bugs I
> described.
[...]


But that case is just a typo, like it would be without any "_".

For some rsearch it might make sense to delimit those digits which
are officially rounded in a setting from those which might be rounded.

like

   4.526829898
  vs.
   4.5_26829898
  vs.
   4.52_6829898

and so on.

So, even you have a floating point value with 9 digits after the
decimal point, if you have a case where your official rounding
is one or two digits, but you have to use the correct value,
you could clarify this in the code.

But this might also be weird to you.


> 
> Your example actually raises another point -- I'm not sure how my
> ideas would be extended to bits after the decimal place in floating
> point literals. Doing something like "every third character going
> right from the point" would probably be sufficient.
> 
> >>I would prefer a syntax rule that only allows underscore every three
> >>characters (starting at the RHS of the number, i.e. complying to the
> >>usual convention). Well, certainly that for decimal literals. For
> >>hex literals you probably want to enforce the same, but every four
> >>characters.
> >[...]
> >
> >For Hex it might also make sense to have it all two characters.
> >
> >If the rule would be only all 4 characters, that would be bad.
> 
> Sure, this seems okay.

Too late, if the four-digit rule would have been implemented before the
(weird?) two-digit rule was asked by someone...

Ciao,
   Oliver

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 13:58         ` oliver
@ 2012-02-08 14:12           ` David House
  2012-02-08 14:39             ` Gabriel Scherer
  2012-02-08 16:21             ` oliver
  0 siblings, 2 replies; 18+ messages in thread
From: David House @ 2012-02-08 14:12 UTC (permalink / raw)
  To: oliver; +Cc: Gabriel Scherer, Matej Košík, caml-list

On Wed 08 Feb 2012 01:58:18 PM GMT, oliver wrote:
>> Perhaps this could happen. But I feel this could be expressed
>> equally clearly using some other mechanism, like a comment. We don't
>> have to have syntax-level support for every weird thing people would
>> like to do.
>
> If something is a weird thing often lies in the eye of the beholder.

My definition of "weird" is "few people use this in practice". 

Clearly, delimiting groups of thousands is useful to a lot of people. 
But it hides bugs, because if you see 10_000_0000 you are much more 
likely to think it is 10^7 than you are with 100000000, where you are 
likely to be careful and take your time. We can prevent this by more 
stringent syntax rules. This would also prevent some corner cases that 
you have described, that probably barely anyone cares about. It's not a 
free restriction, but it is cheap, and definitely has value.

> An int-value which raises an exception on overflow would be something
> much more important than making this syntax rule more restricted.

That's completely orthogonal.

> It's also somehow weird, to write   1_000_000_000 instead of 1000000000.
> Why should this weird "_" stuff supported at all?
>
> Writing +. instead of + also might be weird from a certain view.
> So you are using a weird language.

I think this is addressed by my definition of "weird" above.

>>> Why should this case be forbidden?
>>
>> Because it is impossible to distinguish it from the
>> wrongly-deliminated case that I described, which leads to the bugs I
>> described.
> [...]
>
>
> But that case is just a typo, like it would be without any "_".

I don't understand. Wouldn't it be better to have a syntax where it is 
harder to make typos?

> For some rsearch it might make sense to delimit those digits which
> are officially rounded in a setting from those which might be rounded.
>
> like
>
>     4.526829898
>    vs.
>     4.5_26829898
>    vs.
>     4.52_6829898
>
> and so on.
>
> So, even you have a floating point value with 9 digits after the
> decimal point, if you have a case where your official rounding
> is one or two digits, but you have to use the correct value,
> you could clarify this in the code.

This could also be done, by, e.g., defining a new type with explicit 
coercions:

module Two_dp_float : sig
  val of_float : float -> t
  val to_float : t -> float
end = struct
  type t = float
  let of_float x = x
  let to_float x = x
end

This actually enforces that you get the notation right in your code, 
rather than with the underscores, where you could typo and put the 
underscore too far right, or forget to put them in all together.

But more generally, I think it is worth more, in terms of bugs saved, 
to restrict the syntax versus allowing these infrequently-used cases.

>>> For Hex it might also make sense to have it all two characters.
>>>
>>> If the rule would be only all 4 characters, that would be bad.
>>
>> Sure, this seems okay.
>
> Too late, if the four-digit rule would have been implemented before the
> (weird?) two-digit rule was asked by someone...

You're right, that would be a change that would probably break a lot of 
code. I claim my suggestion would not break much code.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 14:12           ` David House
@ 2012-02-08 14:39             ` Gabriel Scherer
  2012-02-08 14:50               ` David House
  2012-02-08 16:21             ` oliver
  1 sibling, 1 reply; 18+ messages in thread
From: Gabriel Scherer @ 2012-02-08 14:39 UTC (permalink / raw)
  To: David House; +Cc: oliver, Matej Košík, caml-list

[-- Attachment #1: Type: text/plain, Size: 4109 bytes --]

People. Please. Tell me you are *not* arguing over underscores in numeric
literals !

> But it hides bugs, because if you see 10_000_0000 you are
> much more likely to think it is 10^7 than you are with 100000000,
> where you are likely to be careful and take your time.

So your point is : it is dangerous because it is clearer. I also recommend
we forbid comments, since:
- they can be abused, even by mistake, to make code *harder* to read
- removing them will force people to read code more carefully

I'm out of this discussion.


PS: Planet OCaml needs some love. If you're considering contributing to the
present debate, please also consider writing a blog story!


On Wed, Feb 8, 2012 at 3:12 PM, David House <dhouse@janestreet.com> wrote:

> On Wed 08 Feb 2012 01:58:18 PM GMT, oliver wrote:
>
>> Perhaps this could happen. But I feel this could be expressed
>>> equally clearly using some other mechanism, like a comment. We don't
>>> have to have syntax-level support for every weird thing people would
>>> like to do.
>>>
>>
>> If something is a weird thing often lies in the eye of the beholder.
>>
>
> My definition of "weird" is "few people use this in practice".
> Clearly, delimiting groups of thousands is useful to a lot of people. But
> it hides bugs, because if you see 10_000_0000 you are much more likely to
> think it is 10^7 than you are with 100000000, where you are likely to be
> careful and take your time. We can prevent this by more stringent syntax
> rules. This would also prevent some corner cases that you have described,
> that probably barely anyone cares about. It's not a free restriction, but
> it is cheap, and definitely has value.
>
>
>  An int-value which raises an exception on overflow would be something
>> much more important than making this syntax rule more restricted.
>>
>
> That's completely orthogonal.
>
>
>  It's also somehow weird, to write   1_000_000_000 instead of 1000000000.
>> Why should this weird "_" stuff supported at all?
>>
>> Writing +. instead of + also might be weird from a certain view.
>> So you are using a weird language.
>>
>
> I think this is addressed by my definition of "weird" above.
>
>
>  Why should this case be forbidden?
>>>>
>>>
>>> Because it is impossible to distinguish it from the
>>> wrongly-deliminated case that I described, which leads to the bugs I
>>> described.
>>>
>> [...]
>>
>>
>> But that case is just a typo, like it would be without any "_".
>>
>
> I don't understand. Wouldn't it be better to have a syntax where it is
> harder to make typos?
>
>
>  For some rsearch it might make sense to delimit those digits which
>> are officially rounded in a setting from those which might be rounded.
>>
>> like
>>
>>    4.526829898
>>   vs.
>>    4.5_26829898
>>   vs.
>>    4.52_6829898
>>
>> and so on.
>>
>> So, even you have a floating point value with 9 digits after the
>> decimal point, if you have a case where your official rounding
>> is one or two digits, but you have to use the correct value,
>> you could clarify this in the code.
>>
>
> This could also be done, by, e.g., defining a new type with explicit
> coercions:
>
> module Two_dp_float : sig
>  val of_float : float -> t
>  val to_float : t -> float
> end = struct
>  type t = float
>  let of_float x = x
>  let to_float x = x
> end
>
> This actually enforces that you get the notation right in your code,
> rather than with the underscores, where you could typo and put the
> underscore too far right, or forget to put them in all together.
>
> But more generally, I think it is worth more, in terms of bugs saved, to
> restrict the syntax versus allowing these infrequently-used cases.
>
>
>  For Hex it might also make sense to have it all two characters.
>>>>
>>>> If the rule would be only all 4 characters, that would be bad.
>>>>
>>>
>>> Sure, this seems okay.
>>>
>>
>> Too late, if the four-digit rule would have been implemented before the
>> (weird?) two-digit rule was asked by someone...
>>
>
> You're right, that would be a change that would probably break a lot of
> code. I claim my suggestion would not break much code.
>

[-- Attachment #2: Type: text/html, Size: 6075 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 14:39             ` Gabriel Scherer
@ 2012-02-08 14:50               ` David House
  2012-02-08 15:19                 ` Vincent Aravantinos
  2012-02-08 16:30                 ` oliver
  0 siblings, 2 replies; 18+ messages in thread
From: David House @ 2012-02-08 14:50 UTC (permalink / raw)
  To: Gabriel Scherer; +Cc: oliver, Matej Košík, caml-list

On 02/08/2012 02:39 PM, Gabriel Scherer wrote:
> People. Please. Tell me you are *not* arguing over underscores in
> numeric literals !

This is not totally academic. I have come across the exact bug I 
describe. It was painful.

>  > But it hides bugs, because if you see 10_000_0000 you are
>  > much more likely to think it is 10^7 than you are with 100000000,
>  > where you are likely to be careful and take your time.
>
> So your point is : it is dangerous because it is clearer. I also
> recommend we forbid comments, since:
> - they can be abused, even by mistake, to make code *harder* to read
> - removing them will force people to read code more carefully

Allowing underscores is definitely better than not allowing them, since 
it makes code clearer. I'd rather put up with the possibility of these 
bugs than remove the feature altogether. But seeing as we can remove the 
option for these bugs with a pretty easy-going syntax restriction, why 
not? (There is no such analogous restriction that would make comments 
more accurate.)

That being said, I do agree with the general sentiment that too many 
bytes have been wasted on this thread. I'll try to extricate myself from 
the debate and get on with something useful :)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 14:50               ` David House
@ 2012-02-08 15:19                 ` Vincent Aravantinos
  2012-02-10  8:39                   ` Andrew
  2012-02-08 16:30                 ` oliver
  1 sibling, 1 reply; 18+ messages in thread
From: Vincent Aravantinos @ 2012-02-08 15:19 UTC (permalink / raw)
  To: caml-list



On 02/08/2012 09:50 AM, David House wrote:
> On 02/08/2012 02:39 PM, Gabriel Scherer wrote:
>> People. Please. Tell me you are *not* arguing over underscores in
>> numeric literals !
>
> This is not totally academic. I have come across the exact bug I 
> describe. It was painful.

I'm curious: in which occasions do you guys really have to type in such 
numbers?

-- 
Vincent Aravantinos
Postdoctoral Fellow, Concordia University, Hardware Verification Group
http://users.encs.concordia.ca/~vincent


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 14:12           ` David House
  2012-02-08 14:39             ` Gabriel Scherer
@ 2012-02-08 16:21             ` oliver
  1 sibling, 0 replies; 18+ messages in thread
From: oliver @ 2012-02-08 16:21 UTC (permalink / raw)
  To: David House; +Cc: Gabriel Scherer, Matej Košík, caml-list

On Wed, Feb 08, 2012 at 02:12:01PM +0000, David House wrote:
> On Wed 08 Feb 2012 01:58:18 PM GMT, oliver wrote:
> >>Perhaps this could happen. But I feel this could be expressed
> >>equally clearly using some other mechanism, like a comment. We don't
> >>have to have syntax-level support for every weird thing people would
> >>like to do.
> >
> >If something is a weird thing often lies in the eye of the beholder.
> 
> My definition of "weird" is "few people use this in practice".
> 
> Clearly, delimiting groups of thousands is useful to a lot of
> people. But it hides bugs, because if you see 10_000_0000 you are
> much more likely to think it is 10^7 than you are with 100000000,
> where you are likely to be careful and take your time. We can
> prevent this by more stringent syntax rules. This would also prevent
> some corner cases that you have described, that probably barely
> anyone cares about. It's not a free restriction, but it is cheap,
> and definitely has value.

Not sure if it's cheap.
Don't know how much effort it needs to implement it.
But also don't see if it really is that important.


> 
> >An int-value which raises an exception on overflow would be something
> >much more important than making this syntax rule more restricted.
> 
> That's completely orthogonal.
[...]

Orthogonal when looking at the features itself, but not when looking
at the importance of a need of the implementation.


> 
> >It's also somehow weird, to write   1_000_000_000 instead of 1000000000.
> >Why should this weird "_" stuff supported at all?
> >
> >Writing +. instead of + also might be weird from a certain view.
> >So you are using a weird language.
> 
> I think this is addressed by my definition of "weird" above.

No.

of course +. must be used frequently, because it's the notation
that you need for float value addition.

So it's not a rare case; it's what you need to use when
you want to add float values in OCaml.
I doubt that most people only use int values in their code ;-)


> 
> >>>Why should this case be forbidden?
> >>
> >>Because it is impossible to distinguish it from the
> >>wrongly-deliminated case that I described, which leads to the bugs I
> >>described.
> >[...]
> >
> >
> >But that case is just a typo, like it would be without any "_".
> 
> I don't understand. Wouldn't it be better to have a syntax where it
> is harder to make typos?

Yes.

The kind of type you have mentioned here seems to be based
in the allowance of "_" at all, or because you used that "_" feature
without being very used to the consequences of it, when changing code
is necessary.

More to that in a different mail.


> 
> >For some rsearch it might make sense to delimit those digits which
> >are officially rounded in a setting from those which might be rounded.
> >
> >like
> >
> >    4.526829898
> >   vs.
> >    4.5_26829898
> >   vs.
> >    4.52_6829898
> >
> >and so on.
> >
> >So, even you have a floating point value with 9 digits after the
> >decimal point, if you have a case where your official rounding
> >is one or two digits, but you have to use the correct value,
> >you could clarify this in the code.
> 
> This could also be done, by, e.g., defining a new type with explicit
> coercions:
> 
> module Two_dp_float : sig
>  val of_float : float -> t
>  val to_float : t -> float
> end = struct
>  type t = float
>  let of_float x = x
>  let to_float x = x
> end
[...]

I don't see where this addresses my example.


> 
> This actually enforces that you get the notation right in your code,
> rather than with the underscores, where you could typo and put the
> underscore too far right, or forget to put them in all together.

Not sure if you know what I was talking about.
But maybe my example was misleading or not well explained.

> 
> But more generally, I think it is worth more, in terms of bugs
> saved, to restrict the syntax versus allowing these
> infrequently-used cases.
[...]

Not sure if using "_" at all is done frequently.

Maybe a survey should clarify this; otherwise it's just a statement
on frequently used vs. not frequently used, based on your personal assumptions.



Ciao,
   Oliver

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 14:50               ` David House
  2012-02-08 15:19                 ` Vincent Aravantinos
@ 2012-02-08 16:30                 ` oliver
  2012-02-10  3:37                   ` Jun Furuse
  1 sibling, 1 reply; 18+ messages in thread
From: oliver @ 2012-02-08 16:30 UTC (permalink / raw)
  To: David House; +Cc: Gabriel Scherer, Matej Košík, caml-list

On Wed, Feb 08, 2012 at 02:50:55PM +0000, David House wrote:
> On 02/08/2012 02:39 PM, Gabriel Scherer wrote:
> >People. Please. Tell me you are *not* arguing over underscores in
> >numeric literals !
> 
> This is not totally academic. I have come across the exact bug I
> describe. It was painful.
[...]


Let me guess where the problem might be came from:

When i think of code that uses a value
  1_000_000
  and you want to change it to a value ten times higher,
  it should be  changed to
  10_000_000


Coming from notation that does NOT allow "_" in tzhe numbers,
it could be done by just adding one "0" at the end of the value:

  1000000
becomes
  10000000
         ^

with the "0" added at the end.

But also correct ("more correct" would be:

  1000000
becomes
  10000000
   ^

"0" added at the millions.

  "Just add one "0" at the end"

Is the edit-habit, which works fine.


But when allowing "_" inside numbers,
but people don't change the "wrong" editing behaviour,
then allowing the "_" at all means introducing a new kind
of possible errors.

This could be an argument to throw "_" at all,
because adding a "0" after the "1" instead of just
adding a "0" at the end is rarely used behaviour of editing,
and some people might call it "weird". ;-)

So this argument also could be used to disallow "_" at all.


But no, thats not what I want to argue for ;-)



OK, let's stop that discussion now.

If someone thinks the three-digit-distance-"_" is a feature that makes sense,
a feature wish could be added for OCaml. ;-)


Ciao,
  Oliver

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 13:05 ` rixed
@ 2012-02-09  9:05   ` Matej Košík
  2012-02-09 10:56     ` Wojciech Meyer
  0 siblings, 1 reply; 18+ messages in thread
From: Matej Košík @ 2012-02-09  9:05 UTC (permalink / raw)
  To: caml-list

On 02/08/2012 01:05 PM, rixed@happyleptic.org wrote:
> Just in case it was not clear from Gabriel's answer alone, the
> purpose of allowing underscores is to write big integers like:
> 
>   1_000_000_000
> 
> which is very nice once you are used to it.

So I have learnt something new. Thanks.

This could be interesting (and in some special cases also useful)
optional syntactic sugar. I if were it my power, I would drop it from
the core language.

(although I found underscore still useful putting in between
32-bit/64-bit literal decimal integer and 'l'/'L' suffix.)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-09  9:05   ` Matej Košík
@ 2012-02-09 10:56     ` Wojciech Meyer
  0 siblings, 0 replies; 18+ messages in thread
From: Wojciech Meyer @ 2012-02-09 10:56 UTC (permalink / raw)
  To: Matej Košík; +Cc: caml-list

Hi,

> This could be interesting (and in some special cases also useful)
> optional syntactic sugar. I if were it my power, I would drop it from
> the core language.

You can't really make it as an extension, because it's a lexeme and it's
not possible to extend the lexer right now - if you are thinking about Camlp4.

Wojciech

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 16:30                 ` oliver
@ 2012-02-10  3:37                   ` Jun Furuse
  0 siblings, 0 replies; 18+ messages in thread
From: Jun Furuse @ 2012-02-10  3:37 UTC (permalink / raw)
  To: oliver; +Cc: caml-list, Matej Košík, Gabriel Scherer, David House

[-- Attachment #1: Type: text/plain, Size: 2080 bytes --]

Hi,

Just FYI, there are more than 10_0000_0000 wierd people in weird Asia who
use weird 10^4 based digit system.

j
On Feb 9, 2012 12:31 AM, "oliver" <oliver@first.in-berlin.de> wrote:

> On Wed, Feb 08, 2012 at 02:50:55PM +0000, David House wrote:
> > On 02/08/2012 02:39 PM, Gabriel Scherer wrote:
> > >People. Please. Tell me you are *not* arguing over underscores in
> > >numeric literals !
> >
> > This is not totally academic. I have come across the exact bug I
> > describe. It was painful.
> [...]
>
>
> Let me guess where the problem might be came from:
>
> When i think of code that uses a value
>  1_000_000
>  and you want to change it to a value ten times higher,
>  it should be  changed to
>  10_000_000
>
>
> Coming from notation that does NOT allow "_" in tzhe numbers,
> it could be done by just adding one "0" at the end of the value:
>
>  1000000
> becomes
>  10000000
>         ^
>
> with the "0" added at the end.
>
> But also correct ("more correct" would be:
>
>  1000000
> becomes
>  10000000
>   ^
>
> "0" added at the millions.
>
>  "Just add one "0" at the end"
>
> Is the edit-habit, which works fine.
>
>
> But when allowing "_" inside numbers,
> but people don't change the "wrong" editing behaviour,
> then allowing the "_" at all means introducing a new kind
> of possible errors.
>
> This could be an argument to throw "_" at all,
> because adding a "0" after the "1" instead of just
> adding a "0" at the end is rarely used behaviour of editing,
> and some people might call it "weird". ;-)
>
> So this argument also could be used to disallow "_" at all.
>
>
> But no, thats not what I want to argue for ;-)
>
>
>
> OK, let's stop that discussion now.
>
> If someone thinks the three-digit-distance-"_" is a feature that makes
> sense,
> a feature wish could be added for OCaml. ;-)
>
>
> Ciao,
>  Oliver
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa-roc.inria.fr/wws/info/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>

[-- Attachment #2: Type: text/html, Size: 3018 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Caml-list] syntactic detail
  2012-02-08 15:19                 ` Vincent Aravantinos
@ 2012-02-10  8:39                   ` Andrew
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew @ 2012-02-10  8:39 UTC (permalink / raw)
  To: caml-list

On 2012-02-08 16:19, Vincent Aravantinos wrote:
 > I'm curious: in which occasions do you guys really have to type in such numbers?

+1, although I do when I declare a huge array of fixed size; but I usually write 10*1000*1000.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-02-10  8:39 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-08 12:46 [Caml-list] syntactic detail Matej Košík
2012-02-08 12:54 ` Gabriel Scherer
2012-02-08 13:09   ` David House
2012-02-08 13:39     ` oliver
2012-02-08 13:45       ` oliver
2012-02-08 13:46       ` David House
2012-02-08 13:58         ` oliver
2012-02-08 14:12           ` David House
2012-02-08 14:39             ` Gabriel Scherer
2012-02-08 14:50               ` David House
2012-02-08 15:19                 ` Vincent Aravantinos
2012-02-10  8:39                   ` Andrew
2012-02-08 16:30                 ` oliver
2012-02-10  3:37                   ` Jun Furuse
2012-02-08 16:21             ` oliver
2012-02-08 13:05 ` rixed
2012-02-09  9:05   ` Matej Košík
2012-02-09 10:56     ` Wojciech Meyer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).