caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Labels and operators
@ 2000-06-19  4:14 John Prevost
  2000-06-22  1:48 ` Ken Wakita
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: John Prevost @ 2000-06-19  4:14 UTC (permalink / raw)
  To: caml-list

A friend of mine recently said "if ML had regexp stuff that was as
convenient as Perl, I'd switch to it for everything", and mentioned =~
as something he specifically wanted.  So, as I was walking home
tonight, I thought "hey, I bet I could make some little operators for
the PCRE library and show him!"

But, it also occurred to me that you want to use the nice labelled
optional argument stuff, and I wasn't sure you could do that with
operators.  Here's what I've discovered.

The following definition is kind of non-sensical, since obviously you
need at least one and probably at least two arguments for an infix
operator:

# let (+) ?(x = 1) ?(y = 1) () = x + y;;
val ( + ) : ?x:int -> ?y:int -> unit -> int = <fun>

but, well, it accepts it.  Now, let's see if we can apply this
operator as a function.

# (+) ();;
- : int = 2
# (+) ~x:5 ();;
- : int = 6
# (+) ~x:5 ~y:5 ();;
- : int = 10

Okay, that's as expected.  What about as an operator?  Well, the first
case is obviously degenerate, since we're not giving enough arguments
for a two argument operator.  What about the second and third:

# ~x:5 + ();;
  ---
Syntax error
# ~x:5 + ~y:5 ();;
  ---
Syntax error

Well, that's a little disappointing.  I sort of half expected it,
though, since this isn't function application we're dealing with.

Here's what I was *trying* to do:

# let (=~) s ?iflags ?flags ?rex ?pos pat =
    pmatch ?iflags ?flags ?rex ~pat ?pos s;;
val ( =~ ) :
  string -> ?iflags:Pcre.irflag -> ?flags:Pcre.rflag list ->
    ?rex:Pcre.regexp -> ?pos:int -> string -> bool = <fun>
# "foo" =~ "\s+";;
- : bool false
# "foo" =~ "f";;
- : bool true
# "foo" =~ "f" ~pos:1;;
           ---
This expression is not a function, it cannot be applied
# ("foo" =~ "f") ~pos:1;;
  --------------
This expression is not a function, it cannot be applied
# (=~) "foo" ~pos:1 "f";;
- : bool false
# (=~) "foo" "f" ~pos:1;;
- : bool false

The only solution I can think of is something like:

# let re ?iflags ?flags ?rex ?pos pat = (iflags, flags, rex, Some pat, pos)
val re :
  ?iflags:'a -> ?flags:'b -> ?rex:'c -> ?pos:'d -> 'e ->
    'a option * 'b option * 'c option * 'e option * 'd option =
  <fun>
# let (=~) s (iflags,flags,rex,pat,pos) =
                   pmatch ?iflags ?flags ?rex ?pat ?pos;;
val ( =~ ) :
  string ->
  Pcre.irflag option * Pcre.rflag list option * Pcre.regexp option *
  string option * int option -> bool = <fun>
# "foo" =~ re "f";;
- : bool = true
# "foo" =~ re "f" ~pos:1;;
- : bool = false

Which, well, works, but seems kind of nasty.


Since the syntax of labeled arguments is based around function
application, and since function application (juxtaposition) has higher
precedence than any other "operator", I can see why it's not
syntactically valid to try to use labels on arguments to an operator.
I don't see any clean way of "fixing" this, so I figured I ought to
warn people that while you can define an operator with labeled
arguments, you're not going to get much use of it as an infix.

Well, okay, I think it might be reasonable to change the syntax to
allow this syntax:

<expr1> <op> <labelled args> <expr2>

since the labelled args could not in any way shape or form be thought
to go with either expr1 or expr2.  This would lead to things like:

# "foo" =~ ~pos:1 "f";;
- : bool = false

being possible.  Don't know whether it's a great idea, though.



John.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Labels and operators
  2000-06-19  4:14 Labels and operators John Prevost
@ 2000-06-22  1:48 ` Ken Wakita
  2000-06-22  5:19 ` Jacques Garrigue
  2000-06-24 14:44 ` Markus Mottl
  2 siblings, 0 replies; 5+ messages in thread
From: Ken Wakita @ 2000-06-22  1:48 UTC (permalink / raw)
  To: prevost; +Cc: caml-list


Hello.

I tried the same with the Str module.  Except for the parenthesis, it
seems ok to me.

Ken

let pmatch s ~pat ?(direction = `forward) ?(pos = 0) () =
  match direction with
    `forward -> Str.search_forward ~pat s ~pos
  | `backward -> Str.search_backward ~pat s ~pos

let (=~) s regexp = pmatch s (Str.regexp regexp)

# ("Hello world!" =~ "[a-z]+") ();;
- : int = 1
# ("Hello world!" =~ "[A-Za-z]+") ();;
- : int = 0
# ("Hello world!" =~ "[A-Za-z]+") ~direction: `backward ~pos: 12 ();;
- : int = 10
# ("Hello world!" =~ "[A-Za-z]+") ~direction: `backward ~pos: 10 ();;
- : int = 10
# 



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Labels and operators
  2000-06-19  4:14 Labels and operators John Prevost
  2000-06-22  1:48 ` Ken Wakita
@ 2000-06-22  5:19 ` Jacques Garrigue
  2000-06-24 14:44 ` Markus Mottl
  2 siblings, 0 replies; 5+ messages in thread
From: Jacques Garrigue @ 2000-06-22  5:19 UTC (permalink / raw)
  To: prevost; +Cc: caml-list

From: John Prevost <prevost@maya.com>

> A friend of mine recently said "if ML had regexp stuff that was as
> convenient as Perl, I'd switch to it for everything", and mentioned =~
> as something he specifically wanted.  So, as I was walking home
> tonight, I thought "hey, I bet I could make some little operators for
> the PCRE library and show him!"
> 
> But, it also occurred to me that you want to use the nice labelled
> optional argument stuff, and I wasn't sure you could do that with
> operators.  Here's what I've discovered.

> The only solution I can think of is something like:
> 
> # let re ?iflags ?flags ?rex ?pos pat = (iflags, flags, rex, Some pat, pos)
> val re :
>   ?iflags:'a -> ?flags:'b -> ?rex:'c -> ?pos:'d -> 'e ->
>     'a option * 'b option * 'c option * 'e option * 'd option =
>   <fun>
> # let (=~) s (iflags,flags,rex,pat,pos) =
>                    pmatch ?iflags ?flags ?rex ?pat ?pos;;
> val ( =~ ) :
>   string ->
>   Pcre.irflag option * Pcre.rflag list option * Pcre.regexp option *
>   string option * int option -> bool = <fun>
> # "foo" =~ re "f";;
> - : bool = true
> # "foo" =~ re "f" ~pos:1;;
> - : bool = false
> 
> Which, well, works, but seems kind of nasty.

In fact you have a more general solution than that.
What you are trying to do here is just reverse application:

# let (>>) x f = f x;;
val ( >> ) : 'a -> ('a -> 'b) -> 'b = <fun>
# let sub ?(pos=0) ?len s =
    let len = match len with None -> String.length s - pos | Some x -> x in
    String.sub s ~pos ~len;;
val sub : ?pos:int -> ?len:int -> string -> string = <fun>
# "hello" >> sub ~pos:3;;
- : string = "lo"
# "hello" >> sub ~len:3;;
- : string = "hel"

And, with Pcre (artificial since I don't have it installed),

# "foo" >>pmatch "f";;
- : bool = true
# "foo" >>pmatch "f" ~pos:1;;
- : bool = false

That's not exactly the Perl operator, but it feels a lot like it.

> Well, okay, I think it might be reasonable to change the syntax to
> allow this syntax:
> 
> <expr1> <op> <labelled args> <expr2>
> 
> since the labelled args could not in any way shape or form be thought
> to go with either expr1 or expr2.  This would lead to things like:
> 
> # "foo" =~ ~pos:1 "f";;
> - : bool = false
> 
> being possible.  Don't know whether it's a great idea, though.

This is technically possible, but this might be confusing:
<expr2> itself may be a complex expression without parenthesis.

# "foo" =~ ~pos:1 String.sub "foo" 0 1 ^ "bar";;

You would need a bit of habit to parse that.

        Jacques



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Labels and operators
  2000-06-19  4:14 Labels and operators John Prevost
  2000-06-22  1:48 ` Ken Wakita
  2000-06-22  5:19 ` Jacques Garrigue
@ 2000-06-24 14:44 ` Markus Mottl
  2000-06-24 18:03   ` John Prevost
  2 siblings, 1 reply; 5+ messages in thread
From: Markus Mottl @ 2000-06-24 14:44 UTC (permalink / raw)
  To: John Prevost; +Cc: caml-list

On Mon, 19 Jun 2000, John Prevost wrote:
> A friend of mine recently said "if ML had regexp stuff that was as
> convenient as Perl, I'd switch to it for everything", and mentioned =~
> as something he specifically wanted.  So, as I was walking home
> tonight, I thought "hey, I bet I could make some little operators for
> the PCRE library and show him!"

The "=~"-operator itself, as it is normally used in Perl for matching,
is fairly easy to replicate:

  let (=~) str pat = Pcre.pmatch ~pat str

  let _ =
    print_endline (if read_line () =~ "foo" then "has foo!" else "no foo!")

> But, it also occurred to me that you want to use the nice labelled
> optional argument stuff, and I wasn't sure you could do that with
> operators.  Here's what I've discovered.

Well, sometimes we are really struck by the only "two-dimensional" way
in which we can write our sources (top-down + left-right). If we could
write into the depth, there would be an elegant solution for adding
arguments to infix operators...

> The only solution I can think of is something like:
[snip]
> # "foo" =~ re "f";;
> - : bool = true
> # "foo" =~ re "f" ~pos:1;;
> - : bool = false
> 
> Which, well, works, but seems kind of nasty.

I normally try to avoid new operators, but if I wanted to have a somewhat
"powered up" version of "=~", your version here would look fine to me -
just read every piece aloud:

  "foo"      =~                re              "f"       ~pos:1
  "foo" - is matched by - regular expression - "f" - at position one

This is pretty close to the human way of expressing things. (Larry Wall,
the linguist, would be proud of you! ;-)

> since the labelled args could not in any way shape or form be thought
> to go with either expr1 or expr2.  This would lead to things like:
> 
> # "foo" =~ ~pos:1 "f";;
> - : bool = false
> 
> being possible.  Don't know whether it's a great idea, though.

I prefer your first version: "subject", "verb" and "object" are close
together, the additional modifiers only follow afterwards. To my
knowledge, most natural languages would order expressions like this.

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Labels and operators
  2000-06-24 14:44 ` Markus Mottl
@ 2000-06-24 18:03   ` John Prevost
  0 siblings, 0 replies; 5+ messages in thread
From: John Prevost @ 2000-06-24 18:03 UTC (permalink / raw)
  To: Markus Mottl; +Cc: caml-list

>>>>> "mm" == Markus Mottl <mottl@miss.wu-wien.ac.at> writes:

    mm> Well, sometimes we are really struck by the only
    mm> "two-dimensional" way in which we can write our sources
    mm> (top-down + left-right). If we could write into the depth,
    mm> there would be an elegant solution for adding arguments to
    mm> infix operators...

    >> The only solution I can think of is something like:
    mm> [snip]
    >> # "foo" =~ re "f";; - : bool = true # "foo" =~ re "f" ~pos:1;;
    >> - : bool = false
    >> 
    >> Which, well, works, but seems kind of nasty.

    mm> I normally try to avoid new operators, but if I wanted to have
    mm> a somewhat "powered up" version of "=~", your version here
    mm> would look fine to me - just read every piece aloud:

    mm>   "foo" =~ re "f" ~pos:1 "foo" - is matched by - regular
    mm> expression - "f" - at position one

    mm> This is pretty close to the human way of expressing
    mm> things. (Larry Wall, the linguist, would be proud of you! ;-)

Honestly, lingustics is a hobby of mine, and Larry (whom I've met)
doesn't impress me much.  ;>

    >> since the labelled args could not in any way shape or form be
    >> thought to go with either expr1 or expr2.  This would lead to
    >> things like:
    >> 
    >> # "foo" =~ ~pos:1 "f";; - : bool = false
    >> 
    >> being possible.  Don't know whether it's a great idea, though.

    mm> I prefer your first version: "subject", "verb" and "object"
    mm> are close together, the additional modifiers only follow
    mm> afterwards. To my knowledge, most natural languages would
    mm> order expressions like this.

This is a good point, which I'd forgotten.  Natural languages have the
habit of ordering things so that the main arguments of a word
(whatever sort of word that may be) are always closer to the word than
any optional hangers on.  Essentially, this is to avoid problems like:

I gave the mayor of the northernmost city of the northernmost province
of the northernmost nation a token.

If you draw a picture of the structure of this sentence, you end up
with a big deep tree in between the verb "gave" and "a token", which
is the primary argument of that verb.  If you shift things around a
little, you don't get the problem:

I gave a token to the mayor of ...

So your assertion that most natural languages would order expressions
this way is well founded, unless you happen to know that R->L
languages are actually statistically more common than L->R languages
in the world, and therefore we should be writing:

~pos:1 "f" re =~ "foo";

;>

Anyway, the current syntax is enough.  =~ re ... also means that you
could have =~ other things, which might be useful.  And, if I really
cared, camlp4 could give me the full horror of things like:

"foo" =~ /f/i

John.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2000-06-26 10:29 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-06-19  4:14 Labels and operators John Prevost
2000-06-22  1:48 ` Ken Wakita
2000-06-22  5:19 ` Jacques Garrigue
2000-06-24 14:44 ` Markus Mottl
2000-06-24 18:03   ` John Prevost

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).