caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Announce: Regexp/OCaml syntax extension
@ 2003-03-20 16:50 Yutaka OIWA
  2003-03-21 10:18 ` Richard W.M. Jones
  2003-03-28 20:13 ` Pierre Weis
  0 siblings, 2 replies; 3+ messages in thread
From: Yutaka OIWA @ 2003-03-20 16:50 UTC (permalink / raw)
  To: caml-list

Hello subscribers:

I release a camlp4-macro package called Regexp/OCaml.

I believe Objective Caml is a powerful tool not only for writing an
interpreter and compiler but also for writing casual "script"
applications which were in a territory of Perl, Python and
Ruby. However, string decomposition is one of OCaml's weak points when
compared to those scripting languages: the interfaces of both Str
module and Pcre module are very primitive and cumbersome if they are
heavily used.  Regexp/OCaml solves this point by providing syntax
support for regular expression matching.

Regexp/OCaml provides convenient syntax sugar for regular expression
match against strings using PCRE/OCaml library. The features of this
macro package are the following:

    * Convenient syntax: similar to standard match-with expressions
    * Binding matching substrings to variables: no more $1, $2, ...
    * Automagical easy-to-use type-coercion: no flood of int_of_string etc.
    * Support for optional-patterns: gives string option type etc.
    * Default values for optional-patterns 

For example, parsing an entry for some log file becomes as easy as follows:

  try
    while true do
      let line = input_line ic in
        Regexp.match line with
          "^\((\d\d):(\d\d):(\d\d)\)\[(.*?)\] (.*)$" 
           as hour : int, min: int, sec : int, name, line ->
             let time = hour * 3600 + min * 60 + sec in
             ...
        | "^# (.*)$" as meta_info ->
             ...
        | _ -> ()
    done
  with End_of_file -> ()

This short code parses both line in format like
  "(00:34:32) [foobar] something" and "# some meta info"
and binds appropriate data into variables which can be used inside "...".
Compare the code above with an equivalent without using syntax extension.

Regexp/OCaml is downloadable from the web location
http://www.yl.is.s.u-tokyo.ac.jp/~oiwa/caml/ .

In addition to the main macro called pa_regexp_match, the package also
contains two tiny macros:

 1) pa_pragma changes an option for loaded camlp4 macros inside source code.
 2) pa_once provides "once" construct which evaluates any expression
    only once per execution (pa_regexp_match uses this internally).

Any comments will be greatly appreciated.

-- 
Yutaka Oiwa              Yonezawa Lab., Dept. of Computer Science,
      Graduate School of Information Sci. & Tech., Univ. of Tokyo.
      <oiwa@yl.is.s.u-tokyo.ac.jp>, <yutaka@oiwa.shibuya.tokyo.jp>
PGP fingerprint = C9 8D 5C B8 86 ED D8 07  EA 59 34 D8 F4 65 53 61

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] Announce: Regexp/OCaml syntax extension
  2003-03-20 16:50 [Caml-list] Announce: Regexp/OCaml syntax extension Yutaka OIWA
@ 2003-03-21 10:18 ` Richard W.M. Jones
  2003-03-28 20:13 ` Pierre Weis
  1 sibling, 0 replies; 3+ messages in thread
From: Richard W.M. Jones @ 2003-03-21 10:18 UTC (permalink / raw)
  To: Yutaka OIWA; +Cc: caml-list

This looks like it could be one of the essential pieces which could
persuade me to use OCaml full time.

Rich.

-- 
Richard Jones, Red Hat Inc. (London, UK) http://www.redhat.com/software/ccm
http://www.annexia.org/ Freshmeat projects: http://freshmeat.net/users/rwmj
C2LIB is a library of basic Perl/STL-like types for C. Vectors, hashes,
trees, string funcs, pool allocator: http://www.annexia.org/freeware/c2lib/

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] Announce: Regexp/OCaml syntax extension
  2003-03-20 16:50 [Caml-list] Announce: Regexp/OCaml syntax extension Yutaka OIWA
  2003-03-21 10:18 ` Richard W.M. Jones
@ 2003-03-28 20:13 ` Pierre Weis
  1 sibling, 0 replies; 3+ messages in thread
From: Pierre Weis @ 2003-03-28 20:13 UTC (permalink / raw)
  To: Yutaka OIWA; +Cc: caml-list

> Hello subscribers:
> 
> I release a camlp4-macro package called Regexp/OCaml.
[...]
> For example, parsing an entry for some log file becomes as easy as follows:
> 
>   try
>     while true do
>       let line = input_line ic in
>         Regexp.match line with
>           "^\((\d\d):(\d\d):(\d\d)\)\[(.*?)\] (.*)$" 
>            as hour : int, min: int, sec : int, name, line ->
>              let time = hour * 3600 + min * 60 + sec in
>              ...
>         | "^# (.*)$" as meta_info ->
>              ...
>         | _ -> ()
>     done
>   with End_of_file -> ()
> 
> This short code parses both line in format like
>   "(00:34:32) [foobar] something" and "# some meta info"
> and binds appropriate data into variables which can be used inside "...".
> Compare the code above with an equivalent without using syntax extension.
[...]
> -- 
> Yutaka Oiwa              Yonezawa Lab., Dept. of Computer Science,
>       Graduate School of Information Sci. & Tech., Univ. of Tokyo.
>       <oiwa@yl.is.s.u-tokyo.ac.jp>, <yutaka@oiwa.shibuya.tokyo.jp>
> PGP fingerprint = C9 8D 5C B8 86 ED D8 07  EA 59 34 D8 F4 65 53 61

Your camlp4 expertise is impressive :)

However, since you propose to compare with other way to program your
example, I would suggest the simple use of Scanf, that gives:

 try
    while true do
      let ib = Scanning.from_string (input_line ic) in
      match bscanf ib "%c" (fun x -> x) with
      | '(' ->
          bscanf ib "%d:%d:%d) [%s] %s"
           (fun hour min sec name line ->
              let time = hour * 3600 + min * 60 + sec in
              ...)
      | '#' ->
          Scanf.scanf " %s" (fun meta_info -> ...)
      | _ -> ()
    done
  with End_of_file -> ()

This sounds pretty simple, compact, and easy to understand as well...

Best regards,

Pierre Weis

INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://pauillac.inria.fr/~weis/


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-03-28 20:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-20 16:50 [Caml-list] Announce: Regexp/OCaml syntax extension Yutaka OIWA
2003-03-21 10:18 ` Richard W.M. Jones
2003-03-28 20:13 ` Pierre Weis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).