[Caml-list] [ANN] Cmdliner 0.9.0

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* [Caml-list] [ANN] Cmdliner 0.9.0
@ 2011-05-27 14:54 Daniel Bünzli
  2011-05-27 16:35 ` Gabriel Scherer
  0 siblings, 1 reply; 3+ messages in thread
From: Daniel Bünzli @ 2011-05-27 14:54 UTC (permalink / raw)
  To: caml-list, caml-hump

Hello,

I grew tired of the Arg module. For a quick and easy way to make your
functions available from the command line you may be interested in
Cmdliner :

Cmdliner is a module for the declarative definition of command line interfaces.

It provides a simple and compositional mechanism to convert command
line arguments to OCaml values and pass them to your functions. The
module automatically handles syntax errors, help messages and UNIX man
page generation. It supports programs with single or multiple commands
(like darcs or git) and respect most of the POSIX and GNU conventions.

Cmdliner is made of a single, independent, module and distributed
under the BSD3 license.

Project home page : http://erratique.ch/software/cmdliner

The basics section of the documentation can be read as tutorial introduction:

http://erratique.ch/software/cmdliner/doc/Cmdliner#basics

Your feedback is welcome.

Daniel

P.S. The examples use syntactic constructs only available in 3.12
however I took care not to use them in the implementation of Cmdliner
itself.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] [ANN] Cmdliner 0.9.0
  2011-05-27 14:54 [Caml-list] [ANN] Cmdliner 0.9.0 Daniel Bünzli
@ 2011-05-27 16:35 ` Gabriel Scherer
  2011-05-28 16:02   ` Daniel Bünzli
  0 siblings, 1 reply; 3+ messages in thread
From: Gabriel Scherer @ 2011-05-27 16:35 UTC (permalink / raw)
  To: Daniel Bünzli; +Cc: caml-list, caml-hump

[-- Attachment #1: Type: text/plain, Size: 10542 bytes --]

Thank for sharing, that's very interesting.

Here are various remarks I had while browsing your documentation (which is
very good; thanks for taking the time to do something polished).

- The way I understand it, the big change of the Arg interface is that this
interface is "pure". In Arg, you define *actions* for the parameters that,
when parsing the arguments, have the effect of updating a mutable
"configuration state" of the program. In your presentation, you define
values for  the parameters (with some corresponding metadata), but can be
used in the program as the *values* set for those parameters. You explain
this concisely at the beginning, but I think the comparison with the Arg
interface is enlightening and could help the user understand the advantages
of your approach.

- I'm not sure the presentation "oh it's just an applicative functor" is the
most accessible to the wider OCaml community. That said, your examples do a
rather good job at explaining how it's used, so maybe the scary words
upfront are not so much of a problem.

- I wasn't able to understand why you specifically chose an applicative
functor. Intuitively I would see this rather as a Reader (or Env) monad over
the (parsed, typed) "parameter state" of the program. Given that monads are
more popular and widely used, and that the monadic interface is richer than
the applicative functor one, I'm wondering why you made that choice. It
would be ok if this is not a monad (I think only structures that are
explicitely *not* monads should be presented as applicative functors), but I
don't see why your type isn't. Maybe this should be documented.

- Term.eval_choice use a (info * 'a) associative list, while varg_* use ('a
* info) lists. I'm not sure why you made different choices. Actually I don't
understand what "vflag" is. I find the documentation confusing (and am not
able to parse the last sentence). Maybe a different explanation and/or a
usage example would help. My vague idea after re-reading this documentation
is that it is somehow meant to allow a sum type to be set by parameters
(with different possible parameters for the different cases of the sum).

- This seems related to the "parameters combinators" of Eliom :
  http://ocsigen.org/eliom/dev/api/server/Eliom_parameters
  There are small variations (in particular Eliom uses some specific
metadata regarding how the parameter is passed in the URL, as a GET
parameter, an URL suffix etc.), but the main difference is that Eliom
parameters, as Scanf, are in continuation style (you pass the parameters
plus a continuation that take the "values" as argument(s)), while you use a
direct style encoded in a monad / applicative functor. It might be fruitful
to explain the difference in the documentation.

- I think the "predefined converters" are mostly orthogonal to the rest of
the interface, and would possibly benefit of being factored out as a
separate module, or even an external "extra" library; they should certainly
be in any reasonable "extra standard library" (Extlib, Core, Batteries,
whichever you like) and you could/should propose them upstream to those
libraries if they are not. There may however be an issue with conventions
here (maybe for command-line parameters a certain syntax is usual for
sequence of elements that is not what those libraries would naturally
choose).

- The converter interface seems a bit simplistic and unsatisfying. In
particular, parsers being (string -> 'a) functions, I'm not sure how you
compose different parsers. I just had a look at the implementation and, for
pairs for example, if I understand correctly you split the input string on
"," beforehand, then call the parser on both parts. This seems wrong:
  - there seems to be an associativity issue, if I build (pair (pair int
int) int) and pass "1,2,3", it will try to parse "1" with (pair int int); so
we would need delimiters here
  - what if I pass a quoted string that contains a comma?

- Your build system (handcoded make-like script) is a bit unusual. Maybe it
be better for the principle-of-least-surprise if you had a Makefile just
redirecting the obvious actions to your funny build script. I first tried
(yes, I should read the README first) to run "oasis", but this failed as the
_oasis file is syntactically invalid due to the %%FOO%% configuration
variables. I'm unsure what is the benefit of having those configuration
variables in your build script rather than in the _oasis, but I understand
this is a highly subjective thing.

On a more open-ended side, I wanted to report that I have done a related
thing for a prototype, with a slightly different twist: I needed a
description of configuration options that would support both a command-line
interface (for which I reused the Arg module), and configuration directives
in an interactive toplevel: "#set +debug;;". I don't suppose you would or
should be interested in extending your interface to support such a thing,
but this is food for thought anyway.
One difference it makes is that I needed to handle changes in the parameters
state during the life of the program. I used simple side-effects on global
reference, but I suppose you could also handle this with your
applicative/monadic interface, provided the user correctly threads the
functor through the parts of her program where the configuration may be
changed.

Two other differences are that I used hierarchical options rather than a
plain option list, and that I needed to support dependencies between
configurations; some flags don't make any sense unless others are enabled,
eg. --dot-output --dot-renderer=twopi. Also, when I get the value of a
parameter, I don't want to get a meaningful output if a flag on which it
depends was disabled (by changing the configuration options): if I disable
"dot-output" I want "dot-renderer" to return None even if not directly
changed (otherwise the client code would have to first check dot-output,
then match on dot-renderer).

Finally, I had a different approach wrt. option/commands spelling. If I
understand correctly, you accept an input if and only if it is a valid
prefix of only one valid option/command. I didn't use prefixes (you have to
type the whole string), but allowed small typing mistakes by computing an
"edit distance" and trying to make a reasonable but arbitrary heuristic
choice if there was no perfect fit. Included below is my code for doing
this, that may be of interest (though I'm not sure if it can be usefully
combined with a prefix-is-ok approach). I only tested it lightly, so no
guarantee of being bug-free.
This works well for a toplevel use, but I understand that you may want to be
stricter for command-line parameters. It may still be useful to suggest the
correct option when rejecting the spelling, à la git "Did you mean this ?".

(** Edition distance between two strings (dynamic algorithm) *)
> let edit_distance sa sb =
>   let la, lb = String.length sa, String.length sb in
>   let t = Array.create_matrix (la + 1) (lb + 1) (-1) in
>   for i = 0 to la do
>     t.(i).(lb) <- (la - i)
>   done;
>   for j = 0 to lb do
>     t.(la).(j) <- (lb - j)
>   done;
>   for i = la - 1 downto 0 do
>     for j = lb - 1 downto 0 do
>       t.(i).(j) <- min
>     (1 + min t.(i+1).(j) t.(i).(j+1))
>     (t.(i+1).(j+1) + if sa.[i] = sb.[j] then 0 else 1)
>     done
>   done;
>   t.(0).(0)
>
> (** Spell-correcting choice in a string association list
>
>     If the given key is not found in the assoc list, use edit distance
>     to determine the likely candidate. First, sort the assoc by
>     increasing order of edit distance to the key. Next, we could
>     choose the first candidate (minimal edit distance), but that could
>     lead to absurd choices :
>
>     - if the list is a singleton, we would always choose the first, no
>     matter how big the error is. If it's really not what the user
>     typed, it can be more helpful to report the mistake instead of
>     making a dumb choice.
>
>     - if they are different candidates of nearby edit distances to the
>     key, we cannot make a meaningful choice : any of the first items
>     would be as good.
>
>     Therefore :
>
>     - if the list is a singleton, we only choose its element if the
>     edit distance is inferior to a fixed constant (3).
>
>     - if there are a least two elements, we only choose the first one
>     if its edit distance is twice shorter than the second one
>     (unambiguous choice).
> *)
> let select_nearest key_string get_string list =
>   let list' =
>     let weight elem =
>       (edit_distance key_string (get_string elem), elem) in
>     List.map weight list in
>   let list' = List.sort (fun (k1, _) (k2, _) -> compare k1 k2) list' in
>   match list' with
>     | [] -> None
>     | (key, elem)::[] -> if key <= 3 then Some elem else None
>     | (k1, elem)::(k2, _)::_ -> if 2 * k1 < k2 then Some elem else None
>


On Fri, May 27, 2011 at 4:54 PM, Daniel Bünzli
<daniel.buenzli@erratique.ch>wrote:

> Hello,
>
> I grew tired of the Arg module. For a quick and easy way to make your
> functions available from the command line you may be interested in
> Cmdliner :
>
> Cmdliner is a module for the declarative definition of command line
> interfaces.
>
> It provides a simple and compositional mechanism to convert command
> line arguments to OCaml values and pass them to your functions. The
> module automatically handles syntax errors, help messages and UNIX man
> page generation. It supports programs with single or multiple commands
> (like darcs or git) and respect most of the POSIX and GNU conventions.
>
> Cmdliner is made of a single, independent, module and distributed
> under the BSD3 license.
>
> Project home page : http://erratique.ch/software/cmdliner
>
>
> The basics section of the documentation can be read as tutorial
> introduction:
>
> http://erratique.ch/software/cmdliner/doc/Cmdliner#basics
>
> Your feedback is welcome.
>
> Daniel
>
> P.S. The examples use syntactic constructs only available in 3.12
> however I took care not to use them in the implementation of Cmdliner
> itself.
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa-roc.inria.fr/wws/info/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>

[-- Attachment #2: Type: text/html, Size: 11890 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] [ANN] Cmdliner 0.9.0
  2011-05-27 16:35 ` Gabriel Scherer
@ 2011-05-28 16:02   ` Daniel Bünzli
  0 siblings, 0 replies; 3+ messages in thread
From: Daniel Bünzli @ 2011-05-28 16:02 UTC (permalink / raw)
  To: Gabriel Scherer; +Cc: caml-list

Thanks for the long email.

> - I'm not sure the presentation "oh it's just an applicative functor" is the
> most accessible to the wider OCaml community. That said, your examples do a
> rather good job at explaining how it's used, so maybe the scary words
> upfront are not so much of a problem.

As you saw, no knowledge of this is needed. Maybe you are right that
this info doesn't belong here but OTOH I may help programmers who know
what it means. It was also an opportunity to link to an interesting
paper.

> - I wasn't able to understand why you specifically chose an applicative
> functor.

Me neither. I realized after thought that this was an applicative
functor. Although I can now hint why I turned that way (see below).

> Intuitively I would see this rather as a Reader (or Env) monad over
> the (parsed, typed) "parameter state" of the program.

I didn't want to see this interaction as reading from an environment.
I wanted to see that as follows.

When I invoke a command on the command line I invoke a function and
the command line arguments are the arguments to the function. In my
function I don't want to work with special types representing command
line arguments, if my function needs an int then it should be a
regular int, not something from which I can extract an int.

So the idea was that to turn the problem inside-out. Instead of
working with special types representing command line arguments, lift
your bare function in an applicative functor that handles and hides
the extraction of OCaml values from the command line arguments.

> Given that monads are
> more popular and widely used, and that the monadic interface is richer than
> the applicative functor one, I'm wondering why you made that choice. It
> would be ok if this is not a monad (I think only structures that are
> explicitely *not* monads should be presented as applicative functors), but I
> don't see why your type isn't. Maybe this should be documented.

First I always try to work with the weakest assumptions and
applicative was enough. Second monad is just applicative + bind and I
don't see for what you would like bind here. In cmdliner terms are
just a way to hide the command line parsing machinery to our function
that need regular OCaml values. It turns out that this is exactly
Applicative's domain, embed pure computations in an effectful world
(the parsing machinery).

> - Term.eval_choice use a (info * 'a) associative list, while varg_* use ('a
> * info) lists. I'm not sure why you made different choices.

Not sure either. Can't remember. Maybe because of the order of
arguments in Term.eval.

> Actually I don't understand what "vflag" is.

It's a single value that can be defined by different flags. Maybe have
a look at the rm example.

http://erratique.ch/software/cmdliner/doc/Cmdliner.html#examples

It uses vflag_all which is like vflag except that the flags are
allowed to repeat.

> - I think the "predefined converters" are mostly orthogonal to the rest of
> the interface, and would possibly benefit of being factored out as a
> separate module, or even an external "extra" library; they should certainly
> be in any reasonable "extra standard library" (Extlib, Core, Batteries,
> whichever you like) and you could/should propose them upstream to those
> libraries if they are not.

I think it would defeat the "quick and easy" way intended by the
library. Besides having them in the Arg module allows more concise
definitions if you program in OCaml 3.12, e.g. :

let count = Arg.(value & opt int 10 & info ["c"; "count"] ~docv:"COUNT" ~doc)

> - The converter interface seems a bit simplistic and unsatisfying. In
> particular, parsers being (string -> 'a) functions,

The converter interface is certainly simplistic but I didn't find it
unsatisfying in practice. Take that as a stance to keep your command
line interfaces simple and reasonable.

> compose different parsers. I just had a look at the implementation and, for
> pairs for example, if I understand correctly you split the input string on
> "," beforehand, then call the parser on both parts.

Yes. Note that you don't have to look at the implementation. You can
just read the documentation :

http://erratique.ch/software/cmdliner/doc/Cmdliner.Arg.html#VALpair

> This seems wrong:
>   - there seems to be an associativity issue, if I build (pair (pair int
> int) int) and pass "1,2,3", it will try to parse "1" with (pair int int); so
> we would need delimiters here
>   - what if I pass a quoted string that contains a comma?

Implement a better parser... Given the generality of the parser
interface you are allowed to invoke whatever parsing technology suits
you.

> - Your build system (handcoded make-like script) is a bit unusual. Maybe it
> be better for the principle-of-least-surprise if you had a Makefile just
> redirecting the obvious actions to your funny build script. I first tried
> (yes, I should read the README first) to run "oasis", but this failed as the
> _oasis file is syntactically invalid due to the %%FOO%% configuration
> variables. I'm unsure what is the benefit of having those configuration
> variables in your build script rather than in the _oasis, but I understand
> this is a highly subjective thing.

I do proper software releases (yes I'm old fashioned I don't just push
a repo on github). If you see the %%FOO%% variables it means that you
are trying to use the repository version and you shouldn't; use the
tarballs. These variables are here so that I don't have to repeat
myself.

That said the _oasis file in the distribution 0.9.0 has a syntax
error. If you download from oasis-db you will get one without the
syntax error (I know that you should not publish two tar balls
pretending to be the same thing that differ in content, I permitted
myself this misstep because the current ocaml-db is "experimental" and
will be destroyed).

Regarding the "funny" build script, it may eventually disappear if I
finally get serious about using oasis.

> On a more open-ended side, I wanted to report that I have done a related
> thing for a prototype, with a slightly different twist: I needed a
> description of configuration options that would support both a command-line
> interface (for which I reused the Arg module), and configuration directives
> in an interactive toplevel: "#set +debug;;". I don't suppose you would or
> should be interested in extending your interface to support such a thing,
> but this is food for thought anyway.

Without thinking too much about it I don't think it would *need* to be extended.

> Finally, I had a different approach wrt. option/commands spelling. If I
> understand correctly, you accept an input if and only if it is a valid
> prefix of only one valid option/command.

Yes.

> It may still be useful to suggest the correct option when rejecting the spelling, à la git "Did you mean this ?".

That could be nice. But I have to admit, I spent far too much time on
solving (for me) the mundane problem of command line argument parsing
so don't expect that anytime soon. Well written patches are, however,
welcome.

Best,

Daniel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-05-28 16:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-27 14:54 [Caml-list] [ANN] Cmdliner 0.9.0 Daniel Bünzli
2011-05-27 16:35 ` Gabriel Scherer
2011-05-28 16:02   ` Daniel Bünzli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).