Thank for sharing, that's very interesting.

Here are various remarks I had while browsing your documentation (which is very good; thanks for taking the time to do something polished).

- The way I understand it, the big change of the Arg interface is that this interface is "pure". In Arg, you define *actions* for the parameters that, when parsing the arguments, have the effect of updating a mutable "configuration state" of the program. In your presentation, you define values for  the parameters (with some corresponding metadata), but can be used in the program as the *values* set for those parameters. You explain this concisely at the beginning, but I think the comparison with the Arg interface is enlightening and could help the user understand the advantages of your approach.

- I'm not sure the presentation "oh it's just an applicative functor" is the most accessible to the wider OCaml community. That said, your examples do a rather good job at explaining how it's used, so maybe the scary words upfront are not so much of a problem.

- I wasn't able to understand why you specifically chose an applicative functor. Intuitively I would see this rather as a Reader (or Env) monad over the (parsed, typed) "parameter state" of the program. Given that monads are more popular and widely used, and that the monadic interface is richer than the applicative functor one, I'm wondering why you made that choice. It would be ok if this is not a monad (I think only structures that are explicitely *not* monads should be presented as applicative functors), but I don't see why your type isn't. Maybe this should be documented.

- Term.eval_choice use a (info * 'a) associative list, while varg_* use ('a * info) lists. I'm not sure why you made different choices. Actually I don't understand what "vflag" is. I find the documentation confusing (and am not able to parse the last sentence). Maybe a different explanation and/or a usage example would help. My vague idea after re-reading this documentation is that it is somehow meant to allow a sum type to be set by parameters (with different possible parameters for the different cases of the sum).

- This seems related to the "parameters combinators" of Eliom :
  http://ocsigen.org/eliom/dev/api/server/Eliom_parameters
  There are small variations (in particular Eliom uses some specific metadata regarding how the parameter is passed in the URL, as a GET parameter, an URL suffix etc.), but the main difference is that Eliom parameters, as Scanf, are in continuation style (you pass the parameters plus a continuation that take the "values" as argument(s)), while you use a direct style encoded in a monad / applicative functor. It might be fruitful to explain the difference in the documentation.

- I think the "predefined converters" are mostly orthogonal to the rest of the interface, and would possibly benefit of being factored out as a separate module, or even an external "extra" library; they should certainly be in any reasonable "extra standard library" (Extlib, Core, Batteries, whichever you like) and you could/should propose them upstream to those libraries if they are not. There may however be an issue with conventions here (maybe for command-line parameters a certain syntax is usual for sequence of elements that is not what those libraries would naturally choose).

- The converter interface seems a bit simplistic and unsatisfying. In particular, parsers being (string -> 'a) functions, I'm not sure how you compose different parsers. I just had a look at the implementation and, for pairs for example, if I understand correctly you split the input string on "," beforehand, then call the parser on both parts. This seems wrong:
  - there seems to be an associativity issue, if I build (pair (pair int int) int) and pass "1,2,3", it will try to parse "1" with (pair int int); so we would need delimiters here
  - what if I pass a quoted string that contains a comma?

- Your build system (handcoded make-like script) is a bit unusual. Maybe it be better for the principle-of-least-surprise if you had a Makefile just redirecting the obvious actions to your funny build script. I first tried (yes, I should read the README first) to run "oasis", but this failed as the _oasis file is syntactically invalid due to the %%FOO%% configuration variables. I'm unsure what is the benefit of having those configuration variables in your build script rather than in the _oasis, but I understand this is a highly subjective thing.

On a more open-ended side, I wanted to report that I have done a related thing for a prototype, with a slightly different twist: I needed a description of configuration options that would support both a command-line interface (for which I reused the Arg module), and configuration directives in an interactive toplevel: "#set +debug;;". I don't suppose you would or should be interested in extending your interface to support such a thing, but this is food for thought anyway.
One difference it makes is that I needed to handle changes in the parameters state during the life of the program. I used simple side-effects on global reference, but I suppose you could also handle this with your applicative/monadic interface, provided the user correctly threads the functor through the parts of her program where the configuration may be changed.

Two other differences are that I used hierarchical options rather than a plain option list, and that I needed to support dependencies between configurations; some flags don't make any sense unless others are enabled, eg. --dot-output --dot-renderer=twopi. Also, when I get the value of a parameter, I don't want to get a meaningful output if a flag on which it depends was disabled (by changing the configuration options): if I disable "dot-output" I want "dot-renderer" to return None even if not directly changed (otherwise the client code would have to first check dot-output, then match on dot-renderer).

Finally, I had a different approach wrt. option/commands spelling. If I understand correctly, you accept an input if and only if it is a valid prefix of only one valid option/command. I didn't use prefixes (you have to type the whole string), but allowed small typing mistakes by computing an "edit distance" and trying to make a reasonable but arbitrary heuristic choice if there was no perfect fit. Included below is my code for doing this, that may be of interest (though I'm not sure if it can be usefully combined with a prefix-is-ok approach). I only tested it lightly, so no guarantee of being bug-free.
This works well for a toplevel use, but I understand that you may want to be stricter for command-line parameters. It may still be useful to suggest the correct option when rejecting the spelling, à la git "Did you mean this ?".

(** Edition distance between two strings (dynamic algorithm) *)
let edit_distance sa sb =
  let la, lb = String.length sa, String.length sb in
  let t = Array.create_matrix (la + 1) (lb + 1) (-1) in
  for i = 0 to la do
    t.(i).(lb) <- (la - i)
  done;
  for j = 0 to lb do
    t.(la).(j) <- (lb - j)
  done;
  for i = la - 1 downto 0 do
    for j = lb - 1 downto 0 do
      t.(i).(j) <- min
    (1 + min t.(i+1).(j) t.(i).(j+1))
    (t.(i+1).(j+1) + if sa.[i] = sb.[j] then 0 else 1)
    done
  done;
  t.(0).(0)

(** Spell-correcting choice in a string association list
   
    If the given key is not found in the assoc list, use edit distance
    to determine the likely candidate. First, sort the assoc by
    increasing order of edit distance to the key. Next, we could
    choose the first candidate (minimal edit distance), but that could
    lead to absurd choices :

    - if the list is a singleton, we would always choose the first, no
    matter how big the error is. If it's really not what the user
    typed, it can be more helpful to report the mistake instead of
    making a dumb choice.

    - if they are different candidates of nearby edit distances to the
    key, we cannot make a meaningful choice : any of the first items
    would be as good.

    Therefore :

    - if the list is a singleton, we only choose its element if the
    edit distance is inferior to a fixed constant (3).

    - if there are a least two elements, we only choose the first one
    if its edit distance is twice shorter than the second one
    (unambiguous choice).
*)
let select_nearest key_string get_string list =
  let list' =
    let weight elem =
      (edit_distance key_string (get_string elem), elem) in
    List.map weight list in
  let list' = List.sort (fun (k1, _) (k2, _) -> compare k1 k2) list' in
  match list' with
    | [] -> None
    | (key, elem)::[] -> if key <= 3 then Some elem else None
    | (k1, elem)::(k2, _)::_ -> if 2 * k1 < k2 then Some elem else None


On Fri, May 27, 2011 at 4:54 PM, Daniel Bünzli <daniel.buenzli@erratique.ch> wrote:
Hello,

I grew tired of the Arg module. For a quick and easy way to make your
functions available from the command line you may be interested in
Cmdliner :

Cmdliner is a module for the declarative definition of command line interfaces.

It provides a simple and compositional mechanism to convert command
line arguments to OCaml values and pass them to your functions. The
module automatically handles syntax errors, help messages and UNIX man
page generation. It supports programs with single or multiple commands
(like darcs or git) and respect most of the POSIX and GNU conventions.

Cmdliner is made of a single, independent, module and distributed
under the BSD3 license.

Project home page : http://erratique.ch/software/cmdliner


The basics section of the documentation can be read as tutorial introduction:

http://erratique.ch/software/cmdliner/doc/Cmdliner#basics

Your feedback is welcome.

Daniel

P.S. The examples use syntactic constructs only available in 3.12
however I took care not to use them in the implementation of Cmdliner
itself.

--
Caml-list mailing list.  Subscription management and archives:
https://sympa-roc.inria.fr/wws/info/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs