caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* RE: [Caml-list] SML->OCaml
@ 2005-03-07  2:14 Harrison, John R
  2005-03-07 21:39 ` Martin Jambon
  0 siblings, 1 reply; 5+ messages in thread
From: Harrison, John R @ 2005-03-07  2:14 UTC (permalink / raw)
  To: caml-list; +Cc: Harrison, John R

Does this version do anything about SML programs that violate OCaml's
"uppercase identifier" convention? I recently tried something similar,
and while it did a competent job of parsing most of the syntax of SML,
it just reported errors for SML value bindings starting with an
uppercase letter. It would be nice if it just mapped such names to
"lowercase_XXX" or something so that the result could at least be
compiled. Or is that too "context sensitive" to be easy?

John.

-----Original Message-----
From: caml-list-admin@yquem.inria.fr
[mailto:caml-list-admin@yquem.inria.fr] On Behalf Of Martin Jambon
Sent: Sunday, March 06, 2005 2:16 PM
To: Konstantine Arkoudas
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] SML->OCaml

On Sun, 6 Mar 2005, Konstantine Arkoudas wrote:

> I'm thinking about re-implementing a fairly large SML-NJ project
> (> 20K lines) in OCaml. Is anybody aware of any tools capable
> of automatically translating SML code into OCaml, at least
> partially? Any info would be appreciated. Thanks.

It exists but it is in the "unmaintained" section of camlp4:
  http://camlcvs.inria.fr/cgi-bin/cvsweb/ocaml/camlp4/unmaintained/sml/


Martin

--
Martin Jambon, PhD
Researcher in Structural Bioinformatics since the 20th Century
The Burnham Institute http://www.burnham.org
San Diego, California


_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [Caml-list] SML->OCaml
  2005-03-07  2:14 [Caml-list] SML->OCaml Harrison, John R
@ 2005-03-07 21:39 ` Martin Jambon
  2005-03-08  9:11   ` Andreas Rossberg
  0 siblings, 1 reply; 5+ messages in thread
From: Martin Jambon @ 2005-03-07 21:39 UTC (permalink / raw)
  To: Harrison, John R; +Cc: caml-list

On Sun, 6 Mar 2005, Harrison, John R wrote:

> Does this version do anything about SML programs that violate OCaml's
> "uppercase identifier" convention? I recently tried something similar,
> and while it did a competent job of parsing most of the syntax of SML,
> it just reported errors for SML value bindings starting with an
> uppercase letter. It would be nice if it just mapped such names to
> "lowercase_XXX" or something so that the result could at least be
> compiled. Or is that too "context sensitive" to be easy?

[I don't know SML and I am not an expert in Camlp4. And I haven't tried
the SML-to-OCaml converter]

The converter needs a way to tell whether a given identifier is a type
constructor (such as None or Some) or not. Thus the converter needs to
remember the accessible type definitions (either from the standard
library of SML or from other modules). That is possible, by creating some
auxilliary files that contain this information (maybe .cmi files could be
parsed but anyway the type definitions have to be analysed during the
preprocessing of a file). It doesn't seem to be implemented
in pa_sml.ml but a few hundred lines of additional code could do it (or
maybe less).

There would also be a problem with record fields if several record types
use identical names for their fields (using objects is probably not a
good idea for simple records).

Maybe you could also (1) compile you SML files and (2) use the information
contained in the SML equivalent of .cmi files in combination with Camlp4.


Martin

--
Martin Jambon, PhD
Researcher in Structural Bioinformatics since the 20th Century
The Burnham Institute http://www.burnham.org
San Diego, California





^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] SML->OCaml
  2005-03-07 21:39 ` Martin Jambon
@ 2005-03-08  9:11   ` Andreas Rossberg
  2005-03-08 20:46     ` AST traversal functions (was: SML->OCaml) Martin Jambon
  0 siblings, 1 reply; 5+ messages in thread
From: Andreas Rossberg @ 2005-03-08  9:11 UTC (permalink / raw)
  To: caml-list

Martin Jambon <martin_jambon@emailuser.net> wrote:
>
> > Does this version do anything about SML programs that violate OCaml's
> > "uppercase identifier" convention? I recently tried something similar,
> > and while it did a competent job of parsing most of the syntax of SML,
> > it just reported errors for SML value bindings starting with an
> > uppercase letter. It would be nice if it just mapped such names to
> > "lowercase_XXX" or something so that the result could at least be
> > compiled. Or is that too "context sensitive" to be easy?
>
> [I don't know SML and I am not an expert in Camlp4. And I haven't tried
> the SML-to-OCaml converter]
>
> The converter needs a way to tell whether a given identifier is a type
> constructor (such as None or Some) or not. Thus the converter needs to
> remember the accessible type definitions (either from the standard
> library of SML or from other modules). That is possible, by creating some
> auxilliary files that contain this information (maybe .cmi files could be
> parsed but anyway the type definitions have to be analysed during the
> preprocessing of a file). It doesn't seem to be implemented
> in pa_sml.ml but a few hundred lines of additional code could do it (or
> maybe less).

In fact, it would be much more complicated. First, identifier status is
scoped. Constructors are frequently defined locally. In particular, this may
happen implicitly through the use of "open". To derive the required
information you hence needed to perform a complete binding analysis,
including modules and signatures.

Then, SML actually allows constructor status to be withdrawn from an
identifier. For example, the following program is valid:

  signature S = sig type t val x : t end
  structure A : S = struct datatype t = x end

There are few programs that make use of this possibility, but I expect them
to coincide with those that violate the usual case conventions in the first
place. An SML-to-OCaml translator had to go to quite some length to
translate such programs.

In summary, to deal with constructor status correctly (not to mention stuff
like datatype replication, local, records, user-defined fixity, etc.) you
basically need half an SML frontend. It seems out of scope for a Camlp4 hack
to be more than a simple approximation.

Cheers,

  - Andreas


^ permalink raw reply	[flat|nested] 5+ messages in thread

* AST traversal functions (was: SML->OCaml)
  2005-03-08  9:11   ` Andreas Rossberg
@ 2005-03-08 20:46     ` Martin Jambon
  2005-03-08 21:00       ` [Caml-list] " Hal Daume III
  0 siblings, 1 reply; 5+ messages in thread
From: Martin Jambon @ 2005-03-08 20:46 UTC (permalink / raw)
  To: Andreas Rossberg; +Cc: caml-list

On Tue, 8 Mar 2005, Andreas Rossberg wrote:

> In fact, it would be much more complicated. First, identifier status is
> scoped. Constructors are frequently defined locally. In particular, this may
> happen implicitly through the use of "open". To derive the required
> information you hence needed to perform a complete binding analysis,
> including modules and signatures.
>
> Then, SML actually allows constructor status to be withdrawn from an
> identifier. For example, the following program is valid:
>
>   signature S = sig type t val x : t end
>   structure A : S = struct datatype t = x end
>
> There are few programs that make use of this possibility, but I expect them
> to coincide with those that violate the usual case conventions in the first
> place. An SML-to-OCaml translator had to go to quite some length to
> translate such programs.
>
> In summary, to deal with constructor status correctly (not to mention stuff
> like datatype replication, local, records, user-defined fixity, etc.) you
> basically need half an SML frontend. It seems out of scope for a Camlp4 hack
> to be more than a simple approximation.

Yes, but I believe there should be a convenient way at least to reuse the
syntax tree produced by the current converter, and convert the incorrect
uppercase/lowercase identifiers.
It just requires a good root-to-leaves substitution function which does
not ask us to match explicitely every kind of node of a given type (which
is extremely repetitive and error-prone, even with quotations). I already
thought of doing this (actually automatically deriving such a higher-order
function from the type definition of the AST: Pcaml.expr and friends).

Example:

let X = 1 in fun x -> X

We should be able to specify that
  let $patt$ = $e1$ in $e2$

adds the binding to the current environment for converting expr.
And by default, everything is just "traversed".
For instance, the case $e1$ + $e2$ which is itself an expression should
not be explicitely unconstructed/substituted/reconstructed, that would be
done automatically.
This would let us focus only on the specific cases such as "open",
"let ... in", "fun ...", "let ..." (or "val ..."), simple identifiers and
module-related issues.


Martin


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] AST traversal functions (was: SML->OCaml)
  2005-03-08 20:46     ` AST traversal functions (was: SML->OCaml) Martin Jambon
@ 2005-03-08 21:00       ` Hal Daume III
  0 siblings, 0 replies; 5+ messages in thread
From: Hal Daume III @ 2005-03-08 21:00 UTC (permalink / raw)
  To: Martin Jambon; +Cc: Andreas Rossberg, caml-list

> It just requires a good root-to-leaves substitution function which does
> not ask us to match explicitely every kind of node of a given type (which
> is extremely repetitive and error-prone, even with quotations). I already
> thought of doing this (actually automatically deriving such a higher-order
> function from the type definition of the AST: Pcaml.expr and friends).
>
> This would let us focus only on the specific cases such as "open",
> "let ... in", "fun ...", "let ..." (or "val ..."), simple identifiers and
> module-related issues.

FWIW, this is exactly what the "scrap your boilerplate" proposal in 
Haskell does.  I've used it to do something relatively similar: converting 
multiple Haskell modules into one large module.  The brunt of the work is 
basically in replacing identifiers with something unique, and is more or 
less the same as the problem you're talking about here.  The code for the 
traversal is about 300 lines, compared to the several thousands of which 
would be required to match every constructor.

-- 
 Hal Daume III                                   | hdaume@isi.edu
 "Arrest this man, he talks in maths."           | www.isi.edu/~hdaume


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-03-08 21:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-03-07  2:14 [Caml-list] SML->OCaml Harrison, John R
2005-03-07 21:39 ` Martin Jambon
2005-03-08  9:11   ` Andreas Rossberg
2005-03-08 20:46     ` AST traversal functions (was: SML->OCaml) Martin Jambon
2005-03-08 21:00       ` [Caml-list] " Hal Daume III

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).