RE: OCaml on CLR/JVM?

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* RE: OCaml on CLR/JVM?
@ 2001-02-09 15:49 Dave Berry
  2001-02-10  1:04 ` Toby Watson
  2001-02-14 19:30 ` [Caml-list] " Xavier Leroy
  0 siblings, 2 replies; 8+ messages in thread
From: Dave Berry @ 2001-02-09 15:49 UTC (permalink / raw)
  To: Xavier Leroy, caml-list

> From: Xavier Leroy [mailto:Xavier.Leroy@inria.fr]
> 
> > Now I have to say the obvious: wouldn't it be wonderful if Caml
interfaced
> > with either Java or the .NET Common Language Runtime seamlessly so we
> > wouldn't have to keep facing these kinds of questions and problems, and
> > could just leverage existing libraries?   

Although this view is understandable, I think it is rather naive.  As Xavier
said:

> One thing I learnt is that the real problem with language
> interoperability is not how to compile language X to virtual machine Y
> (this can always be done, albeit more or less efficiently), but rather
> how to map between X's data structures and objects and those of all
> other languages Z1 ... Zn that also compile down to Y.  

To look at it another way, OCaml already shares a platform with C (at least
with the native-code compiler), so all the C libraries are already
available.
Yet it can still be a lot of effort to link with a C library.  Why should 
Java and .NET be any easier?  Also, look at the effort that went into making
an ML/Java system with MLj.

(To be fair, I've never tried linking OCaml to C myself; I'm only judging
the
difficulty by traffic on this mailing list.  It seems to be simpler than
many
other ML compilers, but still with potential pitfalls). 

Threads are another area of potential problems.  In fact they can be a total
minefield.  

Xavier also wrote:
> These Haskell guys sure are
> at the bleeding edge of language interoperability. 

I don't entirely agree with this -- they may be at the leading edge of
publicly
documented implementations, but I know of commercial Lisp and Dylan
implementations
that are notably more sophisticated.  Unfortunately the techniques that
those
implementations use are proprietary, so I guess this doesn't have much
practical
impact for anyone else.

That said, I was pleased (if a little envious) when the Haskell people
successfully
published some papers about foreign-language interfaces.  I had rather
assumed that the
academic community wouldn't be interested in such mundane engineering
issues.
I did notice that some of the papers still used greek letters and
denotational
semantic-style equations to describe a fairly straightforward translation -
perhaps
these softened the blow to the readers?  ;-)  (I hope this doesn't sound
harsh; I 
intend it mainly in jest). 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: OCaml on CLR/JVM?
  2001-02-09 15:49 OCaml on CLR/JVM? Dave Berry
@ 2001-02-10  1:04 ` Toby Watson
  2001-02-14 19:30 ` [Caml-list] " Xavier Leroy
  1 sibling, 0 replies; 8+ messages in thread
From: Toby Watson @ 2001-02-10  1:04 UTC (permalink / raw)
  To: caml-list

> > One thing I learnt is that the real problem with language
> > interoperability is not how to compile language X to virtual machine Y
> > (this can always be done, albeit more or less efficiently), but rather
> > how to map between X's data structures and objects and those of all
> > other languages Z1 ... Zn that also compile down to Y.
>
> To look at it another way, OCaml already shares a platform with C (at
least
> with the native-code compiler), so all the C libraries are already
> available.
> Yet it can still be a lot of effort to link with a C library.  Why should
> Java and .NET be any easier?  Also, look at the effort that went into
making
> an ML/Java system with MLj.

As I understand it part of the .NET effort is to specify *some* common
higher-level to decrease types the impedence mismatch between languages. I
think this is called the common language specification,  and there is some
further common type system (probably defines a protocol for collections and
such) CTS.

Whether or not they are successful - and I know the project7 is still
carrying out research to improve type system compatibility - the Idea is to
have practical interoperability between languages. Basically this is to
improve upon the situation where you can make method calls, or even marshall
objects between langauges but still need to generate pages of IDL or custom
headers to have a decent interface. This goal is aided by retaining the type
signatures and other information in binary files.

Some general but useful non-MS background:

http://www.cs.mu.oz.au/research/mercury/information/dotnet/mercury_and_dotne
t.html

not so non-MS

http://www.dnjonline.com/articles/essentials/iss21_essentials.html

cheers,
toby

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Caml-list] Re: OCaml on CLR/JVM?
  2001-02-09 15:49 OCaml on CLR/JVM? Dave Berry
  2001-02-10  1:04 ` Toby Watson
@ 2001-02-14 19:30 ` Xavier Leroy
  2001-02-15 12:12   ` Sven LUTHER
  1 sibling, 1 reply; 8+ messages in thread
From: Xavier Leroy @ 2001-02-14 19:30 UTC (permalink / raw)
  To: Dave Berry; +Cc: caml-list

> To look at it another way, OCaml already shares a platform with C
> (at least with the native-code compiler), so all the C libraries are
> already available.  Yet it can still be a lot of effort to link with
> a C library.  Why should Java and .NET be any easier?

In short, because it is possible to automate fully the generation of
stub code for interfacing with Java or .NET, while this is not
possible in C.  To generate stub code, you need to know quite a lot
about the library you're interfacing with: 
- types of function arguments and results, and of data structure members;
- memory protocol (who frees dynamically-allocated memory and when?)
- error reporting protocol (e.g. return "impossible" results, or
    longjmp() somewhere, or call a user-provided error callback, or ...)

For C, most of this information is implicit or written down in English
only.  E.g. the function prototypes you find in .h files are wholly
inadequate to generate stub code because the C types are not
informative enough: is "char **" an array of strings or a string
result passed as an "out" parameter?  what about macros (they are in
effect untyped)?  As for memory management and error reporting, the .h
file doesn't say anything at all.  

So, interfacing with a C library requires quite a lot of manual
intervention to supply the missing info.  This can take the form of
writing the stubs entirely by hand, or writing a higher-level
description of the library interface in an IDL-like interface from
which the stubs can be generated automatically.  The latter means less
typing, but is not actually significantly easier: the programmer still
has to get the intended behavior from the English documentation, and
then shape it into an IDL file that the stub generator will
understand.

Interfacing with Java or .NET is a completely different story:
- Libraries always come with detailed type annotations (descriptors
  in Java .class files; metadata in .NET) that are high-level enough
  to support the automatic generation of stub code.
- These systems are garbage-collected, so there is no concern about
  manual deallocation.
- These systems have built-in exceptions, which most libraries use
  consistently to report errors.

So, the goal of generating stub code entirely automatically can be
achieved with Java or .NET, but not with C.  (Or so I hope; we'll see
how it goes.)

- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr.  Archives: http://caml.inria.fr

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] Re: OCaml on CLR/JVM?
  2001-02-14 19:30 ` [Caml-list] " Xavier Leroy
@ 2001-02-15 12:12   ` Sven LUTHER
  0 siblings, 0 replies; 8+ messages in thread
From: Sven LUTHER @ 2001-02-15 12:12 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: Dave Berry, caml-list

On Wed, Feb 14, 2001 at 08:30:51PM +0100, Xavier Leroy wrote:
> > To look at it another way, OCaml already shares a platform with C
> > (at least with the native-code compiler), so all the C libraries are
> > already available.  Yet it can still be a lot of effort to link with
> > a C library.  Why should Java and .NET be any easier?
> 
> In short, because it is possible to automate fully the generation of
> stub code for interfacing with Java or .NET, while this is not
> possible in C.  To generate stub code, you need to know quite a lot
> about the library you're interfacing with: 
> - types of function arguments and results, and of data structure members;

This can be automated by a basic parser tool, isn't it ?

This is what i started to do with c2caml, but i have no time to continue work
on it.

> - memory protocol (who frees dynamically-allocated memory and when?)

More difficult, C has no standard way of doing this, but often libraries have
a standard way of handling this.

> - error reporting protocol (e.g. return "impossible" results, or
>     longjmp() somewhere, or call a user-provided error callback, or ...)

Same as above ...

> For C, most of this information is implicit or written down in English
> only.  E.g. the function prototypes you find in .h files are wholly
> inadequate to generate stub code because the C types are not
> informative enough: is "char **" an array of strings or a string

Yes, this is difficult, just the char * argument type can be a problem, since
it is not possible to know if it is a 0 terminated string or somethign else.

> result passed as an "out" parameter?  what about macros (they are in

Just expand them before analysing the code and writting the stubs. 

You could keep them, and have a special preprocessor that sees if it is
possible to use polymorphic functions instead, as it is often possible, but
for straight 1-1 stub writting, expansion is the right way of doing this.

> effect untyped)?  As for memory management and error reporting, the .h
> file doesn't say anything at all.  

Well, memory management is not easy to do. maybe the addition of a per
function manual intervention, or the fixing of a per .h file or per library
convention should help.

Going beyond that would need the analysis of the whole source code, which i
guess is feasible also (for open source projects), but a bit long and heavy.

That said, this apply only to pointer argument types, as the rest of them are
copied around.

Anyway, automatissing this kind of things would perhaps not gain you a lot for
the initial stub writing, since you have to provide some info at first to help
an automated tool, but once that is done, it would help a lot to keep track of
various versions of the same library.

Friendly,

Sven Luther
-------------------
To unsubscribe, mail caml-list-request@inria.fr.  Archives: http://caml.inria.fr

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: OCaml on CLR/JVM?
@ 2001-02-12  9:46 Fabrice Le Fessant
  0 siblings, 0 replies; 8+ messages in thread
From: Fabrice Le Fessant @ 2001-02-12  9:46 UTC (permalink / raw)
  To: caml-list

Is the .NET VM open source ? Which part is Microsoft-independent ?

After loosing progressively some parts of the OS market, is Microsoft
 trying to conquer the VM market ? Why should all software developers
 always depend on some Microsoft software ? Before it was the OS. But
 now, many languages are OS independent ... And now, they want to
 capture the market again, through the .NET VM ?

Ocaml is not a _hack_, as I have read in some recent mails, but a
_good_ independent language. It should not change to follow a
commercial standart, which will itself change for commercial
reasons, as soon as the market is captured ...

If Microsoft wants its new product to be used, it is Microsoft problem
to port more languages to its VM, and not only say: "We have ported
our homemade languages to it (C#, C++, VB.NET) [because it was
designed for them], so, you see, we have proved it's the universal
VM. Now, do the same for your languages, or your language will not be
used anymore by our customers..."

So, why do we really need a .NET port of OCaml ? OCaml is working fine on
Windows, and on many other OS ...

- Fabrice

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: OCaml on CLR/JVM?
@ 2001-02-09 22:05 Don Syme
  0 siblings, 0 replies; 8+ messages in thread
From: Don Syme @ 2001-02-09 22:05 UTC (permalink / raw)
  To: 'Dave Berry', Xavier Leroy, caml-list

> > 
> > > Now I have to say the obvious: wouldn't it be wonderful if Caml
interfaced
> > > with either Java or the .NET Common Language Runtime seamlessly so we
> > > wouldn't have to keep facing these kinds of questions and problems,
and
> > > could just leverage existing libraries?   
> 
> Although this view is understandable, I think it is rather naive.  

Well, I didn't exactly propose a technical solution...  

Of course there's hard work to be done to realise this vision, but in
principle a clean interop story sure beats the endless rehashing of other
people's code in language X as a library in language Y.  Myself and others
involved in the Microsoft Project 7 are working on one approach to achieve
this interop, i.e. compiling languages directly to .NET MS-IL, in the style
of MLj, often adding extensions to the language in order to improve the
interop.  We are also working on improving the .NET infrastructure,
proposing support for features such as parametric polymorphism in MS-IL.  

Xavier is also working on a solution for OCaml, as he mentioned, though the
problem of how to reflect the constructs of an object model into ML, Haskell
or OCaml remains similar whichever approach you take to actually running the
stuff.

> To look at it another way, OCaml already shares a platform with C (at
least
> with the native-code compiler), so all the C libraries are already
> available.
> Yet it can still be a lot of effort to link with a C library.  Why should 
> Java and .NET be any easier?  Also, look at the effort that went into
making
> an ML/Java system with MLj.

There are several reasons why it is easier: exceptions, for example, can be
propogated across the interop boundary, without any effort at all if you
compile to MS-IL of Java bytecode.  If you're compiling to bytecode you can
also ensure more compatibilities of representations, e.g. make sure ML
int64's are exactly representationally equivalent to C's int64s.  Note if
you don't compile to a bytecode then you even have to marshal integers
across the interop boundary in Caml, though this could be automated.

You can also transfer objects more consistently, as the semantics of the
object models of Java and .NET are fairly simple in contrast to C, e.g. no
need to have an IDL to help interpret pointers as "in-out", "in", "out"
parameters.

While at a certain level I like Xavier's approach, i.e. maintaining two
runtimes, garbage collectors etc., I have troubles seeing it scaling to the
multi-language component programming envisioned as part of .NET approach
(and indeed currently in practice with C#, C++, VB.NET and other .NET
langauges).  Two GC's are already trouble enough (performance might suck as
they will both be tuned to fill up the cache), but if you have components
from 10 languages in one process?  10 GCs competing for attention?  Maybe it
can be made to work, but there's a certain conceptual clarity in just
accepting that a GC should form part of the computing infrastructure, and
share that service.  These are the aspects of the .NET approach that I find
quite compelling.

As an aside, I think it would be an interesting question to say "OK, let's
take it for granted that the end purpose of our language is to produce
components whose interface is expressed in terms of the Java or .NET type
systems, but which retains as many of the features and conceptual simplicity
of OCaml and ML as possible."  I'm not sure exactly what you'd end up with,
but whatever it was it could be the language to take over from C# and/or
Java (if that's what you're interested in...)  But without really taking
Java/.NET component building seriously right from the start I feel you're
always just going to end up with a bit of a hack - an interesting, usable
hack perhaps, but not a really _good_ language.

Probably the greatest recurring technical problem that I see in this kind of
work is that of type inference, and the way both the Java and .NET models
rely on both subtyping and overloading to help make APIs palatable.  Type
inference just doesn't work well with either subtyping or oveloading.  This
is a great, great shame, as it's obviously one of the main things ML has to
offer to improve productivity.  

Cheers,
Don

P.S. As for threads - I don't think the story is half as bad as you might
think.  After all, OCaml threads map down to Windows threads at some point,
and I just don't see that there are that many special logical properties of
typical ML and Caml threading libraries that make it semantically ridiculous
to share threads between languages (though it is true asynchronous
exceptions can make things hard when compiling to a bytecode).  But I'll
admit I'm not an expert on this.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: OCaml on CLR/JVM?
  2001-02-08 19:03 ` OCaml on CLR/JVM? Xavier Leroy
@ 2001-02-09  9:06   ` David Mentre
  0 siblings, 0 replies; 8+ messages in thread
From: David Mentre @ 2001-02-09  9:06 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: Don Syme, caml-list

Xavier Leroy <Xavier.Leroy@inria.fr> writes:

[ about Caml and Java GC cooperation ]
> The only limitation is that a cross-heap cycle (a Java object pointing
> to a Caml block pointing back to the Java object) can never be
> reclaimed... (Thanks to Martin Odersky for pointing this out.)

Regarding GC cooperation, some work has been done on the MALI memory
system [Bekkers86-0:confs]. As Olivier Ridoux explained to me and as far
as I remember, they consider 4 kinds of pointer, some of them to let
garbage reclamation be done latter by the other GC. However, I don't
know if their scheme would solve your inter-GC cycle reclamation. Maybe
you should ask Olivier.Ridoux@irisa.fr directly.

I would be interested to know the solution you applied to this issue. 

>  it turns out to be much simpler (for the implementation, not for the
> final user!) to map Java objects to values of abstract Caml types, and
> treat methods as functions over these abstract types, than mapping
> Java objects to Caml objects.  That was quite unexpected!

Of pure curiosity, why is it so difficult to map Java to Caml objects?
Is it the way control flow evolves between object methods that is
different? Is the typing of OCaml constraining too much the kind of
programs that can be written compared to Java?

Best regards,
d.

@InProceedings{Bekkers86-0:confs,
  author =       "Y. Bekkers and B. Canet and O. Ridoux and L. Ungaro",
  title =        "{MALI}: {A} Memory with a Real-time Garbage Collector
                 for Implementing Logic Programming Languages",
  booktitle =    "Proceedings of the International Symposium on Logic
                 Programming",
  organization = "IEEE Computer Society,",
  year =         "1986",
  month =        sep,
  publisher =    "The Computer Society Press",
  pages =        "258--265",
  ISBN =         "0-8186-0728-9",
}

There is also a "Publication interne" #611, IRISA, 1991, by Olivier
Ridoux : ftp://ftp.irisa.fr/local/lande/or-tr-irisa611-91.ps.Z

-- 
 David.Mentre@inria.fr -- http://www.irisa.fr/prive/dmentre/
 Opinions expressed here are only mine.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: OCaml on CLR/JVM?
  2001-02-06  0:03 OCaml on CLR/JVM? (RE: OCaml <--> ODBC/SQL Server) Don Syme
@ 2001-02-08 19:03 ` Xavier Leroy
  2001-02-09  9:06   ` David Mentre
  0 siblings, 1 reply; 8+ messages in thread
From: Xavier Leroy @ 2001-02-08 19:03 UTC (permalink / raw)
  To: Don Syme

> Now I have to say the obvious: wouldn't it be wonderful if Caml interfaced
> with either Java or the .NET Common Language Runtime seemlessly so we
> wouldn't have to keep facing these kinds of questions and problems, and
> could just leverage existing libraries?   

I've been working on and off (mostly off, lately) on an OCaml/Java
interface that works by coupling the two systems at the C level via
their foreign-function interfaces (Java's JNI and OCaml's C
interface).  This was strongly inspired by the work of Erik Meijer et
al on a similar Haskell/Java interface.  (These Haskell guys sure are
at the bleeding edge of language interoperability.  This is the second
interop idea I steal from them, after the IDL/COM binding.)

The low-level coupling is surprisingly easy, including making the two
garbage collectors cooperate: both the JNI and OCaml's C interface
provide enough functionality to get the couping to work without *any*
modification on either of the implementations.  How nice!
The only limitation is that a cross-heap cycle (a Java object pointing
to a Caml block pointing back to the Java object) can never be
reclaimed... (Thanks to Martin Odersky for pointing this out.)

(Actually, the main problem is working around the bugs in Sun's JDK
1.2.2 for Linux.  These guys must be kidding.  Does anyone has a
recommendation for a solid, complete Java implementation (including
Java 2 and of course the JNI) for Linux?)

Of course, the low-level interface is type-unsafe, so the real fun is
to build a type-safe view of Java classes and objects as Caml classes
and objects, and conversely.  I'm still struggling with some of the
issues involved.  For instance, it turns out to be much simpler (for
the implementation, not for the final user!) to map Java objects to
values of abstract Caml types, and treat methods as functions over
these abstract types, than mapping Java objects to Caml objects.  That
was quite unexpected!

One thing I learnt is that the real problem with language
interoperability is not how to compile language X to virtual machine Y
(this can always be done, albeit more or less efficiently), but rather
how to map between X's data structures and objects and those of all
other languages Z1 ... Zn that also compile down to Y.  This is obvious
in retrospect, but I think many (myself included) often overlook this
point and believe that compiling to the same virtual machine is
necessary and sufficient for interoperability.  It is actually neither
necessary nor sufficient...

While this work started with the JVM, I'm pretty sure it can be made
to work with the .NET CLR, as soon as it will have a foreign-function
interface with features comparable to those of the JNI.  (And I'm sure
this will happen eventually, not only because it makes sense, but also
because Java has it, so .NET must too :-)

Stay tuned for further developments.

- Xavier Leroy

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2001-02-15 12:14 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-02-09 15:49 OCaml on CLR/JVM? Dave Berry
2001-02-10  1:04 ` Toby Watson
2001-02-14 19:30 ` [Caml-list] " Xavier Leroy
2001-02-15 12:12   ` Sven LUTHER
  -- strict thread matches above, loose matches on Subject: below --
2001-02-12  9:46 Fabrice Le Fessant
2001-02-09 22:05 Don Syme
2001-02-06  0:03 OCaml on CLR/JVM? (RE: OCaml <--> ODBC/SQL Server) Don Syme
2001-02-08 19:03 ` OCaml on CLR/JVM? Xavier Leroy
2001-02-09  9:06   ` David Mentre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).