caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Big executables from ocamlopt; dynamic libraries again
@ 2002-03-16 16:05 Tim Freeman
  2002-03-18  1:12 ` Jacques Garrigue
  0 siblings, 1 reply; 13+ messages in thread
From: Tim Freeman @ 2002-03-16 16:05 UTC (permalink / raw)
  To: caml-list


When I compile a simple "hello world" app using lablgtk, the resulting
executable exceeds 800KB.  With a few apps like this, my package will
take offensively long to download, even over DSL.  The exectuable
seems to be including most of the lablgtk library in the executable,
which makes sense because lablgtk is statically linked in the
executable.

This isn't a problem with lablgtk.  A native code app that just prints
"hello world" on standard output takes 5K bytes if written in C but
95K in ocaml.  

The GTK executable would be smaller if lablgtk were dynamically linked
into it.  I hear that a real shared library isn't an option because
the present ocaml compiler can't generate position independent code.
However, one can do dynamic linking on many machines without position
independent code or a shared library; the dynamic linker just
relocates the library at run time.

I'd really rather write something in OCAML than the other languages
available, but if the resulting executables are so huge that I can't
distribute binaries, that's a problem.  If I controlled the libraries
myself, I could use the scaml patch from
http://algol.prosalg.no/~malc/scaml/, but I'd much rather use the
lablgtk that comes in Debian than package it myself, and I'd rather
not stick my users with a redundant copy of the lablgtk library.

Is there any reason there's no support for writing dynamically
linkable libraries in OCAML?

Hmm, if you memorized the MD5 checksum of the library at compile time,
and checked it at run time, it could even be type safe.  Or you could
just memorize the MD5 of the signature of the library, in some sense;
this would allow patches like the recent zlib double-free.  If the
library knows it's checksum, and the code loading it knows the
expected checksum, then you can do this checking without computing a
checksum at run time.

-- 
Tim Freeman       
tim@fungible.com; formerly tim@infoscreen.com
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Big executables from ocamlopt; dynamic libraries again
  2002-03-16 16:05 [Caml-list] Big executables from ocamlopt; dynamic libraries again Tim Freeman
@ 2002-03-18  1:12 ` Jacques Garrigue
  2002-03-18  1:29   ` Tim Freeman
  2002-03-18 13:11   ` Sven
  0 siblings, 2 replies; 13+ messages in thread
From: Jacques Garrigue @ 2002-03-18  1:12 UTC (permalink / raw)
  To: tim; +Cc: caml-list

From: tim@fungible.com (Tim Freeman)

> When I compile a simple "hello world" app using lablgtk, the resulting
> executable exceeds 800KB.  With a few apps like this, my package will
> take offensively long to download, even over DSL.  The exectuable
> seems to be including most of the lablgtk library in the executable,
> which makes sense because lablgtk is statically linked in the
> executable.
> 
> This isn't a problem with lablgtk.  A native code app that just prints
> "hello world" on standard output takes 5K bytes if written in C but
> 95K in ocaml.  

This is indeed a problem (in my opininion small, but yet).
But you're wrong in assuming that static linking means including the
whole library. Actually this is quite the opposite: dynamic linking
includes the whole library (but not in the executable), while static
linking only includes needed parts. In particular dynamic linking
without code sharing (using patched code) use more memory at runtime
than static linking.
You can see it as lablgtk.a is 1MB, while your app is only 800K
(include the ocaml runtime, etc...)
The unfortunate thing is that the structure of the LablGTK library
creates lots of spurious dependencies. For instance, using a button
links code for all kind of buttons, or using a label links code for
the calendar widget... My decision was to privilege meaning (put
related things together) over size (hack to be smaller).
Yet, lablGTK stubs are tuned to produce small code on x86, but here
your problem is with the size of the native code produced by ocamlopt,
which is harder to optimize.

> The GTK executable would be smaller if lablgtk were dynamically linked
> into it.  I hear that a real shared library isn't an option because
> the present ocaml compiler can't generate position independent code.
> However, one can do dynamic linking on many machines without position
> independent code or a shared library; the dynamic linker just
> relocates the library at run time.

You're right, but there is a bit more about dynamic linking.
A major problem with using dynamic linking with ocaml (in particular
with native code), is that your program come cut into small pieces,
and you must be sure that they are all compatible. Somebody posted
recently about problems when upgrading ocaml, and part of it is caused
by incompatibilities in the binary format between versions. Just
imagine the reaction of your user when, after having loaded various
packages required for your program, he only gets an error message or a
segmentation fault when trying to run it.

Static linking considerably improves that, since your program will no
longer depend on ocaml being installed, and the versions of its
different components. By the way, static linking here only concerns
the ocaml specific part of the code, libgtk itself is dynamically
linked, since one can expect to find a compatible implementation on
each platform.

The situation is a little bit better with bytecode: this time one only
depends on the version of ocaml, no longer the platform.
This is only in the CVS version (next release), but you can now use
the ocaml toplevel as a dynamic loader:
Compile your application as a .cmo or .cma
        ocamlc -a -o myapp.cma  tools.cmo main.cmo
(you don't need to include required libraries here)
Load it with ocaml
        ocaml lablgtk.cma myapp.cma

If speed is not a major problem, I think this can be nice in practice.

> Is there any reason there's no support for writing dynamically
> linkable libraries in OCAML?
> 
> Hmm, if you memorized the MD5 checksum of the library at compile time,
> and checked it at run time, it could even be type safe.  Or you could
> just memorize the MD5 of the signature of the library, in some sense;
> this would allow patches like the recent zlib double-free.  If the
> library knows it's checksum, and the code loading it knows the
> expected checksum, then you can do this checking without computing a
> checksum at run time.

One point is that, on one side, you really want the checking, and on
the other side, with the current MD5 approach, the checking is too
version dependent. That is, even if you've changed nothing in your
code, and it would be perfectly safe to run it, you have good chances
the loader will refuse to do it. And this nullifies the zlib example:
the newly compiled version would not be compatible with existing
executables!
Of course, this could be improved, but this needs some research and/or
engineering work.

Cheers,

        Jacques Garrigue
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Big executables from ocamlopt; dynamic libraries again
  2002-03-18  1:12 ` Jacques Garrigue
@ 2002-03-18  1:29   ` Tim Freeman
  2002-03-18  5:20     ` Jacques Garrigue
  2002-03-18 10:12     ` [Caml-list] Big executables from ocamlopt; dynamic libraries again Nicolas George
  2002-03-18 13:11   ` Sven
  1 sibling, 2 replies; 13+ messages in thread
From: Tim Freeman @ 2002-03-18  1:29 UTC (permalink / raw)
  To: garrigue; +Cc: caml-list

>Actually this is quite the opposite: dynamic linking
>includes the whole library (but not in the executable), while static
>linking only includes needed parts. In particular dynamic linking
>without code sharing (using patched code) use more memory at runtime
>than static linking.

I agree.  Even with the tradeoff you describe, I prefer the dynamic
linking.  I'm not too worried about memory at runtime; another 5 or
10MB will go unnoticed.  If the code isn't used and there's a memory
crisis, the code will get paged out.  But the speed of the modems is
limited and the space on the disk bites you even when you aren't
running the program, so the size of the download matters more in my
opinion.

>The unfortunate thing is that the structure of the LablGTK library
>creates lots of spurious dependencies.

I see the same sort of thing happening in /usr/lib/ocaml/stdlib.a, so
lablgtk is not alone there.  If you make an object, you load oo.o from
stdlib.a, which defines an unrelated function that uses random
numbers, so static linking then grabs random.o.  

>A major problem with using dynamic linking with ocaml (in particular
>with native code), is that your program come cut into small pieces,
>and you must be sure that they are all compatible.

How is ocaml different from C in this regard?  One difference is that
ocaml is younger and therefore changing faster, but eventually that
won't be true any more.  Are there other difference?

>Static linking considerably improves that, since your program will no
>longer depend on ocaml being installed, and the versions of its
>different components.

The C people find it prudent to offer both options.  What is different
about the ocaml situation?

>..with the current MD5 approach, the checking is too version
>dependent...

Can someone point me to a description of the current MD5 approach?

>And this nullifies the zlib example: the newly compiled version would
>not be compatible with existing executables!

Is there something wrong with just checksumming the signature to
decide whether the code is compatible?  That would still be more
sensitive than I'd like, since adding to the signature ideally would
not require people using the package to recompile, but it ought to
support the zlib example.

-- 
Tim Freeman       
tim@fungible.com; formerly tim@infoscreen.com
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Big executables from ocamlopt; dynamic libraries again
  2002-03-18  1:29   ` Tim Freeman
@ 2002-03-18  5:20     ` Jacques Garrigue
  2002-03-18 10:10       ` [Caml-list] Big executables from ocamlopt; dynamic librariesagain Warp
                         ` (4 more replies)
  2002-03-18 10:12     ` [Caml-list] Big executables from ocamlopt; dynamic libraries again Nicolas George
  1 sibling, 5 replies; 13+ messages in thread
From: Jacques Garrigue @ 2002-03-18  5:20 UTC (permalink / raw)
  To: tim; +Cc: caml-list

From: tim@fungible.com (Tim Freeman)

> >A major problem with using dynamic linking with ocaml (in particular
> >with native code), is that your program come cut into small pieces,
> >and you must be sure that they are all compatible.
> 
> How is ocaml different from C in this regard?  One difference is that
> ocaml is younger and therefore changing faster, but eventually that
> won't be true any more.  Are there other difference?

In short: C doesn't make sure that they are compatible.
If they are, this will work, otherwise, undefined behaviour.
Programmers and users are responsible for checking (by hand!) that the
API didn't change in an incompatible way.

If you want to have both security and allow linking everytime it's
safe, then you would need to do lots of type-checking at link-time
(runtime for dynamic linking). Basically that every module you depend
on has an interface at least as good as what you need, checking type
by type. If you've got a look at the size of some .cmi's, you may
realize that including required types in executables may require
potentially huge sizes. And type-checking is sometimes too slow.

As a fall-back solution, there is MD5 hashing. The problem is that
you're then mixing information for all the contents of a module.
Any change will produce a new incompatible hash value.
For instance, every time you add a function to a library, it becomes
incompatible. And there are new functions in every release of ocaml.

Note that for C, compatibility policies generally allow adding extra
functions to a library without changing the version number, since the
problem, should it arise, can be detected at link time.

And, even worse than that, the current MD5 computation scheme is
algorithm dependent: it is not based on a normalized view of types,
but just on a dump of an internal tree structure, which is extremely
sensitive to any change in the type checking algorithm. This means
that compatibility can be broken as often as once a week for the CVS
version!
I suppose one could define specific normalizing picking and unpicking
procedures, rather than using output_value and input_value as
currently, but this would be a fair amount of work, and I'm not even
sure it would solve completely the problem.

> >And this nullifies the zlib example: the newly compiled version would
> >not be compatible with existing executables!
> 
> Is there something wrong with just checksumming the signature to
> decide whether the code is compatible?  That would still be more
> sensitive than I'd like, since adding to the signature ideally would
> not require people using the package to recompile, but it ought to
> support the zlib example.

The main proablem being with incompatibilities between different
version of ocaml, any code compiled with ocaml cannot be given
compatible MD5 signatures... So, yes, your zlib example would work,
but only for the 6 months between two releases of ocaml.
This can be OK if you have a fair control of what you are running, but
this would be nightmare for the average user.


So probably the real answer is that dynamic linking of caml native
code is possible, but that it would be a lot of work, not so much at
the compilation level, but more at improving compatibility checking.
One could argue that the benefits would not be limited to dynamic
linking alone, but also easier upgrading bewteen ocaml versions, so
this might be worth it.

Cheers,

        Jacques Garrigue
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Big executables from ocamlopt; dynamic librariesagain
  2002-03-18  5:20     ` Jacques Garrigue
@ 2002-03-18 10:10       ` Warp
  2002-03-18 13:14       ` [Caml-list] Big executables from ocamlopt; dynamic libraries again Sven
                         ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Warp @ 2002-03-18 10:10 UTC (permalink / raw)
  Cc: caml-list

> If you want to have both security and allow linking everytime it's
> safe, then you would need to do lots of type-checking at link-time
> (runtime for dynamic linking). Basically that every module you depend
> on has an interface at least as good as what you need, checking type
> by type. If you've got a look at the size of some .cmi's, you may
> realize that including required types in executables may require
> potentially huge sizes. And type-checking is sometimes too slow.
>
> As a fall-back solution, there is MD5 hashing. The problem is that
> you're then mixing information for all the contents of a module.
> Any change will produce a new incompatible hash value.
> For instance, every time you add a function to a library, it becomes
> incompatible. And there are new functions in every release of ocaml.

Is the speed really an issue ? I mean... the ocaml compiler is doing that
job, and even more, right ? and its speed looks quite good. Or perhaps this
is an inv-NP problem where checking against a given signature takes exp.
time when producing a valid one is easy :)
Of course, that's true the size of CMI is turning big quite fast. But
including the CMI will be better then including the CMA ( if there is no C
dll behind, in that case, the CMI is bigger than the CMA )

> Note that for C, compatibility policies generally allow adding extra
> functions to a library without changing the version number, since the
> problem, should it arise, can be detected at link time.
>
> And, even worse than that, the current MD5 computation scheme is
> algorithm dependent: it is not based on a normalized view of types,
> but just on a dump of an internal tree structure, which is extremely
> sensitive to any change in the type checking algorithm. This means
> that compatibility can be broken as often as once a week for the CVS
> version!

That MD5 choice is of course well justified , but if that mean breaking
backward compatibily , distribuing ocaml precompiled binaries become
impossible, and then you're closing the ocaml door to many commercial usages
of ocaml, which could greatly improve the size of the community and so the
speed of ocaml development.

> So probably the real answer is that dynamic linking of caml native
> code is possible, but that it would be a lot of work, not so much at
> the compilation level, but more at improving compatibility checking.
> One could argue that the benefits would not be limited to dynamic
> linking alone, but also easier upgrading bewteen ocaml versions, so
> this might be worth it.

Do you mean dynlink of native code from bytecode ?
Whithout a CMI to ensure the type checking ?

Nicolas Cannasse

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Big executables from ocamlopt; dynamic libraries again
  2002-03-18  1:29   ` Tim Freeman
  2002-03-18  5:20     ` Jacques Garrigue
@ 2002-03-18 10:12     ` Nicolas George
  1 sibling, 0 replies; 13+ messages in thread
From: Nicolas George @ 2002-03-18 10:12 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 897 bytes --]

L'octidi 28 ventôse, an CCX, Tim Freeman a écrit :
> I see the same sort of thing happening in /usr/lib/ocaml/stdlib.a, so
> lablgtk is not alone there.  If you make an object, you load oo.o from
> stdlib.a, which defines an unrelated function that uses random
> numbers, so static linking then grabs random.o.  

Something I dislike in OCaml since a lot of time is the mixing of
modules of compilation units. Why should a module be in only one file? A
solution to that could be an additionnal feature to the compiler:

ocamlc -m -o module.cmi module1.cmi module2.cmi module3.cmi

Then the module.cmi interface would hold all the values and types of
module[123].cmi, along with a special information (like the external
information): "to find the foobar value, you must look in the module2
module".

So huge modules could be split into smaller compilation --and linking--
units.

[-- Attachment #2: Type: application/pgp-signature, Size: 237 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Big executables from ocamlopt; dynamic libraries again
  2002-03-18  1:12 ` Jacques Garrigue
  2002-03-18  1:29   ` Tim Freeman
@ 2002-03-18 13:11   ` Sven
  1 sibling, 0 replies; 13+ messages in thread
From: Sven @ 2002-03-18 13:11 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: tim, caml-list

On Mon, Mar 18, 2002 at 10:12:25AM +0900, Jacques Garrigue wrote:
> You're right, but there is a bit more about dynamic linking.
> A major problem with using dynamic linking with ocaml (in particular
> with native code), is that your program come cut into small pieces,
> and you must be sure that they are all compatible. Somebody posted
> recently about problems when upgrading ocaml, and part of it is caused
> by incompatibilities in the binary format between versions. Just
> imagine the reaction of your user when, after having loaded various
> packages required for your program, he only gets an error message or a
> segmentation fault when trying to run it.

You just would need a propper versionning scheme for ocaml libraries, which
follow strict rules. This way, if you do incompatibles changes to a library
(the signature changes) you bump the version number, and everyone will know
about it. This would be a big gain even if we are not linking statically.

One could even imagine a scheme were there is support for compatible
signatures changes or such (adding functions and so on), i guess the module
type system is able to offer this easily.

Friendly,

Sven Luther
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Big executables from ocamlopt; dynamic libraries again
  2002-03-18  5:20     ` Jacques Garrigue
  2002-03-18 10:10       ` [Caml-list] Big executables from ocamlopt; dynamic librariesagain Warp
@ 2002-03-18 13:14       ` Sven
  2002-03-18 15:51       ` [Caml-list] Type-safe backward compatibility for .so's Tim Freeman
                         ` (2 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Sven @ 2002-03-18 13:14 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: tim, caml-list

On Mon, Mar 18, 2002 at 02:20:17PM +0900, Jacques Garrigue wrote:
> The main proablem being with incompatibilities between different
> version of ocaml, any code compiled with ocaml cannot be given
> compatible MD5 signatures... So, yes, your zlib example would work,
> but only for the 6 months between two releases of ocaml.
> This can be OK if you have a fair control of what you are running, but
> this would be nightmare for the average user.

This would be not so much of a problem if the ocaml would proevide better
information on these compatibilities changes. You could already look at the
version of the different files, but then it is not documented.

Friendly,

Sven Luther
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Caml-list] Type-safe backward compatibility for .so's
  2002-03-18  5:20     ` Jacques Garrigue
  2002-03-18 10:10       ` [Caml-list] Big executables from ocamlopt; dynamic librariesagain Warp
  2002-03-18 13:14       ` [Caml-list] Big executables from ocamlopt; dynamic libraries again Sven
@ 2002-03-18 15:51       ` Tim Freeman
  2002-03-18 18:46       ` [Caml-list] Big executables from ocamlopt; dynamic libraries again malc
  2002-03-19 22:21       ` Johan Georg Granström
  4 siblings, 0 replies; 13+ messages in thread
From: Tim Freeman @ 2002-03-18 15:51 UTC (permalink / raw)
  To: garrigue; +Cc: caml-list

From: Jacques Garrigue <garrigue@kurims.kyoto-u.ac.jp>
>Note that for C, compatibility policies generally allow adding extra
>functions to a library without changing the version number, since the
>problem, should it arise, can be detected at link time.

To use MD5 checksums to check the signatures, and support multiple
versions of a DLL safely, the language would need a new keyword, say
"version", that would be used to identify in the source code which
version of the library each identifier was introduced.  Suppose the
default version is 1.  The original version of a module might be:

   let bump_higher x = x + 1;;

and then after a revision you might have:

   let bump_higher x = x + 2;;  (* Bug fix *)
   version 2 let bump_float x = x + 0.1;;

This library would have two MD5 checksums, one for the type signature
of version 1 and one for the type signature of version 2.  Version 2
includes everything from version 1, since 2 > 1.  When the code is
linked in, old code searches the checksum list hoping to find the
checksum for version 1, and it finds it.  New code finds the checksum
for version 2 on the list and is happy.  Code that wants version 3
doesn't find the checksum it wants and aborts unless version 3 happens
to be identical to version 2.

-- 
Tim Freeman tim@fungible.com; formerly tim@infoscreen.com
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Big executables from ocamlopt; dynamic libraries again
  2002-03-18  5:20     ` Jacques Garrigue
                         ` (2 preceding siblings ...)
  2002-03-18 15:51       ` [Caml-list] Type-safe backward compatibility for .so's Tim Freeman
@ 2002-03-18 18:46       ` malc
  2002-03-19 22:21       ` Johan Georg Granström
  4 siblings, 0 replies; 13+ messages in thread
From: malc @ 2002-03-18 18:46 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: tim, caml-list

On Mon, 18 Mar 2002, Jacques Garrigue wrote:

> From: tim@fungible.com (Tim Freeman)
> 
> > >A major problem with using dynamic linking with ocaml (in particular
> > >with native code), is that your program come cut into small pieces,
> > >and you must be sure that they are all compatible.
> > 
> > How is ocaml different from C in this regard?  One difference is that
> > ocaml is younger and therefore changing faster, but eventually that
> > won't be true any more.  Are there other difference?
> 
> In short: C doesn't make sure that they are compatible.
> If they are, this will work, otherwise, undefined behaviour.
> Programmers and users are responsible for checking (by hand!) that the
> API didn't change in an incompatible way.
> 
> If you want to have both security and allow linking everytime it's
> safe, then you would need to do lots of type-checking at link-time
> (runtime for dynamic linking). Basically that every module you depend
> on has an interface at least as good as what you need, checking type
> by type. If you've got a look at the size of some .cmi's, you may
> realize that including required types in executables may require
> potentially huge sizes. And type-checking is sometimes too slow.
> 
> As a fall-back solution, there is MD5 hashing. The problem is that
> you're then mixing information for all the contents of a module.
> Any change will produce a new incompatible hash value.
> For instance, every time you add a function to a library, it becomes
> incompatible. And there are new functions in every release of ocaml.

If by this you mean unique suffixes for symbols (Module_funcname_123)
and value address positions withing module's data storage, then there
is a workaround which i implemented in my shared patch, so that produced
code is less dependant on such seemingly irrelevant things as
adding/removing/swaping places of global visible functions and variables.

<skip>

-- 
mailto:malc@pulsesoft.com

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Caml-list] Big executables from ocamlopt; dynamic libraries again
  2002-03-18  5:20     ` Jacques Garrigue
                         ` (3 preceding siblings ...)
  2002-03-18 18:46       ` [Caml-list] Big executables from ocamlopt; dynamic libraries again malc
@ 2002-03-19 22:21       ` Johan Georg Granström
  2002-03-20  2:46         ` Hashing research (was Re: [Caml-list] Big executables ...) Tim Freeman
  4 siblings, 1 reply; 13+ messages in thread
From: Johan Georg Granström @ 2002-03-19 22:21 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: caml-list

> As a fall-back solution, there is MD5 hashing. The problem is that
> you're then mixing information for all the contents of a module.
> Any change will produce a new incompatible hash value.
> For instance, every time you add a function to a library, it becomes
> incompatible. And there are new functions in every release of ocaml.
>
> Note that for C, compatibility policies generally allow adding extra
> functions to a library without changing the version number, since the
> problem, should it arise, can be detected at link time.

IMHO this a perfect research problem:

Find a mapping H:S->B where S is the set of module signatures and
B is the set of binary (arbitrary length) strings. Such that if and only if
s_1 is a subset of s_2 then there is some relation between H(s_1) and
H(s_2), thus  s_1<s_2 iff H(s_1) R H(s_2).

Perhaps you could drop "and only if" and let H(s_1) R H(s_2) imply
s_1 < s_2 with 99.9...% certainty.

Finding an efficient pair H and R would really make life easier for
software maintainers. I guess reasonable demands on H and R are that
H(s) should have a binary length that is a fraction of the size of the
corresponding .cmi and that b_1 R b_2 is big ordo the size of b_1 plus b_2.
In any case R must be significantly faster than big ordo size of b_1 times
b_2.



Does this make since? Is it possible? Is it already done?

Yours,

- Johan Granström

Ps. Using hashes of modules for compability checks is a pain in the
ass. Microsoft .NET framework does it - and it doesn't work in real
life. I know from personal experience...


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Hashing research (was Re: [Caml-list] Big executables ...)
  2002-03-19 22:21       ` Johan Georg Granström
@ 2002-03-20  2:46         ` Tim Freeman
  0 siblings, 0 replies; 13+ messages in thread
From: Tim Freeman @ 2002-03-20  2:46 UTC (permalink / raw)
  To: georg.g; +Cc: garrigue, caml-list

From: georg.g@home.se
>IMHO this a perfect research problem:
>
>Find a mapping H:S->B where S is the set of module signatures and
>B is the set of binary (arbitrary length) strings. Such that if and only if
>s_1 is a subset of s_2 then there is some relation between H(s_1) and
>H(s_2), thus  s_1<s_2 iff H(s_1) R H(s_2).
>
>Perhaps you could drop "and only if" and let H(s_1) R H(s_2) imply
>s_1 < s_2 with 99.9...% certainty.

I think you can't do it with constant-sized hashes.  For instance, if
s_2 has 100 elements, then it has 2 ** 100 subsets.  Since R has to
behave correctly on most of those 2 ** 100 subsets, those subsets need
to have almost 2 ** 100 different hashes, so your hash can't be less
than 100 bits.

You have to know the name for each entry point into the library anyway
so you can do the linking.  We could just have one hash for the type
per entry point.  Hmm; MD5 is only 16 bytes, or 32 bytes of hex, or 22
bytes of base 62 (digits plus upper and lower case letters), so maybe
we just append the MD5 checksum to the end of the symbol.  If that's
too much and we're willing to have less-than-cryptographic security we
could truncate the added checksum to whatever number of bits is small
enough and still have a very good chance of getting the right answer.

-- 
Tim Freeman       
tim@fungible.com
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [Caml-list] Big executables from ocamlopt; dynamic librariesagain
@ 2002-03-18 10:50 Dave Berry
  0 siblings, 0 replies; 13+ messages in thread
From: Dave Berry @ 2002-03-18 10:50 UTC (permalink / raw)
  To: Jacques Garrigue, caml-list

Do .cmi files use hash-consing?  This can greatly reduce the size of
type information.

-----Original Message-----
From: Jacques Garrigue [mailto:garrigue@kurims.kyoto-u.ac.jp]
Sent: 18 March 2002 05:20
Subject: Re: [Caml-list] Big executables from ocamlopt; dynamic
librariesagain

...
If you've got a look at the size of some .cmi's, you may
realize that including required types in executables may require
potentially huge sizes.
...
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2002-03-20  4:45 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-03-16 16:05 [Caml-list] Big executables from ocamlopt; dynamic libraries again Tim Freeman
2002-03-18  1:12 ` Jacques Garrigue
2002-03-18  1:29   ` Tim Freeman
2002-03-18  5:20     ` Jacques Garrigue
2002-03-18 10:10       ` [Caml-list] Big executables from ocamlopt; dynamic librariesagain Warp
2002-03-18 13:14       ` [Caml-list] Big executables from ocamlopt; dynamic libraries again Sven
2002-03-18 15:51       ` [Caml-list] Type-safe backward compatibility for .so's Tim Freeman
2002-03-18 18:46       ` [Caml-list] Big executables from ocamlopt; dynamic libraries again malc
2002-03-19 22:21       ` Johan Georg Granström
2002-03-20  2:46         ` Hashing research (was Re: [Caml-list] Big executables ...) Tim Freeman
2002-03-18 10:12     ` [Caml-list] Big executables from ocamlopt; dynamic libraries again Nicolas George
2002-03-18 13:11   ` Sven
2002-03-18 10:50 [Caml-list] Big executables from ocamlopt; dynamic librariesagain Dave Berry

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).