caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Fabrice Le Fessant <Fabrice.Le_fessant@inria.fr>
To: "Christoph Höger" <christoph.hoeger@tu-berlin.de>
Cc: Ocaml Mailing List <caml-list@inria.fr>
Subject: Re: [Caml-list] Size of .cmo / .cmi workload of the compiler
Date: Thu, 20 Aug 2015 11:12:20 +0200	[thread overview]
Message-ID: <CAHvkLrNGe1vnL11NRLv2uhuakZDL__Tj4i1mpOaRR_Vo99K+1g@mail.gmail.com> (raw)
In-Reply-To: <55D4A2E4.7000303@tu-berlin.de>

One possible explanation: object types are almost always expansed in
the .cmi files (and probably in the debug section of .cmo files), and
all the more if all the classes are defined in different .cmi files
(since each type is loaded from a different file, there is no
in-memory sharing when the same type appears in two different
modules).

When we created the Try-OCaml site for Js-of-OCaml
(https://try.ocamlpro.com/js_of_ocaml/), we ran in the same problem,
because js-of-ocaml uses object types for most values, and so the .cmi
files that were loaded were much bigger than for the original
Try-OCaml site (something like 12 MB instead of 1 MB to load the
toplevel). We ended up writting a compressor of .cmi files.

If you are using 4.01.0, you can download a bytecode image of that
compressor in the former repository of try-ocaml:

https://github.com/OCamlPro/tryocaml/tree/master/toplevellib/toplevellib-4.01.0

and then, you can test it on one of your .cmi files to see if it can
compress these files (for Try-OCaml, the image decreased from 12 MB to
2 MB if I remember correctly). Then, you could run it automatically on
each .cmi files after it has been generated by ocamlc, if it is indeed
the source of the problem.

--Fabrice


On Wed, Aug 19, 2015 at 5:38 PM, Christoph Höger
<christoph.hoeger@tu-berlin.de> wrote:
> Dear all,
>
> I autogenerate a rather large (> 12k) set of ocaml modules containing
> classes which are parameterized over their final representation to allow
> for hierarchic classes with polymorphic open recursion.
>
> My compilation scheme seems to work well in principle, but I am reaching
> a frustrating limit in practice: The compilation of the generated ml
> files seems to run superlinear (in fact it seems to depend on the
> hierarchical location of a class). As it turns out, the generated .cmo
> and .cmi files are quite large (up to several hundreds of kb). When I
> generate the .mli files or dump the .cmo files however, the output is
> quite small (several hundred instructions in the bytecode, the .mli file
> contains quite complex objects but still human readable).
>
> Is there any known issue that leads to:
>
> 1. non-linear runtime when compiling inter-module classes
> 2. huge .cmo files outside of the actual bytecode
>
> The generated source code can be found here:
>
> https://www.dropbox.com/s/cllc0xikv9zwu1k/doesnotscale.tar.bz2?dl=0
>
> To test the behavior, unpack and run
>
> ocamlbuild ModelicaXX5FElectricalXX5FMachines.cmo
>
> (there might be type-errors in the generated files somewhere, though).
>
> Any advice or comments are deeply appreciated.
>
> Christoph
>
> --
> Christoph Höger
>
> Technische Universität Berlin
> Fakultät IV - Elektrotechnik und Informatik
> Übersetzerbau und Programmiersprachen
>
> Sekr. TEL12-2, Ernst-Reuter-Platz 7, 10587 Berlin
>
> Tel.: +49 (30) 314-24890
> E-Mail: christoph.hoeger@tu-berlin.de
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs



-- 
Fabrice LE FESSANT
Chercheur en Informatique
INRIA Paris Rocquencourt -- OCamlPro
Programming Languages and Distributed Systems

  reply	other threads:[~2015-08-20  9:12 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-19 15:38 Christoph Höger
2015-08-20  9:12 ` Fabrice Le Fessant [this message]
2015-08-20  9:21 ` Goswin von Brederlow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHvkLrNGe1vnL11NRLv2uhuakZDL__Tj4i1mpOaRR_Vo99K+1g@mail.gmail.com \
    --to=fabrice.le_fessant@inria.fr \
    --cc=caml-list@inria.fr \
    --cc=christoph.hoeger@tu-berlin.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).