From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 5424A7F312 for ; Wed, 13 Mar 2013 18:15:18 +0100 (CET) Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of julien.verlaguet@gmail.com) identity=pra; client-ip=209.85.223.178; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="julien.verlaguet@gmail.com"; x-sender="julien.verlaguet@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of julien.verlaguet@gmail.com designates 209.85.223.178 as permitted sender) identity=mailfrom; client-ip=209.85.223.178; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="julien.verlaguet@gmail.com"; x-sender="julien.verlaguet@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-ie0-f178.google.com) identity=helo; client-ip=209.85.223.178; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="julien.verlaguet@gmail.com"; x-sender="postmaster@mail-ie0-f178.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AmgDAKezQFHRVd+ykGdsb2JhbABDsmeJWQGIMIFRCBYOAQEBARIUFCiCKgEBBAFAARQHEgsBAwwGBQsaISIBEQEFAQoSGRKHbwEDCQYMpDOMMoJ7hF8KGScDClmIaxEBBQyNUYEsBAeDQAOIc4drhXiBH411FimETRw X-IPAS-Result: AmgDAKezQFHRVd+ykGdsb2JhbABDsmeJWQGIMIFRCBYOAQEBARIUFCiCKgEBBAFAARQHEgsBAwwGBQsaISIBEQEFAQoSGRKHbwEDCQYMpDOMMoJ7hF8KGScDClmIaxEBBQyNUYEsBAeDQAOIc4drhXiBH411FimETRw X-IronPort-AV: E=Sophos;i="4.84,838,1355094000"; d="scan'208";a="7279912" Received: from mail-ie0-f178.google.com ([209.85.223.178]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 13 Mar 2013 18:14:04 +0100 Received: by mail-ie0-f178.google.com with SMTP id c13so1750523ieb.37 for ; Wed, 13 Mar 2013 10:14:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=7dvjJyJYOwlfobTGW7EVak+zF1YRzpqZSQHVJrtPFaY=; b=blEkJD4maMHOWOmoStMK9/9WjXn7gTIBStKVmTfIt9nhkxkuEGXaueescXUn+vvkp0 H7Z5tIZhvpfFYsXB8scYWtOPtoMjJn5BZBjqIrejKA2/t/MQa6mTg/b/XkGQeUo6BMw1 ZVuH+KkFfXo7oZq/QM4tTwoORTtWmc0IrqneF+wBp7KefkupT8pPZhGIZzMiT2jEr7wQ YoGfdfME2r5qVi77nPlOlSQXQ9HcBzZtSuoHTPzPJm9A1kdzroaYusqRYHlr7ZWyQ+vj 9UDMAjPXsMt+CLM5RFHWrHG7wqocMxtZbjL/vIx2WY5DtYysuTKqDkkT40lvi7Qohv4Z g2jA== MIME-Version: 1.0 X-Received: by 10.50.13.208 with SMTP id j16mr17503882igc.73.1363194843777; Wed, 13 Mar 2013 10:14:03 -0700 (PDT) Received: by 10.64.13.47 with HTTP; Wed, 13 Mar 2013 10:14:03 -0700 (PDT) In-Reply-To: <00ba01ce200c$d23f1910$76bd4b30$@ffconsultancy.com> References: <00ba01ce200c$d23f1910$76bd4b30$@ffconsultancy.com> Date: Wed, 13 Mar 2013 10:14:03 -0700 Message-ID: From: julien verlaguet To: jon@ffconsultancy.com Cc: caml-list@inria.fr Content-Type: multipart/alternative; boundary=f46d0447f0946c736b04d7d188af Subject: Re: [Caml-list] Case study in optimization: porting a compiler from OCaml to F# --f46d0447f0946c736b04d7d188af Content-Type: text/plain; charset=ISO-8859-1 Hi Jon, Thanks for sharing this case-study with us! Have you tried parallelizing the OCaml version? I am thinking pre-forked processes communicating with pipes? We write a lot of large-scale static-analysis in OCaml here at Facebook. Parallelizing them with pre-forked processes gave us very good performances. I would be curious to see a case-study of pre-forked OCaml vs threaded F#. Julien 2013/3/13 Jon Harrop > > There has been some discussion here about the implications of single- vs > multi-threaded garbage collectors and, in particular, their performance in > the context of the kinds of metaprogramming that OCaml has traditionally > been used for. > > I recently ported a compiler written in OCaml to the F# programming > language > for a client and performance turned out to be an issue so I'd like to > present this as a case study to provide some real data. Unfortunately I > cannot disclose precise details. > > The original compiler was 15kLOC of OCaml code. The amounts of DSL code > that > it consumes and C code that it produces can be considerable and compilation > can take minutes. Consequently, performance is valued by my client's > customers and, therefore, the original code had been optimized for OCaml's > performance characteristics. > > A direct translation of the OCaml code to F# proved to be over 10x slower. > This was so slow that it impeded testing my translation so I did some > optimization early. Specifically, profiling indicated that the biggest > problem was the high rate of exceptions being raised and caught. Exceptions > are around 600x slower on .NET than in OCaml so this can quickly degrade > performance. I changed all of the hot paths to use union types (usually > option types) instead of exceptions, according to F# idioms. Although this > incurs a lot of unnecessary boxing in F# the performance improvements were > substantial and the F# version became 5x slower than the OCaml. > > On a related note, thorough testing showed that my almost-blind translation > of 15kLOC of code was completely error free. I think this is a real > testament to the power of ML's static type system. The only error I have > introduced so far occurred when I was replacing the use of an exception in > a > function with a union type. > > After demonstrating the correctness of the translation, my effort turned to > trying to improve performance in an attempt to compete with the original > OCaml code. I had believed that this could well prove to be prohibitively > difficult or even impossible because symbolic code is OCaml's main > strength. > However, I have managed to make the F# around 8x faster than it was and, in > particular, substantially faster than the original OCaml. > > So this non-trivial symbolic code base has not had its performance suffer > from the adoption of a multicore-friendly garbage collector. > > -- > Dr Jon Harrop, Flying Frog Consultancy Ltd. > http://www.ffconsultancy.com > > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > --f46d0447f0946c736b04d7d188af Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Jon,

Thanks for sharing this case-study with us!
Have you tried parallelizing the OCaml version? I am thinking pre-for= ked processes communicating with pipes?
We write a lot of large-s= cale static-analysis in OCaml here at Facebook.
Parallelizing them with pre-forked processes gave us very good perform= ances.
I would be curious to see a case-study of pre-forked OCaml= vs threaded F#.

Julien

2013/3/13 Jon Harrop <jon@ffconsultancy.com>

There has been some discussion here about the implications of single- vs
multi-threaded garbage collectors and, in particular, their performance in<= br> the context of the kinds of metaprogramming that OCaml has traditionally
been used for.

I recently ported a compiler written in OCaml to the F# programming languag= e
for a client and performance turned out to be an issue so I'd like to present this as a case study to provide some real data. Unfortunately I
cannot disclose precise details.

The original compiler was 15kLOC of OCaml code. The amounts of DSL code tha= t
it consumes and C code that it produces can be considerable and compilation=
can take minutes. Consequently, performance is valued by my client's
customers and, therefore, the original code had been optimized for OCaml= 9;s
performance characteristics.

A direct translation of the OCaml code to F# proved to be over 10x slower.<= br> This was so slow that it impeded testing my translation so I did some
optimization early. Specifically, profiling indicated that the biggest
problem was the high rate of exceptions being raised and caught. Exceptions=
are around 600x slower on .NET than in OCaml so this can quickly degrade
performance. I changed all of the hot paths to use union types (usually
option types) instead of exceptions, according to F# idioms. Although this<= br> incurs a lot of unnecessary boxing in F# the performance improvements were<= br> substantial and the F# version became 5x slower than the OCaml.

On a related note, thorough testing showed that my almost-blind translation=
of 15kLOC of code was completely error free. I think this is a real
testament to the power of ML's static type system. The only error I hav= e
introduced so far occurred when I was replacing the use of an exception in = a
function with a union type.

After demonstrating the correctness of the translation, my effort turned to=
trying to improve performance in an attempt to compete with the original
OCaml code. I had believed that this could well prove to be prohibitively difficult or even impossible because symbolic code is OCaml's main stre= ngth.
However, I have managed to make the F# around 8x faster than it was and, in=
particular, substantially faster than the original OCaml.

So this non-trivial symbolic code base has not had its performance suffer from the adoption of a multicore-friendly garbage collector.

--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffcon= sultancy.com


--
Caml-list mailing list. =A0Subscription management and archives:
ht= tps://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

--f46d0447f0946c736b04d7d188af--