Re: [Caml-list] Measuring GC latencies for OCaml program

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

From: Yaron Minsky <yminsky@janestreet.com>
To: Stanislav Artemkin <artemkin@gmail.com>
Cc: Jon Harrop <jon@ffconsultancy.com>,
	Gabriel Scherer <gabriel.scherer@gmail.com>,
	caml users <caml-list@inria.fr>,
	Damien Doligez <damien.doligez@inria.fr>
Subject: Re: [Caml-list] Measuring GC latencies for OCaml program
Date: Fri, 10 Jun 2016 19:14:51 -0400	[thread overview]
Message-ID: <CACLX4jT7a0jPYwDJq4XMNLn18OqWk7pBgwfmH-nzN0rVHDb4Pg@mail.gmail.com> (raw)
In-Reply-To: <CAL4yANnwJQyJsnHsvrvSD19bgGxkfTyMqCvgnBR-hZGShxNjqw@mail.gmail.com>

10ms and better, more like it.  You have to know what you're doing,
but you can do as well as C++ in OCaml for many workloads.  Just don't
allocate anything.

And even without any extreme style changes, you can do a hell of a lot
better than 10ms, depending on exactly what you're doing.

y

On Fri, Jun 10, 2016 at 5:34 PM, Stanislav Artemkin <artemkin@gmail.com> wrote:
> Very interesting! It seems it was completely broken C++ solution. Wish I
> could use OCaml for current project, but we still have to use C++ to get
> microsecond latencies.
>
> Do I correctly understand that OCaml is suitable for latencies ~10ms and
> worse?
>
> Also, there is still an issue with multithreading. Did you use any existing
> solution?
>
> Thank you
>
> On Sat, Jun 11, 2016 at 12:35 AM, Jon Harrop <jon@ffconsultancy.com> wrote:
>>
>>
>> Very interesting, thank you!
>>
>> We just implemented a substantial client and server system for the finance
>> sector with the "low" latency server written in OCaml. I have done this
>> before in other languages and seen it done in many more languages. The OCaml
>> is by far the consistently-fastest solution I have ever seen. Orders of
>> magnitude faster than the last C++ solution I saw. In particular, compared
>> to Java and .NET where we see substantial latencies from the GC at around
>> 100ms, with OCaml there is no visible peak at high latency due to the GC at
>> all. And this project was implemented to a very short deadline with no time
>> for optimisation at all.
>>
>> On a related note, we used Jane St.'s Core and Async libraries as well as
>> Cohttp and found them all to be *phenomenally* efficient and robust.
>>
>> In case anyone is interested, the only pain point I had was the
>> development environment. I actually prototyped all my hard code in
>> simplified F# in Visual Studio on Windows and then ported to OCaml. Emacs
>> and Merlin crash and hang a lot for me: maybe 50 times per day. Hence my
>> other post. :-)
>>
>> In terms of the language, OCaml was very well suited to this task. Lots of
>> purely functional data structures forming in-memory databases that can be
>> queried in different ways and have many different versions of them stored in
>> different places at different times. Perhaps the main language feature I
>> missed from F# was (surprisingly!) reflection. My F# client code uses
>> reflection to serialize and deserialize messages. With no reflection I
>> couldn't do that in OCaml so I used reflection in F# to autogenerate the
>> OCaml code.
>>
>> Cheers,
>> Jon.
>>
>> -----Original Message-----
>> From: caml-list-request@inria.fr [mailto:caml-list-request@inria.fr] On
>> Behalf Of Gabriel Scherer
>> Sent: 30 May 2016 20:48
>> To: caml users
>> Cc: Damien Doligez
>> Subject: [Caml-list] Measuring GC latencies for OCaml program
>>
>> Dear caml-list,
>>
>> You may be interested in the following blog post, in which I give
>> instructions to measure the worst-case latencies incurred by the GC:
>>
>>   Measuring GC latencies in Haskell, OCaml, Racket
>>
>> http://prl.ccs.neu.edu/blog/2016/05/24/measuring-gc-latencies-in-haskell-ocaml-racket/
>>
>> In summary, the commands to measure latencies look something like:
>>
>>     % build the program with the instrumented runtime
>>     ocamlbuild -tag "runtime_variant(i)" myprog.native
>>
>>     % run with instrumentation enabled
>>     OCAML_INSTR_FILE="ocaml.log" ./main.native
>>
>>     % visualize results from the raw log
>>     $(OCAML_SOURCES)/tools/ocaml-instr-graph ocaml.log
>>     $(OCAML_SOURCES)/tools/ocaml-instr-report ocaml.log
>>
>> While the OCaml GC has had a good incremental mode with low latencies for
>> most workloads for a long time, the ability to instrument it to actually
>> measure latencies is still in its infancy: it is a side-result of Damien
>> Doligez's stint at Jane Street last year, and
>> 4.03.0 is the first release in which this work is available.
>>
>> A practical consequence of this youth is that the "user experience" of
>> actually performing these measurements is currently very bad. The GC
>> measurements are activated in an instrumented runtime variant (OCaml
>> supports having several variants of the runtime available, and deciding
>> which one to use for a specific program at link-time), which is the right
>> design choice, but right now this variant is not built by default by the
>> compiler distribution -- building it is a configure-time option disabled by
>> default. This means that, as a user interested in doing the measurements,
>> you have to compile an alternative OCaml compiler.
>> Furthermore, processing the raw instrumented log requires tool that are
>> also in their infancy, and are currently included in the compiler
>> distribution sources but not installed -- so you have to have a source
>> checkout available to use them. In contrast, GHC's instrumentation is
>> enabled by just passing the option "+RTS -s" to the Haskell program of
>> interest; this is superbly convenient and something we should aim at.
>>
>> I discussed with Damien whether we should enable building the instrumented
>> runtime by default (for example pass the --with-instrumented-runtime option
>> to the opam switches people are using, and encourage distributions to use it
>> in their packages as well). Of course there is a cost/benefit trade-off:
>> currently virtually nobody is using this instrumentation, but enabling it by
>> default would increase the compilation time of the compiler distribution for
>> everyone. (On my machine it only adds 5 seconds to total build time.)
>>
>> I personally think that we should aim for a rock-solid experience for
>> profiling and instrumenting OCaml program enabled by default¹. It is worth
>> making it slightly longer for anyone to install the compiler if we can make
>> it vastly easier to measure GC pauses in our program when the need arises
>> (even if it's not very often). But that is a discussion that should be had
>> before making any choice.
>>
>> Regarding the log analysis tools, before finding about Damien's
>> included-but-not installed tools (a shell and an awk script, in the finest
>> Unix tradition) I built a quick&dirty OCaml script to do some measurements,
>> which can be found in the benchmark repository below. It would not be much
>> more work to grow this in a reusable library to extract the current log
>> format into a structured data structure -- the format is undocumented but
>> the provided scripts in tools/ have enough information to infer the
>> structure. Such a script/library would, of course, remain tightly coupled to
>> the OCaml version, but I think it could be useful to have it packaged for
>> users to play with.
>>
>>
>> https://gitlab.com/gasche/gc-latency-experiment/blob/master/parse_ocaml_log.ml
>>
>> ¹: We cannot expect users to effectively write performant code if they
>> don't have the tool support for it. The fact that lazyness in Haskell makes
>> it harder for users to reason about efficiency or memory usage has made the
>> avaibility of excellent performance tooling *necessary*, where it is merely
>> nice-to-have in OCaml. Rather ironically, Haskell tooling is now much better
>> than OCaml's in this area, to the point that it can be easier to write
>> efficient code in Haskell.
>>
>> Three side-notes on profiling tools:
>>
>> 1. `perf record --call-graph=dwarf` works fine for ocamlopt binaries
>>   (no need for a frame-pointers switch), and this is documented:
>>
>> https://ocaml.org/learn/tutorials/performance_and_profiling.html#UsingperfonLinux
>>
>> 2. Thomas Leonard has done excellent work on domain-specific profiling
>>    tools for Mirage, and this is the kind of tool support that I think
>>    should be available to anyone out of the box.
>>      http://roscidus.com/blog/blog/2014/08/15/optimising-the-unikernel/
>>
>> http://roscidus.com/blog/blog/2014/10/27/visualising-an-asynchronous-monad/
>>
>> 3. There is currently more debate than anyone could wish for around
>>    a pull request of Mark Shinwell for runtime support for dynamic call
>>    graph construction and its use for memory profiling.
>>      https://github.com/ocaml/ocaml/pull/585
>>
>> 4. Providing a good user experience for performance or space profiling
>>    is a fundamentally harder problem than for GC pauses. It may
>>    require specially-compiled versions of the libraries used by your
>>    program, and thus a general policy/agreement across the
>>    ecosystem. Swapping a different runtime at link-time is very easy.
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs=
>>
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>

next prev parent reply	other threads:[~2016-06-10 23:15 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-30 19:48 Gabriel Scherer
2016-05-31  1:13 ` Yaron Minsky
2016-05-31  5:39 ` Malcolm Matalka
2016-06-10 20:35 ` Jon Harrop
2016-06-10 21:34   ` Stanislav Artemkin
2016-06-10 23:14     ` Yaron Minsky [this message]
2016-06-11  8:53     ` Jon Harrop
2016-09-14  2:51 ` pratikfegade
2016-09-14  8:38   ` Gabriel Scherer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CACLX4jT7a0jPYwDJq4XMNLn18OqWk7pBgwfmH-nzN0rVHDb4Pg@mail.gmail.com \
    --to=yminsky@janestreet.com \
    --cc=artemkin@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=damien.doligez@inria.fr \
    --cc=gabriel.scherer@gmail.com \
    --cc=jon@ffconsultancy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).