caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Stanislav Artemkin <artemkin@gmail.com>
To: jon@ffconsultancy.com
Cc: Gabriel Scherer <gabriel.scherer@gmail.com>,
	caml users <caml-list@inria.fr>,
	 Damien Doligez <damien.doligez@inria.fr>
Subject: Re: [Caml-list] Measuring GC latencies for OCaml program
Date: Sat, 11 Jun 2016 01:34:18 +0400	[thread overview]
Message-ID: <CAL4yANnwJQyJsnHsvrvSD19bgGxkfTyMqCvgnBR-hZGShxNjqw@mail.gmail.com> (raw)
In-Reply-To: <00c801d1c357$95598ca0$c00ca5e0$@ffconsultancy.com>

[-- Attachment #1: Type: text/plain, Size: 8628 bytes --]

Very interesting! It seems it was completely broken C++ solution. Wish I
could use OCaml for current project, but we still have to use C++ to get
microsecond latencies.

Do I correctly understand that OCaml is suitable for latencies ~10ms and
worse?

Also, there is still an issue with multithreading. Did you use any existing
solution?

Thank you

On Sat, Jun 11, 2016 at 12:35 AM, Jon Harrop <jon@ffconsultancy.com> wrote:

>
> Very interesting, thank you!
>
> We just implemented a substantial client and server system for the finance
> sector with the "low" latency server written in OCaml. I have done this
> before in other languages and seen it done in many more languages. The
> OCaml is by far the consistently-fastest solution I have ever seen. Orders
> of magnitude faster than the last C++ solution I saw. In particular,
> compared to Java and .NET where we see substantial latencies from the GC at
> around 100ms, with OCaml there is no visible peak at high latency due to
> the GC at all. And this project was implemented to a very short deadline
> with no time for optimisation at all.
>
> On a related note, we used Jane St.'s Core and Async libraries as well as
> Cohttp and found them all to be *phenomenally* efficient and robust.
>
> In case anyone is interested, the only pain point I had was the
> development environment. I actually prototyped all my hard code in
> simplified F# in Visual Studio on Windows and then ported to OCaml. Emacs
> and Merlin crash and hang a lot for me: maybe 50 times per day. Hence my
> other post. :-)
>
> In terms of the language, OCaml was very well suited to this task. Lots of
> purely functional data structures forming in-memory databases that can be
> queried in different ways and have many different versions of them stored
> in different places at different times. Perhaps the main language feature I
> missed from F# was (surprisingly!) reflection. My F# client code uses
> reflection to serialize and deserialize messages. With no reflection I
> couldn't do that in OCaml so I used reflection in F# to autogenerate the
> OCaml code.
>
> Cheers,
> Jon.
>
> -----Original Message-----
> From: caml-list-request@inria.fr [mailto:caml-list-request@inria.fr] On
> Behalf Of Gabriel Scherer
> Sent: 30 May 2016 20:48
> To: caml users
> Cc: Damien Doligez
> Subject: [Caml-list] Measuring GC latencies for OCaml program
>
> Dear caml-list,
>
> You may be interested in the following blog post, in which I give
> instructions to measure the worst-case latencies incurred by the GC:
>
>   Measuring GC latencies in Haskell, OCaml, Racket
>
> http://prl.ccs.neu.edu/blog/2016/05/24/measuring-gc-latencies-in-haskell-ocaml-racket/
>
> In summary, the commands to measure latencies look something like:
>
>     % build the program with the instrumented runtime
>     ocamlbuild -tag "runtime_variant(i)" myprog.native
>
>     % run with instrumentation enabled
>     OCAML_INSTR_FILE="ocaml.log" ./main.native
>
>     % visualize results from the raw log
>     $(OCAML_SOURCES)/tools/ocaml-instr-graph ocaml.log
>     $(OCAML_SOURCES)/tools/ocaml-instr-report ocaml.log
>
> While the OCaml GC has had a good incremental mode with low latencies for
> most workloads for a long time, the ability to instrument it to actually
> measure latencies is still in its infancy: it is a side-result of Damien
> Doligez's stint at Jane Street last year, and
> 4.03.0 is the first release in which this work is available.
>
> A practical consequence of this youth is that the "user experience" of
> actually performing these measurements is currently very bad. The GC
> measurements are activated in an instrumented runtime variant (OCaml
> supports having several variants of the runtime available, and deciding
> which one to use for a specific program at link-time), which is the right
> design choice, but right now this variant is not built by default by the
> compiler distribution -- building it is a configure-time option disabled by
> default. This means that, as a user interested in doing the measurements,
> you have to compile an alternative OCaml compiler.
> Furthermore, processing the raw instrumented log requires tool that are
> also in their infancy, and are currently included in the compiler
> distribution sources but not installed -- so you have to have a source
> checkout available to use them. In contrast, GHC's instrumentation is
> enabled by just passing the option "+RTS -s" to the Haskell program of
> interest; this is superbly convenient and something we should aim at.
>
> I discussed with Damien whether we should enable building the instrumented
> runtime by default (for example pass the --with-instrumented-runtime option
> to the opam switches people are using, and encourage distributions to use
> it in their packages as well). Of course there is a cost/benefit trade-off:
> currently virtually nobody is using this instrumentation, but enabling it
> by default would increase the compilation time of the compiler distribution
> for everyone. (On my machine it only adds 5 seconds to total build time.)
>
> I personally think that we should aim for a rock-solid experience for
> profiling and instrumenting OCaml program enabled by default¹. It is worth
> making it slightly longer for anyone to install the compiler if we can make
> it vastly easier to measure GC pauses in our program when the need arises
> (even if it's not very often). But that is a discussion that should be had
> before making any choice.
>
> Regarding the log analysis tools, before finding about Damien's
> included-but-not installed tools (a shell and an awk script, in the finest
> Unix tradition) I built a quick&dirty OCaml script to do some measurements,
> which can be found in the benchmark repository below. It would not be much
> more work to grow this in a reusable library to extract the current log
> format into a structured data structure -- the format is undocumented but
> the provided scripts in tools/ have enough information to infer the
> structure. Such a script/library would, of course, remain tightly coupled
> to the OCaml version, but I think it could be useful to have it packaged
> for users to play with.
>
>
> https://gitlab.com/gasche/gc-latency-experiment/blob/master/parse_ocaml_log.ml
>
> ¹: We cannot expect users to effectively write performant code if they
> don't have the tool support for it. The fact that lazyness in Haskell makes
> it harder for users to reason about efficiency or memory usage has made the
> avaibility of excellent performance tooling *necessary*, where it is merely
> nice-to-have in OCaml. Rather ironically, Haskell tooling is now much
> better than OCaml's in this area, to the point that it can be easier to
> write efficient code in Haskell.
>
> Three side-notes on profiling tools:
>
> 1. `perf record --call-graph=dwarf` works fine for ocamlopt binaries
>   (no need for a frame-pointers switch), and this is documented:
>
> https://ocaml.org/learn/tutorials/performance_and_profiling.html#UsingperfonLinux
>
> 2. Thomas Leonard has done excellent work on domain-specific profiling
>    tools for Mirage, and this is the kind of tool support that I think
>    should be available to anyone out of the box.
>      http://roscidus.com/blog/blog/2014/08/15/optimising-the-unikernel/
>
> http://roscidus.com/blog/blog/2014/10/27/visualising-an-asynchronous-monad/
>
> 3. There is currently more debate than anyone could wish for around
>    a pull request of Mark Shinwell for runtime support for dynamic call
>    graph construction and its use for memory profiling.
>      https://github.com/ocaml/ocaml/pull/585
>
> 4. Providing a good user experience for performance or space profiling
>    is a fundamentally harder problem than for GC pauses. It may
>    require specially-compiled versions of the libraries used by your
>    program, and thus a general policy/agreement across the
>    ecosystem. Swapping a different runtime at link-time is very easy.
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs=
>
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 10729 bytes --]

  reply	other threads:[~2016-06-10 21:34 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-30 19:48 Gabriel Scherer
2016-05-31  1:13 ` Yaron Minsky
2016-05-31  5:39 ` Malcolm Matalka
2016-06-10 20:35 ` Jon Harrop
2016-06-10 21:34   ` Stanislav Artemkin [this message]
2016-06-10 23:14     ` Yaron Minsky
2016-06-11  8:53     ` Jon Harrop
2016-09-14  2:51 ` pratikfegade
2016-09-14  8:38   ` Gabriel Scherer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAL4yANnwJQyJsnHsvrvSD19bgGxkfTyMqCvgnBR-hZGShxNjqw@mail.gmail.com \
    --to=artemkin@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=damien.doligez@inria.fr \
    --cc=gabriel.scherer@gmail.com \
    --cc=jon@ffconsultancy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).