From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <caml-list-owner@inria.fr>
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83])
	by c5ff346549e7 (Postfix) with ESMTPS id 9CB835D5
	for <caml-list@inbox.ocaml.org>; Tue,  7 Jan 2020 13:43:25 +0000 (UTC)
X-IronPort-AV: E=Sophos;i="5.69,406,1571695200"; 
   d="scan'208,217";a="430233104"
Received: from sympa.inria.fr ([193.51.193.213])
  by mail2-relais-roc.national.inria.fr with ESMTP; 07 Jan 2020 14:43:23 +0100
Received: by sympa.inria.fr (Postfix, from userid 20132)
	id C04A37F3A8; Tue,  7 Jan 2020 14:43:23 +0100 (CET)
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83])
	by sympa.inria.fr (Postfix) with ESMTPS id 073267EC74
	for <caml-list@sympa.inria.fr>; Tue,  7 Jan 2020 14:43:12 +0100 (CET)
X-IronPort-AV: E=Sophos;i="5.69,406,1571695200"; 
   d="scan'208,217";a="430232997"
Received: from set.irisa.fr (HELO set) ([131.254.10.170])
  by mail2-relais-roc.national.inria.fr with ESMTP/TLS/AES256-GCM-SHA384; 07 Jan 2020 14:43:11 +0100
From: Alan Schmitt <alan.schmitt@polytechnique.org>
To: "lwn" <lwn@lwn.net>, "cwn"  <cwn@lists.idyll.org>, caml-list@inria.fr
Date: Tue, 07 Jan 2020 14:43:10 +0100
Message-ID: <87sgkrbcwh.fsf@polytechnique.org>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="=-=-="
Subject: [Caml-list] Attn: Development Editor, Latest OCaml Weekly News
Reply-To: Alan Schmitt <alan.schmitt@polytechnique.org>
X-Loop: caml-list@inria.fr
X-Sequence: 17930
Errors-to: caml-list-owner@inria.fr
Precedence: list
Precedence: bulk
Sender: caml-list-request@inria.fr
X-no-archive: yes
List-Id: <caml-list.inria.fr>
List-Archive: <http://sympa.inria.fr/sympa/arc/caml-list>
List-Help: <mailto:sympa_inria@inria.fr?subject=help>
List-Owner: <mailto:caml-list-request@inria.fr>
List-Post: <mailto:caml-list@inria.fr>
List-Subscribe: <mailto:sympa_inria@inria.fr?subject=subscribe%20caml-list>
List-Unsubscribe: <mailto:sympa_inria@inria.fr?subject=unsubscribe%20caml-list>

--=-=-=
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

Hello

Here is the latest OCaml Weekly News, for the week of December 31, 2019
to January 07, 2020.

Table of Contents
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80

ocaml-lsp preview
Mkocaml Release - Project generator
Garbage Collection, Side-effects and Purity
A Lightweight OCaml Webapp Tutorial (Using Opium, Caqti, and Tyxml)
Release of owl-symbolic 0.1.0
Static lifetime
Old CWN


ocaml-lsp preview
=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=
=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=
=90

  Archive: <https://discuss.ocaml.org/t/ann-ocaml-lsp-preview/4876/15>


Continuing this thread, Edwin T=C3=B6r=C3=B6k said
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80

  Here is an example with ALE and Neovim (tested with v0.3.8):
  =E2=80=A2 Install the [Ale] plugin. If your Vim has support for packages =
(Vim
    8+ or Neovim) you can simply clone it in the correct subdir, no need
    for a plugin manager: `git clone https://github.com/w0rp/ale.git
    .vim/pack/my-plugins/start/ale'
  =E2=80=A2 Add this to your .vimrc:

  =E2=94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80
  =E2=94=82 " only invoke merlin to check for errors when
  =E2=94=82 " exiting insert mode, not on each keystroke.
  =E2=94=82 let g:ale_lint_on_text_changed=3D"never"
  =E2=94=82 let g:ale_lint_on_insert_leave=3D1
  =E2=94=82=20
  =E2=94=82 " enable ALE's internal completion if deoplete is not used
  =E2=94=82 let g:ale_completion_enabled=3D1
  =E2=94=82=20
  =E2=94=82 " only pop up completion when stopped typing for ~0.5s,
  =E2=94=82 " to avoid distracting when completion is not needed
  =E2=94=82 let g:ale_completion_delay=3D500
  =E2=94=82=20
  =E2=94=82 " see ale-completion-completeopt-bug
  =E2=94=82 set completeopt=3Dmenu,menuone,preview,noselect,noinsert
  =E2=94=82=20
  =E2=94=82 if has('packages')
  =E2=94=82     packloadall
  =E2=94=82=20
  =E2=94=82     " This should be part of ALE itself, like ols.vim
  =E2=94=82     call ale#linter#Define('ocaml',{
  =E2=94=82                 \ 'name':'ocaml-lsp',
  =E2=94=82                 \ 'lsp': 'stdio',
  =E2=94=82                 \ 'executable': 'ocamllsp',
  =E2=94=82                 \ 'command': '%e',
  =E2=94=82                 \ 'project_root': function('ale#handlers#ols#Ge=
tProjectRoot')
  =E2=94=82                 \})
  =E2=94=82=20
  =E2=94=82     " remap 'gd' like Merlin would
  =E2=94=82     nmap <silent><buffer> gd  <Plug>(ale_go_to_definition_in_sp=
lit)<CR>
  =E2=94=82=20
  =E2=94=82     " go back
  =E2=94=82     nnoremap <silent> <LocalLeader>gb <C-O>
  =E2=94=82=20
  =E2=94=82     " show list of file:line:col of references for symbol under=
 cursor
  =E2=94=82     nmap <silent><buffer> <LocalLeader>go :ALEFindReferences -r=
elative<CR>
  =E2=94=82=20
  =E2=94=82     " Show documentation if available, and type
  =E2=94=82     nmap <silent><buffer> <LocalLeader>hh <Plug>(ale_hover)<CR>
  =E2=94=82=20
  =E2=94=82     " So I can type ,hh. More convenient than \hh.
  =E2=94=82     nmap , <LocalLeader>
  =E2=94=82     vmap , <LocalLeader>
  =E2=94=82 endif
  =E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80


[Ale] <https://github.com/dense-analysis/ale>


Mkocaml Release - Project generator
=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=
=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=
=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=
=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=
=95=90=E2=95=90

  Archive:
  <https://discuss.ocaml.org/t/mkocaml-release-project-generator/4949/1>


Chris Nevers announced
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80

  I recently created a tool to generate OCaml projects. I constantly
  have difficulties with dune commands and setting up opam files,
  etc. Mkocaml generates a dune project with inline tests, opam file,
  git repository, git ignore, and a Makefile with easy commands. This
  tool can be of great help to newcomers, allowing them to get up and
  running faster!

  <https://github.com/chrisnevers/mkocaml>


Garbage Collection, Side-effects and Purity
=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=
=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=
=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=
=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=
=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=
=90=E2=95=90

  Archive:
  <https://discuss.ocaml.org/t/garbage-collection-side-effects-and-purity/4=
737/1>


Gerard asked
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80

  GC =3D Garbage Collection

  GC, in a pure program, is a point that's always confused me. I always
  understood that freeing memory from a program was impure and would
  create side-effects but it seems it doesn't matter if the program is
  remove from all consequences of those impure acts and side-effects.

  Basically, if any memory block has no remaining references in the
  program, then freeing that block will have no consequences on the
  running program so its allowed to happen behind the scenes..

  Is this correct reasoning?


Guillaume Munch-Maccagnoni replied
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80

  To answer your question =E2=80=9Cdoes de-allocation creates a side-effect=
?=E2=80=9D:

  To state the obvious: if you care about the memory consumption of your
  program, then you care about the side-effect of de-allocation, and
  this indeed voids purity.

  A language like OCaml lets you reason about de-allocation. Memory is
  collected when values are no longer reachable. Like in other
  languages, 1) a value that does not escape and goes out of scope will
  get collected, and 2) you can reason about when a value escapes and
  goes out of scope thanks to OCaml respecting the strict evaluation
  order of value types. OCaml (like other compiled languages) is in fact
  more precise: it ties the dynamic notion of reachability to the
  lexical notion of variable occurrence. For instance, in the following:

  =E2=94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80
  =E2=94=82 let x =3D get_huge_data () in
  =E2=94=82 let z =3D long_running_function x in
  =E2=94=82 f z
  =E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80

  OCaml will be able to collect the value in `x' before `x' goes out of
  scope, and thus if possible before `long_running_function'
  returns. Indeed, OCaml performs liveness analysis during compilation,
  and records the information about variable occurrences in frame
  descriptors, for consumption by the GC when it scans for roots. In
  fact, you can rely on call-by-value operational semantics to (loosely)
  reason that a value no longer appears in a program, and therefore that
  the corresponding memory will be collected by the GC=C2=B9 ([Morrisett,
  Felleisen and Harper, "Abstract Models of Memory Management"]). Of
  course, using lazy or higher-order interfaces (when closures escape;
  with many idioms they do not) will make it harder to reason about the
  lifetime of values.

  (=C2=B9: For OCaml, this is a conjecture I make, for subsets which could =
be
  given such operational semantics, and only for native
  compilation. Morrisett, Felleisen and Harper's semantics obviously
  assumes that the results of liveness analysis are made available to
  the GC, but this is not written, nor is there any mention of the link
  between liveness analysis and accuracy of garbage collection in
  Appel's "Modern Compiler Implementation in C". I assume that it was
  part of folklore at the time, though recently I mentioned it to some
  functional PL researcher and they seemed surprised. I only found it
  explicitly mentioned in later papers from the OOP community. I checked
  that everything seems in place for OCaml to allow such reasoning, but
  only the authors of the original code, @xavierleroy and
  @damiendoligez, can tell us if this is intended to be part of the
  language semantics.)

  Furthermore, memory is not collected immediately when a value becomes
  unreachable. Instead:

  =E2=80=A2 Short-lived values are allocated contiguously and deallocated i=
n a
    batch, so that allocating and deallocating short-lived values is
    very cheap, with additional benefits in terms of cache
    locality. This replaces stack allocation from languages with
    explicit memory management.

  =E2=80=A2 Longer-lived values are moved to a heap that is scanned
    incrementally, to ensure a bounded latency. In contrast, naive
    reference-counting and unique pointers from C++/Rust make you pay
    the cost of deallocation up-front.

  While this is essential for understanding the performance of OCaml
  programs, from the point of view of deallocation-as-an-effect, the
  delaying of the collection of unreachable memory can be seen as a
  runtime optimisation, that does not change the effectful status of
  deallocation (the memory still gets freed). [The intuition is that an
  effect can support some degree of reordering without requiring purity,
  as illustrated by strong monads which can be commutative without being
  idempotent, one possible definition of purity for semanticists.]

  But is de-allocation an effect _in practice_? Faced with the
  scepticism and misunderstandings from this thread, I emit two
  hypotheses:

  1) Memory consumption is not an issue in functional programming, for
     application areas that interest functional programmers.

  2) Memory management in OCaml is efficient in such a way that
     programmers do not need to think about it in their day-to-day
     programming activities in those terms.

  Hypothesis 2) could be explained for instance if OCaml programmers are
  already dealing with effects and thinking about the order in which
  their code executes (my experience), and are only used to deal with
  deallocation as an afterthought, e.g. when chasing leaks with a
  profiler.

  Let us turn towards two programming language experiments from the
  1990's that allow me to reject hypothesis 1). Both show what happens
  when one denies the status of deallocation as an effect controlled by
  the programmer.

  =E2=80=A2 Region-based memory management consisted in allocating in a sta=
ck of
    memory _regions_ deallocated at once, and determined by a
    whole-program static analysis. Now regarded as a failed idea but
    successful experiment (i.e. good science!), it taught us a lot about
    the structure of functional programs in relationship to memory
    management ([see this retrospective]). There were some good
    performance results, but also pathological cases _=E2=80=9Cwhere lifeti=
mes
    were not nested or where higher-order functions were used
    extensively=E2=80=9D_, sometimes requiring them to be altered to be _=
=E2=80=9Cregion
    friendly=E2=80=9D_, which was _=E2=80=9Ctime-consuming=E2=80=9D_ and re=
quired knowledge of
    the inference algorithm. In addition, the regions changed
    unpredictably when the programs evolved, and memory leaks appeared
    when the compiler inferred too wide regions.

  =E2=80=A2 Haskell was (at the time) an experiment with lazy functional
    programming. Pervasive laziness prevents reasoning about the
    lifetime of values, and purity is a central assumption used by the
    compiler for program transformations, which is antithetical with
    reasoning about deallocation as an effect. It is well-known that
    naive Haskell code has issues with memory leaks, and that realistic
    Haskell programs have to follow "best practices" to avoid leaks, by
    making extensive use of strictness annotations (e.g. bang
    patterns). Unfortunately, I found it hard to find reliable academic
    sources about lessons drawn from the experiment like the RBMM
    retrospective. The best I could find on the topic of memory leaks is
    the following blog post:
    <https://queue.acm.org/detail.cfm?id=3D2538488>, from a Haskell
    programmer who wrote in another post (linked from that one) _=E2=80=9CMy
    suspicion is that many (most?) large Haskell programs have space
    leaks, but they often go unnoticed=E2=80=9D_. This is consistent with
    comments I received from people with Haskell experience (first-hand,
    one academic and one industrial) and about an industrial Haskell
    consultant (second-hand) who reportedly commented that their main
    job was to fix memory leaks (but maybe in jest). Of course, take
    this with a grain of salt. At least, I believe that the Haskell
    academic community has accumulated empirical evidence of the extent
    and manner in which deallocation voids purity assumptions. Having an
    authoritative source about it would be pretty important to me, given
    the initial promises of functional programs being more tractable
    mathematically specifically via =E2=80=9Creferential transparency=E2=80=
=9D and
    independence of execution order, whose theoretical justification
    already looks shaky to me from a semantic point of view. Some parts
    of the literature continues to promise far-reaching consequences of
    equational reasoning, without clear statements of limitation of the
    application domain. I have the impression that the Haskell which is
    practiced in the real world is very different from what you can read
    in some academic papers.

  The hypothesis that deallocation matters as an effect, and that ML
  makes it easy to program and reason about effects, seems to me a
  strong argument explaining OCaml's predictable and competitive
  performance.

  So, thank you for your healthy scepticism.


[Morrisett, Felleisen and Harper, "Abstract Models of Memory
Management"] <https://dash.harvard.edu/handle/1/3293156>

[see this retrospective]
<https://link.springer.com/article/10.1023/B:LISP.0000029446.78563.a4>


Xavier Leroy replied
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80

  Concerning the "don't scan local variables that are dead" trick:

  =E2=80=A2 Technically it is not "intended to be part of the language
    semantics" because the bytecode compiler (ocamlc) doesn't implement
    it, only the native-code compiler (ocamlopt).

  =E2=80=A2 As far as I remember, I reinvented this trick circa 1993, but it
    seems it was used earlier in the Lazy ML compiler by Augustsson and
    Johnsson. See Appel and Shao's paper "An Empirical and Analytic
    Study of Stack vs. Heap Cost for Languages with Closures", JFP,
    1996, end of section 5.


Guillaume Munch-Maccagnoni the asked
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80

  TL;DR: the paper mentioned by @xavierleroy provides additional
  references regarding the importance of liveness analysis for GC,
  including a demonstration by Appel that this actually matters for
  space complexity (thanks!). I find that a link is still missing with
  an abstract semantics =C3=A0 la Morrisett, Felleisen & Harper. This seems
  important to me because more theoretical works about time & space
  complexity in the lambda-calculus seem to take for granted that
  garbage collection implements something like the latter (i.e., how
  does one specify and certify that a compiler is sound for space
  complexity?).


Xavier Leroy replied
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80

  See for example [Closure Conversion is Safe for Space], by Zoe
  Paraskevopoulou and Andrew W. Appel, ICFP 2019.


[Closure Conversion is Safe for Space]
<https://www.cs.princeton.edu/~appel/papers/safe-closure.pdf>


A Lightweight OCaml Webapp Tutorial (Using Opium, Caqti, and Tyxml)
=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=
=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=
=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=
=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=
=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=
=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=
=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=
=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=
=90

  Archive:
  <https://discuss.ocaml.org/t/a-lightweight-ocaml-webapp-tutorial-using-op=
ium-caqti-and-tyxml/4967/1>


Shon announced
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80

  The tutorial is [hosted on gitlab pages], out of [this repository].

  I put this together in response to some requests for introductory
  material on the topic (here and on [/r/ocaml]. I don't have much
  expertise to offer in this area, but I had hacked together some simple
  servers based on Opium in the past few months, so it seemed like I
  should be able to memorialize some of what I learned for the benefit
  of others. I received some critical guidance by the Opium maintainers,
  rgrinberg and anuragsoni, and from other resources online (mentioned
  at the end of the tutorial).

  Any feedback or improvements are welcome: this is my first time
  writing such lengthy instructional material, and I'm sure there's lots
  of room to make it better.


[hosted on gitlab pages] <https://shonfeder.gitlab.io/ocaml_webapp/>

[this repository] <https://gitlab.com/anuragsoni/ocaml_webapp>

[/r/ocaml] <https://www.reddit.com/r/ocaml/>


Release of owl-symbolic 0.1.0
=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=
=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=
=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=
=E2=95=90=E2=95=90=E2=95=90=E2=95=90

  Archive:
  <https://discuss.ocaml.org/t/announce-release-of-owl-symbolic-0-1-0/4930/=
2>


jrzhao42 announced
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80

  The Owl tutorial book URL address is now changed to:
  <https://ocaml.xyz/book/symbolic.html>.


Static lifetime
=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=
=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90

  Archive: <https://discuss.ocaml.org/t/static-lifetime/4908/19>


Andr=C3=A9 asked and Guillaume Munch-Maccagnoni replied
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80

  > Is it possible to =E2=80=9Cstatically=E2=80=9D allocate a value? By thi=
s I mean mark
    a value such that it gets ignored by the GC and lives until the
    program exits?

  This is indeed the purpose of Ancient, which comes with limitations
  and does not allow you to reclaim the memory until you exit the
  program. (I am curious to know how well it works with recent OCaml
  versions.)

  > it would be really interesting to learn whether Ocaml forbids blocks
    outside the heap.

  The OCaml runtime has two modes (chosen at compilation) for dealing
  with so-called "out-of-heap" pointers. In the legacy one that Chet
  remembers, the GC uses a page table when scanning to be able to tell
  which pointers it possesses. In the "no-naked-pointers" mode devised
  more recently for efficiency reasons, the page table is replaced by
  looking at the colour in the header of the dereferenced
  value. Out-of-heap values must be preceded by a header with colour
  black. The no-naked-pointer mode is more restricted, because once a
  static value is referenced, it can no longer be deallocated, as you
  never know whether it is still reachable by the GC. This should be
  enough to support Ancient.

  > One should verify such intuitions experimentally, before trying to
    fix them, but I=E2=80=99m not familiar with what OCaml profilers can do=
=E2=80=A6

  Excluding large long-lived data from the GC is an old idea. Among
  recent developments, Nguyen et al. [1] distinguish a "control path"
  (where the generational hypothesis is assumed to hold) from a "data
  path" (where values are assumed to follow an "epochal" behaviour
  (long-lived, similar lifetimes, benefit from locality), and are
  excluded from GC). They give as motivation so-called "big data" and as
  figures of pathological GC usage up to 50% of total runtime. I
  remember reading similar figures from blog posts about large data sets
  in OCaml. In reality this indeed depends on knobs you can turn on your
  GC that can result in increased peak memory usage among
  others. (Assuming infinite available memory, it is even possible to
  let the GC share drop to 0%.)

  @ppedrot reported to me that in a recent experiment with Coq, using an
  Ancient-like trick to exclude some large, long-lived and
  rarely-accessed values from being scanned (namely serialising them
  into bigarrays), they saw an 8% performance improvement across the
  board in benchmarks.

  Multicore, if I understood correctly, aims to support only the
  no-naked-pointer mode, and I am not sure what the page table will
  become. Coq currently does some out-of-heap allocation in the VM, and
  has been adapted to be compatible with the no-naked-pointer mode by
  wrapping out-of-heap pointers into custom blocks. For scanning its
  custom stack (which mixes in-heap and out-of-heap values), Coq sets up
  a custom root-scanning function (`caml_scan_roots_hook`), which still
  relies on the page table.

  Note that having to wrap out-of-heap pointers in custom blocks is
  (much!) less expressive: for instance with Ancient you can call
  `List.filter` on a statically-allocated list (and correctly get a
  GC-allocated list of statically-allocated values). With custom blocks
  you cannot mix in-heap and out-of-heap values in this way.

  For a type system to deal with "statically" allocated values, have a
  look at Rust, which: 1) prevents cycles of reference-counting schemes
  thanks to uniqueness, 2) can treat GC roots as resources to deal with
  backpointers at the leaves of the value (cf. the interoperability with
  SpiderMonkey's GC in Servo). A point of view that I like is that
  tracing GCs and static allocation differ fundamentally by how they
  traverse values for collection: traversing live values for the first
  one, and traversing values at the moment of their death for the
  other. This gives them distinct advantages and drawbacks so one can
  see them as complementary. (See notably [2,3].) Static allocation is
  interesting for performance in some aspects (no tracing, no read-write
  barrier, reusability of memory cells, avoids calling the GC at
  inappropriate times), but I find it even more interesting for
  interoperability (e.g. exchanging values freely with C or Rust, or
  [applications from that other thread]). It is natural to want to mix
  them in a language.

  As far as I understand, developing the runtime capabilities for OCaml
  to deal with out-of-heap pointers without resorting to an expensive
  page table is an engineering problem, not a fundamental one. If anyone
  is interested in this, please contact me.

  [1] Nguyen et al., [Yak : A High-Performance Big-Data-Friendly Garbage
  Collector], 2016

  [2] Bacon, Cheng and Rajan, [A Unified Theory of Garbage Collection],
  2004

  [3] Shahriyar, Blackburn and Frampton, [Down for the Count? Getting
  Reference Counting Back in the Ring], 2012


[applications from that other thread]
<https://discuss.ocaml.org/t/using-a-bigarray-as-a-shared-memory-for-parall=
el-programming/4841/19>

[Yak : A High-Performance Big-Data-Friendly Garbage Collector]
<https://www.usenix.org/system/files/conference/osdi16/osdi16-nguyen.pdf>

[A Unified Theory of Garbage Collection]
<http://citeseerx.ist.psu.edu/viewdoc/download?doi=3D10.1.1.439.1202&rep=3D=
rep1&type=3Dpdf>

[Down for the Count? Getting Reference Counting Back in the Ring]
<https://dl.acm.org/citation.cfm?doid=3D2258996.2259008>


UnixJunkie also replied
=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=
=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=
=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80

  If you can store your long-leaved data into a bigarray, I think you
  would reach the effect that you were looking for (no more GC scanning
  of this data).

  This was once advised to me by Oleg, for some performance-critical
  section of some code.


Old CWN
=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90=E2=95=90

  If you happen to miss a CWN, you can [send me a message] and I'll mail
  it to you, or go take a look at [the archive] or the [RSS feed of the
  archives].

  If you also wish to receive it every week by mail, you may subscribe
  [online].

  [Alan Schmitt]


[send me a message] <mailto:alan.schmitt@polytechnique.org>

[the archive] <http://alan.petitepomme.net/cwn/>

[RSS feed of the archives] <http://alan.petitepomme.net/cwn/cwn.rss>

[online] <http://lists.idyll.org/listinfo/caml-news-weekly/>

[Alan Schmitt] <http://alan.petitepomme.net/>


--=-=-=
Content-Type: text/html; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

<?xml version=3D"1.0" encoding=3D"utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns=3D"http://www.w3.org/1999/xhtml" lang=3D"en" xml:lang=3D"en">
<head>
<!-- 2020-01-07 Tue 14:41 -->
<meta http-equiv=3D"Content-Type" content=3D"text/html;charset=3Dutf-8" />
<meta name=3D"viewport" content=3D"width=3Ddevice-width, initial-scale=3D1"=
 />
<title>OCaml Weekly News</title>
<meta name=3D"generator" content=3D"Org mode" />
<style type=3D"text/css">
 <!--/*--><![CDATA[/*><!--*/
  .title  { text-align: center;
             margin-bottom: .2em; }
  .subtitle { text-align: center;
              font-size: medium;
              font-weight: bold;
              margin-top:0; }
  .todo   { font-family: monospace; color: red; }
  .done   { font-family: monospace; color: green; }
  .priority { font-family: monospace; color: orange; }
  .tag    { background-color: #eee; font-family: monospace;
            padding: 2px; font-size: 80%; font-weight: normal; }
  .timestamp { color: #bebebe; }
  .timestamp-kwd { color: #5f9ea0; }
  .org-right  { margin-left: auto; margin-right: 0px;  text-align: right; }
  .org-left   { margin-left: 0px;  margin-right: auto; text-align: left; }
  .org-center { margin-left: auto; margin-right: auto; text-align: center; }
  .underline { text-decoration: underline; }
  #postamble p, #preamble p { font-size: 90%; margin: .2em; }
  p.verse { margin-left: 3%; }
  pre {
    border: 1px solid #ccc;
    box-shadow: 3px 3px 3px #eee;
    padding: 8pt;
    font-family: monospace;
    overflow: auto;
    margin: 1.2em;
  }
  pre.src {
    position: relative;
    overflow: visible;
    padding-top: 1.2em;
  }
  pre.src:before {
    display: none;
    position: absolute;
    background-color: white;
    top: -10px;
    right: 10px;
    padding: 3px;
    border: 1px solid black;
  }
  pre.src:hover:before { display: inline;}
  /* Languages per Org manual */
  pre.src-asymptote:before { content: 'Asymptote'; }
  pre.src-awk:before { content: 'Awk'; }
  pre.src-C:before { content: 'C'; }
  /* pre.src-C++ doesn't work in CSS */
  pre.src-clojure:before { content: 'Clojure'; }
  pre.src-css:before { content: 'CSS'; }
  pre.src-D:before { content: 'D'; }
  pre.src-ditaa:before { content: 'ditaa'; }
  pre.src-dot:before { content: 'Graphviz'; }
  pre.src-calc:before { content: 'Emacs Calc'; }
  pre.src-emacs-lisp:before { content: 'Emacs Lisp'; }
  pre.src-fortran:before { content: 'Fortran'; }
  pre.src-gnuplot:before { content: 'gnuplot'; }
  pre.src-haskell:before { content: 'Haskell'; }
  pre.src-hledger:before { content: 'hledger'; }
  pre.src-java:before { content: 'Java'; }
  pre.src-js:before { content: 'Javascript'; }
  pre.src-latex:before { content: 'LaTeX'; }
  pre.src-ledger:before { content: 'Ledger'; }
  pre.src-lisp:before { content: 'Lisp'; }
  pre.src-lilypond:before { content: 'Lilypond'; }
  pre.src-lua:before { content: 'Lua'; }
  pre.src-matlab:before { content: 'MATLAB'; }
  pre.src-mscgen:before { content: 'Mscgen'; }
  pre.src-ocaml:before { content: 'Objective Caml'; }
  pre.src-octave:before { content: 'Octave'; }
  pre.src-org:before { content: 'Org mode'; }
  pre.src-oz:before { content: 'OZ'; }
  pre.src-plantuml:before { content: 'Plantuml'; }
  pre.src-processing:before { content: 'Processing.js'; }
  pre.src-python:before { content: 'Python'; }
  pre.src-R:before { content: 'R'; }
  pre.src-ruby:before { content: 'Ruby'; }
  pre.src-sass:before { content: 'Sass'; }
  pre.src-scheme:before { content: 'Scheme'; }
  pre.src-screen:before { content: 'Gnu Screen'; }
  pre.src-sed:before { content: 'Sed'; }
  pre.src-sh:before { content: 'shell'; }
  pre.src-sql:before { content: 'SQL'; }
  pre.src-sqlite:before { content: 'SQLite'; }
  /* additional languages in org.el's org-babel-load-languages alist */
  pre.src-forth:before { content: 'Forth'; }
  pre.src-io:before { content: 'IO'; }
  pre.src-J:before { content: 'J'; }
  pre.src-makefile:before { content: 'Makefile'; }
  pre.src-maxima:before { content: 'Maxima'; }
  pre.src-perl:before { content: 'Perl'; }
  pre.src-picolisp:before { content: 'Pico Lisp'; }
  pre.src-scala:before { content: 'Scala'; }
  pre.src-shell:before { content: 'Shell Script'; }
  pre.src-ebnf2ps:before { content: 'ebfn2ps'; }
  /* additional language identifiers per "defun org-babel-execute"
       in ob-*.el */
  pre.src-cpp:before  { content: 'C++'; }
  pre.src-abc:before  { content: 'ABC'; }
  pre.src-coq:before  { content: 'Coq'; }
  pre.src-groovy:before  { content: 'Groovy'; }
  /* additional language identifiers from org-babel-shell-names in
     ob-shell.el: ob-shell is the only babel language using a lambda to put
     the execution function name together. */
  pre.src-bash:before  { content: 'bash'; }
  pre.src-csh:before  { content: 'csh'; }
  pre.src-ash:before  { content: 'ash'; }
  pre.src-dash:before  { content: 'dash'; }
  pre.src-ksh:before  { content: 'ksh'; }
  pre.src-mksh:before  { content: 'mksh'; }
  pre.src-posh:before  { content: 'posh'; }
  /* Additional Emacs modes also supported by the LaTeX listings package */
  pre.src-ada:before { content: 'Ada'; }
  pre.src-asm:before { content: 'Assembler'; }
  pre.src-caml:before { content: 'Caml'; }
  pre.src-delphi:before { content: 'Delphi'; }
  pre.src-html:before { content: 'HTML'; }
  pre.src-idl:before { content: 'IDL'; }
  pre.src-mercury:before { content: 'Mercury'; }
  pre.src-metapost:before { content: 'MetaPost'; }
  pre.src-modula-2:before { content: 'Modula-2'; }
  pre.src-pascal:before { content: 'Pascal'; }
  pre.src-ps:before { content: 'PostScript'; }
  pre.src-prolog:before { content: 'Prolog'; }
  pre.src-simula:before { content: 'Simula'; }
  pre.src-tcl:before { content: 'tcl'; }
  pre.src-tex:before { content: 'TeX'; }
  pre.src-plain-tex:before { content: 'Plain TeX'; }
  pre.src-verilog:before { content: 'Verilog'; }
  pre.src-vhdl:before { content: 'VHDL'; }
  pre.src-xml:before { content: 'XML'; }
  pre.src-nxml:before { content: 'XML'; }
  /* add a generic configuration mode; LaTeX export needs an additional
     (add-to-list 'org-latex-listings-langs '(conf " ")) in .emacs */
  pre.src-conf:before { content: 'Configuration File'; }

  table { border-collapse:collapse; }
  caption.t-above { caption-side: top; }
  caption.t-bottom { caption-side: bottom; }
  td, th { vertical-align:top;  }
  th.org-right  { text-align: center;  }
  th.org-left   { text-align: center;   }
  th.org-center { text-align: center; }
  td.org-right  { text-align: right;  }
  td.org-left   { text-align: left;   }
  td.org-center { text-align: center; }
  dt { font-weight: bold; }
  .footpara { display: inline; }
  .footdef  { margin-bottom: 1em; }
  .figure { padding: 1em; }
  .figure p { text-align: center; }
  .equation-container {
    display: table;
    text-align: center;
    width: 100%;
  }
  .equation {
    vertical-align: middle;
  }
  .equation-label {
    display: table-cell;
    text-align: right;
    vertical-align: middle;
  }
  .inlinetask {
    padding: 10px;
    border: 2px solid gray;
    margin: 10px;
    background: #ffffcc;
  }
  #org-div-home-and-up
   { text-align: right; font-size: 70%; white-space: nowrap; }
  textarea { overflow-x: auto; }
  .linenr { font-size: smaller }
  .code-highlighted { background-color: #ffff00; }
  .org-info-js_info-navigation { border-style: none; }
  #org-info-js_console-label
    { font-size: 10px; font-weight: bold; white-space: nowrap; }
  .org-info-js_search-highlight
    { background-color: #ffff00; color: #000000; font-weight: bold; }
  .org-svg { width: 90%; }
  /*]]>*/-->
</style>
<style type=3D"text/css">#table-of-contents h2 { display: none } .title { d=
isplay: none } .authorname { text-align: right }</style>
<style type=3D"text/css">.outline-2 {border-top: 1px solid black;}</style>
<script type=3D"text/javascript">
/*
@licstart  The following is the entire license notice for the
JavaScript code in this tag.

Copyright (C) 2012-2019 Free Software Foundation, Inc.

The JavaScript code in this tag is free software: you can
redistribute it and/or modify it under the terms of the GNU
General Public License (GNU GPL) as published by the Free Software
Foundation, either version 3 of the License, or (at your option)
any later version.  The code is distributed WITHOUT ANY WARRANTY;
without even the implied warranty of MERCHANTABILITY or FITNESS
FOR A PARTICULAR PURPOSE.  See the GNU GPL for more details.

As additional permission under GNU GPL version 3 section 7, you
may distribute non-source (e.g., minimized or compacted) forms of
that code without the copy of the GNU GPL normally required by
section 4, provided you include this license notice and a URL
through which recipients can access the Corresponding Source.


@licend  The above is the entire license notice
for the JavaScript code in this tag.
*/
<!--/*--><![CDATA[/*><!--*/
 function CodeHighlightOn(elem, id)
 {
   var target =3D document.getElementById(id);
   if(null !=3D target) {
     elem.cacheClassElem =3D elem.className;
     elem.cacheClassTarget =3D target.className;
     target.className =3D "code-highlighted";
     elem.className   =3D "code-highlighted";
   }
 }
 function CodeHighlightOff(elem, id)
 {
   var target =3D document.getElementById(id);
   if(elem.cacheClassElem)
     elem.className =3D elem.cacheClassElem;
   if(elem.cacheClassTarget)
     target.className =3D elem.cacheClassTarget;
 }
/*]]>*///-->
</script>
</head>
<body>
<div id=3D"content">
<h1 class=3D"title">OCaml Weekly News</h1>
<p>
<a href=3D"http://alan.petitepomme.net/cwn/2019.12.31.html">Previous Week</=
a> <a href=3D"http://alan.petitepomme.net/cwn/index.html">Up</a> <a href=3D=
"http://alan.petitepomme.net/cwn/2020.01.14.html">Next Week</a>
</p>

<p>
Hello
</p>

<p>
Here is the latest OCaml Weekly News, for the week of December 31, 2019 to =
January 07, 2020.
</p>

<div id=3D"table-of-contents">
<h2>Table of Contents</h2>
<div id=3D"text-table-of-contents">
<ul>
<li><a href=3D"#1">ocaml-lsp preview</a></li>
<li><a href=3D"#2">Mkocaml Release - Project generator</a></li>
<li><a href=3D"#3">Garbage Collection, Side-effects and Purity</a></li>
<li><a href=3D"#4">A Lightweight OCaml Webapp Tutorial (Using Opium, Caqti,=
 and Tyxml)</a></li>
<li><a href=3D"#5">Release of owl-symbolic 0.1.0</a></li>
<li><a href=3D"#6">Static lifetime</a></li>
<li><a href=3D"#orgf226af9">Old CWN</a></li>
</ul>
</div>
</div>


<div id=3D"outline-container-org74b8486" class=3D"outline-2">
<h2 id=3D"1">ocaml-lsp preview</h2>
<div class=3D"outline-text-2" id=3D"text-1">
<p>
Archive: <a href=3D"https://discuss.ocaml.org/t/ann-ocaml-lsp-preview/4876/=
15">https://discuss.ocaml.org/t/ann-ocaml-lsp-preview/4876/15</a>
</p>
</div>

<div id=3D"outline-container-org0fd5b9c" class=3D"outline-3">
<h3 id=3D"org0fd5b9c">Continuing this thread, Edwin T=C3=B6r=C3=B6k said</h=
3>
<div class=3D"outline-text-3" id=3D"text-org0fd5b9c">
<p>
Here is an example with ALE and Neovim (tested with v0.3.8):
</p>
<ul class=3D"org-ul">
<li>Install the <a href=3D"https://github.com/dense-analysis/ale">Ale</a> p=
lugin. If your Vim has support for packages (Vim 8+ or Neovim) you can simp=
ly clone it in the correct subdir, no need for a plugin manager: <code>git =
clone https://github.com/w0rp/ale.git .vim/pack/my-plugins/start/ale</code>=
</li>
<li>Add this to your .vimrc:</li>
</ul>

<pre class=3D"example">
" only invoke merlin to check for errors when
" exiting insert mode, not on each keystroke.
let g:ale_lint_on_text_changed=3D"never"
let g:ale_lint_on_insert_leave=3D1

" enable ALE's internal completion if deoplete is not used
let g:ale_completion_enabled=3D1

" only pop up completion when stopped typing for ~0.5s,
" to avoid distracting when completion is not needed
let g:ale_completion_delay=3D500

" see ale-completion-completeopt-bug
set completeopt=3Dmenu,menuone,preview,noselect,noinsert

if has('packages')
    packloadall

    " This should be part of ALE itself, like ols.vim
    call ale#linter#Define('ocaml',{
                \ 'name':'ocaml-lsp',
                \ 'lsp': 'stdio',
                \ 'executable': 'ocamllsp',
                \ 'command': '%e',
                \ 'project_root': function('ale#handlers#ols#GetProjectRoot=
')
                \})

    " remap 'gd' like Merlin would
    nmap &lt;silent&gt;&lt;buffer&gt; gd  &lt;Plug&gt;(ale_go_to_definition=
_in_split)&lt;CR&gt;

    " go back
    nnoremap &lt;silent&gt; &lt;LocalLeader&gt;gb &lt;C-O&gt;

    " show list of file:line:col of references for symbol under cursor
    nmap &lt;silent&gt;&lt;buffer&gt; &lt;LocalLeader&gt;go :ALEFindReferen=
ces -relative&lt;CR&gt;

    " Show documentation if available, and type
    nmap &lt;silent&gt;&lt;buffer&gt; &lt;LocalLeader&gt;hh &lt;Plug&gt;(al=
e_hover)&lt;CR&gt;

    " So I can type ,hh. More convenient than \hh.
    nmap , &lt;LocalLeader&gt;
    vmap , &lt;LocalLeader&gt;
endif
</pre>
</div>
</div>
</div>


<div id=3D"outline-container-org5aad3c6" class=3D"outline-2">
<h2 id=3D"2">Mkocaml Release - Project generator</h2>
<div class=3D"outline-text-2" id=3D"text-2">
<p>
Archive: <a href=3D"https://discuss.ocaml.org/t/mkocaml-release-project-gen=
erator/4949/1">https://discuss.ocaml.org/t/mkocaml-release-project-generato=
r/4949/1</a>
</p>
</div>

<div id=3D"outline-container-orgd5eb9be" class=3D"outline-3">
<h3 id=3D"orgd5eb9be">Chris Nevers announced</h3>
<div class=3D"outline-text-3" id=3D"text-orgd5eb9be">
<p>
I recently created a tool to generate OCaml projects. I constantly have dif=
ficulties with dune commands and setting up opam files, etc. Mkocaml genera=
tes a dune project with inline tests, opam file, git repository, git ignore=
, and a
Makefile with easy commands. This tool can be of great help to newcomers, a=
llowing them to get up and running faster!
</p>

<p>
<a href=3D"https://github.com/chrisnevers/mkocaml">https://github.com/chris=
nevers/mkocaml</a>
</p>
</div>
</div>
</div>


<div id=3D"outline-container-org3c49439" class=3D"outline-2">
<h2 id=3D"3">Garbage Collection, Side-effects and Purity</h2>
<div class=3D"outline-text-2" id=3D"text-3">
<p>
Archive: <a href=3D"https://discuss.ocaml.org/t/garbage-collection-side-eff=
ects-and-purity/4737/1">https://discuss.ocaml.org/t/garbage-collection-side=
-effects-and-purity/4737/1</a>
</p>
</div>

<div id=3D"outline-container-orgdded73b" class=3D"outline-3">
<h3 id=3D"orgdded73b">Gerard asked</h3>
<div class=3D"outline-text-3" id=3D"text-orgdded73b">
<p>
GC =3D Garbage Collection
</p>

<p>
GC, in a pure program, is a point that's always confused me. I always under=
stood that freeing memory from a program was impure and would create side-e=
ffects but it seems it doesn't matter if the program is remove from all con=
sequences of those impure acts and side-effects.
</p>

<p>
Basically, if any memory block has no remaining references in the program, =
then freeing that block will have no consequences on the running program so=
 its allowed to happen behind the scenes..
</p>

<p>
Is this correct reasoning?
</p>
</div>
</div>


<div id=3D"outline-container-org5825420" class=3D"outline-3">
<h3 id=3D"org5825420">Guillaume Munch-Maccagnoni replied</h3>
<div class=3D"outline-text-3" id=3D"text-org5825420">
<p>
To answer your question =E2=80=9Cdoes de-allocation creates a side-effect?=
=E2=80=9D:
</p>

<p>
To state the obvious: if you care about the memory consumption of your prog=
ram, then you care about the side-effect of de-allocation, and this indeed =
voids purity.
</p>

<p>
A language like OCaml lets you reason about de-allocation. Memory is collec=
ted when values are no longer reachable. Like in other languages, 1) a valu=
e that does not escape and goes out of scope will get collected, and 2) you=
 can reason about when a value escapes and goes out of scope thanks to OCam=
l respecting the strict evaluation order of value types. OCaml (like other =
compiled languages) is in fact more precise: it ties the dynamic notion of =
reachability to the lexical notion of variable occurrence. For instance, in=
 the following:
</p>

<div class=3D"org-src-container">
<pre class=3D"src src-ocaml"><span style=3D"color: #000000; font-weight: bo=
ld;">let</span> <span style=3D"color: #a0522d;">x</span> =3D get_huge_data =
() <span style=3D"color: #000000; font-weight: bold;">in</span>
<span style=3D"color: #000000; font-weight: bold;">let</span> <span style=
=3D"color: #a0522d;">z</span> =3D long_running_function x <span style=3D"co=
lor: #000000; font-weight: bold;">in</span>
f z
</pre>
</div>

<p>
OCaml will be able to collect the value in <code>x</code> before <code>x</c=
ode> goes out of scope, and thus if possible before <code>long_running_func=
tion</code> returns. Indeed, OCaml performs liveness analysis during compil=
ation, and records the information about variable occurrences in frame desc=
riptors, for consumption by the GC when it scans for roots. In fact, you ca=
n rely on call-by-value operational semantics to (loosely) reason that a va=
lue no longer appears in a program, and therefore that the corresponding me=
mory will be collected by the GC=C2=B9 (<a href=3D"https://dash.harvard.edu=
/handle/1/3293156">Morrisett, Felleisen and Harper, "Abstract Models of Mem=
ory Management"</a>). Of course, using lazy or higher-order interfaces (whe=
n closures escape; with many idioms they do not) will make it harder to rea=
son about the lifetime of values.
</p>

<p>
(=C2=B9: For OCaml, this is a conjecture I make, for subsets which could be=
 given such operational semantics, and only for native compilation. Morrise=
tt, Felleisen and Harper's semantics obviously assumes that the results of =
liveness analysis are made available to the GC, but this is not written, no=
r is there any mention of the link between liveness analysis and accuracy o=
f garbage collection in Appel's "Modern Compiler Implementation in C". I as=
sume that it was part of folklore at the time, though recently I mentioned =
it to some functional PL researcher and they seemed surprised. I only found=
 it explicitly mentioned in later papers from the OOP community. I checked =
that everything seems in place for OCaml to allow such reasoning, but only =
the authors of the original code, @xavierleroy and @damiendoligez, can tell=
 us if this is intended to be part of the language semantics.)
</p>

<p>
Furthermore, memory is not collected immediately when a value becomes unrea=
chable. Instead:
</p>

<ul class=3D"org-ul">
<li>Short-lived values are allocated contiguously and deallocated in a batc=
h, so that allocating and deallocating short-lived values is very cheap, wi=
th additional benefits in terms of cache locality. This replaces stack allo=
cation from languages with explicit memory management.</li>

<li>Longer-lived values are moved to a heap that is scanned incrementally, =
to ensure a bounded latency. In contrast, naive reference-counting and uniq=
ue pointers from C++/Rust make you pay the cost of deallocation up-front.</=
li>
</ul>

<p>
While this is essential for understanding the performance of OCaml programs=
, from the point of view of deallocation-as-an-effect, the delaying of the =
collection of unreachable memory can be seen as a runtime optimisation, tha=
t does not change the effectful status of deallocation (the memory still ge=
ts freed). [The intuition is that an effect can support some degree of reor=
dering without requiring purity, as illustrated by strong monads which can =
be commutative without being idempotent, one possible definition of purity =
for semanticists.]
</p>

<p>
But is de-allocation an effect <span class=3D"underline">in practice</span>=
? Faced with the scepticism and misunderstandings from this thread, I emit =
two hypotheses:
</p>

<ol class=3D"org-ol">
<li>Memory consumption is not an issue in functional programming, for appli=
cation areas that interest functional programmers.</li>

<li>Memory management in OCaml is efficient in such a way that programmers =
do not need to think about it in their day-to-day programming activities in=
 those terms.</li>
</ol>

<p>
Hypothesis 2) could be explained for instance if OCaml programmers are alre=
ady dealing with effects and thinking about the order in which their code e=
xecutes (my experience), and are only used to deal with deallocation as an =
afterthought, e.g. when chasing leaks with a profiler.
</p>

<p>
Let us turn towards two programming language experiments from the 1990's th=
at allow me to reject hypothesis 1). Both show what happens when one denies=
 the status of deallocation as an effect controlled by the programmer.
</p>

<ul class=3D"org-ul">
<li>Region-based memory management consisted in allocating in a stack of me=
mory <span class=3D"underline">regions</span> deallocated at once, and dete=
rmined by a whole-program static analysis. Now regarded as a failed idea bu=
t successful experiment (i.e. good science!), it taught us a lot about the =
structure of functional programs in relationship to memory management (<a h=
ref=3D"https://link.springer.com/article/10.1023/B:LISP.0000029446.78563.a4=
">see this retrospective</a>). There were some good performance results, bu=
t also pathological cases <span class=3D"underline">=E2=80=9Cwhere lifetime=
s were not nested or where higher-order functions were used extensively=E2=
=80=9D</span>, sometimes requiring them to be altered to be <span class=3D"=
underline">=E2=80=9Cregion friendly=E2=80=9D</span>, which was <span class=
=3D"underline">=E2=80=9Ctime-consuming=E2=80=9D</span> and required knowled=
ge of the inference algorithm. In addition, the regions changed unpredictab=
ly when the programs evolved, and memory leaks appeared when the compiler i=
nferred too wide regions.</li>

<li>Haskell was (at the time) an experiment with lazy functional programmin=
g. Pervasive laziness prevents reasoning about the lifetime of values, and =
purity is a central assumption used by the compiler for program transformat=
ions, which is antithetical with reasoning about deallocation as an effect.=
 It is well-known that naive Haskell code has issues with memory leaks, and=
 that realistic Haskell programs have to follow "best practices" to avoid l=
eaks, by making extensive use of strictness annotations (e.g. bang patterns=
). Unfortunately, I found it hard to find reliable academic sources about l=
essons drawn from the experiment like the RBMM retrospective. The best I co=
uld find on the topic of memory leaks is the following blog post: <a href=
=3D"https://queue.acm.org/detail.cfm?id=3D2538488">https://queue.acm.org/de=
tail.cfm?id=3D2538488</a>, from a Haskell programmer who wrote in another p=
ost (linked from that one) <span class=3D"underline">=E2=80=9CMy suspicion =
is that many (most?) large Haskell programs have space leaks, but they ofte=
n go unnoticed=E2=80=9D</span>. This is consistent with comments I received=
 from people with Haskell experience (first-hand, one academic and one indu=
strial) and about an industrial Haskell consultant (second-hand) who report=
edly commented that their main job was to fix memory leaks (but maybe in je=
st). Of course, take this with a grain of salt. At least, I believe that th=
e Haskell academic community has accumulated empirical evidence of the exte=
nt and manner in which deallocation voids purity assumptions. Having an aut=
horitative source about it would be pretty important to me, given the initi=
al promises of functional programs being more tractable mathematically spec=
ifically via =E2=80=9Creferential transparency=E2=80=9D and independence of=
 execution order, whose theoretical justification already looks shaky to me=
 from a semantic point of view. Some parts of the literature continues to p=
romise far-reaching consequences of equational reasoning, without clear sta=
tements of limitation of the application domain. I have the impression that=
 the Haskell which is practiced in the real world is very different from wh=
at you can read in some academic papers.</li>
</ul>

<p>
The hypothesis that deallocation matters as an effect, and that ML makes it=
 easy to program and reason about effects, seems to me a strong argument ex=
plaining OCaml's predictable and competitive performance.
</p>

<p>
So, thank you for your healthy scepticism.
</p>
</div>
</div>


<div id=3D"outline-container-org069a6c6" class=3D"outline-3">
<h3 id=3D"org069a6c6">Xavier Leroy replied</h3>
<div class=3D"outline-text-3" id=3D"text-org069a6c6">
<p>
Concerning the "don't scan local variables that are dead" trick:
</p>

<ul class=3D"org-ul">
<li>Technically it is not "intended to be part of the language semantics" b=
ecause the bytecode compiler (ocamlc) doesn't implement it, only the native=
-code compiler (ocamlopt).</li>

<li>As far as I remember, I reinvented this trick circa 1993, but it seems =
it was used earlier in the Lazy ML compiler by Augustsson and Johnsson. See=
 Appel and Shao's paper "An Empirical and Analytic Study of Stack vs. Heap =
Cost for Languages with Closures", JFP, 1996, end of section 5.</li>
</ul>
</div>
</div>


<div id=3D"outline-container-org4912e34" class=3D"outline-3">
<h3 id=3D"org4912e34">Guillaume Munch-Maccagnoni the asked</h3>
<div class=3D"outline-text-3" id=3D"text-org4912e34">
<p>
TL;DR: the paper mentioned by @xavierleroy provides additional references r=
egarding the importance of liveness analysis for GC, including a demonstrat=
ion by Appel that this actually matters for space complexity (thanks!). I f=
ind that a link is still missing with an abstract semantics =C3=A0 la Morri=
sett, Felleisen &amp; Harper. This seems important to me because more theor=
etical works about time &amp; space complexity in the lambda-calculus seem =
to take for granted that garbage collection implements something like the l=
atter (i.e., how does one specify and certify that a compiler is sound for =
space complexity?).
</p>
</div>
</div>


<div id=3D"outline-container-orgada9efb" class=3D"outline-3">
<h3 id=3D"orgada9efb">Xavier Leroy replied</h3>
<div class=3D"outline-text-3" id=3D"text-orgada9efb">
<p>
See for example <a href=3D"https://www.cs.princeton.edu/~appel/papers/safe-=
closure.pdf">Closure Conversion is Safe for Space</a>, by Zoe Paraskevopoul=
ou and Andrew W. Appel, ICFP 2019.
</p>
</div>
</div>
</div>


<div id=3D"outline-container-org7c77234" class=3D"outline-2">
<h2 id=3D"4">A Lightweight OCaml Webapp Tutorial (Using Opium, Caqti, and T=
yxml)</h2>
<div class=3D"outline-text-2" id=3D"text-4">
<p>
Archive: <a href=3D"https://discuss.ocaml.org/t/a-lightweight-ocaml-webapp-=
tutorial-using-opium-caqti-and-tyxml/4967/1">https://discuss.ocaml.org/t/a-=
lightweight-ocaml-webapp-tutorial-using-opium-caqti-and-tyxml/4967/1</a>
</p>
</div>

<div id=3D"outline-container-orgf1109a0" class=3D"outline-3">
<h3 id=3D"orgf1109a0">Shon announced</h3>
<div class=3D"outline-text-3" id=3D"text-orgf1109a0">
<p>
The tutorial is <a href=3D"https://shonfeder.gitlab.io/ocaml_webapp/">hoste=
d on gitlab pages</a>, out of <a href=3D"https://gitlab.com/anuragsoni/ocam=
l_webapp">this repository</a>.
</p>

<p>
I put this together in response to some requests for introductory material =
on the topic (here and on <a href=3D"https://www.reddit.com/r/ocaml/">/r/oc=
aml</a>. I don't have much expertise to offer in this area, but I had hacke=
d together some simple servers based on Opium in the past few months, so it=
 seemed like I should be able to memorialize some of what I learned for the=
 benefit of others. I received some critical guidance by the Opium maintain=
ers, rgrinberg and anuragsoni, and from other resources online (mentioned a=
t the end of the tutorial).
</p>

<p>
Any feedback or improvements are welcome: this is my first time writing suc=
h lengthy instructional material, and I'm sure there's lots of room to make=
 it better.
</p>
</div>
</div>
</div>


<div id=3D"outline-container-org7f361f9" class=3D"outline-2">
<h2 id=3D"5">Release of owl-symbolic 0.1.0</h2>
<div class=3D"outline-text-2" id=3D"text-5">
<p>
Archive: <a href=3D"https://discuss.ocaml.org/t/announce-release-of-owl-sym=
bolic-0-1-0/4930/2">https://discuss.ocaml.org/t/announce-release-of-owl-sym=
bolic-0-1-0/4930/2</a>
</p>
</div>

<div id=3D"outline-container-orge176167" class=3D"outline-3">
<h3 id=3D"orge176167">jrzhao42 announced</h3>
<div class=3D"outline-text-3" id=3D"text-orge176167">
<p>
The Owl tutorial book URL address is now changed to: <a href=3D"https://oca=
ml.xyz/book/symbolic.html">https://ocaml.xyz/book/symbolic.html</a>.
</p>
</div>
</div>
</div>


<div id=3D"outline-container-org9c780fc" class=3D"outline-2">
<h2 id=3D"6">Static lifetime</h2>
<div class=3D"outline-text-2" id=3D"text-6">
<p>
Archive: <a href=3D"https://discuss.ocaml.org/t/static-lifetime/4908/19">ht=
tps://discuss.ocaml.org/t/static-lifetime/4908/19</a>
</p>
</div>

<div id=3D"outline-container-orgd99cb66" class=3D"outline-3">
<h3 id=3D"orgd99cb66">Andr=C3=A9 asked and Guillaume Munch-Maccagnoni repli=
ed</h3>
<div class=3D"outline-text-3" id=3D"text-orgd99cb66">
<p>
&gt; Is it possible to =E2=80=9Cstatically=E2=80=9D allocate a value? By th=
is I mean mark a value such that it gets ignored by the GC and lives until =
the program exits?
</p>

<p>
This is indeed the purpose of Ancient, which comes with limitations and doe=
s not allow you to reclaim the memory until you exit the program. (I am cur=
ious to know how well it works with recent OCaml versions.)
</p>

<p>
&gt; it would be really interesting to learn whether Ocaml forbids blocks o=
utside the heap.
</p>

<p>
The OCaml runtime has two modes (chosen at compilation) for dealing with so=
-called "out-of-heap" pointers. In the legacy one that Chet remembers, the =
GC uses a page table when scanning to be able to tell which pointers it pos=
sesses. In the "no-naked-pointers" mode devised more recently for efficienc=
y reasons, the page table is replaced by looking at the colour in the heade=
r of the dereferenced value. Out-of-heap values must be preceded by a heade=
r with colour black. The no-naked-pointer mode is more restricted, because =
once a static value is referenced, it can no longer be deallocated, as you =
never know whether it is still reachable by the GC. This should be enough t=
o support Ancient.
</p>

<p>
&gt; One should verify such intuitions experimentally, before trying to fix=
 them, but I=E2=80=99m not familiar with what OCaml profilers can do=E2=80=
=A6
</p>

<p>
Excluding large long-lived data from the GC is an old idea. Among recent de=
velopments, Nguyen et al. [1] distinguish a "control path" (where the gener=
ational hypothesis is assumed to hold) from a "data path" (where values are=
 assumed to follow an "epochal" behaviour (long-lived, similar lifetimes, b=
enefit from locality), and are excluded from GC). They give as motivation s=
o-called "big data" and as figures of pathological GC usage up to 50% of to=
tal runtime. I remember reading similar figures from blog posts about large=
 data sets in OCaml. In reality this indeed depends on knobs you can turn o=
n your GC that can result in increased peak memory usage among others. (Ass=
uming infinite available memory, it is even possible to let the GC share dr=
op to 0%.)
</p>

<p>
@ppedrot reported to me that in a recent experiment with Coq, using an Anci=
ent-like trick to exclude some large, long-lived and rarely-accessed values=
 from being scanned (namely serialising them into bigarrays), they saw an 8=
% performance improvement across the board in benchmarks.
</p>

<p>
Multicore, if I understood correctly, aims to support only the no-naked-poi=
nter mode, and I am not sure what the page table will become. Coq currently=
 does some out-of-heap allocation in the VM, and has been adapted to be com=
patible with the no-naked-pointer mode by wrapping out-of-heap pointers int=
o custom blocks. For scanning its custom stack (which mixes in-heap and out=
-of-heap values), Coq sets up a custom root-scanning function (`caml_scan_r=
oots_hook`), which still relies on the page table.
</p>

<p>
Note that having to wrap out-of-heap pointers in custom blocks is (much!) l=
ess expressive: for instance with Ancient you can call `List.filter` on a s=
tatically-allocated list (and correctly get a GC-allocated list of statical=
ly-allocated values). With custom blocks you cannot mix in-heap and out-of-=
heap values in this way.
</p>

<p>
For a type system to deal with "statically" allocated values, have a look a=
t Rust, which: 1) prevents cycles of reference-counting schemes thanks to u=
niqueness, 2) can treat GC roots as resources to deal with backpointers at =
the leaves of the value (cf. the interoperability with SpiderMonkey's GC in=
 Servo). A point of view that I like is that tracing GCs and static allocat=
ion differ fundamentally by how they traverse values for collection: traver=
sing live values for the first one, and traversing values at the moment of =
their death for the other. This gives them distinct advantages and drawback=
s so one can see them as complementary. (See notably [2,3].) Static allocat=
ion is interesting for performance in some aspects (no tracing, no read-wri=
te barrier, reusability of memory cells, avoids calling the GC at inappropr=
iate times), but I find it even more interesting for interoperability (e.g.=
 exchanging values freely with C or Rust, or <a href=3D"https://discuss.oca=
ml.org/t/using-a-bigarray-as-a-shared-memory-for-parallel-programming/4841/=
19">applications from that other thread</a>). It is natural to want to mix =
them in a language.
</p>

<p>
As far as I understand, developing the runtime capabilities for OCaml to de=
al with out-of-heap pointers without resorting to an expensive page table i=
s an engineering problem, not a fundamental one. If anyone is interested in=
 this, please contact me.
</p>

<p>
[1] Nguyen et al., <a href=3D"https://www.usenix.org/system/files/conferenc=
e/osdi16/osdi16-nguyen.pdf">Yak : A High-Performance Big-Data-Friendly Garb=
age Collector</a>, 2016
</p>

<p>
[2] Bacon, Cheng and Rajan, <a href=3D"http://citeseerx.ist.psu.edu/viewdoc=
/download?doi=3D10.1.1.439.1202&amp;rep=3Drep1&amp;type=3Dpdf">A Unified Th=
eory of Garbage Collection</a>, 2004
</p>

<p>
[3] Shahriyar, Blackburn and Frampton, <a href=3D"https://dl.acm.org/citati=
on.cfm?doid=3D2258996.2259008">Down for the Count? Getting Reference Counti=
ng Back in the Ring</a>, 2012
</p>
</div>
</div>


<div id=3D"outline-container-org7f92de3" class=3D"outline-3">
<h3 id=3D"org7f92de3">UnixJunkie also replied</h3>
<div class=3D"outline-text-3" id=3D"text-org7f92de3">
<p>
If you can store your long-leaved data into a bigarray, I think you would r=
each the effect that you were looking for (no more GC scanning of this data=
).
</p>

<p>
This was once advised to me by Oleg, for some performance-critical section =
of some code.
</p>
</div>
</div>
</div>


<div id=3D"outline-container-orgf226af9" class=3D"outline-2">
<h2 id=3D"orgf226af9">Old CWN</h2>
<div class=3D"outline-text-2" id=3D"text-orgf226af9">
<p>
If you happen to miss a CWN, you can <a href=3D"mailto:alan.schmitt@polytec=
hnique.org">send me a message</a> and I'll mail it to you, or go take a loo=
k at <a href=3D"http://alan.petitepomme.net/cwn/">the archive</a> or the <a=
 href=3D"http://alan.petitepomme.net/cwn/cwn.rss">RSS feed of the archives<=
/a>.
</p>

<p>
If you also wish to receive it every week by mail, you may subscribe <a hre=
f=3D"http://lists.idyll.org/listinfo/caml-news-weekly/">online</a>.
</p>

<div class=3D"authorname">
<p>
<a href=3D"http://alan.petitepomme.net/">Alan Schmitt</a>
</p>

</div>
</div>
</div>
</div>
</body>
</html>


--=-=-=--