# OCaml Weekly News

Hello

Here is the latest OCaml Weekly News, for the week of December 28, 2021 to = January 04, 2022.

## A hack for toplevel breakpoints using effect handlers

### wiktor announced

I started playing with effect handlers and wondered if they could be used t= o implement toplevel breakpoints. It's a big hack and probably unsound at the moment, but it works and here's an exa= mple interaction:

```let arr =3D
let fact n =
=3D
let arr =3D Array=
.make (n+1) 1 in<=
/span>
let rec loop i =3D
if i <=3D n then begin
Break.break ["i", i; "arr", arr];
arr.(i) <- arr.(i-1) * i;
loop (i+1)
end
in
(loop 1; arr)
in
fact 5;;
# (* We hit a breakpoint and obtain the continuation k *)
k ();;
- : bool =3D true
# (* the bools are leaking from the execute_phrase function
* inside the toplevel *)
k ();;
- : bool =3D true
# i;;
- : int =3D 3
# arr;;
- : int array =3D [|1; 1; 2; 1; 1; 1|]
# (* let's disturb the computation of factorials *)
arr.(i-1) <- 42;;
- : unit =3D ()
# k ();;
- : bool =3D true
# (* btw: here the user is like a scheduler for yield-based async *)
k ();;
- : bool =3D true
# k ();;
val arr : int array =3D [|1; 1; 42; 126; 504; 2520|]
- : bool =3D true
```

Currently I don't try to clean up bindings or values, which is a source of = unsoundness. After the last `k ()` we got two results: First the computation of `let arr ...` finished, an= d then the computation of `k ()` finished. But `k` is a part of the execution of `let arr ...`, so these two execution= s "intersect" without one being contained in the other. This makes the question of what should the current variable bindings= be complicated. Setting the bindings at end of execution is futile, when a continuation may in such a way leak bind= ings from breakpoint time.

Possibly a stack discipline for the execution of phrases is required to mak= e the environments behave properly: at the end of executing a phrase we cancel (with another effect, maybe) other exec= utions which "depend" on the current execution (evaluate the `k` obtained from a breakpoint in the cu= rrent execution). This should eliminate these "intersections" and we could throw out the bindings added by the cancelled = executions.

I haven't tried anything with polymorphism yet, but type variables should p= robably be changed into abstract types inside the binders.

### wiktor later said

Well, this might have been unnecessary, as most of it can be done properly = in userspace (with more syntactic overhead).

```open EffectHandlers
open Deep

type ('a, 'b) res =3D
| Bp of 'a * ((unit, ('a, 'b) res) continua=
tion)
| Fin <=
span style=3D"color: #a020f0;">of 'b

module type P1 =3D sig  val i : int  val arr : int array end
type payload =3D P1 of=
(module P1)
type _ eff +=3D Break : payload -> unit eff

let arr () =3D
let fact n =
=3D
let arr =3D Array=
.make (n+1) 1 in<=
/span>
let rec loop i =3D
if i <=3D n then begin
perform (Break (P=
1 (module =
struct let i =3D i =
let arr =3D arr end)));
arr.(i) <- arr.(i-1) * i;
loop (i+1)
end
in
(loop 1; arr)
in
fact 5;;

let with_break th =3D
try_with (fun () -> Fin (th ())) ()
{ effc =3D fun (type a) (e : a eff<=
/span>) ->
match e with
| Break p -> Som=
e (fun (k : (a,_) continuation) -> Bp (p, k))
| _ -> =
None }

let cont =3D functi=
on
| Bp (_=
, k) -> continue k ()
| Fin _=
-> failwith "computation finished, cannot continue"

let get =3D functio=
n
| Bp (r=
, _) -> r
| Fin _=
-> failwith "computation finished, no breakpoint payload"

let get1 r =
=3D match get r with P1 m -> m
```
```# let r =3D with_break a=
rr;;
val r : (payload, int array) res =3D Bp (P1 <module>, <abstr>)
# open (val get1 r);;
val i : int =3D 1
val arr : int array =3D [|1; 1; 1; 1; 1; 1|]
```

The main pain point is having to define the payload types. In basic cases t= he payload type could be just a simple polymorphic variant. It would be nice if it could be completely inferred, b= ut it's unlikely as `Break` has to have a statically known argument.

With a bit of help from tooling (ppxes for code generation and shorthands i= n the toplevel), this could be better than printf debugging.

### Guillaume Munch-Maccagnoni then said

This is an interesting experiment.

• This reminds me of the idea of high-level stack inspection for debuggin= g and security (articulated for instance in Clements' PhD thesis Portable and high-level access to the stack with Continuati= on Marks; here's another more recent paper from the Racket people that might= be relevant). One can ask whether a PPX can provide high-level stack inspe= ction or if one needs support from the compiler for that. It's nice to expe= riment.
• A few years ago someone asked whether there could be a use to untyped a= lgebraic effects in OCaml (in the sense that they do not appear in the effe= ct annotation in function types). I proposed debugging as an example. Someo= ne suggested that it is not too hard to adapt the interface types of all fu= nctions in the call chain to add the appropriate effect annotation (and rem= ove it afterwards), but I was not convinced.

## Multi-shot continuations gone forever?

What happens with multi-shot continuations now that Obj.clone_continuation = was removed? (ht= tps://github.com/ocaml-multicore/ocaml-multicore/pull/651)

Anything that requires a "fork" operation, like say, a probabilistic progra= mming EDSL, needs this. None of the old examples I've looked at like Delimcc on top of effects have been updated to use a new method, and I haven't been able to find any hints of one.

Are multi-shot continuations just not possible now? Are there plans to add = something equivalent back in later?

### Nicol=C3=A1s Ojeda B=C3=A4r replied

Yes, multi-shot continuations are gone and is unlikely that they will find = their way back any time soon. One (good) reason is explained in h= ttps://dl.acm.org/doi/10.1145/3434314 :

and

### Guillaume Munch-Maccagnoni then said

I think the question still stands. You cut the sentence =E2=80=9C_Extending= our system with multi-shot continuations is future work (=C2=A78)_=E2=80=9D. Also the paper is about a particular model= based on separation logic rather than OCaml itself (for instance the authors also mention that their continuations are affine = instead of linear unlike in OCaml multicore).

Nevertheless, the multicore designers were aware that duplicating continuat= ions makes it complicated to reason about resources. The topic of mixing continuations and linearity has been better = studied from the angle of algebraic models of computation and proof theory. Essentially, with an effect system you cou= ld ensure that certain kinds of effects do not happen in the delimited part of the program (including allocating a res= ource), which controls copiability of the stack from the point of view of reasoning about the program. This is inspir= ed by some logics that mix classical and intuitionistic or linear logic. From this angle the ability to copy a conti= nuation would be restricted to a sub-part of the language which is pure to some degree. This should also be a suitabl= e starting point if one wanted to develop a program logic to formalise the reasoning about such programs.

However according to #651 there were more technical reasons to drop `clone_continuation`, such as breaking compiler = optimizations. I am curious as well to know whether there are plans to reintroduce `clone_continuation` at some poin= t, but obviously this would require some effort.

### KC Sivaramakrishnan said

@nojb and @gadmm have already answered why we've dropped support for ```= clone_continuation``` now. We will need to track the copiability of the continuation in the continuation type and compiler o= ptimisations also need to be made aware of the possibility of copying. Given the pervasive nature of its effects, ther= e are no immediate plans to bring the feature back. We will have to come back to this after we have typed effects.

Anything that requires a =E2=80=9Cfork=E2=80=9D operation, like say, a prob= abilistic programming EDSL

One can get pretty far with PPL with just one-shot continuations. My studen= t and I did some experiments building a DSL for a PPL to learn about the space: https://github.com/Arnhav-Datar/EffPPL. Having spoken t= o PPL experts there are indeed some usecases where multi-shot continuations are useful, but fro= m what I understand, the one-shotness isn't a blocker for PPL.

I would be interested in collecting usecases where multi-shot continuations= are absolutely necessary.

### gasche then said

Interesting!

My (probably naive) mental model of HANSEI-style libraries, using multishot= continuations, is that they are extensions/generalization of a non-probabilistic "logic/non-deterministic m= onad" that searches for the set of solutions to a problem. Multishot continuations are naturally used in non-d= eterministic computations at backtracking points, to explore different search directions and collect the result. It i= s possible to avoid multishot continuations by replaying the whole search from the start each time (refer= ence: Capturing the future by = replaying the past), but this involves duplicated computations so it is less effi= cient (reference: Asymptotic speedup= with first-class control).

Can you give some intuition of how other approaches to probalistic inferenc= e work, that do not require multishot continuations? Are they also duplicating computations, or are they using a = magic trick to avoid this issue with a different inference algorithm?

I tried to find an answer to this question by reading the internship report, but I couldn't locate an answer. (The report mentions HANSEI in the related work, but it does not discuss this qu= estion.) The report explains that the inference algorithm, called HMC (Hamiltonian Monte Carlo), uses automatic d= ifferenciation; so it uses a sort of symbolic manipulation / analysis of the probabilistic program to sample. Bu= t does this avoid repeated computations? It may be the case instead that the differential is as large or larger than= the program itself, and that the search algorithm using this differential in effect perform a program-sized computa= tion at each search step, duplicating computations.

Not a PPL but I've been hacking on a little effects-based model checker for= concurrent data structures that implements dynamic partial order reduction (https://github.com/sadiqj/dscheck/ - a WIP!). Multi-sho= t continuations would have been very useful.

I ended up implementing something that involves maintaining a schedule and = repeatedly replaying the computation. It looks very similar to what Cap= turing the future.. proposes.

## New release of Menhir (20211230)

### Fran=C3=A7ois Pottier announced

Dear OCaml & Menhir users,

I am pleased to announce a new release of Menhir, with a major improvement.

The code back-end has been rewritten from the ground up by =C3=89mile Troti= gnon and by myself, and now produces efficient and well-typed OCaml code. The infamous Obj.magic is not used any more.

Furthermore, the new code back-end produces code that is more aggressively optimized, leading to a significant reduction in memory allocation and a typical performance improvement of up to 20% compared to the previous code back-end.

```opam update
opam install menhir.20211230
```

Happy well-typed parsing in 2022!

#### 2021/12/30

• The code back-end has been rewritten from the ground up by =C3=89mile Troti= gnon and Fran=C3=A7ois Pottier, and now produces efficient and well-typed= OCaml code. The infamous `Obj.magic` is not used any more.

The table back-end and the Coq back-end are unaffected by this change.

The main side effects of this change are as follows:

• The code back-end now needs type information. This means that
either Menhir's type inference mechanism must be enabled (the easiest way of enabling it is to use Menhir via `dune` and to check that the `dune-project` file says `(using menhir 2.0)` or later)
or the type of every nonterminal symbol must be explicitly given via a `%type` declaration.
• The code back-end no longer allows the type of any symbol to be an open polymorphic variant type, such as `[> `A ]`. As a workar= ound, we suggest using a closed polymorphic variant instead.
• The code back-end now adheres to the simplified error-handling strat= egy, as opposed to the legacy strategy.

For grammars that do not use the `error` token, this make= s no difference.

For grammars that use the `error` token in the limited way permi= tted by the simplified strategy, this makes no difference either. The simplified strategy makes the following requirement: the `error` token shou= ld always appear at the end of a production, whose semantic action should abort the parser by raising an exception.

Grammars that make more complex use of the `error` token, and th= erefore need the `legacy` strategy, cannot be compiled by the new code b= ack-end. As a workaround, it is possible to switch to the table back-end (using `--table --strategy legacy`) or to the ancient code back-end (us= ing `--code-ancient`). In the long run, we recommend abandoning t= he use of the `error` token. Support for the `error` token = may be removed entirely at some point in the future.

The original code back-end, which has been around since the early days of Menhir (2005), temporarily remains available (using ```--code-ancient). It will be removed at some point in the future. ```

``` The new code back-end offers several levels of optimization, which remain undocumented and are subject to change in the future. At present, the main levels are roughly as follows: -O 0 --represent-everything uses a uniform representation = of the stack and produces straightforward code. -O 0 uses a non-uniform representation of the stack; some = stack cells have fewer fields; some stack cells disappear altogether. -O 1 reduces memory traffic by moving PUSH op= erations so that they meet POP operations and cancel out. -O 2 optimizes the reduction of unit productions (that is,= productions whose right-hand side has length 1) by performing a limited amount of code specialization. The default level of optimization is the maximum level, -O 2. ```
• ``` The new command line switch --exn-carries-state causes the= exception Error to carry an integer parameter: exception Error of = int. When the parser detects a syntax error, the number of the current state is reported in this way. This allows the caller to select a suitable syntax error message, along the lines described in Sect= ion 11 of the manual. This command line switch is currently supported by the code back-end only. The \$syntaxerror keyword is no longer supported. Document the trick of wrapping module aliases in open struct ... = end, like this: %{ open struct module alias M =3D MyLongModuleName end %}<= /code>. This allows you to use the short name M in your grammar, but f= orces OCaml to infer types that refer to the long name MyLongModuleName. (Suggested by Fr=C3=A9d=C3=A9ric Bour.) ```
``` ```
``` ```
``` ```
``` ```
``` Improved documentation for Fix Archive: https://discuss.ocaml.org/t/ann-improved-documentation-for-= fix/9079/1 Fran=C3=A7ois Pottier announced My last contribution for 2021 is an improved documentation for Fix, a libra= ry that helps with various algorithmic constructions that involve memoization, recursion, and numbering. The documentation can be viewed online. It can also be viewed locally (on your own machine) as follows: opam update opam install fix.20211231 opam install odig odig odoc # this may take some time odig doc fix # this opens the doc in your browser Happy fix'in' in 2022! pp-binary-ints 0.1.1 Archive: https://discuss.ocaml.org/t/ann-pp-binary-ints-0-1-1/9080/1 Ifaz Kabir announced Tired of printing octals and hexadecimals and then mentally converting them= to bits. Ever wanted to just see the bits in an int? Now you can! Just run opam install pp-binary-ints and off you go: # Pp_binary_in= ts.Int.to_string 0b10101001;; - : string =3D "10101001" You can find the documentation for the project and more examples of how to = use it here. The library is very customizable. You can choose to print with 0b prefixes and _ separators. You can choose to print zeros just like the non-zeros, with prefixes an= d separators. If you use zero padding, you can control how many leading zeros show up= with the ~min_width argument. It correctly handles the edge cases when adding _ separato= rs: you won=E2=80=99t get leading underscores. It includes pretty printers that work with Format and Fmt , not just to_string functions. Supports int, int32, int64, and = nativeint. Don't like the default prefixes and suffixes? Customize the prefixes an= d suffixes with the provided functor. Old CWN If you happen to miss a CWN, you can send me a message and I'll mail it to you, or go take a loo= k at the archive or the <= a href=3D"https://alan.petitepomme.net/cwn/cwn.rss">RSS feed of the archive= s. If you also wish to receive it every week by mail, you may subscribe online. Alan Schmitt ```
``` --=-=-=-- ```