I am pleased to announce the 109.58.00 release of the Core suite. Starting from 109.55.00, core libraries contain inline benchmark. It is now possible to run them: echo 'let () = Inline_benchmarks.Runner.main ~libname:"core_kernel"' > bench.ml ocamlfind ocamlopt -thread -linkpkg -linkall -package str,core_kernel,core_bench.inline_benchmarks bench.ml -o bench ./bench -benchmarks-runner Replace core_kernel by the library you want to run benchmarks for. Currently only core and core_kernel contain inline benchmark. In this release the following packages were upgraded: - async - async_extra - async_inotify - async_kernel - async_parallel - async_unix - core - core_bench - core_extended - core_kernel - jenga - sexplib Files and documentation for this release are available on our website and all packages are in opam: https://ocaml.janestreet.com/ocaml-core/109.58.00/individual/ https://ocaml.janestreet.com/ocaml-core/109.58.00/doc/ Here is list of changes for this version: # 109.58.00 ## async_extra - Changed `Cpu_usage` to use `Core.Percent` instead of `float` where appropriate. - Made `Bus.unsubscribe` check that the subscriber is subscribed to the given bus. - Made `Log.t` support `with sexp_of`. - Fixed `Tcp.on_port 0` to return the port actually being listened on, like `Tcp.on_port_chosen_by_os`. Previously, a serverlistening on `Tcp.on_port 0` would have its `Tcp.Server.listening_on` as `0`, which of course is not the port the server is listening on. ## async_kernel - Renamed the `Async_core` library as `Async_kernel`, to parallel `Core_kernel`. Someday `Async_core` will depend only on `Core_kernel`, but not yet. - Added a thread-safe queue of "external actions" that is checked after each job. - Fixed a race condition in `Clock.Event.abort`. Here is the race condition: * `Clock.Event.at` adds an alarm, its value is a job (let's call it job1) with this run function: ```ocaml let fire () ` t :` Happened; Ivar.fill ready `Happened; ``` * later a job (let's call it job2) aborting the clock event is queued in the async scheduler * in the same cycle, the `Timing_wheel.advance_clock` fires the alarm and job1 scheduled * at this point: + job1 and job2 are still pending + the alarm was removed so it is invalid + the clock event is still in the state `Waiting` * job2 is executed before job1: the clock event is still in the `Waiting` state, so the abort tries to remove the alarm from the timing wheel: CRASH The bugfix is for `Clock.Event.abort` to check if the alarm has already been removed from the timing wheel and if so, don't remove it again. - Changed `Monitor.try_with` when run with `~rest:\`Ignore`, the default, so that the created monitor is detached from the monitor tree. The detached monitor has no parent, rather than being a child of the current monitor. This will eliminate recently observed space leaks in `Sequencer_table` and `Throttle`, like: ```ocaml let leak () = let seq = Throttle.Sequencer.create () in let rec loop n = Throttle.enqueue seq (fun () -> loop (n + 1); Deferred.unit ) |> don't_wait_for in loop 0 ``` - Changed Async's scheduler to pool jobs rather than heap allocate them, decreasing the cost of a job by 30-40%. Changed the main scheduler queue of jobs to be an `Obj_array.t` that is essentially a specialized `Flat_queue` (the specialization was necessary for speed). Also, cleaned up the scheduler run-job loop. With these changes, the cost of a simple job decreases significantly (30-40%), across a range of live data sizes. Here are the nanoseconds-per-job numbers for a microbenchmark with the old and new approaches. ``` | num live jobs | old ns/job | new ns/job | |---------------+------------+------------| | 1 | 74 | 53 | | 2 | 75 | 47 | | 4 | 76 | 41 | | 8 | 63 | 39 | | 16 | 62 | 38 | | 32 | 61 | 37 | | 64 | 61 | 37 | | 128 | 60 | 37 | | 256 | 60 | 38 | | 512 | 60 | 38 | | 1024 | 60 | 39 | | 2048 | 61 | 40 | | 4096 | 67 | 41 | | 8192 | 65 | 45 | | 16384 | 75 | 56 | | 32768 | 115 | 67 | | 65536 | 171 | 108 | | 131072 | 255 | 158 | | 262144 | 191 | 130 | | 524288 | 216 | 139 | | 1048576 | 238 | 152 | ``` See async/bench/nanos\_per\_job.ml for the benchmark. - Removed `debug_space_leaks` from Async's internals. It hadn't been used in years. ## async_unix - Improved fairness of the async scheduler with respect to external threads, including I/O done in external threads. The change is to add a thread-safe queue of "external actions" that is checked after each job. Previously, when a job given to `In_thread.run` finished, `In_thread.run` would take the async lock, fill the result ivar and run a cycle. The problem is that in some situations, due to poor OS scheduling, the helper thread never had a chance to grab the lock. Now, `In_thread.run` tries to take the lock: - if it can it does as before - if it can't it enqueues a thunk in the external actions queue and wakes up the scheduler With this change, the helper thread doing an `In_thread.run` will always quickly finish once the work is done, and the async scheduler will fill in the result ivar as soon as the current job finishes. - Fixed `Epoll_file_descr_watcher.invariant` to deal with the timerfd, which has the edge-triggered flag set. - Added `Writer.write_gen`, a generic functor for blitting directly to a writer's buffer. ## core - Added `Debug.should_print_backtrace : bool ref`, to control whether `Debug.am*` functions print backtraces. - Added to `Float` inline benchmarks. - Moved all of the `Gc` module into `Core_kernel`. Part of the `Gc` module used to be in `Core` because it used threads. But it doesn't use threads anymore, so can be all in `Core_kernel`. - Improved `Iobuf` support for copying to/from strings and bigstrings. The new modules are `Iobuf.{Consume,Peek}.To_{bigstring,string}`. They match a `Blit`-like interface. We don't yet implement the `Blit` interface in all applicable contexts, but do use `Blit.Make` and expose some of the operations we want in the forms we expect them to take under a `Blit` interface. - Added `Linux_ext.Timerfd.to_file_descr`. - Added to `Time.next_multiple` an optional argument to control whether the inequality with `after` is strict. - Added `Time.Zone.local`, a lazily cached `Time.Zone.machine_zone ()`. This is the first stage in a plan to make explicit timezones more pervasive. First, they are made more convenient, by replacing the relatively wordy `Time.Zone.machine_zone ()` with `Time.Zone.local`. This involves making the underlying timezone type lazy. The next stage will be to remove `machine_zone` and use `Time.Zone.local` everywhere instead. Then (it is hoped) instead of `of_local_whatever`, we just say e.g. `of_date_ofday Time.Zone.local` and currently-implicitly-local functions will be able to switch over to explicit timezones without becoming too verbose. - Added `Timing_wheel.Alarm.null`. - Made `Unix.File_descr.t` have `with sexp`. Closes janestreet/async_unix#3 - Fixed OpenBSD compilation failures in C stubs. - Fixed `Lock_file.is_locked` to require read permission, not write permission, on the lock file. - Added to `Unix.mcast_join` an optional `?source:Inet_addr.t` argument. From pull-request on bitbucket: https://bitbucket.org/janestreet/core/pull-request/1/receive-source-specific-multicast/diff ## core_bench - Added support for saving inline benchmark measurements to tabular files for easy loading into Octave. ## core_extended - Cleaned up the `Stats_reporting` module ## core_kernel - Moved all of the `Gc` module into `Core_kernel`. Part of the `Gc` module used to be in `Core` because it used threads. But it doesn't use threads anymore, so can be all in `Core_kernel`. - Made `Stable.Map` and `Set` have `with compare`. - Added `String.rev`. Closes janestreet/core#16 We will not add `String.rev_inplace`, as we do not want to encourage mutation of strings. - Made `Univ_map.Key` equivalent to `Type_equal.Id`. - Added `Univ.view`, which exposes `Univ.t` as an existential, `type t = T : 'a Id.t * 'a -> t`. Exposing the existential makes it possible to, for example, use `Univ_map.set` to construct a `Univ_map.t`from a list of `Univ.t`s. This representation is currently the same as the underlying representation, but to make changes to the underlying representation easier, it has been put in a module `Univ.View`. -- Jeremie Dimino, for the Core team