From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 98A897EE4B for ; Mon, 30 Sep 2013 16:40:21 +0200 (CEST) Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of yotambarnoy@gmail.com) identity=pra; client-ip=209.85.216.173; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="yotambarnoy@gmail.com"; x-sender="yotambarnoy@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of yotambarnoy@gmail.com designates 209.85.216.173 as permitted sender) identity=mailfrom; client-ip=209.85.216.173; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="yotambarnoy@gmail.com"; x-sender="yotambarnoy@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-qc0-f173.google.com) identity=helo; client-ip=209.85.216.173; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="yotambarnoy@gmail.com"; x-sender="postmaster@mail-qc0-f173.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AhEDAJmMSVLRVditlGdsb2JhbABaDoMxUq4lihSIRIElCBYOAQEBAQcLCwkSKoIlAQEEAUABGwsHBgUBAwELBgULAwoNISEBAREBBQEKBAENBhMSh2EBAwkGDJ8rjFKDCoQJChknAwpkiQABBQyMWoJnBAeEIgOWFoFpgS+LFoNKGCmEDlsg X-IPAS-Result: AhEDAJmMSVLRVditlGdsb2JhbABaDoMxUq4lihSIRIElCBYOAQEBAQcLCwkSKoIlAQEEAUABGwsHBgUBAwELBgULAwoNISEBAREBBQEKBAENBhMSh2EBAwkGDJ8rjFKDCoQJChknAwpkiQABBQyMWoJnBAeEIgOWFoFpgS+LFoNKGCmEDlsg X-IronPort-AV: E=Sophos;i="4.90,1008,1371074400"; d="scan'208";a="34930776" Received: from mail-qc0-f173.google.com ([209.85.216.173]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 30 Sep 2013 16:40:19 +0200 Received: by mail-qc0-f173.google.com with SMTP id c3so3666053qcv.4 for ; Mon, 30 Sep 2013 07:40:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=AYIuvrVJnujbaBPfxaLujDEerCbYjwpBJiRZ1KeI/nc=; b=goQI2LeAqYtT7lHpeJdXDE4+jaij5JdylSyCAaaYex0F90tMSc6sroke0JbJMozJBz a4xPSBvCIsDhW7RaD0hC4W7X4euOFTptshEZJxhU3lDG7PCuZIk0reCQqHyrDlkS/cKN ipOqcELoKMqFjScnTe4iRR0r9B98+mLyPHlGKD8eCw8B2LW2YmzR905K0DUVWKPypD9n dF8BamuKPBZtQXs20W+oWzXMglExiGXvPOESBY7rpE1SGEQhp4wYaipwJWILMRdcw2J7 Nqdg8k4em/NJUoMym0+ZUthUUgaTGO4hymfBD1sq7WkX6c4AGb3ThSZ4o5EfLrz3TB+r 1j4g== X-Received: by 10.49.131.132 with SMTP id om4mr29095679qeb.2.1380552019429; Mon, 30 Sep 2013 07:40:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.224.139.20 with HTTP; Mon, 30 Sep 2013 07:39:59 -0700 (PDT) In-Reply-To: References: <86bo3ao31h.fsf@cam.ac.uk> From: Yotam Barnoy Date: Mon, 30 Sep 2013 10:39:59 -0400 Message-ID: To: Leo White Cc: Sungwoo Park , Ocaml Mailing List Content-Type: multipart/alternative; boundary=047d7beb9a46b64e5404e79ad07d Subject: Re: [Caml-list] Two questions on the OCAML runtime system and compiler, --047d7beb9a46b64e5404e79ad07d Content-Type: text/plain; charset=ISO-8859-1 A simpler option would be to use a hash table, and to put deallocations in a log with timestamps. When threads access caml_page_table_lookup(), they write a timestamp into their respective slot. OS block deallocations can only be carried out when all threads have seen the deallocations in the hash table already up to the timestamp of the deallocation. Yotam On Mon, Sep 30, 2013 at 10:23 AM, Yotam Barnoy wrote: > From what I understand, Luca's patch is going to be mainstreamed sometime > in the not-so-distant future, so it might make sense to work directly off > of his branch, no? > > The problem you're having is that every caml_page_table_lookup() needs to > acquire the heap allocator's lock, right? Perhaps you can make this access > lockless. If the global page table is a persistent data structure, like a > Map (but in C of course) and the allocator creates a new data structure > root every time a block is allocated from/released from the OS, then all > you would need is a boolean per thread indicating that it's seen the new > global root before the allocator can go ahead and deallocate whatever > blocks need to be deallocated from the old persistent trees. Since blocks > shouldn't be allocated from the OS that frequently, this shouldn't be too > expensive. > > Yotam > > > On Mon, Sep 30, 2013 at 9:24 AM, Leo White wrote: > >> >> We are also experimenting with a parallel OCaml runtime system at >> OCamlLabs (https://github.com/ocamllabs/ocaml/tree/multicore). Out of >> interest is the source for your experimental compiler and runtime >> available >> anywhere? >> >> With respect to the page table, I don't think it serves any other >> purpose. I know various people have been looking into removing >> it (or at least removing it from common operations) without the addition >> of a bit field in the header. So it is possible you could avoid changing >> the header layout in your version of the compiler as well. >> >> Regards, >> >> Leo White >> >> Sungwoo Park writes: >> >> > Dear Caml users, >> > >> > We are currently experimenting with a parallel Ocaml runtime system, in >> which >> > each thread maintains its own old and young heaps and runs in parallel >> with >> > other threads, without sharing a global lock as in the current Ocaml >> runtime >> > system while sharing the whole memory space. We have made changes to the >> > current Ocaml compiler (4.00.1), and now the parallel compiler generates >> > parallel code that runs on the parallel Ocaml runtime system. It is a >> > cross-compiler and we can build it with any version of the Ocaml >> compiler, >> > including the parallel compiler itself. >> > >> > This is a project very similar to Luca Saiu's reentrant multi-runtime >> system >> > (https://github.com/lucasaiu/ocaml), and we benefited very much from >> Luca's >> > code at several early stages in the development. Similarly to Luca's >> work, our >> > runtime system is reentrant. >> > >> > As the next step toward a parallel runtime system with a shared heap, >> we are >> > struggling to redefine the tag format. I have two specific questions on >> the >> > compiler and the runtime system. Any comment and help would be greatly >> > appreciated. >> > >> > Question 1. >> > >> > When we reassign tag numbers for block types, the runtime system >> crashes for >> > some applications. We thought we made all necessary changes to the >> source code >> > (in byterun/mlvalues.h, utils/config.mlp, and several files in >> bytecomp/) to >> > reflect the new tag numbers. However the runtime system often crashes, >> for >> > example, when it accesses the Lazy module, which in turn accesses the >> Obj >> > module, in testsuite/tests/misc/hamming.ml. I wonder what other parts >> in the >> > source code we should revise after reassigning tag numbers. >> > >> > (Our finding is that this is not a problem due to the parallel runtime >> system, >> > as we could reproduce the same outcome on the sequential runtime >> system.) >> > >> > Question 2. >> > >> > On a 64-bit system, can we safely get rid of the page table, in >> particular >> > caml_page_table_lookup(), if the header of each block is redesigned to >> include >> > a field indicating which memory area it resides (old heap, young heap, >> static >> > data, etc)? I wonder what other purposes the page table serves, other >> than >> > determining the memory area for a given pointer. >> > >> > (Perhaps adding a new field in the header would be too impractical on a >> 32-bit >> > system, but on a 64-bit system, we can affort to allocate quite a few >> bits for >> > new fields in the header. Windows might need the page table to check if >> a given >> > pointer points to code, but luckily we don't need to consider Windows >> for our >> > purpose.) >> > >> > Thank you very much! >> > >> > --- Sungwoo Park >> >> -- >> Caml-list mailing list. Subscription management and archives: >> https://sympa.inria.fr/sympa/arc/caml-list >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >> Bug reports: http://caml.inria.fr/bin/caml-bugs >> > > --047d7beb9a46b64e5404e79ad07d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
A simpler option would be to use a hash table, and to= put deallocations in a log with timestamps. When threads access caml_page_= table_lookup(), they write a timestamp into their respective slot. OS block= deallocations can only be carried out when all threads have seen the deall= ocations in the hash table already up to the timestamp of the deallocation.=

Yotam


On Mon, Sep 30, 2013 at 10:23 AM, Yotam Barnoy <yotambarno= y@gmail.com> wrote:
From what I under= stand, Luca's patch is going to be mainstreamed sometime in the not-so-= distant future, so it might make sense to work directly off of his branch, = no?

The problem you're having is that every caml_page_table_looku= p() needs to acquire the heap allocator's lock, right? Perhaps you can = make this access lockless. If the global page table is a persistent data st= ructure, like a Map (but in C of course) and the allocator creates a new da= ta structure root every time a block is allocated from/released from the OS= , then all you would need is a boolean per thread indicating that it's = seen the new global root before the allocator can go ahead and deallocate w= hatever blocks need to be deallocated from the old persistent trees. Since = blocks shouldn't be allocated from the OS that frequently, this shouldn= 't be too expensive.

Yota= m


On Mon, Sep 30, 2013 at= 9:24 AM, Leo White <lpw25@cam.ac.uk> wrote:

We are also experimenting with a parallel OCaml runtime system at
OCamlLabs (https://github.com/ocamllabs/ocaml/tree/multicore). Out= of
interest is the source for your experimental compiler and runtime available=
anywhere?

With respect to the page table, I don't think it serves any other
purpose. I know various people have been looking into removing
it (or at least removing it from common operations) without the addition
of a bit field in the header. So it is possible you could avoid changing
the header layout in your version of the compiler as well.

Regards,

Leo White

Sungwoo Park <gla= @postech.ac.kr> writes:

> Dear Caml users,
>
> We are currently experimenting with a parallel Ocaml runtime system, i= n which
> each thread maintains its own old and young heaps and runs in parallel= with
> other threads, without sharing a global lock as in the current Ocaml r= untime
> system while sharing the whole memory space. We have made changes to t= he
> current Ocaml compiler (4.00.1), and now the parallel compiler generat= es
> parallel code that runs on the parallel Ocaml runtime system. It is a<= br> > cross-compiler and we can build it with any version of the Ocaml compi= ler,
> including the parallel compiler itself.
>
> This is a project very similar to Luca Saiu's reentrant multi-runt= ime system
> (https= ://github.com/lucasaiu/ocaml), and we benefited very much from Luca'= ;s
> code at several early stages in the development. Similarly to Luca'= ;s work, our
> runtime system is reentrant.
>
> As the next step toward a parallel runtime system with a shared heap, = we are
> struggling to redefine the tag format. I have two specific questions o= n the
> compiler and the runtime system. Any comment and help would be greatly=
> appreciated.
>
> Question 1.
>
> When we reassign tag numbers for block types, the runtime system crash= es for
> some applications. We thought we made all necessary changes to the sou= rce code
> (in byterun/mlvalues.h, utils/config.mlp, and several files in bytecom= p/) to
> reflect the new tag numbers. However the runtime system often crashes,= for
> example, when it accesses the Lazy module, which in turn accesses the = Obj
> module, in testsuite/tests/misc/hamming.ml. I wonder what other parts in the
> source code we should revise after reassigning tag numbers.
>
> (Our finding is that this is not a problem due to the parallel runtime= system,
> as we could reproduce the same outcome on the sequential runtime syste= m.)
>
> Question 2.
>
> On a 64-bit system, can we safely get rid of the page table, in partic= ular
> caml_page_table_lookup(), if the header of each block is redesigned to= include
> a field indicating which memory area it resides (old heap, young heap,= static
> data, etc)? I wonder what other purposes the page table serves, other = than
> determining the memory area for a given pointer.
>
> (Perhaps adding a new field in the header would be too impractical on = a 32-bit
> system, but on a 64-bit system, we can affort to allocate quite a few = bits for
> new fields in the header. Windows might need the page table to check i= f a given
> pointer points to code, but luckily we don't need to consider Wind= ows for our
> purpose.)
>
> Thank you very much!
>
> --- Sungwoo Park

--
Caml-list mailing list. =A0Subscription management and archives:
ht= tps://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


--047d7beb9a46b64e5404e79ad07d--