From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id C3F8B7F907 for ; Wed, 4 Jun 2014 01:52:20 +0200 (CEST) Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of interlock.public@gmail.com) identity=pra; client-ip=209.85.128.172; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="interlock.public@gmail.com"; x-sender="interlock.public@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of interlock.public@gmail.com designates 209.85.128.172 as permitted sender) identity=mailfrom; client-ip=209.85.128.172; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="interlock.public@gmail.com"; x-sender="interlock.public@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-ve0-f172.google.com) identity=helo; client-ip=209.85.128.172; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="interlock.public@gmail.com"; x-sender="postmaster@mail-ve0-f172.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjcDAGpfjlPRVYCsm2dsb2JhbABZg1lYgmynWQaYJgGBAQgWDgEBAQEBBgsLCRQogiUBAQEDARIRBBkBGx0BAwELBgULAwoqAgIhAQERAQUBHAYTIogLAQMJCA2gDWqLJ4Fygw2ZdQoZJw1khSQRAQEEDIVJhmeCEgQHgnWBSwSWAoIPgXmNPoQEGCmEaTw X-IPAS-Result: AjcDAGpfjlPRVYCsm2dsb2JhbABZg1lYgmynWQaYJgGBAQgWDgEBAQEBBgsLCRQogiUBAQEDARIRBBkBGx0BAwELBgULAwoqAgIhAQERAQUBHAYTIogLAQMJCA2gDWqLJ4Fygw2ZdQoZJw1khSQRAQEEDIVJhmeCEgQHgnWBSwSWAoIPgXmNPoQEGCmEaTw X-IronPort-AV: E=Sophos;i="4.98,969,1392159600"; d="scan'208";a="65336399" Received: from mail-ve0-f172.google.com ([209.85.128.172]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 04 Jun 2014 01:51:51 +0200 Received: by mail-ve0-f172.google.com with SMTP id oz11so7891909veb.31 for ; Tue, 03 Jun 2014 16:51:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=mh+Yk7rw6i++LhOyxoYk+R4hA+K2jh9wehGSMExZUpA=; b=kqQosavck+vNeN9ex5A5Uoy6eTjV1Kar+MncR8cJezDePMT9DxW5CIrNUzRKxV4C5I Owtah7+QwnXKLaIz6sERBLOXpINgITG8J2XrP9MbxBFrpnrHfTHHWT8YPTh8P+kyhknR om05/R5USJ5Vq5z4YPeFaRKEQFK6raI8ESpLcc3S8uiVvDXIYfY9im8VluXC3C3a/Zii FFvAkBLLQWfaSm4okI4ZTlDGXniZjX8+yHCq4uHgcam8Yg1nxzPwEKaEIG5Sa7JHeCjF VJG/6cn1EAbjNLkjRmFBxxmdkJxAEbniJ/P/B0t/Mxyvc+bki0tRfzYWPvJLNzfMmu/6 QPLQ== MIME-Version: 1.0 X-Received: by 10.58.246.164 with SMTP id xx4mr5273041vec.52.1401839510220; Tue, 03 Jun 2014 16:51:50 -0700 (PDT) Received: by 10.221.45.130 with HTTP; Tue, 3 Jun 2014 16:51:50 -0700 (PDT) In-Reply-To: References: Date: Wed, 4 Jun 2014 00:51:50 +0100 Message-ID: From: Dan Stark To: Yaron Minsky Cc: David House , OCaml Mailing List Content-Type: multipart/alternative; boundary=047d7bea2ff609f58204faf732de Subject: Re: [Caml-list] How is Async implemented? --047d7bea2ff609f58204faf732de Content-Type: text/plain; charset=UTF-8 Hi Yaron Yes, I do read your book and currently reading this chapter. And this is the reason I got interested of how it is implemented behind the scene. Thanks for all the help I got from here. Dan On Wed, Jun 4, 2014 at 12:17 AM, Yaron Minsky wrote: > For what it's worth, this gives a decent overview of Async (if I do > say so myself) > > > https://realworldocaml.org/v1/en/html/concurrent-programming-with-async.html > > This isn't quite right, but a good mental model is to think of Async > as being single threaded. Scheduler.go starts up the async scheduler > on the main thread, and you do indeed need to do that for any IO to > actually happen. > > y > > On Tue, Jun 3, 2014 at 4:59 PM, Dan Stark > wrote: > > Hi David > > > > Thank you very much for this comprehensive explanation. > > > > Can I also know who is responsible for the queue and scheduler? > > > > Are they created and maintained by OCaml thread (OCaml internal) or Async > > (3rd party library, which means Async create the job queue and has its > own > > scheduler)? > > > > In addition, will the compiler got involved in handling Deferred.t? > > > > I ask above questions because I felt quite curious about what is > happening > > in the followings: > > > > Suppose we have a normal function: > > > >> let f1 () = print_endline "hello"; whatever_result;; > > > > > > Normally, no matter what whatever_result is, when I do let _ = f1 ();;, > > print_endline "hello" will be executed, am I right? For example, finally > > returning an int or a record or a lazy.t, etc, "hello" would be printed > out. > > > > However, if I do > > > >> let f2 () = print_endline "hello"; return 1;; > > > > > > let _ = f2 ();; would do nothing unless I run the schedule let _ = > > ignore(Scheduler.go());; > > > > Since for f2 I am not using any other special creation function and the > only > > special bit is return 1 after print_endline, if the compiler doesn't get > > involved, how can compiler know the whole application of f2() should be > in > > future execution? > > > > Sorry for my above verbose questions if they are boring. I am just > trying to > > understand more and I guess eventually I will look into the code once I > > grasp the big picture. > > > > thanks > > > > Dan > > > > > > > > > > > > > > > > > > > > On Tue, Jun 3, 2014 at 5:29 PM, David House > wrote: > >> > >> There is a queue of jobs in the scheduler. The scheduler runs the jobs > one > >> by one. Jobs may schedule other jobs. A job is a pair of ['a * 'a -> > unit]. > >> > >> There's a thing called a deferred. ['a Deferred.t] is an initially empty > >> box that may become filled later with something of type ['a]. There is a > >> similar type called ['a Ivar.t] -- the difference is that ivars have a > >> function to actually fill in the value, whereas deferreds do not: a > deferred > >> is a "read-only" view on an ivar. > >> > >> You can wait on a deferred using bind. Doing [x >>= f] mutates the > >> deferred x to add f as a "handler". When a deferred is filled, it adds > a job > >> to the scheduler for each handler it has. > >> > >> Doing [Deferred.return 1] allocates a deferred which is already filled > and > >> has no handlers. Binding on that will immediately schedule a job to run > your > >> function. (The job is still scheduled though, rather than being run > >> immediately, to ensure that you don't have an immediate context switch > -- in > >> async, the only context switch points are the binds.) > >> > >> The primitive operations that block are replaced with functions that > >> return deferreds, and go do their work in a separate thread. There's a > >> thread pool to make sure you don't use infinity threads. (I think the > >> default cap is 50 threads.) I think yes, async does depend on -thread. > >> > >> There is an important optimisation: if you want to read or write to > >> certain file descriptors, that doesn't use a thread. Instead there's a > >> central list of such file descriptors. There's also a central list of > all > >> "timer events" (e.g. deferreds that become deferred after some amount of > >> time). The scheduler actually is based around a select loop: it does the > >> following: > >> > >> run all the jobs > >> if more jobs have been scheduled, run those too > >> keep going until there are no more jobs, or we hit the > >> maximum-jobs-per-cycle cap > >> sleep using select until one read fd is read, or a write fd is ready, > or a > >> timer event is due to fire > >> do that thing > >> > >> There's also a way to manually interrupt the scheduler. Blocking > >> operations other than reading/writing to fds do this: they run in a > thread, > >> grab the async scheduler lock, fill in an ivar, then wake up the > scheduler > >> to ensure timely running of the jobs they just scheduled. The async > >> scheduler lock is necessary because the scheduler itself is not > re-entrant: > >> you cannot have multiple threads modifying the scheduler's internals. > >> > >> > >> On 3 June 2014 16:39, Dan Stark wrote: > >>> > >>> Hi all > >>> > >>> I am trying to get a rough overview of how Async is implemented (or the > >>> idea behind it) before I really dig into its source code. > >>> > >>> I have the following questions: > >>> > >>> Q1: Is Async event-loop like? > >>> > >>> From the API and some docs for Async's usage, I feel it is quite like a > >>> event-loop. > >>> > >>> You create Deferred.t and it might be added to a queue and a scheduler > >>> behind might be adjusting the order of running for all Deferred.t in > the > >>> queue. > >>> > >>> Am I correct? > >>> > >>> Q2: Deferred.return and Deferred.bind > >>> > >>> If I say > >>> > >>>> Deferred.return 1 > >>> > >>> > >>> It will returns me a Deferred.t, but inside the function return or bind > >>> somehow an "event" is implicitly added to the default queue for > scheduling, > >>> right? > >>> > >>> If I am correct above, > >>> > >>> Q3: Is Async depending on -thread? The queue or scheduler needs > compiler > >>> support? > >>> > >>> I just need to understand the whole picture in a rough way first. > >>> > >>> Thanks > >>> > >>> Dan > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >> > > > --047d7bea2ff609f58204faf732de Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Yaron

Yes, I do read your book and c= urrently reading this chapter. And this is the reason I got interested of h= ow it is implemented behind the scene.

Thanks for = all the help I got from here.

Dan




On Wed, Jun 4, 2014 at = 12:17 AM, Yaron Minsky <yminsky@janestreet.com> wrote:<= br>
For what it's worth, this gives a decent= overview of Async (if I do
say so myself)

https://realworldocaml.org/v1/en/html/concu= rrent-programming-with-async.html

This isn't quite right, but a good mental model is to think of Async
as being single threaded. =C2=A0Scheduler.go starts up the async scheduler<= br> on the main thread, and you do indeed need to do that for any IO to
actually happen.

y

On Tue, Jun 3, 2014 at 4:59 PM, Dan Stark <interlock.public@gmail.com> wrote:
> Hi David
>
> Thank you very much for this comprehensive explanation.
>
> Can I also know who is responsible for the queue and scheduler?
>
> Are they created and maintained by OCaml thread (OCaml internal) or As= ync
> (3rd party library, which means Async create the job queue and has its= own
> scheduler)?
>
> In addition, will the compiler got involved in handling Deferred.t?
>
> I ask above questions because I felt quite curious about what is happe= ning
> in the followings:
>
> Suppose we have a normal function:
>
>> let f1 () =3D print_endline "hello"; whatever_result;; >
>
> Normally, no matter what whatever_result is, when I do let _ =3D f1 ()= ;;,
> print_endline "hello" will be executed, am I right? For exam= ple, finally
> returning an int or a record or a lazy.t, etc, "hello" would= be printed out.
>
> However, if I do
>
>> let f2 () =3D print_endline "hello"; return 1;;
>
>
> let _ =3D f2 ();; would do nothing unless I run the schedule let _ =3D=
> ignore(Scheduler.go());;
>
> Since for f2 I am not using any other special creation function and th= e only
> special bit is return 1 after print_endline, if the compiler doesn'= ;t get
> involved, how can compiler know the whole application of f2() should b= e in
> future execution?
>
> Sorry for my above verbose questions if they are boring. I am just try= ing to
> understand more and I guess eventually I will look into the code once = I
> grasp the big picture.
>
> thanks
>
> Dan
>
>
>
>
>
>
>
>
>
> On Tue, Jun 3, 2014 at 5:29 PM, David House <dhouse@janestreet.com> wrote:
>>
>> There is a queue of jobs in the scheduler. The scheduler runs the = jobs one
>> by one. Jobs may schedule other jobs. A job is a pair of ['a *= 'a -> unit].
>>
>> There's a thing called a deferred. ['a Deferred.t] is an i= nitially empty
>> box that may become filled later with something of type ['a]. = There is a
>> similar type called ['a Ivar.t] -- the difference is that ivar= s have a
>> function to actually fill in the value, whereas deferreds do not: = a deferred
>> is a "read-only" view on an ivar.
>>
>> You can wait on a deferred using bind. Doing [x >>=3D f] mut= ates the
>> deferred x to add f as a "handler". When a deferred is f= illed, it adds a job
>> to the scheduler for each handler it has.
>>
>> Doing [Deferred.return 1] allocates a deferred which is already fi= lled and
>> has no handlers. Binding on that will immediately schedule a job t= o run your
>> function. (The job is still scheduled though, rather than being ru= n
>> immediately, to ensure that you don't have an immediate contex= t switch -- in
>> async, the only context switch points are the binds.)
>>
>> The primitive operations that block are replaced with functions th= at
>> return deferreds, and go do their work in a separate thread. There= 's a
>> thread pool to make sure you don't use infinity threads. (I th= ink the
>> default cap is 50 threads.) I think yes, async does depend on -thr= ead.
>>
>> There is an important optimisation: if you want to read or write t= o
>> certain file descriptors, that doesn't use a thread. Instead t= here's a
>> central list of such file descriptors. There's also a central = list of all
>> "timer events" (e.g. deferreds that become deferred afte= r some amount of
>> time). The scheduler actually is based around a select loop: it do= es the
>> following:
>>
>> run all the jobs
>> if more jobs have been scheduled, run those too
>> keep going until there are no more jobs, or we hit the
>> maximum-jobs-per-cycle cap
>> sleep using select until one read fd is read, or a write fd is rea= dy, or a
>> timer event is due to fire
>> do that thing
>>
>> There's also a way to manually interrupt the scheduler. Blocki= ng
>> operations other than reading/writing to fds do this: they run in = a thread,
>> grab the async scheduler lock, fill in an ivar, then wake up the s= cheduler
>> to ensure timely running of the jobs they just scheduled. The asyn= c
>> scheduler lock is necessary because the scheduler itself is not re= -entrant:
>> you cannot have multiple threads modifying the scheduler's int= ernals.
>>
>>
>> On 3 June 2014 16:39, Dan Stark <interlock.public@gmail.com> wrote:
>>>
>>> Hi all
>>>
>>> I am trying to get a rough overview of how Async is implemente= d (or the
>>> idea behind it) before I really dig into its source code.
>>>
>>> I have the following questions:
>>>
>>> Q1: Is Async event-loop like?
>>>
>>> From the API and some docs for Async's usage, I feel it is= quite like a
>>> event-loop.
>>>
>>> You create Deferred.t and it might be added to a queue and a s= cheduler
>>> behind might be adjusting the order of running for all Deferre= d.t in the
>>> queue.
>>>
>>> Am I correct?
>>>
>>> Q2: Deferred.return and Deferred.bind
>>>
>>> If I say
>>>
>>>> Deferred.return 1
>>>
>>>
>>> It will returns me a Deferred.t, but inside the function retur= n or bind
>>> somehow an "event" is implicitly added to the defaul= t queue for scheduling,
>>> right?
>>>
>>> If I am correct above,
>>>
>>> Q3: Is Async depending on -thread? The queue or scheduler need= s compiler
>>> support?
>>>
>>> I just need to understand the whole picture in a rough way fir= st.
>>>
>>> Thanks
>>>
>>> Dan
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

--047d7bea2ff609f58204faf732de--