From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id BCDEA7F8F2 for ; Wed, 4 Jun 2014 00:33:23 +0200 (CEST) Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of agarwal1975@gmail.com) identity=pra; client-ip=209.85.192.52; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="agarwal1975@gmail.com"; x-sender="agarwal1975@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of agarwal1975@gmail.com designates 209.85.192.52 as permitted sender) identity=mailfrom; client-ip=209.85.192.52; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="agarwal1975@gmail.com"; x-sender="agarwal1975@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-qg0-f52.google.com) identity=helo; client-ip=209.85.192.52; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="agarwal1975@gmail.com"; x-sender="postmaster@mail-qg0-f52.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Al0BALNMjlPRVcA0lGdsb2JhbABZhDGCbL96AYEHCBYOAQEBAQcLCwkSKoIlAQEBAwESEQQZARsdAQMBCwYFCwMKKgICIQEBEQEFARwGEyKICwEDCQifeGqLJ4Fygw2ZegoZJw1khSQRAQUMjDCCFgeCdYFLAQOWAoIPgXmNPoQEGCmFBCE X-IPAS-Result: Al0BALNMjlPRVcA0lGdsb2JhbABZhDGCbL96AYEHCBYOAQEBAQcLCwkSKoIlAQEBAwESEQQZARsdAQMBCwYFCwMKKgICIQEBEQEFARwGEyKICwEDCQifeGqLJ4Fygw2ZegoZJw1khSQRAQUMjDCCFgeCdYFLAQOWAoIPgXmNPoQEGCmFBCE X-IronPort-AV: E=Sophos;i="4.98,969,1392159600"; d="scan'208";a="65332631" Received: from mail-qg0-f52.google.com ([209.85.192.52]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 04 Jun 2014 00:33:22 +0200 Received: by mail-qg0-f52.google.com with SMTP id a108so14667970qge.11 for ; Tue, 03 Jun 2014 15:33:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=vIwh17JVAcm5ttCu4lrnH6lLFtTG0EEDLukRjR/r5SI=; b=OWP1zTDvWcWOoyxoVWfMG0rpZt748rf48ViEDuUw6RsacOzbCTsLZrYlBrbq9CRpyW FkZFvLXS0GRiC4/CHwQ/76MS6KZ5Mf+5ftVqfnXYDzH8P6yxwdQP4YR+MrZr9jAACK96 iQwnbvfv/CEhB2Dc5/zq2ffb85x8m5qBVMKvHY3stFCh3soByURaqiZ7/FBuIFbfMVfK hn8wL51Jf0CsDmVerEmpTxyjqu5/z+JxywxEtBalUyNizClZ6KAcg+AAGxuhWqMjLJhS EmGCR8vBVAVq4euLMYiZyXU99FQENzpWV/niWqo2m7dhAfm5HbduO17T4XznO4Wt2ugJ 5Wrg== X-Received: by 10.140.49.76 with SMTP id p70mr60670291qga.86.1401834801364; Tue, 03 Jun 2014 15:33:21 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.241.133 with HTTP; Tue, 3 Jun 2014 15:33:01 -0700 (PDT) In-Reply-To: References: From: Ashish Agarwal Date: Tue, 3 Jun 2014 18:33:01 -0400 Message-ID: To: Dan Stark Cc: David House , OCaml Mailing List Content-Type: multipart/alternative; boundary=001a11370a345e8c7704faf61988 Subject: Re: [Caml-list] How is Async implemented? --001a11370a345e8c7704faf61988 Content-Type: text/plain; charset=UTF-8 When you use Async, you must do `open Async.Std`, which overrides all blocking functions from the standard library. Thus, in f2, it's not that the "return 1" part somehow changes the behavior of the previous code. Rather, since you've written "return 1", you've presumably done `open Async.Std`, so the print_endline function is actually the one from Async. So no, the compiler doesn't get involved. Async is implemented purely as a library. On Tue, Jun 3, 2014 at 4:59 PM, Dan Stark wrote: > Hi David > > Thank you very much for this comprehensive explanation. > > Can I also know who is responsible for the queue and scheduler? > > Are they created and maintained by OCaml thread (OCaml internal) or Async > (3rd party library, which means Async create the job queue and has its own > scheduler)? > > In addition, will the compiler got involved in handling Deferred.t? > > I ask above questions because I felt quite curious about what is happening > in the followings: > > Suppose we have a normal function: > > let f1 () = print_endline "hello"; whatever_result;; > > > *Normally*, no matter what *whatever_result *is, when I do *let _ = f1 > ();;*, *print_endline "hello" *will be executed, am I right? For example, > finally returning an int or a record or a lazy.t, etc, "hello" would be > printed out. > > However, if I do > > let f2 () = print_endline "hello"; return 1;; > > > *let _ = f2 ();; *would do nothing unless I run the schedule *let _ = > ignore(Scheduler.go());; * > > Since for *f2* I am not using any other special creation function and the > only special bit is *return 1* after *print_endline*, if the compiler > doesn't get involved, how can compiler know the whole application of > *f2()* should be in future execution? > > Sorry for my above verbose questions if they are boring. I am just trying > to understand more and I guess eventually I will look into the code once I > grasp the big picture. > > thanks > > Dan > > > > > > > > > > On Tue, Jun 3, 2014 at 5:29 PM, David House wrote: > >> There is a queue of jobs in the scheduler. The scheduler runs the jobs >> one by one. Jobs may schedule other jobs. A job is a pair of ['a * 'a -> >> unit]. >> >> There's a thing called a deferred. ['a Deferred.t] is an initially empty >> box that may become filled later with something of type ['a]. There is a >> similar type called ['a Ivar.t] -- the difference is that ivars have a >> function to actually fill in the value, whereas deferreds do not: a >> deferred is a "read-only" view on an ivar. >> >> You can wait on a deferred using bind. Doing [x >>= f] mutates the >> deferred x to add f as a "handler". When a deferred is filled, it adds a >> job to the scheduler for each handler it has. >> >> Doing [Deferred.return 1] allocates a deferred which is already filled >> and has no handlers. Binding on that will immediately schedule a job to run >> your function. (The job is still scheduled though, rather than being run >> immediately, to ensure that you don't have an immediate context switch -- >> in async, the only context switch points are the binds.) >> >> The primitive operations that block are replaced with functions that >> return deferreds, and go do their work in a separate thread. There's a >> thread pool to make sure you don't use infinity threads. (I think the >> default cap is 50 threads.) I think yes, async does depend on -thread. >> >> There is an important optimisation: if you want to read or write to >> certain file descriptors, that doesn't use a thread. Instead there's a >> central list of such file descriptors. There's also a central list of all >> "timer events" (e.g. deferreds that become deferred after some amount of >> time). The scheduler actually is based around a select loop: it does the >> following: >> >> run all the jobs >> if more jobs have been scheduled, run those too >> keep going until there are no more jobs, or we hit the >> maximum-jobs-per-cycle cap >> sleep using select until one read fd is read, or a write fd is ready, or >> a timer event is due to fire >> do that thing >> >> There's also a way to manually interrupt the scheduler. Blocking >> operations other than reading/writing to fds do this: they run in a thread, >> grab the async scheduler lock, fill in an ivar, then wake up the scheduler >> to ensure timely running of the jobs they just scheduled. The async >> scheduler lock is necessary because the scheduler itself is not re-entrant: >> you cannot have multiple threads modifying the scheduler's internals. >> >> >> On 3 June 2014 16:39, Dan Stark wrote: >> >>> Hi all >>> >>> I am trying to get a rough overview of how Async is implemented (or the >>> idea behind it) before I really dig into its source code. >>> >>> I have the following questions: >>> >>> *Q1:* Is Async event-loop like? >>> >>> From the API and some docs for Async's usage, I feel it is quite like a >>> event-loop. >>> >>> You create Deferred.t and it might be added to a queue and a scheduler >>> behind might be adjusting the order of running for all Deferred.t in the >>> queue. >>> >>> Am I correct? >>> >>> *Q2:* Deferred.return and Deferred.bind >>> >>> If I say >>> >>> Deferred.return 1 >>> >>> >>> It will returns me a Deferred.t, but inside the function *return* or >>> *bind* somehow an "event" is implicitly added to the default queue for >>> scheduling, right? >>> >>> If I am correct above, >>> >>> *Q3:* Is Async depending on -thread? The queue or scheduler needs >>> compiler support? >>> >>> I just need to understand the whole picture in a rough way first. >>> >>> Thanks >>> >>> Dan >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >> > --001a11370a345e8c7704faf61988 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
When you use Async, you must do `open Async.Std`, which ov= errides all blocking functions from the standard library. Thus, in f2, it&#= 39;s not that the "return 1" part somehow changes the behavior of= the previous code. Rather, since you've written "return 1", = you've presumably done `open Async.Std`, so the print_endline function = is actually the one from Async. So no, the compiler doesn't get involve= d. Async is implemented purely as a library.


On Tue, Jun 3= , 2014 at 4:59 PM, Dan Stark <interlock.public@gmail.com><= /span> wrote:
Hi David

Thank you very much for this comprehensive explanation.

Can I also know who is responsible for the queue and scheduler?=C2=A0<= /div>

Are they created and maintained by OCaml thread (O= Caml internal) or Async (3rd party library, which means Async create the jo= b queue and has its own scheduler)?=C2=A0

In addition, will the compiler got involved in handling= Deferred.t?

I ask above questions because I felt = quite curious about what is happening in the followings:

Suppose we have a normal function:

let f1 () =3D print_endline "hello"; whatever_result;;

Normally, no matter what whatever_result is, when I do let _ =3D=C2=A0f1 ();;, print_endline "hello= " will be executed, am I right? For example, finally returning an = int or a record or a lazy.t, etc, "hello" would be printed out.

However, if I do

let f2 () =3D print_endline "hello"; return 1;;

let _ =3D f2 ();; would do nothing unless I run = the schedule let _ =3D ignore(Scheduler.go());;=C2=A0

=
Since for f2=C2=A0I am not using any other special creati= on function and the only special bit is return 1=C2=A0after print= _endline, if the compiler doesn't get involved, how can compiler kn= ow the whole application of f2()=C2=A0should be in future execution?= =C2=A0

Sorry for my above verbose questions if they are boring= . I am just trying to understand more and I guess eventually I will look in= to the code once I grasp the big picture.

thanks

Dan



<= div>




=
On Tue, Jun 3, 2014 at 5:29 PM, David House <dhouse@janestreet.com> wrote:
There is a queue of jobs in= the scheduler. The scheduler runs the jobs one by one. Jobs may schedule o= ther jobs. A job is a pair of ['a * 'a -> unit].

There's a thing called a deferred. ['a Deferred.t] i= s an initially empty box that may become filled later with something of typ= e ['a]. There is a similar type called ['a Ivar.t] -- the differenc= e is that ivars have a function to actually fill in the value, whereas defe= rreds do not: a deferred is a "read-only" view on an ivar.

You can wait on a deferred using bind. Doing [x >>= ;=3D f] mutates the deferred x to add f as a "handler". When a de= ferred is filled, it adds a job to the scheduler for each handler it has.= =C2=A0

Doing [Deferred.return 1] allocates a deferred which is= already filled and has no handlers. Binding on that will immediately sched= ule a job to run your function. (The job is still scheduled though, rather = than being run immediately, to ensure that you don't have an immediate = context switch -- in async, the only context switch points are the binds.)<= /div>

The primitive operations that block are replaced with f= unctions that return deferreds, and go do their work in a separate thread. = There's a thread pool to make sure you don't use infinity threads. = (I think the default cap is 50 threads.) I think yes, async does depend on = -thread.

There is an important optimisation: if you want to read= or write to certain file descriptors, that doesn't use a thread. Inste= ad there's a central list of such file descriptors. There's also a = central list of all "timer events" (e.g. deferreds that become de= ferred after some amount of time). The scheduler actually is based around a= select loop: it does the following:

run all the jobs
if more jobs have been sched= uled, run those too
keep going until there are no more jobs, or w= e hit the maximum-jobs-per-cycle cap
sleep using select until one= read fd is read, or a write fd is ready, or a timer event is due to fire
do that thing

There's also a way to manua= lly interrupt the scheduler. Blocking operations other than reading/writing= to fds do this: they run in a thread, grab the async scheduler lock, fill = in an ivar, then wake up the scheduler to ensure timely running of the jobs= they just scheduled. The async scheduler lock is necessary because the sch= eduler itself is not re-entrant: you cannot have multiple threads modifying= the scheduler's internals.


--001a11370a345e8c7704faf61988--