From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FROM,HTML_MESSAGE,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 18570 invoked from network); 5 Jun 2020 16:02:18 -0000 Received: from minnie.tuhs.org (45.79.103.53) by inbox.vuxu.org with ESMTPUTF8; 5 Jun 2020 16:02:18 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id B93CD9CAEB; Sat, 6 Jun 2020 02:02:13 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id 76D769C774; Sat, 6 Jun 2020 02:01:34 +1000 (AEST) Authentication-Results: minnie.tuhs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="mzNMg3jh"; dkim-atps=neutral Received: by minnie.tuhs.org (Postfix, from userid 112) id 834629C774; Sat, 6 Jun 2020 02:01:30 +1000 (AEST) Received: from mail-qt1-f177.google.com (mail-qt1-f177.google.com [209.85.160.177]) by minnie.tuhs.org (Postfix) with ESMTPS id E69AE9C606 for ; Sat, 6 Jun 2020 02:01:29 +1000 (AEST) Received: by mail-qt1-f177.google.com with SMTP id k22so8862748qtm.6 for ; Fri, 05 Jun 2020 09:01:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=vlA5J5JkJ/GRIivQZL4BhWkWAcm4mtp8Jwuz5d09TNo=; b=mzNMg3jhLnc9Lbdn2RpDp9bDBG4J5KETILORzFOg+TDHYtoy4C9PTA24cPMiqOR3cA AFdzobdikBcf08+d0EEsl6bAh3s7RFc570pciZioIMyto/YNfVBSBlBkJVCZ9KEEC8MZ Jrq+Njcq4tZSIqj1HxQmn23je8dCQfzWcMx2OoDoRmN59x3gEh2HQxYUegGpKyej5xKk HTNmDi9GizohcyoOekUutLGF3BV3MWj7V5BFsUAP7mv96PoOC4qIDMG28T/2/LFAplge WcvRrpXLM1vubGAaJEk1nLE05O6ONABqb8WG+xsKrdzKjGvEGEf60kXc1wE/BRJybJhJ aGMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=vlA5J5JkJ/GRIivQZL4BhWkWAcm4mtp8Jwuz5d09TNo=; b=pnVNOTgdc7s8AGG68NTdgBmfYmA6I6nYbegvkzSOj39j0tvsEgSBoPiPjLCUKTRvqq BeqQaLffsOSsZlzRGwFqqA/d5YAl74mums3nYnIsvbA1G3EgjZRvHKu/Yw01VXEsNkwY hR+hcxro6AudLkMMUAv9FsNtbDK99i7KIbrLIHwRx4DF20vEcd0P0C0vGrZDmCT+ak1q LItlvePK7ixu13XlaMysZldxgRkbUJERKMzqfbcwcG5bOate3z2kKPnDCl7gzl83VG03 tgBlO+ZOmuxt5CMadqclLe1iiPFDexPxXEHg8SUT+tGN2OPz961ouLK+O1bqAwx1IAKc PygA== X-Gm-Message-State: AOAM530i1rRNDArgUkWSHvSuvW6VglLqG1v8Gd3dHxbalzhtROq7MSKi sdl3vR7Uttw3N+D3FGsIJKsPZ3eVQI7cMNWTSRaSjA== X-Google-Smtp-Source: ABdhPJzg3crZ0bamrRgdolbvG+O+moYmmb5h9b81IUnNjdzzSHEy25QPuBebGAIMyrt6CIV8nIaITVTwHKo4KqzF6Dk= X-Received: by 2002:ac8:37ad:: with SMTP id d42mr10763496qtc.352.1591372888915; Fri, 05 Jun 2020 09:01:28 -0700 (PDT) MIME-Version: 1.0 References: <20200601145801.GE22016@mcvoy.com> <20200604090436.GJ279@server.rulingia.com> <20200604165011.GC18437@mcvoy.com> In-Reply-To: <20200604165011.GC18437@mcvoy.com> From: Dan Cross Date: Fri, 5 Jun 2020 12:00:52 -0400 Message-ID: To: Larry McVoy Content-Type: multipart/alternative; boundary="000000000000bfc3f605a7585fbb" Subject: Re: [TUHS] non-blocking IO X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: The Eunuchs Hysterical Society Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" --000000000000bfc3f605a7585fbb Content-Type: text/plain; charset="UTF-8" On Thu, Jun 4, 2020 at 12:51 PM Larry McVoy wrote: > On Thu, Jun 04, 2020 at 08:19:58AM -0600, Warner Losh wrote: > > The kicker is that all of the kernel is callback driven. The > > upper half queues the request and then sleeps until the lower half > signals > > it to wakeup. And that signal is often just a wakeup done from the > > completion routine in the original request. All of that would be useful > in > > userland for high volume activity, none of it is exposed... > > Yeah, I've often wondered why this stuff wasn't exposed. We already have > signal handlers, seems like that maps. > Was it Rob who said that signals were really just for SIGKILL? Here, signals would be gang-pressed into service as a general IPC mechanism. In fairness, they've mutated that way, but they didn't start out that way. While I obviously wasn't there, the strong impression I get is that by the time people were seriously thinking about async IO in Unix, the die had already been cast for better or worse. > I tried to get the NFS guys at Sun to rethink the biod junk and do it like > UFS does, where it queues something and gets a callback. I strongly > suspect > that two processes, one to queue, one to handle callbacks, would be more > efficient and actually faster than the biod nonsense. > > That's one of the arguments I lost unfortunately. > > Warner, exposing that stuff in FreeBSD is not really that hard, I suspect. > Might be a fun project for a young kernel hacker with some old dude like > you or me or someone, watching over it and thinking about the API. > I'm going to actually disagree with you here, Larry. While I think a basic mechanism wouldn't be THAT hard to implement, it wouldn't compose nicely with the existing primitives. I suspect the edge cases would be really thorny, particularly without a real AST abstraction. For instance, what happens if you initiate an async IO operation, then block on a `read`? Where does the callback happen? If on the same thread, The real challenge isn't providing the operation, it's integrating it into the existing model. As a counter-point to the idea that it's completely unruly, in Akaros this was solved in the C library: all IO operations were fundamentally asynchronous, but the C library provided blocking read(), write(), etc by building those from the async primitives. It worked well, but Akaros had something akin to an AST environment and fine-grain scheduling decisions were made in userspace: in Akaros the unit of processor allocation is a CPU core, not a thread, and support exists for determining the status of all cores allocated to a process. There are edge cases (you can't roll-your-own mutex, for example, and the basic threading library does a lot of heavy lifting for you making it challenging to integrate into the runtime of a language that doesn't use the same ABI), but by and large it worked. It was also provided by a kernel that was a pretty radical departure from a Unix-like kernel. - Dan C. --000000000000bfc3f605a7585fbb Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Thu, Jun 4, 2020 at 12:51 PM Larry McV= oy <lm@mcvoy.com> wrote:
On Thu, Jun 04, 2020 at 08:19:58AM -0600, Warner Losh wrote:
> The kicker is that all of the kernel is callback driven. The
> upper half queues the request and then sleeps until the lower half sig= nals
> it to wakeup. And that signal is often just a wakeup done from the
> completion routine in the original request. All of that would be usefu= l in
> userland for high volume activity, none of it is exposed...

Yeah, I've often wondered why this stuff wasn't exposed.=C2=A0 We a= lready have
signal handlers, seems like that maps.=C2=A0

Was it Rob who said that signals were really just for SIGKILL? Here,= signals would be gang-pressed into service as a general IPC mechanism. In = fairness, they've mutated that way, but they didn't start out that = way. While I obviously wasn't there, the strong impression I get is tha= t by the time people were seriously thinking about async IO in Unix, the di= e had=C2=A0already been cast for better or worse.
=C2=A0
I tried to get the NFS guys at Sun to rethink the biod junk and do it like<= br> UFS does, where it queues something and gets a callback.=C2=A0 I strongly s= uspect
that two processes, one to queue, one to handle callbacks, would be more efficient and actually faster than the biod nonsense.

That's one of the arguments I lost unfortunately.

Warner, exposing that stuff in FreeBSD is not really that hard, I suspect.<= br> Might be a fun project for a young kernel hacker with some old dude like you or me or someone, watching over it and thinking about the API.

I'm going to actually disagree with you her= e, Larry. While I think a basic mechanism wouldn't be THAT hard to impl= ement, it wouldn't compose nicely with the existing primitives. I suspe= ct the edge cases would be really thorny, particularly without a real AST a= bstraction. For instance, what happens if you initiate an async IO operatio= n, then block on a `read`? Where does the callback happen? If on the same t= hread, The real challenge isn't providing the operation, it's integ= rating it into the existing model.

As a counter-po= int to the idea that it's completely unruly, in Akaros this was solved = in the C library: all IO operations were fundamentally asynchronous, but th= e C library provided blocking read(), write(), etc by building those from t= he async primitives. It worked well, but Akaros had something akin to an AS= T environment and fine-grain scheduling decisions were made in userspace: i= n Akaros the unit of processor allocation is a CPU core, not a thread, and = support exists for determining the status of all cores allocated to a proce= ss. There are edge cases (you can't roll-your-own mutex, for example, a= nd the basic threading library does a lot of heavy lifting for you making i= t challenging to integrate into the runtime of a language that doesn't = use the same ABI), but by and large it worked. It was also provided by a ke= rnel that was a pretty radical departure from a Unix-like kernel.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 - Dan C.

--000000000000bfc3f605a7585fbb--