From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: References: <6c51f31b2d39e9ec0ea5f2e1a9eb0b2b@terzarima.net> From: Venkatesh Srinivas Date: Wed, 27 Oct 2010 10:48:01 -0400 Message-ID: To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: multipart/alternative; boundary=0016364586c811201c04939a51ec Subject: Re: [9fans] A little more ado about async Tclunk Topicbox-Message-UUID: 6ef3c67a-ead6-11e9-9d60-3106f5b1d025 --0016364586c811201c04939a51ec Content-Type: text/plain; charset=UTF-8 First, let me hop up and down and say -- this was me seeing what could be accomplished in roughly two hours of hacking; this is by-far not the best (or even a near approximation of) way to accomplish delayed TClunk. There was no thought given at all to how to handle build-up of files to close or how to overlap sending TClunks. I do track the peak number of outstanding files to close; in this simple implementation, which fcp -R 16 -W 16-ing down the p9 kernel sources, the peak waiting files to close I've seen was 10. I did not try out using more than one clunk process. brucee solved this ten years ago; I'd love to hear how. Plan 9 has had delayed close for tearing down fdgrps for nearly 4 years; dual-purposing that logic to handle this case would be very straightforward. erik quanstrom wrote: > > how do you flow control the clunking processes? are you using > qio? if so, what's the size of the buffer? > > also, do you have any test results for different latentices? how > about a local network with ~0.5ms rtt and a long distance connection > at ~150ms? > > have you tried this in user space? one could imagine passing the > fds to a seperate proc that does the closes without involving the kernel. > The clunking process just grabs Chans to close off a list, clunks one at a time, and then sleeps if the list is empty. Its woken by a defered clunk call. I tried out a little hysteresis there, to allow it to sleep for longer but to burst close requests above another threshold, but that was considerably slower overall than even the default (sync Clunk case). I do not use qio. There is no limit on the number of outstanding files. If you'd like to look it over, the main diff is visible at: http://code.google.com/p/inferno-npe/source/detail?r=09a2e719616e1c842e602485edee9d03020909a6; plan9 folks will notice a strong similarity to 9/port/chan.c:ccloseq and :closeproc. I do not have tests at different latencies (reliable ones anyway); for whatever reason, sources.cs.bell-labs.com was varying between 20ms and 300ms from me yesterday, even in short timespans. The code is very easy to test though - just grab and build inferno-npe and pass the '-j' flag to mount in addition to whatever you'd normally use. You can monitor the number of files waiting to close by reading /dev/vmstat. I've not tried this in userspace (recently). Wes and I did try it about a year ago as a side to the Journal Callbacks work; I'd have to look back to see what the numbers looked like then, but iirc there was a significant penalty to using our user-wrapper process and rewriting lots of 9p traffic, though that was one of my first Limbo programs and could massively be improved upon. Eric Van Hensbergen wrote: > More specifically, with the semantic restrictions that VS imposes, are > their remaining semantic problems when using this with cache-able file > systems? All this approach does is delay TClunk messages, generally very little, and only on files not marked ORCLOSE (perhaps it should also consider OEXCL) and only on mounts flagged Sys->MCACHE. It would certainly be possible to write a file server for which this approach is bad, but for the vast majority of ones, this is a safe optimization. It does not depend on MCACHE caching; inferno's port/cache.c is just a stub. -- vs --0016364586c811201c04939a51ec Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
First, let me hop up and down and say -- th= is was me seeing what could be accomplished in roughly two hours of hacking= ; this is by-far not the best (or even a near approximation of) way to acco= mplish delayed TClunk. There was no thought given at all to how to handle b= uild-up of files to close or how to overlap sending TClunks. I do track the= peak number of outstanding files to close; in this simple implementation, = which fcp -R 16 -W 16-ing down the p9 kernel sources, the peak waiting file= s to close I've seen was 10. I did not try out using more than one clun= k process.

brucee solved this ten years ago; I'd love to hear how. Plan 9 has = had delayed close for tearing down fdgrps for nearly 4 years; dual-purposin= g that logic to handle this case would be very straightforward.

erik quanstrom wrote:
>
> how do you flow control the clunking processes? =C2=A0are you using > qio? =C2=A0if so, what's the size of the buffer?
>
> also, do you have any test results for different latentices? =C2=A0how=
> about a local network with ~0.5ms rtt and a long distance connection > at ~150ms?
>
> have you tried this in user space? =C2=A0one could imagine passing the=
> fds to a seperate proc that does the closes without involving the kern= el.
>

The clunking process just grabs Chans to close off a lis= t, clunks one at a time, and then sleeps if the list is empty. Its woken by= a defered clunk call. I tried out a little hysteresis there, to allow it t= o sleep for longer but to burst close requests above another threshold, but= that was considerably slower overall than even the default (sync Clunk cas= e). I do not use qio. There is no limit on the number of outstanding files.=

If you'd like to look it over, the main diff is visible=C2=A0at: http://code.google.com/p/inferno-npe/source/= detail?r=3D09a2e719616e1c842e602485edee9d03020909a6 ; plan9 folks will = notice a strong similarity to 9/port/chan.c:ccloseq and :closeproc.

I do not have tests at different latencies (reliable ones anyway); for = whatever reason, sources.cs.bel= l-labs.com was varying between 20ms and 300ms from me yesterday, even i= n short timespans. The code is very easy to test though - just grab and bui= ld inferno-npe and pass the '-j' flag to mount in addition to whate= ver you'd normally use. You can monitor the number of files waiting to = close by reading /dev/vmstat.

I've not tried this in userspace (recently). Wes and I did try it a= bout a year ago as a side to the Journal Callbacks work; I'd have to lo= ok back to see what the numbers looked like then, but iirc there was a sign= ificant penalty to using our user-wrapper process and rewriting lots of 9p = traffic, though that was one of my first Limbo programs and could massively= be improved upon.=C2=A0

Eric Van Hensbergen wrote:
> More = specifically, with the semantic restrictions that VS imposes, are
> t= heir remaining semantic problems when using this with cache-able file
> systems?
All this approach does is delay TClunk messages, generally very little, a= nd only on files not marked ORCLOSE (perhaps it should also consider OEXCL)= and only on mounts flagged Sys->MCACHE. It would certainly be possible = to write a file server for which this approach is bad, but for the vast maj= ority of ones, this is a safe optimization. It does not depend on MCACHE ca= ching; inferno's port/cache.c is just a stub.

-- vs
--0016364586c811201c04939a51ec--