9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: "ron minnich" <rminnich@gmail.com>
To: "Fans of the OS Plan 9 from Bell Labs" <9fans@9fans.net>
Subject: Re: [9fans] 9pfuse and O_APPEND
Date: Fri, 19 Dec 2008 08:44:13 -0800	[thread overview]
Message-ID: <13426df10812190844p2f10c4afr85147e1f0a41a723@mail.gmail.com> (raw)
In-Reply-To: <E6F29B1F-555B-4B98-9695-A756B18781F7@sun.com>

On Thu, Dec 18, 2008 at 7:59 PM, Roman Shaposhnik <rvs@sun.com> wrote:
> On Dec 18, 2008, at 7:26 PM, ron minnich wrote:
>>
>> On Thu, Dec 18, 2008 at 7:06 PM, Roman Shaposhnik <rvs@sun.com> wrote:
>>>
>>> Its fun, yes. But I believe this is more of a testament to the
>>> statelessness
>>> of the NFS
>>> plus the fact that the "end of file" is not a well defined offset (unlike
>>> beginning of
>>> the file).
>>
>> no, it's even worse with stateful systems.
>


you want to write at EOF. Where is EOF? On Plan 9 on an append file,
server by definition always knows: it's where the last write was. So
writes go at EOF.

What about writing append files in a stateful FS where it's up to the
client to figure out where the end is?

client by definition knows more than the server. So client has to do this:
1. get metadata in a way that indicates that nobody else gets to
write. Client calls server to get exclusive access to metadata/file.
This can result in server-client callbacks to all other clients. This
is fun to watch on 1000s of nodes. Before the right hacks went in it
could take 30 minutes. I am not making this up. Why? Well, what if
*every* one of the thousands of clients is trying to write at
eof and they're all fighting for the metadata? Congestive collapse,
that's what.
2. Client writes at eof. Since the client has exclusive access at this
point, it's pretty fast.
3. Clients releases the metadata lock to the server and hence the
other thousands of clients.

The 'client write at EOF' is bad for precisely the same reason that
you don't want to use shared memory for locks in a CC-NUMA machine;
you want to send the operation to the data, not move the data to the
operation. Lots of great papers on this over the years ...

ron



  reply	other threads:[~2008-12-19 16:44 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-18 23:34 Roman Shaposhnik
2008-12-18 23:57 ` Russ Cox
2008-12-19  0:03   ` ron minnich
2008-12-19  3:06     ` Roman Shaposhnik
2008-12-19  3:26       ` ron minnich
2008-12-19  3:59         ` Roman Shaposhnik
2008-12-19 16:44           ` ron minnich [this message]
2008-12-19 19:21             ` Anthony Sorace
2008-12-19 19:31               ` erik quanstrom
2008-12-19 19:41               ` ron minnich
2008-12-19 19:59             ` Roman Shaposhnik
2008-12-19 20:06               ` erik quanstrom
2008-12-19 20:18               ` Charles Forsyth
2008-12-21  5:08                 ` Roman V. Shaposhnik
2008-12-19  3:03   ` Roman Shaposhnik
2008-12-19  3:43     ` erik quanstrom
2008-12-19  3:54       ` Roman Shaposhnik
2008-12-19  4:13         ` geoff
2008-12-19  8:23           ` Russ Cox
2008-12-19 19:49             ` Roman Shaposhnik
2008-12-19 19:56               ` erik quanstrom
2008-12-19 20:10                 ` Roman Shaposhnik
2008-12-19 20:22                   ` erik quanstrom
2008-12-19 20:02               ` ron minnich
2008-12-19 14:21           ` erik quanstrom
2008-12-19 21:00 ` ron minnich
2008-12-19 21:32   ` Charles Forsyth
2008-12-19 21:29     ` ron minnich
2008-12-21  5:05   ` Roman V. Shaposhnik
2008-12-21 14:45     ` erik quanstrom
2008-12-22 10:02       ` roger peppe
2008-12-25  6:04       ` Roman Shaposhnik
2008-12-25  6:33         ` erik quanstrom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=13426df10812190844p2f10c4afr85147e1f0a41a723@mail.gmail.com \
    --to=rminnich@gmail.com \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).