9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] create(2)/open(2) race for file creation
@ 1998-02-08  2:01 Rob
  0 siblings, 0 replies; 23+ messages in thread
From: Rob @ 1998-02-08  2:01 UTC (permalink / raw)


Am I missing something?   If everyone does create(file, OWRITE, 000)
only one process can succeed.   Perm==0 means no one can read or
write the file; if you want one writer, many readers, 004 will do.

In other words, it's a create/open race, so don't run that race.

-rob




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-26 23:30 Russ
  0 siblings, 0 replies; 23+ messages in thread
From: Russ @ 1998-02-26 23:30 UTC (permalink / raw)


> unnecessary and A-Bad-ThingTM.

Wouldn't that be "A-Bad-Thing™" ?





^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-26 23:01 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-26 23:01 UTC (permalink / raw)


>gdb wrote:
>> 
>> I'm sure many of you wish this topic would go away, but...
>
>I actually think it's quite interesting :)  
>
>Admittedly I do not know the entire scope of the problem, but 
>it seems to me that the effect that is desired is to use the fs 
>as a centralized database, with update collisions being resolved
>at the os-layer.

No, I want them resolved at the fs; as they are.  I only want
the os-layer to make the fs resolution visible.

>                  This seems to me to be a challenging problem
>for the fs/os, particularly since the fs/os as seen by applications
>may have mounts and binds going across the network to a variety
>of locations.

Even open(2) does NOT have the same behavior across arbitrary
bind(2) structures.  That is NOT my expectation.  I DO expect
open(2) and create(2) to have the same behavior on directories
that have the same bind(2) structure.

I did notice an interesting response on this thread in comp.os.plan9
that didn't make it to the list about a property of union directories.
It went something like "the behavior of a properly implemented
union directory should not differ from the behavior of a flat
directory with the same contents".  That would be a hard trick to
pull off since the constitutient directories would have to be
incorporated in the union in the same order, always, no matter
who, no matter when, no matter from where.  For that to happen
in Plan9, the union directory would need to be constructed using
an RPC between the fileservers [fileserver in the generic sense, not
the physical machine called a fileserver].  The mount would be allowed
only after it was constructed.  It also means the directory should not
be available if any of the constitutient fileservers is unavailable.
After implementing this trick, the bind(2) system call would be
unnecessary and A-Bad-ThingTM.

Anybody want to take that on as a research project?

>Might it be appropriate to use a multithreaded server process
>to handle the transactions, and have the clients attach to

[snip]

This idea has been suggested many times during this discussion.
In Plan9 this both unnecessary and overly complicated.  The 9P
protocol is sufficient to express the ideas completely and the
fileservers as implemented support 9P properly.  The only thing
that corrupts the design is the implementation of one system call.

Just Fix It.

David Butler
gdb@dbSystems.com




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-26 22:16 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-26 22:16 UTC (permalink / raw)


>I'll get back to you on this one.
I'm back...!

The "feature" is Plan9's shared environment space.  As is the case
with many of Plan9's features, they are powerful forces that can
be used for either good or evil.

This problem first showed up when I compiled a new cpu/terminal
kernel and the message "rc: can't open #e/fn#mkextract" would occur.
If you look at the rc scripts in /sys/src/9/port of mkdevlist,
mklinklist, mkmisclist and mkstreamlist you will see a function
called mkextract being defined and used.  During a build multiple
processes are trying to create their version of mkextract and writing
it to #e.  In this case the effect is benign since the scripts don't
invoke other scripts that read the function definition from #e and
attempt to use the possibly incorrect result.

I don't know why rc shares the environment since it doesn't use
the result.  For example (I'm adding a newline to the cat output
for readability):

term% a=1
term% echo $a
1
term% cat /env/a
1
term% rc
term% echo $a
1
term% cat /env/a
1
term% a=2
term% cat /env/a
2
term% exit
term% cat /env/a
2
term% echo $a
1
term%

Since rc doesn't re-initialize from the environment perhaps it
should copy it before it starts.  

Before I go adding Fork() to /sys/src/cmd/rc/plan9.c and changing
fork() to rfork(RFFDG|RFENVG|RFPROC), I think fork(2) should be
changed....  Here we go again.

I don't know of any programs that expect another concurrent process
to change the environment.  In Plan9 it can be used as an IPC, but
is it?

David Butler
gdb@dbSystems.com




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-26 19:33 Eric
  0 siblings, 0 replies; 23+ messages in thread
From: Eric @ 1998-02-26 19:33 UTC (permalink / raw)


gdb wrote:
> 
> I'm sure many of you wish this topic would go away, but...

I actually think it's quite interesting :)  

Admittedly I do not know the entire scope of the problem, but 
it seems to me that the effect that is desired is to use the fs 
as a centralized database, with update collisions being resolved
at the os-layer.  This seems to me to be a challenging problem
for the fs/os, particularly since the fs/os as seen by applications
may have mounts and binds going across the network to a variety
of locations.

Might it be appropriate to use a multithreaded server process
to handle the transactions, and have the clients attach to
the server process rather than directly to an fs?  This way 
the collisions and race problems can be resolved at the applications 
layer where the environment is restricted to the context of 
the application.  This also has the effect of hiding the 
implementation of data storage from the applications,
which could be advantageous, and perhaps allowing the server
to cache information in ways the fs couldn't, if appropriate.

Such a centralized server process approach could put a considerable
load on the 'database' server (1 process per client, with 30Hz
transactions), but this can be addressed with hardware 
multiprocessing.  It also increases the number of systems required 
to support the database to 2, which could be a problem.
  
> David Butler
> gdb@dbSystems.com

Yours,

Eric Dorman
edorman@tanya.ucsd.edu





^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-26  0:24 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-26  0:24 UTC (permalink / raw)


>>I took out the Updenv() in Execute to follow /sys/src/cmd/rc/unix.c
>>and because it makes sense.
>
>But, then you have to add another.  The one in /sys/src/cmd/rc/simple.c
>makes more sense.

Well, making the child not re-write the environment only slows
down the problem.  This is not the root cause.

I'll get back to you on this one.




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-26  0:04 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-26  0:04 UTC (permalink / raw)


>I took out the Updenv() in Execute to follow /sys/src/cmd/rc/unix.c
>and because it makes sense.

But, then you have to add another.  The one in /sys/src/cmd/rc/simple.c
makes more sense.




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-25 23:42 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-25 23:42 UTC (permalink / raw)


I'm sure many of you wish this topic would go away, but...

As you can guess I have made the change I have been advocating in
this thread and have found an interesting race that affects everybody.

In /sys/src/cmd/rc/plan9.c in the function Execute the statement

Updenv();

Is in a race with the same call in /sys/src/cmd/rc/simple.c in
the function Xsimple().

During the fork and subsequent exec, both the parent and child
are writing the environment with sometimes spectacular results.

With the new create(2), I saw periodic rc: can't open #e/blah since
Create() in /sys/src/cmd/rc/plan9.c now has open(OWRITE|OTRUNC)
and only if that fails create(OWRITE).  If I add the additional
open(OWRITE|OTRUNC) call to emulate the old create(2) call, the
race stays hidden.

I took out the Updenv() in Execute to follow /sys/src/cmd/rc/unix.c
and because it makes sense.

Also, if anybody is interested, the OTRUNC handling in
/sys/src/9/port/devenv.c needs to include things other than OWRITE.

David Butler
gdb@dbSystems.com




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-10 14:34 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-10 14:34 UTC (permalink / raw)


>From: forsyth@caldo.demon.co.uk

[snip]
>it's not even worth a reference to the Race Relations Board.

OK, how about:

Form two sets of 9P protocol servers.

Take the operations of the system calls and examine the mapping
between the two 9P sets through the system call operators.

The create(2) system call is not a function.




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-10  8:59 forsyth
  0 siblings, 0 replies; 23+ messages in thread
From: forsyth @ 1998-02-10  8:59 UTC (permalink / raw)


i'm not sure i follow this.

in order to simplify one application that
purportedly has unusual needs, it is proposed as a sensible action,
and even dignified by reference to `research', to rush to change
a fundamental interface for dozens of other programs
(more if you include the ones gdb hasn't seen).
even then i wasn't sure i could understand the timing restrictions accurately
that prevented use of CHEXCL -- it seemed the race was
built into that application more than the kernel! but i hadn't
time to pursue it this week.

it's even more surprising, because (if one were to accept the
underlying assumptions, which as yet i don't), gdb himself proposed
an alternative solution by a flag to open (or something)
that didn't require as much messing about, isolated the change
and allowed me to ignore the whole matter for the moment!

i recall a caustic comment made by a friend of mine when
someone from the CSRG included things such as `increasing dev_t to 64 bits'
on a 1987 slide describing `research' they were undertaking with Unix.
i'd concluded from that talk that clever people can still be tempted just
to prat about with trivial things, and even then without THINKING.

if i were truly exercised by its existence, i might consider removing the
race in the implementation of the system call (in port/chan.c).
in practice, i'd probably think a little bit about the implications
of that, and then decide against bothering to do even that.
in a distributed system with processes
concurrently creating and destroying names, especially in the presence
of union mounts (where it isn't possible in general to rendezvous at a shared
file server because there isn't guaranteed to be a unique one),
some form of name-race is a fact of life,
and one might as well look to other mechanisms for a solution
to general synchronisation problems.
it's not even worth a reference to the Race Relations Board.




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-09 21:21 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-09 21:21 UTC (permalink / raw)


>From: "G. David Butler" <gdb@dbSystems.com>

>The current create(2) call, IMHO, is broken.  When the call was
>implemented, it introduced a race condition that was not necessary
>and can't be easily avoided.  Since Plan9 is a "research" system,
>I don't consider the API set in stone and was trying to open a
>discussion on the subject.  I offered two solutions, an option
>to open (POSIX O_EXCL) or a fix to create(2).  There is a third

I have found some interesting information while considering
the change of create(2).

Look at the code in /sys/src/cmd/exportfs/exportsrv.c and
/sys/src/cmd/iostats/statsrv.c for the comment with the word
"race" in it.  It seems that any user space "file server"
laying over a real file server can't transfer the Tcreate
correctly.  Big surprise!

Another comment in /sys/src/cmd/samterm/plan9.c about an
"existing guy" is interesting in that it assumes that create(2)
fails if the file exists!

It doesn't take much to make the change in the kernel and a simple
change in /sys/src/cmd/rc/plan9.c will let the system boot.  Beyond
that the changes are mechanical and I have listed most of them below.
Overall, the change of create(2) is not very hard.

The actual change in the cpu/terminal kernel is:

delete from the comment in namec() in /sys/src/9/port/chan.c near
line 582
	/*
	* Walk the element
to near line 600
	poperror();

delete from the comment near line 610
	/*
	* protect against the open/create race.
to near line 620
	}

delete the line near line 625
	poperror();

also remove the local variables cc and createrr and the
label Open: to keep the compile warning free.

In the following list, if there is no comment then the change is
create becomes open/create/open as previously discussed.

/sys/man/2
open	documentation change

/sys/man/5
open	documentation change

/sys/src/9/boot
aux.c

/sys/src/alef/8
output.c

/sys/src/alef/k
output.c

/sys/src/alef/lib/libauth
newns.l

/sys/src/alef/lib/libbio
binit.l

/sys/src/alef/lib/libg
binit.l	create should be open OTRUNC

/sys/src/alef/lib/p9
putenv.l

/sys/src/alef/test
test16.l

/sys/src/alef/test/Y
files.l

/sys/src/alef/v
output.c

/sys/src/ape/9src
ptyfs.c	create there is correct to keep two from running

/sys/src/ape/lib/ap/plan9
execve.c	fix _CREATE
mkdir.c	access is race, but _CREATE is correct
open.c	access is race, but _CREATE is correct
rename.c	fix _CREATE
tmpfile.c	fix _CREATE

/sys/src/cmd
ar.c
char.c
cp.c
dd.c
ed.c
fortune.c
init.c
mkdir.c	access is race but create is correct
mv.c
news.c
ramfs.c	create there is correct to keep two from running
sed.c	create there is correct
sh.C
sort.c	to avoid races, some create become open OTRUNC
split.c
srv.c		access is race but create is correct
srvfs.c	create there is correct to keep two from running
strip.c
swap.c	fix create on env but swapfd create is correct
tar.c
tee.c
touch.c
tweak.c
½char.c

/sys/src/cmd/2l
obj.c

/sys/src/cmd/6l
obj.c

/sys/src/cmd/8l
obj.c

/sys/src/cmd/8½
main.c	fix create on env but srv create is correct

/sys/src/cmd/9660srv
main.c	create there is correct to keep two from running

/sys/src/cmd/acid
builtin.c	already does the create/open trick

/sys/src/cmd/acme
disk.l	stat is race but create is correct
exec.l
rows.l
util.l

/sys/src/cmd/art
fileio.c

/sys/src/cmd/auth
adduser.c		create is correct
changeuser.c	create is correct
cron.c		create is correct

/sys/src/cmd/aux
consolefs.l	create there is correct to keep two from running
depend.l		create there is correct to keep two from running

/sys/src/cmd/aux/icmp
icmp.c		create there is correct to keep two from running

/sys/src/cmd/cc
compat.c

/sys/src/cmd/chdb
sub.c

/sys/src/cmd/chdb/cdb
cdb.c

/sys/src/cmd/con
xmr.c

/sys/src/cmd/cpp
nlist.c

/sys/src/cmd/db
output.c	create ok
setup.c	create ok
trcrun.c

/sys/src/cmd/diff
main.c	create ok

/sys/src/cmd/disk
format.c
mkext.c	create ok
mkfs.c	create ok

/sys/src/cmd/disk/kfs
main.c	create there is correct to keep two from running

/sys/src/cmd/disk/pip
disk.c

/sys/src/cmd/dossrv
xfssrv.c	create there is correct to keep two from running

/sys/src/cmd/exportfs
exportsrv.c	create now will work like it should! see race comment

/sys/src/cmd/fax
file.c		create is correct

/sys/src/cmd/fone
plan9.c	create there is correct to keep two from running

/sys/src/cmd/ftpfs
file.c		stat is race but create is correct

/sys/src/cmd/hp
hp.c

/sys/src/cmd/hp/hp-vt
main.c

/sys/src/cmd/iostats
iostats.c	create there maybe correct if two run in debug mode
statsrv.c	create now will work like it should! see race comment

/sys/src/cmd/ip
tftpd.c

/sys/src/cmd/kl
obj.c

/sys/src/cmd/lex
sub1.c

/sys/src/cmd/lp
LOCK.c	create now will work like it should!
lpdaemon.c	create ok
lpsend.c	create ok

/sys/src/cmd/mk
main.c	create ok
plan9.c
t_ar.c	create ok
t_file.c	create ok

/sys/src/cmd/mothra
gopher2html.c	create ok
http.c	create ok
mothra.c

/sys/src/cmd/ndb
cs.c	create there is correct to keep two from running
dns.c	create there is correct to keep two from running
mkhash.c
mkhosts.c

/sys/src/cmd/plot
plot.c	create there is correct to keep two from running

/sys/src/cmd/postscript/postio
postio.l	create ok

/sys/src/cmd/postscript/tcpostio
tcpostio.l	create ok

/sys/src/cmd/rc
plan9.c

/sys/src/cmd/rschar
rschar.c

/sys/src/cmd/sam
io.c
mesg.c
plan9.c	create ok
sam.c
shell.c

/sys/src/cmd/samterm
plan9.c	one create ok (see comment) other open OTRUNC

/sys/src/cmd/scuzz
scuzz.c

/sys/src/cmd/service
ftp.c		two creates need changing, not all

/sys/src/cmd/service/nfs
chat.c	create there is correct to keep two from running

/sys/src/cmd/spin
pangen1.h	create ok

/sys/src/cmd/telco
telco.c	create there is correct to keep two from running

/sys/src/cmd/upas/common
libsys.c	fix create in syscreate

/sys/src/cmd/upas/q
qer.c		look at more
runq.c	fix create on /dev/user

/sys/src/cmd/vl
obj.c

/sys/src/cmd/xl
obj.c

/sys/src/fb
cvt2pic.c	create ok

/sys/src/games/gps
gps55.c	create ok

/sys/src/games/plumb
pced.c

/sys/src/games/smiley
mkfont.c

/sys/src/libauth
newns.c

/sys/src/libbio
binit.c

/sys/src/libc/9sys
putenv.c

/sys/src/libc/port
profile.c

/sys/src/libfb
picopen_w.c

/sys/src/libg
binit.c

/sys/src/libstdio
freopen.c

David Butler
gdb@dbSystems.com




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-09 16:42 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-09 16:42 UTC (permalink / raw)


I have seen serveral suggestions that I use a process somewhere
to control the create race.  I am trying to do just that.  The
process already exists!  It is in the 9P fileserver and the IPC
used to reach it is the Tcreate message.  What I'm trying to do
is determine the best way to get to it from a system call.

The current create(2) call, IMHO, is broken.  When the call was
implemented, it introduced a race condition that was not necessary
and can't be easily avoided.  Since Plan9 is a "research" system,
I don't consider the API set in stone and was trying to open a
discussion on the subject.  I offered two solutions, an option
to open (POSIX O_EXCL) or a fix to create(2).  There is a third
option that someone hinted at, implement a new system call which
provides access to the primitive create(5).

Since many of you see the problem with race well enough to suggest
the other process solution, perhaps we can focus on trying to
eliminate the problem instead of coding our way around it.

David Butler
gdb@dbSystems.com




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-09  8:05 Dan
  0 siblings, 0 replies; 23+ messages in thread
From: Dan @ 1998-02-09  8:05 UTC (permalink / raw)


> If you're trying to create lock files dozens of times per second
> in a union directory you deserve whatever failures the system
> sees fit to provide.

Hahahaha....

dang, this comment nearly made me wet my pants...

	- Dan C.





^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-08 22:48 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-08 22:48 UTC (permalink / raw)


>From: "Rob Pike" <rob@plan9.bell-labs.com>
>If you're trying to create lock files dozens of times per second
>in a union directory you deserve whatever failures the system
>sees fit to provide.
>
>-rob

Since I have documented over 70 create(2)s per second in a
single directory on my file server with a million entries, and
a small cpu server is able to reliably execute at least an order
of magnitude more items in the for loop it uses to cover the union
per second, and computers and software don't have bad days nor get
fatigued; it is entirely possible to create lock, or any other type
of file, at the small rate of dozens per second in any directory,
union or not, with NO failure.  I hope you don't mind if I disregard
your "your crazy" response.

On the other hand, I was discussing the implications of changing
the semantics of the create(2) system call as it relates to union
directories.  I didn't chance to think that the software should
change behavior to convince the programmer that he was crazy.
Perhaps I should, Microsoft does...

David




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-08 22:18 Photon
  0 siblings, 0 replies; 23+ messages in thread
From: Photon @ 1998-02-08 22:18 UTC (permalink / raw)


This is a cryptographically signed message in MIME format.

--------------ms0F3C1B45CC7A013D36398A00
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Would it not be possible to create a system call "atomic
test-and-create" called stat_crate() or something?  And let the system
use atomic test-and-set ipc mechanisms to make it atomic across other
calls to stat_create(), stat(), and create()?  Then the only question
would still be making this work with calls on different cpu servers,
right?  Maybe another 9P call to the fileserver would have to be created
to let the atomic-ness exist at the file server and not in the cpu
server's kernel?

Brandon

Rob Pike wrote:
> 
> If you're trying to create lock files dozens of times per second
> in a union directory you deserve whatever failures the system
> sees fit to provide.
> 
> -rob
--------------ms0F3C1B45CC7A013D36398A00
Content-Type: application/x-pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIIKUAYJKoZIhvcNAQcCoIIKQTCCCj0CAQExCzAJBgUrDgMCGgUAMAsGCSqGSIb3DQEHAaCC
CIAwggPJMIIDMqADAgECAhApJmV1yc3jK4hj0lkLTjnTMA0GCSqGSIb3DQEBAgUAMGIxETAP
BgNVBAcTCEludGVybmV0MRcwFQYDVQQKEw5WZXJpU2lnbiwgSW5jLjE0MDIGA1UECxMrVmVy
aVNpZ24gQ2xhc3MgMiBDQSAtIEluZGl2aWR1YWwgU3Vic2NyaWJlcjAeFw05ODAxMDgwMDAw
MDBaFw05OTAxMDgyMzU5NTlaMIIBDDERMA8GA1UEBxMISW50ZXJuZXQxFzAVBgNVBAoTDlZl
cmlTaWduLCBJbmMuMTQwMgYDVQQLEytWZXJpU2lnbiBDbGFzcyAyIENBIC0gSW5kaXZpZHVh
bCBTdWJzY3JpYmVyMUYwRAYDVQQLEz13d3cudmVyaXNpZ24uY29tL3JlcG9zaXRvcnkvQ1BT
IEluY29ycC4gYnkgUmVmLixMSUFCLkxURChjKTk2MSYwJAYDVQQLEx1EaWdpdGFsIElEIENs
YXNzIDIgLSBOZXRzY2FwZTEYMBYGA1UEAxMPQnJhbmRvbiBMIEJsYWNrMR4wHAYJKoZIhvcN
AQkBFg9waG90b25AZHRtZi5jb20wgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBAKCWbRdD
CENPzjAsTBbxTaOEKFDYYgSIX5bdKDOka6XvNTLZCM/GmGYKGArBR409FqLuAy8GpuiUC47c
gHU2qT6rRQCOcTRN9RwlmOUA8zmnnIM/VGJZ3YpW/cra4HCTeirYOMAbATbn0zoDwtArIZJi
d4UCvxquCLXG1glgdUzHAgMBAAGjgdMwgdAwCQYDVR0TBAIwADCBrwYDVR0gBIGnMIAwgAYL
YIZIAYb4RQEHAQEwgDAoBggrBgEFBQcCARYcaHR0cHM6Ly93d3cudmVyaXNpZ24uY29tL0NQ
UzBiBggrBgEFBQcCAjBWMBUWDlZlcmlTaWduLCBJbmMuMAMCAQEaPVZlcmlTaWduJ3MgQ1BT
IGluY29ycC4gYnkgcmVmZXJlbmNlIGxpYWIuIGx0ZC4gKGMpOTcgVmVyaVNpZ24AAAAAAAAw
EQYJYIZIAYb4QgEBBAQDAgeAMA0GCSqGSIb3DQEBAgUAA4GBAGKZ5In9e2L+0mSNwjSxXU4Q
VGguhJsCcb/x94mPXBZl4MScqyYWbSohfh1n2LTGNTwvXNa7NasPGNswSA0QCOkN+XHhfPX1
/mvW1fiDiliGCsSURrs2QsyLPEB0BNVpnf3qBgFjIeaxGrMu80O90AhfB2OAlOgUKQ5l4NTm
Kv/IMIICejCCAeOgAwIBAgIRAJWwdoRMxQoiZqaTab+Cnp0wDQYJKoZIhvcNAQECBQAwXzEL
MAkGA1UEBhMCVVMxFzAVBgNVBAoTDlZlcmlTaWduLCBJbmMuMTcwNQYDVQQLEy5DbGFzcyAy
IFB1YmxpYyBQcmltYXJ5IENlcnRpZmljYXRpb24gQXV0aG9yaXR5MB4XDTk2MDYyNzAwMDAw
MFoXDTk5MDYyNzIzNTk1OVowYjERMA8GA1UEBxMISW50ZXJuZXQxFzAVBgNVBAoTDlZlcmlT
aWduLCBJbmMuMTQwMgYDVQQLEytWZXJpU2lnbiBDbGFzcyAyIENBIC0gSW5kaXZpZHVhbCBT
dWJzY3JpYmVyMIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQC6A+2czKGRcYMfm8gdnk+0
de99TDDzsqo0v5nbRsbUmMcdRQ7nsMbRWe0SAb/9QoLTZ/cJ0iOBqdrkz7UpqqKarVoTSdlS
MVM92tWp3bJncZHQD1t4xd6lQVdI1/T6R+5J0T1ukOdsI9Jmf+F28S6g3R3L1SFwiHKeZKZv
z+793wIDAQABozMwMTAPBgNVHRMECDAGAQH/AgEBMAsGA1UdDwQEAwIBBjARBglghkgBhvhC
AQEEBAMCAQYwDQYJKoZIhvcNAQECBQADgYEAqnUuv+srf8qe029tjTwPCc5bOJjdT4AhOVhb
/XcxTgYSF1/ZkqGRuyCi04g8p7ZSaRcs7mwsn07IW71EwcCK9o8t0lX8YYFNckfCDcduPLOx
2QGyYDVtN6EV38Jgj4XanGX8677JC0V23dDiNBRp1kx2uYdEcGGZa8j8PxoYrngwggIxMIIB
mgIFAqMAAAEwDQYJKoZIhvcNAQECBQAwXzELMAkGA1UEBhMCVVMxFzAVBgNVBAoTDlZlcmlT
aWduLCBJbmMuMTcwNQYDVQQLEy5DbGFzcyAyIFB1YmxpYyBQcmltYXJ5IENlcnRpZmljYXRp
b24gQXV0aG9yaXR5MB4XDTk2MDEyOTAwMDAwMFoXDTk5MTIzMTIzNTk1OVowXzELMAkGA1UE
BhMCVVMxFzAVBgNVBAoTDlZlcmlTaWduLCBJbmMuMTcwNQYDVQQLEy5DbGFzcyAyIFB1Ymxp
YyBQcmltYXJ5IENlcnRpZmljYXRpb24gQXV0aG9yaXR5MIGfMA0GCSqGSIb3DQEBAQUAA4GN
ADCBiQKBgQC2WoujDWojg4BrzzmH9CETMwZMJaLtVRKXxaeAufqDwSCg+i8VDXyhYGt+eSz6
Bg86rvYbb7HS/y8oUl+DfUvEerf4Zh+AVPy3wo5ZShRXRtGak75BkQO7FYCTXOvnzAhsPz6z
Svz/S2wj1VCCJkQZjiPDceoZJEcEnnW/yKYAHwIDAQABMA0GCSqGSIb3DQEBAgUAA4GBAHuv
pBwvwJsOO1z8qObRlg9IckmF5a8aIwDj6buANGyyP4SoPHJws0zec/p8hNKSpQA3CcIDNCMD
t+12ltw+T4X9WBf0BL7sUR7cetvbruhheNplOEesHQNwbenju6BKn0DqvNtzv2dNXcZ8IFzn
6KIKK8IJdNdM/tvpawJjXTmnMYIBmDCCAZQCAQEwdjBiMREwDwYDVQQHEwhJbnRlcm5ldDEX
MBUGA1UEChMOVmVyaVNpZ24sIEluYy4xNDAyBgNVBAsTK1ZlcmlTaWduIENsYXNzIDIgQ0Eg
LSBJbmRpdmlkdWFsIFN1YnNjcmliZXICECkmZXXJzeMriGPSWQtOOdMwCQYFKw4DAhoFAKB6
MBgGCSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwGwYJKoZIhvcNAQkPMQ4wDDAKBggqhkiG9w0D
BzAcBgkqhkiG9w0BCQUxDxcNOTgwMjA4MjIxODQ2WjAjBgkqhkiG9w0BCQQxFgQUM1dxlV+Q
1GmPJ7BaqhW46OrFXk8wDQYJKoZIhvcNAQEBBQAEgYAfHwxA8ekcKuVDDzRk9qbLdmBBMRSi
PUi+oMb419OSgSyQ5fLA3jmr2fayx+UHERuWoPBVVWMc3+skkmD5kVEgAVbUD4nw+ixbUsOr
+CMWGBRUA9MigCcPWRqYyTIhO7r8cGTPQrgPqmvuBnkBEPDFMEYYdNa+6r1zZxsdr8zlfA==
--------------ms0F3C1B45CC7A013D36398A00--





^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-08 21:44 Rob
  0 siblings, 0 replies; 23+ messages in thread
From: Rob @ 1998-02-08 21:44 UTC (permalink / raw)


If you're trying to create lock files dozens of times per second
in a union directory you deserve whatever failures the system
sees fit to provide.

-rob




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-08 21:10 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-08 21:10 UTC (permalink / raw)


>From: "G. David Butler" <gdb@dbSystems.com>
>
>Another alternative would be to change create(2) to simply call
>create(5) and return the results.  There will need to be some
>cleanup of programs that assume that a create on a existing file
>is OK, but if that is the case it is easy to change:
>
>if ((fd = create(file, mode, perm)) < 0) {
>	error...
>}
>
>to:
>
>if ((fd = create(file, mode, perm)) < 0 ||
>    (fd = open(file, mode|OTRUNC) < 0)) {
>	error...
>}
>
>In those programs.
>
>Any comments?

I'm surprised I haven't yet seen "What about union directories?"

If create(2) is changed then it could succeed even though a
file with that name exists in the union.  Then the above:

if ((fd = create(file, mode, perm)) < 0) {
	error...
}

Would need to become:

if ((fd = open(file, mode|OTRUNC)) < 0 ||
    (fd = create(file, mode, perm)) < 0 ||
    (fd = open(file, mode|OTRUNC)) < 0 ||
	error...
}

This is precisely the current create(2) call and the nasty
race is clear.

At this point an application could remove the OTRUNC from the
last open and know what is happening and deal with it as
appropriate.

So another advantage of changing create(2) to simply use create(5)
is the application now has more control over the behavior of
union directories.

For example one could mount a creatable directory before a
directory of readonly files and could have file updates "replace"
the readonly ones.  Currently one would get an error because you
can't write the file that exists.  To do that the application
would have to "know" the creatable directory exists and issue
create(2) to that directory.

In this case the application could do:

if ((fd = create(file, mode, perm)) < 0 ||
    (fd = open(file, mode|OTRUNC) < 0)) {
	error...
}

with no race.

David Butler
gdb@dbSystems.com




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-08 16:52 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-08 16:52 UTC (permalink / raw)


I hate when I'm in too much of a hurry.  I need to be more
complete here.  Also, this happens 30 times a *second* in
my application.

>From: "G. David Butler" <gdb@dbSystems.com>
>>From: "Dave Presotto" <presotto@plan9.bell-labs.com>
>>This is why I got ken to introduce exclusive access files.  I
>>use one as a lock for a directory/file/whatever.
>
>I tried that with:
>
>open(lock file in directory)
>if open fail {
>	create
>	write
>	close
>}
>close(lock file)

The real code is more like:

while(open(lock file) fails)
	/*sleep(some amount if you want)*/;

because the open to an exclusive file does not block.
Also I forgot to add the many clone(5)s.

>if open fail {
>	create
>	write
>	close
>}
>close(lock file)
>
>But the 9p overhead and the bottleneck created convinced me that
>it was not a good idea.  The above becomes:
>
>open(2)
	clone(5)
>	walk(5)
>	open(5)
>open(2)
	clone(5)
>	walk(5) [fails]
	clunk(5)
>create(2)
	clone(5)
>	walk(5) [create(2) trying to determine to send the create(5)]
	clunk(5)
>	create(5)
>write(2)...
>close(2)
>	clunk(5)
>close(2)
>	clunk(5)
>
>If we change create(2) to do create(5) then:
>
>if create() success {
>	write
>	close
>}
>
>becomes:
>
>create(2)
>	create(5)
>write(2)...
>close(2)
>	clunk(5)
>
>Much better.

David Butler
gdb@dbSystems.com




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-08 16:16 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-08 16:16 UTC (permalink / raw)


>From: "Dave Presotto" <presotto@plan9.bell-labs.com>
>This is why I got ken to introduce exclusive access files.  I
>use one as a lock for a directory/file/whatever.

I tried that with:

open(lock file in directory)
if open fail {
	create
	write
	close
}
close(lock file)

But the 9p overhead and the bottleneck created convinced me that
it was not a good idea.  The above becomes:

open(2)
	walk(5)
	open(5)
open(2)
	walk(5) [fails]
create(2)
	walk(5) [create(2) trying to determine to send the create(5)]
	create(5)
write(2)...
close(2)
	clunk(5)
close(2)
	clunk(5)

If we change create(2) to do create(5) then:

if create() success {
	write
	close
}

becomes:

create(2)
	create(5)
write(2)...
close(2)
	clunk(5)

Much better.

David Butler
gdb@dbSystems.com




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-08 16:10 Dave
  0 siblings, 0 replies; 23+ messages in thread
From: Dave @ 1998-02-08 16:10 UTC (permalink / raw)


This is why I got ken to introduce exclusive access files.  I
use one as a lock for a directory/file/whatever.

------ forwarded message follows ------

>From cse.psu.edu!owner-9fans Sat Feb  7 20:03:11 EST 1998
Received: from plan9.bell-labs.com ([135.104.8.6]) by plan9; Sat Feb  7 20:03:11 EST 1998
Received: from cse.psu.edu ([130.203.3.50]) by plan9; Sat Feb  7 20:03:10 EST 1998
Received: from localhost (majordom@localhost) by cse.psu.edu (8.8.8/8.7.3) with SMTP id TAA07964; Sat, 7 Feb 1998 19:51:49 -0500 (EST)
Received: by claven.cse.psu.edu (bulk_mailer v1.5); Sat, 7 Feb 1998 19:50:41 -0500
Received: (from majordom@localhost) by cse.psu.edu (8.8.8/8.7.3) id TAA07916 for 9fans-outgoing; Sat, 7 Feb 1998 19:50:35 -0500 (EST)
X-Authentication-Warning: claven.cse.psu.edu: majordom set sender to owner-9fans using -f
Received: from ns.dbSystems.com (root@ns.dbsystems.com [204.178.76.1]) by cse.psu.edu (8.8.8/8.7.3) with SMTP id TAA07912 for <9fans@cse.psu.edu>; Sat, 7 Feb 1998 19:50:30 -0500 (EST)
Received: (from gdb@localhost) by ns.dbSystems.com (8.6.11/8.6.9) id SAA08743 for 9fans@cse.psu.edu; Sat, 7 Feb 1998 18:27:03 -0600
Date: Sat, 7 Feb 1998 18:27:03 -0600
From: "G. David Butler" <dbSystems.com!gdb>
Message-Id: <199802080027.SAA08743@ns.dbSystems.com>
To: cse.psu.edu!9fans
Subject: [9fans] create(2)/open(2) race for file creation
Sender: cse.psu.edu!owner-9fans
Reply-To: cse.psu.edu!9fans
Precedence: bulk

What is the best way for many processes to race for a file create
with only one winner?

In /sys/src/9/port/chan.c we have in namec():

/*
 * protect against the open/create race.  This is not a complete
 * fix.  It just reduces the window.
 */

In man open(5) the algorithm for create(2) is:
if walk(5) is good {
	return open(5, OTRUNC)
} else {
	if ret = create(5) is good {
		return ret
	} else {
		return open(5, OTRUNC)
	}
}

The first idea would be to stat(2) the file and if that fails
then create(2) (That is what ape does), but if the second process
stats before the first creates... NOT!

Using the fact that one can create a file O_WRITE with permissions
that do not allow writing, one can use a stat(2), create(2, mode 0),
chmod(2) sequence as long as the amout of time between the create and
the chmod are "long enough" for the other process... NOT!

One could create the file with CHEXCL then clear the flag later
as long as you wait "long enough" for the other process to create(2)
after the stat(2)... NOT!

It would seem that what is needed is the old *NIX O_EXCL flag.

Another alternative would be to change create(2) to simply call
create(5) and return the results.  There will need to be some
cleanup of programs that assume that a create on a existing file
is OK, but if that is the case it is easy to change:

if ((fd = create(file, mode, perm)) < 0) {
	error...
}

to:

if ((fd = create(file, mode, perm)) < 0 ||
    (fd = open(file, mode|OTRUNC) < 0)) {
	error...
}

In those programs.

Any comments?

David Butler
gdb@dbSystems.com




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-08 15:48 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-08 15:48 UTC (permalink / raw)


>From: forsyth@caldo.demon.co.uk
>>>Any comments?
>
>it's hopeless trying to offer advice without knowing
>what you're actually trying to achieve at the application level
>(except that i can already say i really don't approve of the proposed
>change to create).
>
>what's the aim?

I have data coming from many sources that have a unique key
associated with it.  I want to be able to receive this data
using the key as a file name without the data being corrupted.
This data may need to be updated at times.  I have serveral
processes receiving this data and I don't want the data if I
already have it because my copy may have already been updated.

If I use the algorithm:

if stat fails {
	create
	write
	close
}

I have no guarantee about the amout of time between the stat
and the create for each process.  What I need is a atomic test
and set on each possible name.  create(5) give me that, but there
is no way to get to it from the available system calls.  I can't
use rendezvous because the processes are on different cpu servers.

So if you don't want to change create(2), do we implement O_EXCL?

David Butler
gdb@dbsystems.com




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-08  5:09 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-08  5:09 UTC (permalink / raw)


>From: "Rob Pike" <rob@plan9.bell-labs.com>
>Am I missing something?   If everyone does create(file, OWRITE, 000)
>only one process can succeed.   Perm==0 means no one can read or
>write the file; if you want one writer, many readers, 004 will do.

If I want to create the file, write data and close so that it can
be read/written again then I can't leave it 000 or 400, I must
leave it 600.  So, if many processes are doing:

if stat() fails {
	create()
	write()
	chmod()
	close()
}

Then it is possible for two or more processes to fail the stat
but one to create, write, chmod and close before the other others
do create.  It is a race that can't be avoided, that I can see.

>In other words, it's a create/open race, so don't run that race.

I would like to avoid it.  Your point is valid if the file is
created once and never written to again.  That is a big restriction.




^ permalink raw reply	[flat|nested] 23+ messages in thread

* [9fans] create(2)/open(2) race for file creation
@ 1998-02-08  0:27 G.David
  0 siblings, 0 replies; 23+ messages in thread
From: G.David @ 1998-02-08  0:27 UTC (permalink / raw)


What is the best way for many processes to race for a file create
with only one winner?

In /sys/src/9/port/chan.c we have in namec():

/*
 * protect against the open/create race.  This is not a complete
 * fix.  It just reduces the window.
 */

In man open(5) the algorithm for create(2) is:
if walk(5) is good {
	return open(5, OTRUNC)
} else {
	if ret = create(5) is good {
		return ret
	} else {
		return open(5, OTRUNC)
	}
}

The first idea would be to stat(2) the file and if that fails
then create(2) (That is what ape does), but if the second process
stats before the first creates... NOT!

Using the fact that one can create a file O_WRITE with permissions
that do not allow writing, one can use a stat(2), create(2, mode 0),
chmod(2) sequence as long as the amout of time between the create and
the chmod are "long enough" for the other process... NOT!

One could create the file with CHEXCL then clear the flag later
as long as you wait "long enough" for the other process to create(2)
after the stat(2)... NOT!

It would seem that what is needed is the old *NIX O_EXCL flag.

Another alternative would be to change create(2) to simply call
create(5) and return the results.  There will need to be some
cleanup of programs that assume that a create on a existing file
is OK, but if that is the case it is easy to change:

if ((fd = create(file, mode, perm)) < 0) {
	error...
}

to:

if ((fd = create(file, mode, perm)) < 0 ||
    (fd = open(file, mode|OTRUNC) < 0)) {
	error...
}

In those programs.

Any comments?

David Butler
gdb@dbSystems.com




^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~1998-02-26 23:30 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-02-08  2:01 [9fans] create(2)/open(2) race for file creation Rob
  -- strict thread matches above, loose matches on Subject: below --
1998-02-26 23:30 Russ
1998-02-26 23:01 G.David
1998-02-26 22:16 G.David
1998-02-26 19:33 Eric
1998-02-26  0:24 G.David
1998-02-26  0:04 G.David
1998-02-25 23:42 G.David
1998-02-10 14:34 G.David
1998-02-10  8:59 forsyth
1998-02-09 21:21 G.David
1998-02-09 16:42 G.David
1998-02-09  8:05 Dan
1998-02-08 22:48 G.David
1998-02-08 22:18 Photon
1998-02-08 21:44 Rob
1998-02-08 21:10 G.David
1998-02-08 16:52 G.David
1998-02-08 16:16 G.David
1998-02-08 16:10 Dave
1998-02-08 15:48 G.David
1998-02-08  5:09 G.David
1998-02-08  0:27 G.David

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).