9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* Re: [9fans] Re: Any significant gotchas?
@ 2000-07-12 14:43 jmk
  2000-07-12 14:55 ` Greg Hudson
  2000-07-12 16:36 ` John S. Dyson
  0 siblings, 2 replies; 10+ messages in thread
From: jmk @ 2000-07-12 14:43 UTC (permalink / raw)
  To: 9fans

Stephen C. Harris <sharris@sch1.NCTR.FDA.GOV> wrote:
	>I'm not saying this is the right system for plan9, I'm just noting that it has
	>apparently helped traditional UNIX in web-server performance a great deal.
	>
	>I'm sure Plan9 could do something cleaner.

Maybe Plan 9 can do this cleaner, maybe not, the trick is it is a very mutable
system that isn't UNIX and offers you the opportunity to look at things in a
different way.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] Re: Any significant gotchas?
  2000-07-12 14:43 [9fans] Re: Any significant gotchas? jmk
@ 2000-07-12 14:55 ` Greg Hudson
  2000-07-14  9:20   ` Wesley Felter
  2000-07-12 16:36 ` John S. Dyson
  1 sibling, 1 reply; 10+ messages in thread
From: Greg Hudson @ 2000-07-12 14:55 UTC (permalink / raw)
  To: jmk; +Cc: 9fans

> Maybe Plan 9 can do this cleaner, maybe not, the trick is it is a
> very mutable system that isn't UNIX and offers you the opportunity
> to look at things in a different way.

Just to relay one such idea, I'm told Solaris has introduced a
/dev/poll in recent versions.  (Though it doesn't happen to be in
Solaris 7, the most recent version I have access to at the moment.)
The idea seems fairly straightforward and should scale quite well.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] Re: Any significant gotchas?
  2000-07-12 14:43 [9fans] Re: Any significant gotchas? jmk
  2000-07-12 14:55 ` Greg Hudson
@ 2000-07-12 16:36 ` John S. Dyson
  1 sibling, 0 replies; 10+ messages in thread
From: John S. Dyson @ 2000-07-12 16:36 UTC (permalink / raw)
  To: 9fans

In article <200007121443.KAA25866@cse.psu.edu>,
	jmk@plan9.bell-labs.com writes:
> Stephen C. Harris <sharris@sch1.NCTR.FDA.GOV> wrote:
>	>I'm not saying this is the right system for plan9, I'm just noting that it has
>	>apparently helped traditional UNIX in web-server performance a great deal.
>	>
>	>I'm sure Plan9 could do something cleaner.
> 
> Maybe Plan 9 can do this cleaner, maybe not, the trick is it is a very mutable
> system that isn't UNIX and offers you the opportunity to look at things in a
> different way.
>
Good point.  That will certainly be part of my education.  If you'all notice
a rediculous position on my part, I'd appreciate a kindly correction!!!

John


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] Re: Any significant gotchas?
  2000-07-12 14:55 ` Greg Hudson
@ 2000-07-14  9:20   ` Wesley Felter
  2000-07-14 14:55     ` Douglas A. Gwyn
  0 siblings, 1 reply; 10+ messages in thread
From: Wesley Felter @ 2000-07-14  9:20 UTC (permalink / raw)
  To: 9fans

in article 200007121455.KAA01408@egyptian-gods.mit.edu, Greg Hudson at
ghudson@mit.edu wrote on 7/12/00 10:13 AM:

>> Maybe Plan 9 can do this cleaner, maybe not, the trick is it is a
>> very mutable system that isn't UNIX and offers you the opportunity
>> to look at things in a different way.
> 
> Just to relay one such idea, I'm told Solaris has introduced a
> /dev/poll in recent versions.  (Though it doesn't happen to be in
> Solaris 7, the most recent version I have access to at the moment.)
> The idea seems fairly straightforward and should scale quite well.

The Linux Scalability Project has /dev/poll patches for Linux and a paper
comparing poll(), /dev/poll, and queued realtime signals (which are really
more like event queues than signals IMO).

http://www.citi.umich.edu/projects/linux-scalability/index.html

The ScalaServer project proposed another event queue API, which may be
similar to the one recently adopted in FreeBSD.

http://www.cs.rice.edu/CS/Systems/ScalaServer/index.html

SGI has a very lightweight threads library, although I'm skeptical that it
can beat the event-driven model.

http://oss.sgi.com/projects/state-threads/

Wesley Felter - wesf@cs.utexas.edu


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] Re: Any significant gotchas?
  2000-07-14  9:20   ` Wesley Felter
@ 2000-07-14 14:55     ` Douglas A. Gwyn
  0 siblings, 0 replies; 10+ messages in thread
From: Douglas A. Gwyn @ 2000-07-14 14:55 UTC (permalink / raw)
  To: 9fans

Wesley Felter wrote:
> The ScalaServer project proposed another event queue API, which may be
> similar to the one recently adopted in FreeBSD.
> [etc.]

The thing is, Plan 9 is exploring how much can be done using
the generalized file access model (9P), not how many
different types of things can be crammed into the system.
If it can indeed be shown that efficient Web service (for
example) is impossible using that model, that is useful and
important information.  I don't think that has yet been
demonstrated.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] Re: Any significant gotchas?
       [not found] <10007111441.ZM757223@marvin>
@ 2000-07-12 13:39 ` Stephen C. Harris
  0 siblings, 0 replies; 10+ messages in thread
From: Stephen C. Harris @ 2000-07-12 13:39 UTC (permalink / raw)
  To: 9fans

"Tom Duff" <td@pixar.com> wrote:
> What application needs to do select on 1000s of file descriptors?

I was thinking web serving and IRC.  

Linux for one has benefited greatly as a web server performance-wise 
just recently, now that it has a good async-io implementation. ( 3 x the 
performance of W2K on SpecWeb99 !). When not many descriptors have data 
available, signals deliver information about which descriptors are ready 
(so you don't have to scan the long list of file descriptors).  On the 
other hand, when lots of descriptors are ready, you get a "signal-queue full" 
signal and can fall back to poll()/select() which will reap lots of 
ready descriptors (gauranteeing the full descriptor list is worth scanning).

I don't know how light-weight Plan9 processes are, but I assume each gets its
own stack and consumes some kernal memory. In Linux (for comparison) that's
8k (stack) + 20k kernal (at least) per connection, which adds up quickly
when you start talking about large numbers of connections ( ~ 500 MB for 
15000 connections I think was a lower bound for a Linux estimate ). 

I'm not saying this is the right system for plan9, I'm just noting that it has
apparently helped traditional UNIX in web-server performance a great deal.

I'm sure Plan9 could do something cleaner.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] Re: Any significant gotchas?
       [not found] <sharris@sch1.NCTR.FDA.GOV>
  2000-07-11 21:10 ` Stephen C. Harris
@ 2000-07-12 12:08 ` John S. Dyson
  1 sibling, 0 replies; 10+ messages in thread
From: John S. Dyson @ 2000-07-12 12:08 UTC (permalink / raw)
  To: 9fans

In article <396BD2AC.9BA22564@anu.edu.au>,
	Jason Ozolins <jason.ozolins@anu.edu.au> writes:
> Scott Schwartz wrote:
> 
>> | a select()-based approach is orders of magnitude better at throughput
>> | and maximum number of connections.
>> 
>> And yet people are constantly writing papers about how select() and poll()
>> are too inefficient, and proposing various other schemes.
> 
> Maybe so, but how does that increase the merit of spawning lots of
> threads just to get select()-like behaviour, if the experience on other
> OSs suggests that the thread approach is even less efficient than
> select()?
> 
> If people have some references to articles on approaches other than
> fd-per-thread or traditional select(), could they maybe post them to the
> group?
> 
I haven't run benchmarks for this problem exactly, but depending on
the technique for threading, you can end up with severe performance
problems due to the cache not being able to map the entire range
of memory required by the threads.  Some multi-threading schemes
require a different stack for every thread (perhaps both in user
and in kernel mode.)  The jumping from stack frame to stack frame
fills the cache very nicely :-).

A distributed kernel development that I have put on hold doesn't depend
on kernel stacks for per-process (or thread) kernel context.  This
begs the issue of the user space stack, but a carefully designed user
level multi-threading scheme might be able to minimize the user stack
needs also (in certain circumstances.)

There are some places where multi-threaded implementations for I/O
concurrency are better than poll() type implementations and vice
versa.  I implemented AIO on FreeBSD as a threaded scheme, so as
to avoid modifying the entire framework for all of the I/O types.
For RAW disk I/O, I did place hooks into the code, and had a table
driven scheme so that the optimizations to avoid threads could be
added over time.

John


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] Re: Any significant gotchas?
  2000-07-11 21:20   ` Scott Schwartz
@ 2000-07-12  9:31     ` Jason Ozolins
  0 siblings, 0 replies; 10+ messages in thread
From: Jason Ozolins @ 2000-07-12  9:31 UTC (permalink / raw)
  To: 9fans

Scott Schwartz wrote:

> | a select()-based approach is orders of magnitude better at throughput
> | and maximum number of connections.
> 
> And yet people are constantly writing papers about how select() and poll()
> are too inefficient, and proposing various other schemes.

Maybe so, but how does that increase the merit of spawning lots of
threads just to get select()-like behaviour, if the experience on other
OSs suggests that the thread approach is even less efficient than
select()?

If people have some references to articles on approaches other than
fd-per-thread or traditional select(), could they maybe post them to the
group?

Thanks,
   Jason =:^)
-- 
Jason Ozolins
Technical Support Group                 Local: x5449
Department of Computer Science         Global: +61 2 6249 5449          
The Australian National University      Email: jason.ozolins@anu.edu.au


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [9fans] Re: Any significant gotchas?
  2000-07-11 21:10 ` Stephen C. Harris
@ 2000-07-11 21:20   ` Scott Schwartz
  2000-07-12  9:31     ` Jason Ozolins
  0 siblings, 1 reply; 10+ messages in thread
From: Scott Schwartz @ 2000-07-11 21:20 UTC (permalink / raw)
  To: 9fans

| One thing that concerns me though is the lack of a select() call or 
| some type of async i/o in Plan9, so one could efficiently handle 
| i/o on a large number of files/services without forking too many processes.  

If that turns out to be a problem in real life, then I'd much
prefer to see rfork tuned, rather than just reintroduce old hacks.

| a select()-based approach is orders of magnitude better at throughput
| and maximum number of connections. 
 
And yet people are constantly writing papers about how select() and poll()
are too inefficient, and proposing various other schemes.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [9fans] Re: Any significant gotchas?
@ 2000-07-11 21:10 ` Stephen C. Harris
  2000-07-11 21:20   ` Scott Schwartz
  0 siblings, 1 reply; 10+ messages in thread
From: Stephen C. Harris @ 2000-07-11 21:10 UTC (permalink / raw)
  To: 9fans

John Dyson said:

>Do you guys think that it would be worthwhile to invest
> serious time on upgrading and improving the kernel for
> large scale usage (on a per-cpu basis.)  

Yes, I think so, I am terribly excited about Plan9 and hope the
best for it.

	One thing that concerns me though is the lack of a select() call or 
some type of async i/o in Plan9, so one could efficiently handle 
i/o on a large number of files/services without forking too many processes.  

	Somebody please correct me if I'm wrong, (I've only poked around briefly in 
the APE select.c and so far I can't really grok how it's emulated), but I think 
to handle i/o in a timely manner on several file descriptors you're supposed to 
rfork() a process for each descriptor to handle the i/o.  This solves
the problem, but I think the cost is too high when the number of descriptors
is even moderately high (more than 1000, say).

	I've seen Java try to handle all I/O this way, and even on
architectures with very lightweight processes/threads (like Solaris), 
a select()-based approach is orders of magnitude better at throughput
and maximum number of connections. (Thus Java is a poor choice for 
writing scalable servers in the eyes of many people.)

	This seems like the biggest scalability issue to me, anyway.
I'd like to hear opinions about it, or someone to tell me I'm 
missing something.

cheers,

Steve Harris



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2000-07-14 14:55 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-07-12 14:43 [9fans] Re: Any significant gotchas? jmk
2000-07-12 14:55 ` Greg Hudson
2000-07-14  9:20   ` Wesley Felter
2000-07-14 14:55     ` Douglas A. Gwyn
2000-07-12 16:36 ` John S. Dyson
     [not found] <10007111441.ZM757223@marvin>
2000-07-12 13:39 ` Stephen C. Harris
     [not found] <sharris@sch1.NCTR.FDA.GOV>
2000-07-11 21:10 ` Stephen C. Harris
2000-07-11 21:20   ` Scott Schwartz
2000-07-12  9:31     ` Jason Ozolins
2000-07-12 12:08 ` John S. Dyson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).