From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <47CB236A.2020402@free.fr> Date: Sun, 2 Mar 2008 23:00:10 +0100 From: Philippe Anel User-Agent: Thunderbird 2.0.0.12 (Windows/20080213) MIME-Version: 1.0 To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Subject: Re: [9fans] GCC/G++: some stress testing References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: 6ce946fa-ead3-11e9-9d60-3106f5b1d025 Paul Lalonde wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > CSP doesn't scale very well to hundreds of simultaneously executing > threads (my claim, not, as far as I've found yet, anyone else's). It > is very well suited to a small number of threads that need to > communicate, and as a model of concurrency for tasks with few points > of contact. For performance, the channel locks become a bottleneck as > the number of cores scale up. As far as expressiveness, there are > still issues with composability and correctness as the number of > threads interacting increases. Yes, you at least get local stacks, > but the work seems to get exponentially harder as the number of > systems in the simulation (um, game engine) increases. Interesting. I agree with you, taking care about memory hierarchy is becoming very important. Especially if you think about the upcoming NUMAcc systems (Opterons are already there though). But the fact is doesn't scale well is not about CSP itself, but the way it has been implemented. If CSP system itself takes care about memory hierarchy and uses no synchronisation (using IPI to send message to another core by example), CSP scales very well. Of course IPI mechanism requires a switch to kernel mode which costs a lot. But this is necessary only if the destination thread is running on another core, and I don't think latency is very important in algorigthms requiring a lot of cpus. What do you think ? Phil;