From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Fri, 11 Jun 2010 11:03:19 -0400 To: 9fans@9fans.net Message-ID: <70d80f50da355772daa7d21f195c7b4b@kw.quanstro.net> In-Reply-To: <4C124E2C.7010008@bouyapop.org> References: <4C1242CD.5020202@bouyapop.org> <4C124E2C.7010008@bouyapop.org> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: Re: [9fans] 9vx, kproc and *double sleep* Topicbox-Message-UUID: 317e4054-ead6-11e9-9d60-3106f5b1d025 On Fri Jun 11 10:54:40 EDT 2010, xigh@bouyapop.org wrote: > I don't think either splhi fixes the problem ... it only hides it for > the 99.999999999% cases. on a casual reading, i agree. unfortunately, the current simplified promela model disagrees, and coraid has run millions of cpu-hrs on quad processor machines running near 100% load with up to 1500 procs, and never seen this. unless you have a good reason why we've never seen such a deadlock, i'm inclined to believe we're missing something. we need better reasons for sticking locks in than guesswork. multiple locks can easily lead to deadlock. have you tried your solution with a single Mach? > No ... I don't think so. I think the problem comes from the fact the > process is no longer exclusively tied to the current Mach when going > (back) to schedinit() ... hence the change I did. have you tried? worst case is you'll have more information on the problem. - erik