From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Tue, 22 Jun 2010 21:09:01 -0400 To: 9fans@9fans.net Message-ID: <370f0af00a45e85654888bdbc1deebe9@kw.quanstro.net> In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Subject: Re: [9fans] interesting timing tests Topicbox-Message-UUID: 3666fed0-ead6-11e9-9d60-3106f5b1d025 > Do you have a way to turn off one of the sockets on "c" (2 x E5540) and get the numbers with HT (8 processors) and without HT (4 processors)? It would also be interesting to see "c" with HT turned off. here's the progression 4 4.41u 1.83s 4.06r 0. %ilock 8 4.47u 2.37s 3.60r 2.0 12 4.49u 8.34s 4.40r 11.0 16 4.36u 13.16s 4.43r 14.7 here's a fun little calculation: 16 threads * 4.43 s * 0.147 + 1.83s baseline = 10.41936 + 1.83 thread*s = 12.25s it seems that increased ilock contention is a big factor in the increase in system time. ilock accounting has most (>80%) long-held ilocks (>8.5µs, ~21k cycles) starting here /sys/src/libc/port/pool.c:1318. this is no surprise. technically, a long-held ilock is not really a problem—until somebody else wants it. but we can be fairly certain that allocb/malloc is a fairly contended code path. hopefully i'll be able to test a less-contended replacement for allocb/freeb before i run out of time with this machine. > Certainly it seems to me that idlehands needs to be fixed, > your bit array "active.schedwait" is one way. i'm not convinced that idlehands is anything but a power-waster. performance wise, it's nearly ideal. - erik