From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: <1b3bfeff9117336767b881178d1f8b6a@brasstown.quanstro.net> References: <399b6c5c08b7aeab53f52ce0414714bf@brasstown.quanstro.net> <1b3bfeff9117336767b881178d1f8b6a@brasstown.quanstro.net> Date: Fri, 20 Jun 2014 08:47:15 -0400 Message-ID: From: "Devon H. O'Dell" To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=UTF-8 Subject: Re: [9fans] cache lines, and 60000 cycles of doom Topicbox-Message-UUID: fc291066-ead8-11e9-9d60-3106f5b1d025 2014-06-20 7:50 GMT-04:00 erik quanstrom : > On Fri Jun 20 01:04:20 EDT 2014, devon.odell@gmail.com wrote: > >> Weird. I assume cycles is using rdtsc or rdtscp. Perhaps some of it is due >> to a combination of contention and rdtsc(p) being serializing instructions? I forget that rdtsc isn't, and one uses cpuid to get that behavior. >> On Jun 19, 2014 12:04 PM, "erik quanstrom" wrote: > > other than the code i posted, nobody else touching the Srb, > and it's bigger than a cacheline. > > why would serialization cause a big issue? It disables out-of-order execution by the processor, so there's a pipeline stall. There's overhead to calling the tsc instructions, but not that much. Does `srb->wmach != m->machno` imply that t0 and t1 could be run on different CPUs? TSC is synchronized between cores (unless someone does wrmsr), but if you bounce to another processor, there's no guarantee. Perhaps the difference between when the CPUs came online was on the order of 60k cycles. No clue how cheap sched() is these days. I should probably start reading the code again before I reply to these things. Sorry. --dho > - erik >