9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Mike Haertel <mike@ducky.net>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] bochs still no go
Date: Tue, 11 Dec 2001 00:01:19 -0800	[thread overview]
Message-ID: <200112110801.fBB81J357621@ducky.net> (raw)
In-Reply-To: <20011211032545.43452199B5@mail.cse.psu.edu>

>> If "RDMSR" is being used to read the time stamp counter,
>> it should be replaced with RDTSC (0x0F 0x31).  RDMSR is
>> a much slower instruction.
>
>That's not at all clear.  I bet they're approximately
>the same on real hardware.  RDMSR is much slower under
>VMware because it requires trapping into the VMware
>runtime, while RDTSC, an unprivileged instruction, does not.

Ok, I'll admit to a bit of an unfair advantage on this issue: I
can't speak for AMD processors, but I used to work at Intel, as an
architect on the team that did the Pentium Pro and Pentium 4
processors.  I've seen the microcode, and I can assure you that on
Intel processors RDMSR is indeed substantially slower.

The reason is that many of the so-called "machine-specific registers"
that you can read by RDMSR don't really exist as registers in
the hardware at all; instead they are just magic numbers specifying
particular values that the processor microcode can put together
for you by poking around at bits and pieces of internal state
that are often widely distributed throughout the hardware.

So the processor's microcode for the RDMSR instruction is roughly
equivalent to the following C fragment:

	RDMSR:
		if (not in kernel mode)
			fault;
		switch (ecx) {
		...
		case 0x10:
			copy the time stamp counter to (eax:edx);
			break;
		...
		}

whereas the microcode for RDTSC is just:

	RDTSC:
		copy the time stamp counter to (eax:edx);

On Intel processors, an indirect jump in the microcode (the switch)
is guaranteed to be mispredicted, since the usual branch prediction
mechanisms for macroinstruction branches do not apply to microcode
branches (and especially not microcode indirect jumps), so at minimum
RDMSR causes the pipeline to get flushed at least one extra time.
In addition RDMSR is specified to be a "serializing instruction",
which means that the pipeline is drained of older instructions
before the first microinstruction of RDMSR even starts executing.

On x86 processors with RDTSC, you can get pretty high precision
timing for even very fast operations with the following approach:
	x = rdtsc();
	y = rdtsc();
	thing_you_want_to_measure();
	z = rdtsc();
	cycles = (z - y) - (y - x);
(The idea is the "y - x" subtracts out the time required by RDTSC itself.)

Using this method on a Pentium III, I measured RDMSR with ecx == 0x10
to require ~90 cycles, and RDTSC to require "only" ~30 cycles.  The timing
will be similar or identical on the rest of the P6 family (Pentium Pro,
Pentium II, Celeron).

I don't have a Pentium 4 handy try this on, but I expect the performance
difference between RDMSR and RDTSC would be even more pronounced due
to the deeper pipeline among other things.


  reply	other threads:[~2001-12-11  8:01 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-12-11  3:25 Russ Cox
2001-12-11  8:01 ` Mike Haertel [this message]
  -- strict thread matches above, loose matches on Subject: below --
2001-12-10 22:19 Matt
2001-12-10 23:35 ` Mike Haertel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200112110801.fBB81J357621@ducky.net \
    --to=mike@ducky.net \
    --cc=9fans@cse.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).