From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: <399b6c5c08b7aeab53f52ce0414714bf@brasstown.quanstro.net> References: <399b6c5c08b7aeab53f52ce0414714bf@brasstown.quanstro.net> Date: Fri, 20 Jun 2014 01:03:05 -0400 Message-ID: From: "Devon H. O'Dell" To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: multipart/alternative; boundary=047d7b3a81e8a17c5204fc3d6881 Subject: Re: [9fans] cache lines, and 60000 cycles of doom Topicbox-Message-UUID: fb8e6f84-ead8-11e9-9d60-3106f5b1d025 --047d7b3a81e8a17c5204fc3d6881 Content-Type: text/plain; charset=UTF-8 Weird. I assume cycles is using rdtsc or rdtscp. Perhaps some of it is due to a combination of contention and rdtsc(p) being serializing instructions? On Jun 19, 2014 12:04 PM, "erik quanstrom" wrote: > i'm seeing some mighty interesting timing on my intel ivy bridge. > i found a bug in the file server aoe implementation (can't happen > if you're using the uniprocessor x86 version) that happens because > the Srb is freed before wakeup completes. to solve this there is > some code that sets the state (this is from ken's ancient scheduler, > by way of sape) > > wakeup(&srb); > srb->state = Free; > > code that receives it is like this > > sleep(&srb, srbdone, srb); > cycles(&t0); > for(n = 0; srb->state != Free; n++){ > if(srb->wmach == m->machno) > sched(); > else > monmwait(&srb->state, Alloc); > } > cycles(&t1); > free(srb); > > the astounding thing is that t1-t0 is often ~ 60,000 cycles. > it only hits a small fraction of the time, and the average is > much lower. but that just blows the mind. 60000 cycles! > > (other versions with sched were much worse.) > > as far as i can tell, there are no funny bits in the scheduler that > can cause this, and no wierd scheduling is going on. > > i'm baffled. > > - erik > > --047d7b3a81e8a17c5204fc3d6881 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

Weird. I assume cycles is using rdtsc or rdtscp. Perhaps som= e of it is due to a combination of contention and rdtsc(p) being serializin= g instructions?

On Jun 19, 2014 12:04 PM, "erik quanstrom&q= uot; <quanstro@quanstro.net= > wrote:
i'm seeing some mighty interesting timing on my intel ivy bridge.
i found a bug in the file server aoe implementation (can't happen
if you're using the uniprocessor x86 version) that happens because
the Srb is freed before wakeup completes. =C2=A0to solve this there is
some code that sets the state (this is from ken's ancient scheduler, by way of sape)

=C2=A0 =C2=A0 =C2=A0 =C2=A0 wakeup(&srb);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 srb->state =3D Free;

code that receives it is like this

=C2=A0 =C2=A0 =C2=A0 =C2=A0 sleep(&srb, srbdone, srb);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 cycles(&t0);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 for(n =3D 0; srb->state !=3D Free; n++){
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if(srb->wmach = =3D=3D m->machno)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 sched();
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 else
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 monmwait(&srb->state, Alloc);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 =C2=A0 =C2=A0 cycles(&t1);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 free(srb);

the astounding thing is that t1-t0 is often ~ 60,000 cycles.
it only hits a small fraction of the time, and the average is
much lower. =C2=A0but that just blows the mind. =C2=A060000 cycles!

(other versions with sched were much worse.)

as far as i can tell, there are no funny bits in the scheduler that
can cause this, and no wierd scheduling is going on.

i'm baffled.

- erik

--047d7b3a81e8a17c5204fc3d6881--