9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] dumb kw question
@ 2010-12-27  2:52 erik quanstrom
  2010-12-27  9:06 ` Richard Miller
  0 siblings, 1 reply; 8+ messages in thread
From: erik quanstrom @ 2010-12-27  2:52 UTC (permalink / raw)
  To: 9fans

what does BY2SE stand for?
bytes per what?  and why is this
different from BY2WD?

- erik



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] dumb kw question
  2010-12-27  2:52 [9fans] dumb kw question erik quanstrom
@ 2010-12-27  9:06 ` Richard Miller
  2010-12-27 14:00   ` erik quanstrom
  2010-12-30  1:05   ` Charles Forsyth
  0 siblings, 2 replies; 8+ messages in thread
From: Richard Miller @ 2010-12-27  9:06 UTC (permalink / raw)
  To: 9fans

> what does BY2SE stand for?
> bytes per what?

I don't know the answer but here's a clue:

ether1116.c:573: 		cachedwbse(&r->cs, BY2SE);

l.s:342: TEXT cachedwbse(SB), 1, $-4			/* D writeback SE */




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] dumb kw question
  2010-12-27  9:06 ` Richard Miller
@ 2010-12-27 14:00   ` erik quanstrom
  2010-12-28 10:41     ` Gorka Guardiola
  2010-12-30  1:05   ` Charles Forsyth
  1 sibling, 1 reply; 8+ messages in thread
From: erik quanstrom @ 2010-12-27 14:00 UTC (permalink / raw)
  To: 9fans

On Mon Dec 27 04:07:57 EST 2010, 9fans@hamnavoe.com wrote:
> > what does BY2SE stand for?
> > bytes per what?
>
> I don't know the answer but here's a clue:
>
> ether1116.c:573: 		cachedwbse(&r->cs, BY2SE);
>
> l.s:342: TEXT cachedwbse(SB), 1, $-4			/* D writeback SE */

after quite the go around, BY2SE stands for "single entry".
which begs the question "single entry" of what.  it appears
that it means a cache line, which is 32 bytes, judging from the
rounding that both l2cacheuwbse and cachedwbse do, all
the cachelines covered by the given object are flushed.

if i'm reading the code correctly, ...
- shouldn't BY2SE be replaced by either BY2WD or sizeof(thing)?

- erik



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] dumb kw question
  2010-12-27 14:00   ` erik quanstrom
@ 2010-12-28 10:41     ` Gorka Guardiola
  2010-12-28 10:42       ` Gorka Guardiola
  0 siblings, 1 reply; 8+ messages in thread
From: Gorka Guardiola @ 2010-12-28 10:41 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Mon, Dec 27, 2010 at 3:00 PM, erik quanstrom <quanstro@quanstro.net> wrote:
> On Mon Dec 27 04:07:57 EST 2010, 9fans@hamnavoe.com wrote:
>> > what does BY2SE stand for?
>> > bytes per what?
>>
>> I don't know the answer but here's a clue:
>>
>> ether1116.c:573:              cachedwbse(&r->cs, BY2SE);
>>
>> l.s:342: TEXT cachedwbse(SB), 1, $-4                  /* D writeback SE */
>
> after quite the go around, BY2SE stands for "single entry".
> which begs the question "single entry" of what.  it appears
> that it means a cache line, which is 32 bytes, judging from the
> rounding that both l2cacheuwbse and cachedwbse do, all
> the cachelines covered by the given object are flushed.
>
> if i'm reading the code correctly, ...
> - shouldn't BY2SE be replaced by either BY2WD or sizeof(thing)?
>

Se is in fact single entry in the cache, i.e. cache line. It is not a word
necessarily (on other arm arquitectures it is 64 bytes if I remember right)
and it is used in assembly so a define is the way to go.

G.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] dumb kw question
  2010-12-28 10:41     ` Gorka Guardiola
@ 2010-12-28 10:42       ` Gorka Guardiola
  2010-12-28 13:59         ` erik quanstrom
  0 siblings, 1 reply; 8+ messages in thread
From: Gorka Guardiola @ 2010-12-28 10:42 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

On Tue, Dec 28, 2010 at 11:41 AM, Gorka Guardiola <paurea@gmail.com> wrote:
> On Mon, Dec 27, 2010 at 3:00 PM, erik quanstrom <quanstro@quanstro.net> wro

>
> Se is in fact single entry in the cache, i.e. cache line. It is not a word
> necessarily (on other arm architectures it is 64 bytes if I remember right)
> and it is used in assembly so a define is the way to go.
>

I mean other arm processors, same architecture.

G.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] dumb kw question
  2010-12-28 10:42       ` Gorka Guardiola
@ 2010-12-28 13:59         ` erik quanstrom
  0 siblings, 0 replies; 8+ messages in thread
From: erik quanstrom @ 2010-12-28 13:59 UTC (permalink / raw)
  To: 9fans

> > On Mon, Dec 27, 2010 at 3:00 PM, erik quanstrom <quanstro@quanstro.net> wro
>
> >
> > Se is in fact single entry in the cache, i.e. cache line. It is not a word
> > necessarily (on other arm architectures it is 64 bytes if I remember right)
> > and it is used in assembly so a define is the way to go.
> >
>
> I mean other arm processors, same architecture.

i think you misunderstand what i'm saying.  i don't care that its a define,
i suspect that the definition makes no sense, and calling
cache lines "single entries" is obtuse, even for kernel
assembly code.  :-)

at least my copy of the kw code has
	BY2SE		= 4,
	CACHELINESZ	= 32,	( verified by kw l2 cache doc )
i would think that BY2SE would need to be the
same as CACHELINESZ, so BY2SE is incorrect
and redundant.

regardless, the *se(base, sz) functions don't clear a single entry
they clear all cache lines intersecting
	[base & ~(CACHELINESZ-1), base+size)
so the second argument really should be the size of the object
that needs to be flushed from cache.

to me it would make sense remove BY2SE and replace with
BY2WD, since its always used to indicated the object to be flushed
is word-sized.  and replace *se(base, sz) with *cls(base, sz) or
*entries(base, sz).

- erik



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] dumb kw question
  2010-12-27  9:06 ` Richard Miller
  2010-12-27 14:00   ` erik quanstrom
@ 2010-12-30  1:05   ` Charles Forsyth
  2010-12-30  1:24     ` erik quanstrom
  1 sibling, 1 reply; 8+ messages in thread
From: Charles Forsyth @ 2010-12-30  1:05 UTC (permalink / raw)
  To: 9fans

>ether1116.c:573: 		cachedwbse(&r->cs, BY2SE);

led me to read:

		/* set up receive descriptor */
		r = &ctlr->rx[ctlr->rxtail];
		assert(((uintptr)r & (Descralign - 1)) == 0);
		r->countsize = Bufsize(Rxblklen);
		r->buf = PADDR(b->rp);
		cachedwbse(r, sizeof *r);
		l2cacheuwbse(r, sizeof *r);

		/* and fire */
		r->cs = RCSdmaown | RCSenableintr;
		cachedwbse(&r->cs, BY2SE);
		l2cacheuwbse(&r->cs, BY2SE);

if Descralign is 16, and sizeof(Rx) is 16, but the cache line size is 32,
i'm surprised it works. there are two descriptors per cache line,
and two processors, but only one sees both caches. the other processor -
the ethernet controller - depending how it's wired up,
sees at best L2 but more usually uncached RAM. i don't see how this cache
flushing (and invalidation elsewhere) can give a coherent view of both descriptors.

i'd expect to see peculiar errors, such as non-trivial levels of retransmission
(both sides) causing poor performance.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9fans] dumb kw question
  2010-12-30  1:05   ` Charles Forsyth
@ 2010-12-30  1:24     ` erik quanstrom
  0 siblings, 0 replies; 8+ messages in thread
From: erik quanstrom @ 2010-12-30  1:24 UTC (permalink / raw)
  To: 9fans

> if Descralign is 16, and sizeof(Rx) is 16, but the cache line size is 32,
> i'm surprised it works. there are two descriptors per cache line,
> and two processors, but only one sees both caches. the other processor -
> the ethernet controller - depending how it's wired up,
> sees at best L2 but more usually uncached RAM. i don't see how this cache
> flushing (and invalidation elsewhere) can give a coherent view of both descriptors.
>
> i'd expect to see peculiar errors, such as non-trivial levels of retransmission
> (both sides) causing poor performance.

that was the sort of confusion i was expecting to ensue.
i suppose the proper course of action would be to only
fill out full cache lines.  the myricom hardware demands
this sort of treatment.  but at the very least, it will require
half as many l2 flushes.

however, there's only one rx kproc to be scheduled.  could that
alliviate your concerns?  i'm very fuzzy on arm caching.
for instance, when will the l2 sync up with main memory,
without being manually flushed?

- erik



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-12-30  1:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-27  2:52 [9fans] dumb kw question erik quanstrom
2010-12-27  9:06 ` Richard Miller
2010-12-27 14:00   ` erik quanstrom
2010-12-28 10:41     ` Gorka Guardiola
2010-12-28 10:42       ` Gorka Guardiola
2010-12-28 13:59         ` erik quanstrom
2010-12-30  1:05   ` Charles Forsyth
2010-12-30  1:24     ` erik quanstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).