9front - general discussion about 9front
 help / color / mirror / Atom feed
* [9front] [patch] ethervgbe: add rx checksum offloading
@ 2022-11-11 16:13 Arne Meyer
  2022-11-11 17:01 ` hiro
  0 siblings, 1 reply; 8+ messages in thread
From: Arne Meyer @ 2022-11-11 16:13 UTC (permalink / raw)
  To: 9front

[-- Attachment #1: Type: text/plain, Size: 120 bytes --]

Hello,

this patch implements rx tcp and udp checksum offloading for ipv4 for the ethervgbe nic driver.

Greetings,
Arne

[-- Attachment #2: ethervgbe.patch --]
[-- Type: application/octet-stream, Size: 2114 bytes --]

diff 4e75dd52d6dc4601e1f9912779c85f760e8d71e1 uncommitted
--- a/sys/src/9/pc/ethervgbe.c
+++ b/sys/src/9/pc/ethervgbe.c
@@ -298,6 +298,13 @@
 	RxDesc_Status_Shutdown	= (1<<30),	/* shutdown during RX */
 	RxDesc_Status_Own	= (1<<31),	/* own bit */
 
+	RxDesc_Control_Udp		= (1<<16),	/* UDP packet */ 
+	RxDesc_Control_Tcp		= (2<<16),	/* TCP packet */
+	RxDesc_Control_Ip		= (4<<16),	/* IP packet */
+
+	RxDesc_Control_Protocsum	= (2<<20),	/* TCP/UDP checksum ok */
+	RxDesc_Control_Ipcsum		= (4<<20),	/* IP checksum ok */
+
 	/* ... */
 	TxDesc_Status_Own	= (1<<31),	/* own bit */
 
@@ -310,6 +317,9 @@
 struct Stats
 {
 	ulong	rx;
+	ulong	tcpck;
+	ulong	udpck;
+	ulong	ipck;
 	ulong	tx;
 	ulong	txe;
 	ulong	intr;
@@ -468,6 +478,9 @@
 	l += snprint(p+l, READSTR-l, "tx: %uld\n", ctlr->stats.tx);
 	l += snprint(p+l, READSTR-l, "tx [errs]: %uld\n", ctlr->stats.txe);
 	l += snprint(p+l, READSTR-l, "rx: %uld\n", ctlr->stats.rx);
+	l += snprint(p+l, READSTR-l, "tcpck: %uld\n", ctlr->stats.tcpck);
+	l += snprint(p+l, READSTR-l, "udpck: %uld\n", ctlr->stats.udpck);
+	l += snprint(p+l, READSTR-l, "ipck: %uld\n", ctlr->stats.ipck);
 	l += snprint(p+l, READSTR-l, "intr: %uld\n", ctlr->stats.intr);
 	snprint(p+l, READSTR-l, "\n");
 
@@ -560,7 +573,7 @@
 	Ctlr* ctlr;
 	int i;
 	Block* block;
-	ulong length, status;
+	ulong length, status, control;
 	RxDesc* desc;
 
 	ctlr = edev->ctlr;
@@ -573,6 +586,7 @@
 		desc = &ctlr->rx_ring[i];
 
 		status = le32toh(desc->status);
+		control = le32toh(desc->control);
 
 		if(status & RxDesc_Status_Own)
 			continue;
@@ -589,6 +603,21 @@
 
 			/* remember the block */
 			block = ctlr->rx_blocks[i];
+
+			if(control & (RxDesc_Control_Tcp | RxDesc_Control_Protocsum)){
+				block->flag |= Btcpck;
+				ctlr->stats.tcpck++;
+			}
+
+			if(control & (RxDesc_Control_Udp | RxDesc_Control_Protocsum)){
+				block->flag |= Budpck;
+				ctlr->stats.udpck++;
+			}
+
+			if(control & (RxDesc_Control_Ip | RxDesc_Control_Ipcsum)){
+				block->flag |= Bipck;
+				ctlr->stats.ipck++;
+			}
 
 			/* plant new block, might fail if out of memory */
 			if(vgbenewrx(ctlr, i) == 0){

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9front] [patch] ethervgbe: add rx checksum offloading
  2022-11-11 16:13 [9front] [patch] ethervgbe: add rx checksum offloading Arne Meyer
@ 2022-11-11 17:01 ` hiro
  2022-11-11 17:35   ` Arne Meyer
  0 siblings, 1 reply; 8+ messages in thread
From: hiro @ 2022-11-11 17:01 UTC (permalink / raw)
  To: 9front

did you find concrete use cases where it's worth offloading this?
some kind of actual bottleneck?
in practice one of my most frequent network administration related
tasks is to turn off all kinds of weird (broken) offloading features
on random ethernet cards.

On 11/11/22, Arne Meyer <meyer.arne83@netcologne.de> wrote:
> Hello,
>
> this patch implements rx tcp and udp checksum offloading for ipv4 for the
> ethervgbe nic driver.
>
> Greetings,
> Arne

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9front] [patch] ethervgbe: add rx checksum offloading
  2022-11-11 17:01 ` hiro
@ 2022-11-11 17:35   ` Arne Meyer
  2022-11-19 20:09     ` Arne Meyer
  0 siblings, 1 reply; 8+ messages in thread
From: Arne Meyer @ 2022-11-11 17:35 UTC (permalink / raw)
  To: 9front

To be honest, no. I was reading through the source code and found out that
the ip stack supports RX checksum offloading, so i thought it was fun to implement it.
The vgbe driver is my pet project.
It's been in my local repo and kernel for about a week now and it didn't hurt. Blocks get send up the stack anyway, 
the only difference is that the code sets the flags for the blocks. Worst case scenario here is that the driver flags checksums as valid even when they are not.
OpenBSD and FreeBSD both support it, so i think it's stable(ish). But as a fellow network admin, I somewhat agree
with you.

> hiro <23hiro@gmail.com> hat am 11.11.2022 17:01 GMT geschrieben:
> 
>  
> did you find concrete use cases where it's worth offloading this?
> some kind of actual bottleneck?
> in practice one of my most frequent network administration related
> tasks is to turn off all kinds of weird (broken) offloading features
> on random ethernet cards.
> 
> On 11/11/22, Arne Meyer <meyer.arne83@netcologne.de> wrote:
> > Hello,
> >
> > this patch implements rx tcp and udp checksum offloading for ipv4 for the
> > ethervgbe nic driver.
> >
> > Greetings,
> > Arne

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9front] [patch] ethervgbe: add rx checksum offloading
  2022-11-11 17:35   ` Arne Meyer
@ 2022-11-19 20:09     ` Arne Meyer
  2022-11-19 21:36       ` ori
  0 siblings, 1 reply; 8+ messages in thread
From: Arne Meyer @ 2022-11-19 20:09 UTC (permalink / raw)
  To: 9front

Is hw offloading something we want? I ask because I think there is something wrong with ether8169 offloading and I'm working on a patch to fix or axe this.

> Arne Meyer <meyer.arne83@netcologne.de> hat am 11.11.2022 17:35 GMT geschrieben:
> 
>  
> To be honest, no. I was reading through the source code and found out that
> the ip stack supports RX checksum offloading, so i thought it was fun to implement it.
> The vgbe driver is my pet project.
> It's been in my local repo and kernel for about a week now and it didn't hurt. Blocks get send up the stack anyway, 
> the only difference is that the code sets the flags for the blocks. Worst case scenario here is that the driver flags checksums as valid even when they are not.
> OpenBSD and FreeBSD both support it, so i think it's stable(ish). But as a fellow network admin, I somewhat agree
> with you.
> 
> > hiro <23hiro@gmail.com> hat am 11.11.2022 17:01 GMT geschrieben:
> > 
> >  
> > did you find concrete use cases where it's worth offloading this?
> > some kind of actual bottleneck?
> > in practice one of my most frequent network administration related
> > tasks is to turn off all kinds of weird (broken) offloading features
> > on random ethernet cards.
> > 
> > On 11/11/22, Arne Meyer <meyer.arne83@netcologne.de> wrote:
> > > Hello,
> > >
> > > this patch implements rx tcp and udp checksum offloading for ipv4 for the
> > > ethervgbe nic driver.
> > >
> > > Greetings,
> > > Arne

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9front] [patch] ethervgbe: add rx checksum offloading
  2022-11-19 20:09     ` Arne Meyer
@ 2022-11-19 21:36       ` ori
  2022-11-20 12:44         ` hiro
  0 siblings, 1 reply; 8+ messages in thread
From: ori @ 2022-11-19 21:36 UTC (permalink / raw)
  To: 9front

Quoth Arne Meyer <meyer.arne83@netcologne.de>:
> Is hw offloading something we want? I ask because I think there is something wrong with ether8169 offloading and I'm working on a patch to fix or axe this.
> 

if it makes a real difference in performance and isn't too complex, it may be worth it, but I suspect software is more than fast enough for
most of the cards we support.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9front] [patch] ethervgbe: add rx checksum offloading
  2022-11-19 21:36       ` ori
@ 2022-11-20 12:44         ` hiro
  2022-12-07 19:12           ` hiro
  0 siblings, 1 reply; 8+ messages in thread
From: hiro @ 2022-11-20 12:44 UTC (permalink / raw)
  To: 9front

my hunch is also that checksum offloading is not worth it, since we
don't have any kind of fastpath in our packet forwarding (we memcpy
every packet).

also, if we did more computationally intensive work on the packet
data, like aes encryption, i understand why somebody might want to use
accelerators for that part. but those checksums are dirt-cheap, on any
CPU.

On 11/19/22, ori@eigenstate.org <ori@eigenstate.org> wrote:
> Quoth Arne Meyer <meyer.arne83@netcologne.de>:
>> Is hw offloading something we want? I ask because I think there is
>> something wrong with ether8169 offloading and I'm working on a patch to
>> fix or axe this.
>>
>
> if it makes a real difference in performance and isn't too complex, it may
> be worth it, but I suspect software is more than fast enough for
> most of the cards we support.
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9front] [patch] ethervgbe: add rx checksum offloading
  2022-11-20 12:44         ` hiro
@ 2022-12-07 19:12           ` hiro
  2022-12-08  1:41             ` cinap_lenrek
  0 siblings, 1 reply; 8+ messages in thread
From: hiro @ 2022-12-07 19:12 UTC (permalink / raw)
  To: 9front

this netflix guy mentions some important points:
https://www.youtube.com/watch?v=36qZYL5RlgY

e.g. if they don't enable *all* those optimizations there's barely any
benefit. turn one off and performance degrades miserably.
basically you have to use TCP segmentation offload, TCP large receive
offload, some kind of sendfile() thingy, and other stuff that i don't
understand, all in unison, or you basically get nothing at all out of
it.

for us i think it's best we just skip all this badly designed over-complexity.

On 11/20/22, hiro <23hiro@gmail.com> wrote:
> my hunch is also that checksum offloading is not worth it, since we
> don't have any kind of fastpath in our packet forwarding (we memcpy
> every packet).
>
> also, if we did more computationally intensive work on the packet
> data, like aes encryption, i understand why somebody might want to use
> accelerators for that part. but those checksums are dirt-cheap, on any
> CPU.
>
> On 11/19/22, ori@eigenstate.org <ori@eigenstate.org> wrote:
>> Quoth Arne Meyer <meyer.arne83@netcologne.de>:
>>> Is hw offloading something we want? I ask because I think there is
>>> something wrong with ether8169 offloading and I'm working on a patch to
>>> fix or axe this.
>>>
>>
>> if it makes a real difference in performance and isn't too complex, it
>> may
>> be worth it, but I suspect software is more than fast enough for
>> most of the cards we support.
>>
>>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [9front] [patch] ethervgbe: add rx checksum offloading
  2022-12-07 19:12           ` hiro
@ 2022-12-08  1:41             ` cinap_lenrek
  0 siblings, 0 replies; 8+ messages in thread
From: cinap_lenrek @ 2022-12-08  1:41 UTC (permalink / raw)
  To: 9front

sorry, but yes. we have a fast-path where we skip checking the ip header
cheksum if the Block has the flag that says it was already checked
by hardware:

	if((bp->flag & Bipck) == 0 && ipcsum(&h->vihl)) {
		ip->stats[InHdrErrors]++;
		netlog(f, Logip, "%V -> %V: bad ip header checksum\n", h->src, h->dst);
		goto drop;
	}

i cannot say if this is really worth it tho. i have no measured it. and
the ip *HEADER* is not very big and the code in ipcsum() has a fast-path
in itself handling the common fixed header size case.

i would say the complexity of this is very low. all the ethernet driver
needs todo is set a flag on the Block to say hw already calculated the
ip header checksum.

also, we do not memcpy every packet. thats what Block's and Block lists
are about. the ip fragments are stored as lists of packets (with all the
headers) without touching them until it becomes really neccessary.
the Block's are directly filled by hardware dma in the common ethernet
case.

maybe you ment when the payload gets passed to userspace? in that case,
yes, when data is tranferred between userspace and kernel memory it gets
copied. but within the kernel, qio/blocks are avoiding doing that.

--
cinap

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-12-08  1:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-11 16:13 [9front] [patch] ethervgbe: add rx checksum offloading Arne Meyer
2022-11-11 17:01 ` hiro
2022-11-11 17:35   ` Arne Meyer
2022-11-19 20:09     ` Arne Meyer
2022-11-19 21:36       ` ori
2022-11-20 12:44         ` hiro
2022-12-07 19:12           ` hiro
2022-12-08  1:41             ` cinap_lenrek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).