From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 757D1C433B4 for ; Mon, 12 Apr 2021 17:03:35 +0000 (UTC) Received: from lists.zx2c4.com (lists.zx2c4.com [165.227.139.114]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 58D7B611CE for ; Mon, 12 Apr 2021 17:03:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 58D7B611CE Authentication-Results: mail.kernel.org; dmarc=pass (p=none dis=none) header.from=zx2c4.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=wireguard-bounces@lists.zx2c4.com Received: by lists.zx2c4.com (ZX2C4 Mail Server) with ESMTP id bb518b3c; Mon, 12 Apr 2021 17:03:32 +0000 (UTC) Received: from mail.zx2c4.com (mail.zx2c4.com [104.131.123.232]) by lists.zx2c4.com (ZX2C4 Mail Server) with ESMTPS id 70fecac2 (TLSv1.3:AEAD-AES256-GCM-SHA384:256:NO) for ; Mon, 12 Apr 2021 17:03:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1618247007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=DuxvBpzgygzIl6VDj4duXDl7P9YX39EzAq4glmW6qcQ=; b=Q7gA8ii8keMRCFEoJToJesngmwbSYjfSpYPRg8GDM9gZMpHy/rwqvj+8nGxvr+f5OIwk+M idhA7OX/DEgolyU2bSDHfisG6jJnSgBec6TLKl1GuA0bFquCNTFFC7cp+qIMAVieQOM0z4 Cy4ggmRhQmzG0f06Q3f1JmsGVfRB0Kc= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id afa3c2aa (TLSv1.3:AEAD-AES256-GCM-SHA384:256:NO) for ; Mon, 12 Apr 2021 17:03:27 +0000 (UTC) Received: by mail-yb1-f180.google.com with SMTP id o10so14915312ybb.10 for ; Mon, 12 Apr 2021 10:03:27 -0700 (PDT) X-Gm-Message-State: AOAM531o/9Ne2KtdTOhKwspWMLPemdu2F+9FZCvJV1ufr4/dazNjmzhq XbvooVh/bDWBrQmNaLyNz8DeinmXx/7BKALC7WQ= X-Google-Smtp-Source: ABdhPJxy6hqlPgnrx12cXq9JA4WnbwJFWGFw1R63EkWwcEHWJgf+n3bTsyZv3/UaNJdv5/4REi87MgyjclRPL/WtWuo= X-Received: by 2002:a05:6902:1003:: with SMTP id w3mr32864362ybt.123.1618247006695; Mon, 12 Apr 2021 10:03:26 -0700 (PDT) MIME-Version: 1.0 References: <6e259ab359c7f93f8f1119df0ba7b285cd4f53d1.camel@infradead.org> <26fc1c68fa495407b5c4c46a56abdb5dfe639280.camel@infradead.org> <1f5dfe333c4e8d228773241cffadc9913d7829c7.camel@infradead.org> <9940aef2c1064fc785b51ac860020a18@rozman.si> <38E774FD-16C8-4788-8C31-634A7AA4248A@infradead.org> In-Reply-To: <38E774FD-16C8-4788-8C31-634A7AA4248A@infradead.org> From: "Jason A. Donenfeld" Date: Mon, 12 Apr 2021 11:03:15 -0600 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Allowing space for packet headers in Wintun Tx/Rx To: David Woodhouse Cc: Simon Rozman , Daniel Lenski , WireGuard mailing list Content-Type: text/plain; charset="UTF-8" X-BeenThere: wireguard@lists.zx2c4.com X-Mailman-Version: 2.1.30rc1 Precedence: list List-Id: Development discussion of WireGuard List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: wireguard-bounces@lists.zx2c4.com Sender: "WireGuard" Hey guys, Sorry I'm a bit late to this thread. I'm happy to see there's a prototype for benchmarking, though I do wonder if this is a bit of overeager optimization? That is, why is this necessary and does it actually help? By returning packets back to the Wintun ring later, more of the ring winds up being used, which in turn means more cache misses as it spans additional cache lines. In other words, it seems like this might be comparing the performance of memcpy+cache no-memcpy+cachemiss. Which is better, and is it actually measurable? Is it possible that adding this functionality actually has zero measurable impact on performance? Given the complexity this adds, it'd be nice to see some numbers to help make the argument, or perhaps reasoning that's more sophisticated than my own napkin thoughts here. Jason