From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 5441 invoked from network); 10 Mar 2023 15:10:17 -0000 Received: from minnie.tuhs.org (50.116.15.146) by inbox.vuxu.org with ESMTPUTF8; 10 Mar 2023 15:10:17 -0000 Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id F398A4153E; Sat, 11 Mar 2023 01:10:12 +1000 (AEST) Received: from mcvoy.com (mcvoy.com [192.169.23.250]) by minnie.tuhs.org (Postfix) with ESMTPS id 704314152A for ; Sat, 11 Mar 2023 01:10:08 +1000 (AEST) Received: by mcvoy.com (Postfix, from userid 3546) id D16BB35E845; Fri, 10 Mar 2023 07:10:07 -0800 (PST) Date: Fri, 10 Mar 2023 07:10:07 -0800 From: Larry McVoy To: Ralph Corderoy Message-ID: <20230310151007.GU9225@mcvoy.com> References: <20230310013216.GP9225@mcvoy.com> <20230310101401.CFFB51FBE4@orac.inputplus.co.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230310101401.CFFB51FBE4@orac.inputplus.co.uk> User-Agent: Mutt/1.5.24 (2015-08-30) Message-ID-Hash: MPOQ6GJYDLHPOEMILPIRADO73IAJDAQ5 X-Message-ID-Hash: MPOQ6GJYDLHPOEMILPIRADO73IAJDAQ5 X-MailFrom: lm@mcvoy.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: tuhs@tuhs.org X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Re: scaling on TCP socket connections List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Fri, Mar 10, 2023 at 10:14:01AM +0000, Ralph Corderoy wrote: > Hi Larry, > > > SGI made TCP go very fast on 200Mhz MIPS processors. The tricks were > > to mark the page copy on write on output so the driver could use the > > page without copying > > I understand that bit. > > > and to page flip on input. > > Could you expand that to a sentence or two. Sure. If you are using something like Hippi where the data size is at least as big as a page, the networking stack can arrange to send data that is a multiple of a page size. When that data works its way up the stack to where a process is waiting in a read(), if the process has asked for a read of a size that is a multiple of a page size, it can just diddle the page tables and take out the pages that the process had and give them to the kernel and put in the pages that the kernel had and give them to the process. That seems like a lot but it's basically what happens in TLB miss and that code is very small and fast. As I was thinking about this there was another thing SGI did that was new to me at the time. Interrupt coalescing. When a packet came in and the driver pulled it off the wire, the driver could look to see if there was going to be another interrupt because another packet had arrived. Then the driver grabbed that packet as well, saving all the overhead of an interrupt. MIPS cpus were not fast but SGI wall papered over that problem by being very clever when they needed to be.