From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: tuhs-bounces@minnie.tuhs.org X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, T_DKIM_INVALID autolearn=ham autolearn_force=no version=3.4.1 Received: from minnie.tuhs.org (minnie.tuhs.org [45.79.103.53]) by inbox.vuxu.org (OpenSMTPD) with ESMTP id 9c5fade7 for ; Sat, 1 Sep 2018 22:19:49 +0000 (UTC) Received: by minnie.tuhs.org (Postfix, from userid 112) id EA62CA2099; Sun, 2 Sep 2018 08:19:47 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id 1E849A1A90; Sun, 2 Sep 2018 08:19:38 +1000 (AEST) Authentication-Results: minnie.tuhs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=thunk.org header.i=@thunk.org header.b=WdHoCWan; dkim-atps=neutral Received: by minnie.tuhs.org (Postfix, from userid 112) id EA83CA1A90; Sun, 2 Sep 2018 08:19:36 +1000 (AEST) Received: from imap.thunk.org (imap.thunk.org [74.207.234.97]) by minnie.tuhs.org (Postfix) with ESMTPS id 71CD7A1A20 for ; Sun, 2 Sep 2018 08:19:36 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=thunk.org; s=ef5046eb; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=IneqipeTvmeTl+TNxQEaHtLo6g5XhvbIoghoMjemh4U=; b=WdHoCWanjgqHgfIEjoggs9fNiy rHcSlJzcXTwGyIHbKYde3x4lr+zN+BzPvxgsaItPYMej9MCC4UPFN0m/cKeqNTkDlfpUVeQ6HZrtX YrKu+mZ8I2CmUAytFJpyTElv2zdYKe0vRIkzMVPvFZOT0EUsYiza97H6ZbLZagAdwSTA=; Received: from root (helo=callcc.thunk.org) by imap.thunk.org with local-esmtp (Exim 4.89) (envelope-from ) id 1fwEEw-0006z4-O9; Sat, 01 Sep 2018 22:19:34 +0000 Received: by callcc.thunk.org (Postfix, from userid 15806) id 973F17A4B95; Sat, 1 Sep 2018 18:19:33 -0400 (EDT) Date: Sat, 1 Sep 2018 18:19:33 -0400 From: "Theodore Y. Ts'o" To: Arthur Krewat Message-ID: <20180901221933.GA2214@thunk.org> References: <20180830213407.6DC4718C0A6@mercury.lcs.mit.edu> <20180831213451.r7LAj%ca6c@bitmessage.ch> <20180831215854.GB28971@mcvoy.com> <7ed51612-82d7-90ca-ceaf-37b0c869ff93@kilonet.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: tytso@thunk.org X-SA-Exim-Scanned: No (on imap.thunk.org); SAEximRunCond expanded to false Subject: Re: [TUHS] SunOS code? X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: TUHS main list Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" On Sat, Sep 01, 2018 at 01:17:59PM -0400, Arthur Krewat wrote: > On 9/1/2018 12:27 PM, Kevin Bowling wrote: > > > I find it's about equal, and even exceeds Linux in terms of it's NUMA > > > support and multi-processor support. I need to move some systems away from > > > Solaris and off to Linux, and I find it's NUMA support lacking in certain > > > ways. > > This is pure fantasy. To understand Linux performance on high core > > count and multi-socket machines is to have at least passing knowledge > > of Paul McKenney's genius work on RCU [1] and NUMA [2] at Sequent [3] > > and on Linux. IBM bought Sequent, made a favorable patent grant of > > RCU for Linux, and the rest history. > > Thanks :) - I'm basing this on Oracle database performance, for the most > part, and it's weird way of supporting NUMA on Linux in a bass-ackwards sort > of way. Nothing I see in the latest RedHat/CentOS tells me it even cares > about NUMA, but maybe that's more of their "we know better than you" > mentality and it's all hidden under the covers somewhere. It wouldn't surprise me if Linux's NUMA performance is pretty weak compared to Solaris. There was an attempt to try to make NUMA work well on Linux, with a lot of the effort coming from IBM and SGI, but that effort was overtaken by events. Back in Sequent's day, the remote to local memory latency was ten to one, so making the system NUMA aware was critical. But by early 2000's, the remote to local ratio was under 3:1 (or 2:1) for 4 socket systems, and with AMD's "Sufficiently Uniform Memory Organization" (SUMO), the ratio was under 1.5:1 or less. The main reason for this was that Windows was (and as far as I know, still is) NUMA oblivious. So x86 chip and motherboard designers solved the problem, by brute foruce, in hardware. So by 2003 or 2004, the Linux Scalability Effort had more or less petered out. (You can see the leftover remnants at http://lse.sourceforge.net) Fundamentally, the economics of 4 socket and higher machines was such that for many workloads, scale out was much cheaper than scale up. So why buy super-expensive IBM X440, x450, and x460 servers, which were huge cabinets connected by one or more "scalability cables" (sometimes referred to as the "scalability bottleneck"), when most of the time, you could just buy a rack of 2U x86 servers which would be much, much cheaper? There were certainly workloads this wasn't applicable, of course. But when Sun was selling Sun 10k's to web startups during the dot com boom, and they were using it to serve web traffic, they probably had too much VC money to burn, because that was *not* the most cost effective way to do things. Don't get me wrong; the Read Copy Update (RCU) technique was certainly very important, and is responsible for much of Linux's SMP scalability today. But these days, when you can get up to 28 cores (56 threads) on a single socket, the need for more than 2 socket systems is already somewhat niche, and by the time you get to more than 4 sockets, it's positively microscopic. As a result, NUMA support on Linux is certainly not as strong as it could be, and it wouldn't surprise me that Solaris has developed much better ways of handling the behemoths such as Sun Enterprise 10k. - Ted P.S. IBM made the RCU patent available for any GPL code, well before Sun decided on the CDDL for Solaris. So if Sun management had chosen GPL, they could have used RCU....