The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: Lawrence Stewart <stewart@serissa.com>
To: Clem Cole <clemc@ccc.com>
Cc: TUHS main list <tuhs@minnie.tuhs.org>
Subject: Re: [TUHS] PDP-11 legacy, C, and modern architectures
Date: Thu, 28 Jun 2018 16:37:53 -0400	[thread overview]
Message-ID: <0B27CA75-F9AC-4DAE-95D9-858155980B12@serissa.com> (raw)
In-Reply-To: <CAC20D2NMPscxBpHif4YY8Uwnj8gd_9x997iU1AYHaWHPaR4cqA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 6708 bytes --]

Thanks for the promotion to CTO Clem!  I was merely the software architect at Sicortex.

The SC systems had 6-core MIPS-64 cpus at 700 MHz, two channel DDR-2, and a really fast interconnect.  (Seriously fast for its day 800 ns PUT, 1.6 uS GET, north of 2 GB/sec end-to-end, this was in 2008.)  The achilles heel was low memory bandwidth due to a core limitation of a single outstanding miss.  The new chip would have fixed that (and about 8x performance) but we ran out of money in 2009, which was not a good time to look for more.

We had delighted customers who appreciated the reliabiity and the network.  For latency limited codes we did extremely well (GUPS) and still did well on the rest from a flops/watt perspective.  However, lots of commercial prospects didn’t have codes that needed the network and did need single stream performance.  We talked to Urs Holtze at Google and he was very clear - they needed fast single threads.  The low power was very nice, … but we were welcome to try and parallelize their benchmarks.

Which brings me back to the original issue - does C constrain our architectural thinking? 

I’ve spent a fair amount of time recently digging into Nyx, which is an adaptive mesh refinement cosmological hydrodynamics code.  The framework is in C++ because the inheritance stuff makes it straightforward to adapt the AMR machinery to different problems.  This isn’t the kind of horrible C++ that you can’t tell what is going to happen, but pretty close to C style in which you can visualize what the compiler will do.  The “solvers” tend to be Fortran modules, because, I think, Fortran is just sensible about multidimensional arrays and indexing in a way you have to use weird macros to replicate in C.  It isn’t I think that C or C++ compilers cannot generate good code - it is about the syntax for arrays.

For anyone interested in architectural arm wrestling, memory IS the main issue.  It is worth reading the papers on BLIS, an analytical model for writing Basic Linear Algebra libraries.  Once you figure out the flops per byte, you are nearly done - the rest is complicated but straightforward code tuning.  Matrix multiply has O(n^3) computation for O(n^2) memory and that immediately says you can get close to 100% of the ALUs running if you have a clue about blocking in the caches.  This is just as easy or hard to do in C as in Fortran.  The kernels tend to wind up in asm(“”) no matter what you wish for just in order to get the prefetch instructions placed just so.  As far as I can tell, compilers still do not have very good models for cache hierarchies although there isn’t really any reason why they shouldn’t.  Similarly, if your code is mainly doing inner products, you are doomed to run at memory speeds rather than ALU speeds.  Multithreading usually doesn’t help, because often other cores are farther away than main memory.

My summary of the language question comes down to: if you knew what code would run fast, you could code it in C.  Thinking that a new language will explain how to make it run fast is just wishful thinking.  It just pushes the problem onto the compiler writers, and they don’t know how to code it to run fast either.  The only argument I like for new languages is that at least they might be able to let you describe the problem in a way that others will recognize.  I’m sure everyone here has has the sad experience of trying to figure out what is the idea behind a chunk of code.  Comments are usually useless.  I wind up trying to match the physics papers with the math against the code and it makes my brain hurt.  It sure would be nice if there were a series of representations between math and hardware transitioning from why to what to how.  I think that is was Steele was trying to do with Fortress.

I do think the current environment is the best for architectural innovation since the ‘90s.  We have The Machine, we have Dover Micro trying to add security, we have Microsoft’s EDGE stuff, and the multiway battle between Intel/AMD/ARM and the GPU guys and the FPGA guys.  It is a lot more interesting than 2005!  

> On 2018, Jun 28, at 11:37 AM, Clem Cole <clemc@ccc.com> wrote:
> 
> 
> 
> On Thu, Jun 28, 2018 at 10:40 AM, Larry McVoy <lm@mcvoy.com <mailto:lm@mcvoy.com>> wrote:
> Yep.  Lots of cpus are nice when doing a parallel make but there is 
> always some task that just uses one cpu.  And then you want the fastest
> one you can get.  Lots of wimpy cpus is just, um, wimpy.
> 
> ​Larry Stewart would be better to reply as SiCortec's CTO - but that was the basic logic behind their system -- lots of cheap MIPS chips. Truth is they made a pretty neat system and it scaled pretty well.   My observation is that they, like most of the attempts I have been a part, in the end architecture does not matter nearly as much as economics.
> 
> In my career I have build 4 or 5 specially architecture systems.  You can basically live through one or two generations using some technology argument and 'win'.   But in the end, people buy computers to do a job and the really don't give a s*t about how the job gets done, as long as it get done cheaply.​   Whoever wins the economic war has the 'winning' architecture.   Look x66/Intel*64 would never win awards as a 'Computer Science Architecture'  or in SW side; Fortran vs. Algol etc...; Windows beat UNIX Workstations for the same reasons... as well know.
> 
> Hey, I used to race sailboats ...  there is a term called a 'sea lawyer' - where you are screaming you have been fouled but you drowning as your boating is sinking.   I keep thinking about it here.   You can scream all you want about goodness or badness of architecture or language, but in the end, users really don't care.   They buy computers to do a job.   You really can not forget that is the purpose.
> 
> As Larry says: Lots of wimpy cpus is just wimpy.    Hey, Intel, nVidia and AMD's job is sell expensive hot rocks.   They are going to do what they can to make those rocks useful for people.  They want to help people get there jobs done -- period. That is what they do.   Amtel and RPi folks take the 'jelly bean' approach - which is one of selling enough it make it worth it for the chip manufacture and if the simple machine can do the customer job, very cool.  In those cases simple is good (hey the PDP-11 is pretty complex compared to say the 6502).
> 
> So, I think the author of the paper trashing as too high level C misses the point, and arguing about architecture is silly.  In the end it is about what it costs to get the job done.   People will use what it is the most economically for them.
> 
> Clem
> 


[-- Attachment #2: Type: text/html, Size: 10178 bytes --]

  reply	other threads:[~2018-06-28 20:46 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-26 17:54 Nelson H. F. Beebe
2018-06-26 18:03 ` Cornelius Keck
2018-06-26 21:21   ` Nelson H. F. Beebe
2018-06-26 21:56   ` Kurt H Maier
2018-06-26 18:52 ` Ronald Natalie
2018-06-26 19:01 ` Ronald Natalie
2018-06-26 21:16   ` Arthur Krewat
2018-06-26 21:50     ` Larry McVoy
2018-06-26 21:54       ` Ronald Natalie
2018-06-26 21:59         ` Larry McVoy
2018-06-26 22:20           ` Bakul Shah
2018-06-26 22:33             ` Arthur Krewat
2018-06-26 23:53               ` Bakul Shah
2018-06-27  8:30             ` Tim Bradshaw
2018-06-26 22:33           ` Andy Kosela
2018-06-27  0:11             ` Bakul Shah
2018-06-27  6:10               ` arnold
2018-06-27  2:18           ` [TUHS] PDP-11 legacy, C, and modern architectTures Theodore Y. Ts'o
2018-06-27  2:22             ` Theodore Y. Ts'o
2018-06-28 14:36             ` Steffen Nurpmeso
2018-06-27 11:26         ` [TUHS] PDP-11 legacy, C, and modern architectures Tony Finch
2018-06-27 14:33           ` Clem Cole
2018-06-27 14:38             ` Clem Cole
2018-06-27 15:30             ` Paul Winalski
2018-06-27 16:55               ` Tim Bradshaw
2018-06-27  6:27     ` arnold
2018-06-27 16:00 ` Steve Johnson
2018-06-28  4:12   ` Bakul Shah
2018-06-28 14:15     ` Theodore Y. Ts'o
2018-06-28 14:40       ` Larry McVoy
2018-06-28 14:55         ` Perry E. Metzger
2018-06-28 14:58           ` Larry McVoy
2018-06-28 15:39             ` Tim Bradshaw
2018-06-28 16:02               ` Larry McVoy
2018-06-28 16:41                 ` Tim Bradshaw
2018-06-28 16:59                   ` Paul Winalski
2018-06-28 17:09                   ` Larry McVoy
2018-06-29 15:32                     ` tfb
2018-06-29 16:09                       ` Perry E. Metzger
2018-06-29 17:51                       ` Larry McVoy
2018-06-29 18:27                         ` Tim Bradshaw
2018-06-29 19:02                         ` Perry E. Metzger
2018-06-28 20:37                 ` Perry E. Metzger
2018-06-28 15:37         ` Clem Cole
2018-06-28 20:37           ` Lawrence Stewart [this message]
2018-06-28 14:43       ` Perry E. Metzger
2018-06-28 14:56         ` Larry McVoy
2018-06-28 15:07           ` Warner Losh
2018-06-28 19:42           ` Perry E. Metzger
2018-06-28 19:55             ` Paul Winalski
2018-06-28 20:42             ` Warner Losh
2018-06-28 21:03               ` Perry E. Metzger
2018-06-28 22:29                 ` Theodore Y. Ts'o
2018-06-29  0:18                   ` Larry McVoy
2018-06-29 15:41                     ` Perry E. Metzger
2018-06-29 18:01                       ` Larry McVoy
2018-06-29 19:07                         ` Perry E. Metzger
2018-06-29  5:58                   ` Michael Kjörling
2018-06-28 20:52             ` Lawrence Stewart
2018-06-28 21:07               ` Perry E. Metzger
2018-06-28 16:45       ` Paul Winalski
2018-06-28 20:47         ` Perry E. Metzger
2018-06-29 15:43         ` emanuel stiebler
2018-06-29  2:02       ` Bakul Shah
2018-06-29 12:58         ` Theodore Y. Ts'o
2018-06-29 18:41           ` Perry E. Metzger
2018-06-29  1:02 Noel Chiappa
2018-06-29  1:06 Noel Chiappa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0B27CA75-F9AC-4DAE-95D9-858155980B12@serissa.com \
    --to=stewart@serissa.com \
    --cc=clemc@ccc.com \
    --cc=tuhs@minnie.tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).