On Sun, May 17, 2020 at 12:38 PM Paul Winalski <paul.winalski@gmail.com> wrote:
Well, the function in question is called getchar().  And although
these days "byte" is synonymous with "8 bits", historically it meant
"the number of bits needed to store a single character".
 
Yep, I think that is the real crux of the issue.  If you grew up with systems that used a 5, 6, or even a 7-bit byte; you have an appreciation of the difference.   Remember, B, like BCPL, and BLISS only have a 'word' as the storage unit.  But by the late 1960s, a byte had been declared (thanks to Fred Brooks shutting down Gene Amhadl's desires) at 8 bits, at least at IBM.**  Of course, the issue was that ASCII was using only 7 bits to store a character.

DEC was still sort of transitioning from word-oriented hardware (a lesson, Paul, you and I lived through being forgotten a few years later with Alpha); but the PDP-11, unlike the 18/36 or 12 bit systems followed IBM's lead and used the 8-bit byte and byte addressing.  But that nasty 7-bit ASCII thing messed it up a little bit.    When C was created (for the 8-bit byte addressed PDP-11) unlike B, Dennis introduced different types.   As he says "C is quirky" and one of those quirks is that he created a "char" type, which was thus 8 bits naturally for the PDP-11, but was storing data following that 7-bit ASCII data with a bit leftover.

As previously said in this discussion, to me issue is that it was called a char, not a byte.  But I wonder if Dennis and team had had that foresight, it would have in practice made that much difference?   It took many years and many lines of code and trying to encode the glyphs for many different natural languages to get to ideas like UTF. 

As someone else pointed out, one of the other quirks of C was trying to encode the return value of a function into single 'word.'    But like many things in the world, we have to build it first and let it succeed before we can find real flaws.   C was incredibly successful and as I said before, I'll not trade it for any other language yet it what it had allowed me and my peers to do over the years.  I humbled by what Dennis did, I doubt many of us would have done as well. That doesn't make C perfect, or than we can not strive to do better, and maybe time will show Rust or Go to be that.  But I suspect that may still be a long time in the future.   All my CMU professors in the 1970s said Fortran was dead then.   However .. remember that it still pays my salary and my company makes a ton of money building hardware that runs Fortran codes - it's not even close when you look at number one [check out:  the application usage on one of the bigger HPC sites in Europe -- I offer it because it's easy to find the data and the graphics make it obvious what is happening: https://www.archer.ac.uk/status/codes/ - other sites have similar stats, but find them is harder].

Clem

** As my friend Russ Robeolen (who was the chief designer of the S/360 Model 50) tells the story, he says Amdahl was madder than a hornet about it, but Brooks pulled rank and kicked him out of his office.  The S/360 was supposed to be an ASCII machine - Amdahl thought the extra bit for a byte was a waste -- Brooks told him if it wasn't a power of 2, don't come back -- that is "if a byte was not a power of two he did not know how to program for it efficiently and SW being efficient was more important that Amdahl's HW implementation!" (imagine that).   Amdahl did get a 24-bit word type, but Brooks made him define it so that 32 bits stored everything, which again Amdahl thought was a waste of HW.  Bell would later note that it was the single greatest design choice in the computer industry].