From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HTML_MESSAGE,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 29645 invoked from network); 17 May 2020 20:09:31 -0000 Received: from minnie.tuhs.org (45.79.103.53) by inbox.vuxu.org with ESMTPUTF8; 17 May 2020 20:09:31 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id 16AE89C1AF; Mon, 18 May 2020 06:09:30 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id 865DE9C178; Mon, 18 May 2020 06:08:58 +1000 (AEST) Authentication-Results: minnie.tuhs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=ccc.com header.i=@ccc.com header.b="qvQztfp+"; dkim-atps=neutral Received: by minnie.tuhs.org (Postfix, from userid 112) id 7AC569C160; Mon, 18 May 2020 06:08:55 +1000 (AEST) Received: from mail-qv1-f47.google.com (mail-qv1-f47.google.com [209.85.219.47]) by minnie.tuhs.org (Postfix) with ESMTPS id 333A59C15F for ; Mon, 18 May 2020 06:08:54 +1000 (AEST) Received: by mail-qv1-f47.google.com with SMTP id ee19so3740866qvb.11 for ; Sun, 17 May 2020 13:08:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ccc.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=nkpBXybZUYGYq7xT9ueMxh/pK87hWiQ8uwXcuCsMnuo=; b=qvQztfp+VSluIA+w9ZfU9vyDIVHSncA7vo5IZTilJtuiT03t5Ke4X9Zoquu8LKYay8 f7rpYY3UbVtgrBb8NvROCWA1wzmPUal3YbTuo7z/JiWnBe/4B2m4xE//SnCwinrpe1C8 lqp/DPAEM3ghBs5B+YvbX5efpd8dl235XSs38= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=nkpBXybZUYGYq7xT9ueMxh/pK87hWiQ8uwXcuCsMnuo=; b=EFHixANFS2WWrfbQQa2rTiXLjMKwT0ckKOqfa8CNtS/PcS0Mpe4PqyXAbiZWllgelb V0MIPa/aV3HiP2FoKsK1NMEsaqdrmtMpzuCYstTluA+ssDzBoJtCS3r6JQMcXmsGwb5i 5G6S+xrsaa1fBjU6XcbtJnWJDa9E4t1VAavXEQ9luMUkRORlQ4KmziCbt9nlaKCeb4ox wiP7ruBmyBaPnKlnAPNkTlFZKvWnHZOzGYtt11WuI8zvGHlN1StGKKHDasnLzWP8kVfb Ws/ajwBrHr2s4CdkqEBjYtU+ThtVycR9clcwku9t8QPBiRLwD2akgg7VpnmGZ94AZlT5 Qp9Q== X-Gm-Message-State: AOAM531ImP0SY1Nyw9wqyRbiVksGGwPzUatqbUBgGRhEhLIHuR3EBi3D Y4UwW6GNc6eZVel9ZQkheH0JW138j7noP/e+Pv/gdQ== X-Google-Smtp-Source: ABdhPJyE/WFxKN7l8TL4Wa3j69vcRlH9KFGjr3xtL8EuhHhTDsHBh27MiWLDWLJJmhEMdv/SFhAb+MzPwgxEXYV26k4= X-Received: by 2002:a0c:a692:: with SMTP id t18mr12612979qva.56.1589746133103; Sun, 17 May 2020 13:08:53 -0700 (PDT) MIME-Version: 1.0 References: <20200515213138.8E0F72D2D71E@macaroni.inf.ed.ac.uk> <077a01d62b08$e696bee0$b3c43ca0$@ronnatalie.com> <20200515233427.31Vab%steffen@sdaoden.eu> <5DB09C5A-F5DA-4375-AAA5-0711FC6FB1D9@ronnatalie.com> <20200516232607.nLiIx%steffen@sdaoden.eu> <065a01d62c68$59b7d890$0d2789b0$@ronnatalie.com> In-Reply-To: From: Clem Cole Date: Sun, 17 May 2020 16:08:26 -0400 Message-ID: To: Paul Winalski Content-Type: multipart/alternative; boundary="0000000000008c09d605a5dd9d2d" Subject: Re: [TUHS] v7 K&R C X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: The Eunuchs Hysterical Society Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" --0000000000008c09d605a5dd9d2d Content-Type: text/plain; charset="UTF-8" On Sun, May 17, 2020 at 12:38 PM Paul Winalski wrote: > Well, the function in question is called getchar(). And although > these days "byte" is synonymous with "8 bits", historically it meant > "the number of bits needed to store a single character". > Yep, I think that is the real crux of the issue. If you grew up with systems that used a 5, 6, or even a 7-bit byte; you have an appreciation of the difference. Remember, B, like BCPL, and BLISS only have a 'word' as the storage unit. But by the late 1960s, a byte had been declared (thanks to Fred Brooks shutting down Gene Amhadl's desires) at 8 bits, at least at IBM.** Of course, the issue was that ASCII was using only 7 bits to store a character. DEC was still sort of transitioning from word-oriented hardware (a lesson, Paul, you and I lived through being forgotten a few years later with Alpha); but the PDP-11, unlike the 18/36 or 12 bit systems followed IBM's lead and used the 8-bit byte and byte addressing. But that nasty 7-bit ASCII thing messed it up a little bit. When C was created (for the 8-bit byte addressed PDP-11) unlike B, Dennis introduced different types. As he says "C is quirky" and one of those quirks is that he created a "char" type, which was thus 8 bits naturally for the PDP-11, but was storing data following that 7-bit ASCII data with a bit leftover. As previously said in this discussion, to me issue is that it was called a *char,* not a *byte*. But I wonder if Dennis and team had had that foresight, it would have in practice made that much difference? It took many years and many lines of code and trying to encode the glyphs for many different natural languages to get to ideas like UTF. As someone else pointed out, one of the other quirks of C was trying to encode the return value of a function into single 'word.' But like many things in the world, we have to build it first and let it succeed before we can find real flaws. C was incredibly successful and as I said before, I'll not trade it for any other language yet it what it had allowed me and my peers to do over the years. I humbled by what Dennis did, I doubt many of us would have done as well. That doesn't make C perfect, or than we can not strive to do better, and maybe time will show Rust or Go to be that. But I suspect that may still be a long time in the future. All my CMU professors in the 1970s said Fortran was dead then. However .. remember that it still pays my salary and my company makes a ton of money building hardware that runs Fortran codes - it's not even close when you look at number one [check out: the application usage on one of the bigger HPC sites in Europe -- I offer it because it's easy to find the data and the graphics make it obvious what is happening: https://www.archer.ac.uk/status/codes/ - other sites have similar stats, but find them is harder]. Clem ** As my friend Russ Robeolen (who was the chief designer of the S/360 Model 50) tells the story, he says Amdahl was madder than a hornet about it, but Brooks pulled rank and kicked him out of his office. The S/360 was supposed to be an ASCII machine - Amdahl thought the extra bit for a byte was a waste -- Brooks told him if it wasn't a power of 2, don't come back -- that is "if a byte was not a power of two he did not know how to program for it efficiently and SW being efficient was more important that Amdahl's HW implementation!" (imagine that). Amdahl did get a 24-bit word type, but Brooks made him define it so that 32 bits stored everything, which again Amdahl thought was a waste of HW. Bell would later note that it was the single greatest design choice in the computer industry] . --0000000000008c09d605a5dd9d2d Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Sun, May 17, 2020 at 12:3= 8 PM Paul Winalski <paul.wina= lski@gmail.com> wrote:
Well, the function in question is called getchar().=C2=A0 And= although
these days "byte" is synonymous with "8 bits", historic= ally it meant
"the number of bits needed to store a single character".
=C2=A0
Yep, I think that is the real crux of the= issue.=C2=A0 If you grew up with systems that used a 5, 6, or even a 7-bit= byte; you have an appreciation=C2=A0of the difference.=C2=A0 =C2=A0Remembe= r, B, like BCPL, and BLISS only have=C2=A0a 'word' as the storage u= nit.=C2=A0 But by the late 1960s, a byte had been declared (thanks to Fred = Brooks shutting down Gene Amhadl's desires) at 8 bits,=C2=A0at least at= IBM.**=C2=A0 Of course, the issue was that ASCII was using only 7 bits to = store a character.

DEC was= still sort of transitioning=C2=A0from word-oriented hardware (a lesson, Pa= ul, you and I lived through being forgotten a few years later with Alpha); = but the PDP-11, unlike the 18/36 or 12 bit systems followed IBM's lead = and used the 8-bit byte and byte addressing.=C2=A0 But that nasty 7-bit ASC= II thing messed it up a little bit.=C2=A0 =C2=A0=C2=A0When C was created (for= the 8-bit byte addressed PDP-11) unlike B, Dennis introduce= d different types.=C2=A0 =C2=A0As he says "C is quirky" and one o= f those quirks is that he created a "char" type, which was thus 8= bits naturally for the PDP-11, but was storing data following that 7-bit A= SCII data with a bit leftover.

As previously said in t= his discussion, to me issue is that it was called a char, not a b= yte.=C2=A0 But I wonder if Dennis and team had had that foresight, it w= ould have in practice made that much difference?=C2=A0 =C2=A0It took many y= ears and many lines of code and trying to encode the glyphs for many differ= ent natural languages to get to ideas like UTF.=C2=A0

= As someone else pointed out, one of the other quirks of C was trying to enc= ode the return value of a function=C2=A0into single 'word.'=C2=A0 = =C2=A0 But like many things in the world, we have to build it first and let= it succeed before we can find real flaws.=C2=A0 =C2=A0C was incredibly suc= cessful and as I said before, I'll not trade it for any other language = yet it what it had allowed me and my peers to do over the years.=C2=A0 I hu= mbled=C2=A0by what Dennis did, I doubt many of us would have done as well. = That doesn't make C perfect, or than we can not strive to do better, an= d maybe time will show Rust or Go to be that.=C2=A0 But I suspect that may = still be a long time in the future.=C2=A0 =C2=A0All my CMU professors in th= e 1970s said Fortran was dead then.=C2=A0 =C2=A0However .. remember that it= still pays my salary and my company makes a=C2=A0ton of money building har= dware that runs Fortran codes - it's not even close when you look at nu= mber=C2=A0one [check out:=C2=A0 the application usage on one of the bigger = HPC sites in Europe -- I offer it because it's easy to find the data an= d the graphics make it obvious what is happening:=C2=A0https://www.archer.ac.uk/statu= s/codes/ - other sites have similar stats, but find them is harder<= /span>].<= /span>

Clem

** As my friend Russ Robeolen (who = was the chief designer of the S/360 Model 50) tells the story, he says Amda= hl was madder than a hornet about it, but Brooks pulled rank and kicked him= out of his office.=C2=A0 The=C2=A0S/360 was supposed to be a= n ASCII machine - Amdahl thought the extra bit for a byte was a = waste -- Brooks told him if it=C2=A0wasn't a power of 2, don't come= back -- that is "if a byte was not a power of two he did not know how to progr= am for it efficiently and SW being efficient was more important that Amdahl= 's HW implementation!" (imagine that).=C2=A0 =C2=A0Amdahl did = get a 24-bit word type,=C2=A0but Brooks made him define it so that 32 bits = stored everything,=C2=A0which again Amdahl thought was a waste of HW.=C2=A0= Bell would later note that it was the single greatest design choice in the= computer industry].

= =C2=A0
--0000000000008c09d605a5dd9d2d--