From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HTML_FONT_LOW_CONTRAST,HTML_MESSAGE,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 7594 invoked from network); 5 Jul 2021 21:30:09 -0000 Received: from minnie.tuhs.org (45.79.103.53) by inbox.vuxu.org with ESMTPUTF8; 5 Jul 2021 21:30:09 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id 1409C9C9F2; Tue, 6 Jul 2021 07:30:08 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id 6A24E9C9F1; Tue, 6 Jul 2021 07:29:33 +1000 (AEST) Authentication-Results: minnie.tuhs.org; dkim=pass (1024-bit key; unprotected) header.d=ccc.com header.i=@ccc.com header.b="OOcQ/TXJ"; dkim-atps=neutral Received: by minnie.tuhs.org (Postfix, from userid 112) id 33E619C9F1; Tue, 6 Jul 2021 07:29:31 +1000 (AEST) Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) by minnie.tuhs.org (Postfix) with ESMTPS id 1F6D39C9F0 for ; Tue, 6 Jul 2021 07:29:30 +1000 (AEST) Received: by mail-qt1-f175.google.com with SMTP id w13so13126205qtc.0 for ; Mon, 05 Jul 2021 14:29:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ccc.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=a9wud5c5NV0Ivk4N8D92kdbUPR1jME6lV1Zj/IHM8+E=; b=OOcQ/TXJ/mjkiyM13CNWHRS8qbzUIKX9HVyoky/8p/0Sc6i1aPSObZBleFATjNYIWa SuVXr9iQAULLLtpLFurIyCtVXOXW6pq8DocnuZGJrZ+/HcXiNheJKuGGp5TtxvjWulgI xLplnyM8DrTpT+kRVWkkEJZ2SmO/UPnaFIe9g= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=a9wud5c5NV0Ivk4N8D92kdbUPR1jME6lV1Zj/IHM8+E=; b=GMOWwl8LsmQ2aTCTw3En0QZl649MbT3IY0tCAnaBbmigTFKKUMnWX4t/esG8ndKUFk 4myVHVHN5vo//eQ2SLGlhc3yl/kyB2+kq6Q9zV/WrEnPrvZMHD5RqaUJYXXsDoeabfy6 WXDImVBIZOp2uDP020eFLa0a4bNDW02SIy5DL9sw/PZpQeBMTT0wmsYFpqYocmSy+oca MJ/mOk6EOqAOWHE9iWZlW1/y/w0cVCtuSApQn5XAuhUv4w4n2vJ887U7wyx8VYGjt44q 0Yc1SBrlxb0pSlUOHbh4GSFzTBOR4YBSQ4a45gtcN4P7phmaWG+Pc0NTTFxiPc3qjGn2 nV5Q== X-Gm-Message-State: AOAM531TmqkU0EZkanzaH5t35ZChBLhpgQurJB9+HEikrsBc5oAP3/v3 A8iToM3g/GJMOJEkkM2kn5MmHYVaEWE27fvRT59oxQ== X-Google-Smtp-Source: ABdhPJxk6hVIUtgI6t/8cN9z9plhUGud2lHq3zaoIznyB2PajFOsiNbb7uo7sbZo3EI8RKWZQrlZcWtRxLESvDSKzpI= X-Received: by 2002:ac8:479a:: with SMTP id k26mr14353273qtq.119.1625520569037; Mon, 05 Jul 2021 14:29:29 -0700 (PDT) MIME-Version: 1.0 References: <20210702213648.GW817@mcvoy.com> <20210705002119.GL817@mcvoy.com> <20210705034751.GU817@mcvoy.com> <20210705134522.nzyIC%steffen@sdaoden.eu> In-Reply-To: From: Clem Cole Date: Mon, 5 Jul 2021 17:29:02 -0400 Message-ID: To: Dan Stromberg Content-Type: multipart/alternative; boundary="00000000000017cb3f05c667009a" Subject: Re: [TUHS] Is C obsolete? (was Re: [tuhs] The Unix shell: a 50-year view) X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: The Unix Heritage Society mailing list Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" --00000000000017cb3f05c667009a Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Jul 5, 2021 at 4:16 PM Dan Stromberg wrote: > A null-terminated array of char is a petri dish. A proper string type is > more like a disinfectant. > Hrrmpt.... maybe (in theory), but I can say that never seen it really work in practice -- bwk in Why Pascal is Not My Favorite Programming Language describes much of the practical realities of this sort of choice: 2.1. The size of an array is part of its typeIf one declares var arr10 : array [1..10] of integer; arr20 : array [1..20] of integer; then arr10 and arr20 are arrays of 10 and 20 integers respectively. Suppos= e we want to write a procedure 'sort' to sort an integer array. Because arr10 and arr20 have different types, it is not possible to write a single procedure that will sort them both. The place where this affects Software Tools particularly, and I think programs in general, is that it makes it difficult indeed to create a library of routines for doing common, general-purpose operations like sorting. The particular data type most often affected is 'array of char', for in Pascal a string is an array of characters. Consider writing a function 'index(s,c)' that will return the position in the string s where the character c first occurs, or zero if it does not. The problem is how to handle the string argument of 'index'. The calls 'index('hello',c)' and ' index('goodbye',c)' cannot both be legal, since the strings have different lengths. (I pass over the question of how the end of a constant string like 'hello' can be detected, because it can't.) The next try is var temp : array [1..10] of char; temp :=3D 'hello'; n :=3D index(temp,c); but the assignment to 'temp' is illegal because 'hello' and 'temp' are of different lengths. The only escape from this infinite regress is to define a family of routines with a member for each possible string size, or to make all strings (including constant strings like 'define' ) of the same length. The latter approach is the lesser of two great evils. In 'Tools', a type called 'string' is declared as type string =3D array [1..MAXSTR] of char; where the constant 'MAXSTR' is ``big enough,'' and all strings in all programs are exactly this size. This is far from ideal, although it made it possible to get the programs running. It does not solve the problem of creating true libraries of useful routines. There are some situations where it is simply not acceptable to use the fixed-size array representation. For example, the 'Tools' program to sort lines of text operates by filling up memory with as many lines as will fit; its running time depends strongly on how full the memory can be packed. Thus for 'sort', another representation is used, a long array of characters and a set of indices into this array: type charbuf =3D array [1..MAXBUF] of char; charindex =3D array [1..MAXINDEX] of 0..MAXBUF; But the procedures and functions written to process the fixed-length representation cannot be used with the variable-length form; an entirely new set of routines is needed to copy and compare strings in this representation. In Fortran or C the same functions could be used for both. As suggested above, a constant string is written as 'this is a string' and has the type 'packed array [1..n] of char', where n is the length. Thu= s each string literal of different length has a different type. The only way to write a routine that will print a message and clean up is to pad all messages out to the same maximum length: error('short message '); error('this is a somewhat longer message'); Many commercial Pascal compilers provide a 'string' data type that explicitly avoids the problem; 'string's are all taken to be the same type regardless of size. This solves the problem for this single data type, but no other. It also fails to solve secondary problems like computing the length of a constant string; another built-in function is the usual solution. Pascal enthusiasts often claim that to cope with the array-size problem one merely has to copy some library routine and fill in the parameters for the program at hand, but the defense sounds weak at best:(12 ) ``Since the bounds of an array are part of its type (or, more exactly, of the type of its indexes), it is impossible to define a procedure or function which applies to arrays with differing bounds. Although this restriction may appear to be a severe one, the experiences we have had with Pascal tend to show that it tends to occur very infrequently. [...] However, the need to bind the size of parametric arrays is a serious defect in connection with the use of program libraries.'' This botch is the biggest single problem with Pascal. I believe that if it could be fixed, the language would be an order of magnitude more usable. T= he proposed ISO standard for Pascal(13 ) provides such a fix (``conformant array schemas''), but the acceptance of this part of the standard is apparently still in doubt. =E1=90=A7 --00000000000017cb3f05c667009a Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Mon, Jul 5, 2021 at 4:16 PM Dan Stromberg <drsalists@gmail.com> wrote:
A null-terminated array of char is a petri dish.=C2=A0 A proper st= ring type is more like a disinfectant.
Hrrmp= t.... maybe (in theory), but I can say that never seen it really work in pr= actice --=C2=A0 bwk in=C2=A0Why Pascal is Not My Favorite Programming Language=C2=A0des= cribes=C2=A0much of the practical realities of this sort of choice:

2.1.=C2=A0=C2=A0The size of an = array is part of its type

If one d= eclares
     var     arr10 : array [1..10] of integer;
             arr20 : array [1..20] of integer;
then arr10 and arr20 a= re arrays of 10 and 20 integers respectively.=C2=A0=C2=A0Suppose we want to write a= procedure 'sort' to sort an integer array.=C2=A0=C2=A0= Because arr10 and arr20 = have different types, it is not possible to write a single procedure that w= ill sort them both.

The p= lace where this affects Software Tools particularly, and I think programs i= n general, is that it makes it difficult indeed to create a library of rout= ines for doing common, general-purpose operations like sorting.

<= p style=3D"font-family:Times;font-size:medium">The particular data type most often aff= ected is 'array of char', for in Pascal a string is an array of cha= racters.=C2=A0=C2=A0Consider writing a function 'index(s,c)= 9; that will return the position in the string s where the character c firs= t occurs, or zero if it does not.=C2=A0=C2=A0The problem is how to= handle the string argument of 'index'.=C2=A0=C2= =A0The calls 'index('hello',c)' and '= index('goodbye',c)' cannot both be legal, since th= e strings have different lengths.=C2=A0=C2=A0(I pass over the ques= tion of how the end of a constant string like 'hello' = can be detected, because it can't.) The next try is

     var     te=
mp : array [1..10] of char;
     temp :=3D 'hello';

     n :=3D index(temp,c);
but the assignment to = 'temp' is illegal because 'hello' and 'temp= ' are of different l= engths.

The only escape f= rom this infinite regress is to define a family of routines with a member f= or each possible string size, or to make all strings (including constant st= rings like 'define' ) of the same length.

The latter approach is the lesser of two= great evils.=C2=A0=C2=A0In 'Tools', a type called 'string' is declared as

     type    string =3D array [1.=
.MAXSTR] of char;
where the constant = 9;MAXSTR' is ``big enough,'' and all strings in all programs are ex= actly this size.=C2=A0=C2=A0This is far from ideal, although it made it possible to= get the programs running.=C2=A0=C2=A0It does not solve the problem of creating tru= e libraries of useful routines.

There are some situations where it is simply not acceptable to use t= he fixed-size array representation.=C2=A0=C2=A0For example, the &#= 39;Tools' program to sort lines of text operates by filling up memory w= ith as many lines as will fit; its running time depends strongly on how ful= l the memory can be packed.

= Thus for 'sort', another representation is used, a lon= g array of characters and a set of indices into this array:

=
     type  =
  charbuf =3D array [1..MAXBUF] of char;
             charindex =3D array [1..MAXINDEX] of 0..MAXBUF;
But the procedures and= functions written to process the fixed-length representation cannot be use= d with the variable-length form; an entirely new set of routines is needed = to copy and compare strings in this representation.=C2=A0=C2=A0<= /tt>In Fortran or C the = same functions could be used for both.

As suggested above, a constant string is written as

     =
'this is a string'
and has the type '= packed array [1..n] of char', where n is the length.=C2=A0= =C2=A0Thus each str= ing literal of different length has a different type.=C2=A0=C2= =A0The only way to = write a routine that will print a message and clean up is to pad all messag= es out to the same maximum length:
     error('short message    =
                ');
     error('this is a somewhat longer message');
Many commercial Pascal= compilers provide a 'string' data type that explicitly avoids the prob= lem; 'string's are all taken to be the same type regardless of size.=C2= =A0=C2=A0This solves the problem for this single data type, but no other.=C2=A0=C2=A0It als= o fails to solve secondary problems like computing the length of a constant= string; another built-in function is the usual solution.

Pascal enthusiasts often claim that to c= ope with the array-size problem one merely has to copy some library routine= and fill in the parameters for the program at hand, but the defense sounds= weak at best:(12)

``Since the bounds of an array are part of its type (or, more= exactly, of the type of its indexes), it is impossible to define a procedu= re or function which applies to arrays with differing bounds.=C2=A0=C2= =A0Although this restriction may appear to be a severe one, the experi= ences we have had with Pascal tend to show that it tends to occur very infr= equently.=C2=A0=C2=A0[...] However, the need to bind the size of p= arametric arrays is a serious defect in connection with the use of program = libraries.''
This botch is the biggest sin= gle problem with Pascal.=C2=A0=C2= =A0I believe that if it could be fixed, the language would be an order of m= agnitude more usable.=C2=A0=C2=A0Th= e proposed ISO standard for Pascal(13) provides such a fix (``conformant array schemas= 9;'), but the acceptance of this part of the standard is apparently sti= ll in doubt.=C2=A0
3D""=E1=90=A7
--00000000000017cb3f05c667009a--