From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FROM,MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 29236 invoked from network); 16 Dec 2022 16:12:28 -0000 Received: from minnie.tuhs.org (50.116.15.146) by inbox.vuxu.org with ESMTPUTF8; 16 Dec 2022 16:12:28 -0000 Received: from minnie.tuhs.org (localhost [IPv6:::1]) by minnie.tuhs.org (Postfix) with ESMTP id C70B5423A3; Sat, 17 Dec 2022 02:12:15 +1000 (AEST) Received: from mail-lf1-f47.google.com (mail-lf1-f47.google.com [209.85.167.47]) by minnie.tuhs.org (Postfix) with ESMTPS id 1388D4236A for ; Sat, 17 Dec 2022 02:12:11 +1000 (AEST) Received: by mail-lf1-f47.google.com with SMTP id q6so4114809lfm.10 for ; Fri, 16 Dec 2022 08:12:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yF9P1wye8Y9k5WkvuOYEUCvWNEunWkdAuZGBxmm+ZCc=; b=UkNLeEbxKvSyOd1+7YbR2NkVv7dZzaSFfK3WPt+jMc4NInh+tneCvjpcaRtAZkUqNB UBpC9SZsMOIZIeN1RuT2KUDyX0lsOP2CCyQrKKejtZokSZtpzAL/j8Jd8a9UMRAN3YDr DYfcijOi8NS59ByRL1dy4ktWPyiPvQmgH4HdyYLadaV/3IUsvoFJlAFwXsF47XcIod4G V+H+wvYjg71FeJ8VTU5Xe0du+FFW0aDt8ARzfaS2pzjs2I6ihIEcoSkfxNFWGa4kxQOH OI6PSxCmTVwHYVCsWHWLtVPH1RaCPbz68nXbVkBLEuEA5HctSKwSs1NCWCBVdWuGe/l+ 0iIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yF9P1wye8Y9k5WkvuOYEUCvWNEunWkdAuZGBxmm+ZCc=; b=DsUo3a6BmeUbzAV0U4YFpNZWjDfcg6h/ES5UlDImuuVPRWmygFO9wZ2AoVnoj0Wk6j a16gl8aGJ9Yr7GY6VBA+ah5bxCwKlZKoyHa2MpbTBjKQF9lAuvKhtCRpXIT2vzRJiMTG AgySnEVa0GkI0L0mKRlsngOrXvaRyKsj1N0PxHPb+YwzVnO4MGcrVsreuqwBZSUSTHl8 VXq096n89od+FAX1XXd5vTNp8qP8fX1lZu5s7LsINu7lGRll0zLngvuemEgeMiJ1CyZC uWcOiI9k7x41vQLFLDkLSwsxbl9/bLKvaDJbMgRjZFbgli7YaMbYAXdwv3ZqOhc+hQkl tPOg== X-Gm-Message-State: AFqh2ko+3Fbya3xYjFgvfDLYaW3R9VNoJWI/HFt9YVDxVWN7W61lRr8p PdXLNzInG7X37CXqU82QnLBQZHOwtK9ekmOw/To= X-Google-Smtp-Source: AMrXdXsj6RDzwVvc5wvqS0keLjYJe0loSIGjvCuGAR5zYglvquaGAv1JqUwaVCZ7uIuju4rnY1jemxm1MAoXhjEMhEY= X-Received: by 2002:a05:6512:281a:b0:4b9:6159:f307 with SMTP id cf26-20020a056512281a00b004b96159f307mr855477lfb.86.1671207067607; Fri, 16 Dec 2022 08:11:07 -0800 (PST) MIME-Version: 1.0 References: <78A69F72-788E-4A31-B750-A39C97F77C75@csp-partnership.co.uk> <514474eb-5ef5-9c78-0a42-73c7d82e9a65@halwitz.org> In-Reply-To: <514474eb-5ef5-9c78-0a42-73c7d82e9a65@halwitz.org> From: Dan Cross Date: Fri, 16 Dec 2022 11:10:31 -0500 Message-ID: To: Dan Halbert Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Message-ID-Hash: XPPBLKG2NKIHLQWHK7EVD426VCWVOVQK X-Message-ID-Hash: XPPBLKG2NKIHLQWHK7EVD426VCWVOVQK X-MailFrom: crossd@gmail.com X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-tuhs.tuhs.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header CC: tuhs@tuhs.org X-Mailman-Version: 3.3.6b1 Precedence: list Subject: [TUHS] Re: origin of null-terminated strings List-Id: The Unix Heritage Society mailing list Archived-At: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: On Fri, Dec 16, 2022 at 8:42 AM Dan Halbert wrote: > ASCIZ was an assembler directive used for a number of different DEC compu= ters, and also the name for null-terminated strings. I learned it for the P= DP-10, but I'm sure it existed on other machines. It is in some PDP-10 docu= mentation I am looking at right now. Anyone who used DEC and did assembly p= rogramming would have known about it. Various system calls took ASCIZ strin= gs. This raises something I've always been curious about. To what extent were the Unix folks at Bell Labs already familiar with DEC systems before the PD= P-7? It strikes me that much of the published work was centered around IBM and G= E systems (e.g., Ken's wonderful paper on regular expressions, and of course = the Multics work). Were there other Digital machines floating around? I know a proposal was written to get a PDP-10 for operating systems research, but it wasn't approved. Relatedly, was any thought given to trying to get a 360 system? On 12/16/22 04:13, Dr Iain Maoileoin wrote: > ASCIZ > Lost in the mists of time in my mind. Origin, perhaps, but it exists in contemporary assemblers. Like most sane people I try to avoid being in assembler for too long, when you're first turning on a machine it is useful to be able to squirt a message out of the UART if something goes dramatically wrong, and the directive is handy for that. It seems to have made its way into Research assembler via BSD; it's in locore.s in 8th Edition, for instance, but doesn't appear before that. The "UNIX Assembler Manual" describes "String Statements" for the 7th Edition assembler; strings are sequences of ASCII characters between '<' and '>'. But it doesn't say that they're NUL terminated, and they are not: adding the terminator was manual via the familiar, `\0` escape sequence. - Dan C. > I remember running into a .asciz directive n the 70s =E2=80=9Csomewhere= =E2=80=9D. > It was an assembler directive in one of the RT11 systems??? or perhaps th= e unix bootstrap and/or =E2=80=9C.s=E2=80=9D files - when I get some time I= will go read some old code/manuals. > > I > > Yes, it put a null byte at the end of a string. > > On 16 Dec 2022, at 03:14, Ken Thompson wrote: > > asciz -- this is the first time i heard of it. > doug -- yes. > > > On Thu, Dec 15, 2022 at 7:04 PM Douglas McIlroy wrote: >> >> I think this cited quote from >> https://www.joelonsoftware.com/2001/12/11/ is urban legend. >> >> Why do C strings [have a terminating NUl]? It=E2=80=99s because the = PDP-7 >> microprocessor, on which UNIX and the C programming language were >> invented, had an ASCIZ string type. ASCIZ meant =E2=80=9CASCII with a Z = (zero) >> at the end.=E2=80=9D >> >> This assertion seems unlikely since neither C nor the library string >> functions existed on the PDP-7. In fact the "terminating character" of >> a string in the PDP-7 language B was the pair '*e'. A string was a >> sequence of words, packed two characters per word. For odd-length >> strings half of the final one-character word was effectively >> NUL-padded as described below. >> >> One might trace null termination to the original (1965) proposal for >> ASCII, https://dl.acm.org/doi/10.1145/363831.363839. There the only >> role specifically suggested for NUL is to "serve to accomplish time >> fill or media fill." With character-addressable hardware (not the >> PDP-7), it is only a small step from using NUL as terminal padding to >> the convention of null termination in all cases. >> >> Ken would probably know for sure whether there's any truth in the >> attribution to ASCIZ. >> >> Doug > > >