From: Dan Cross <crossd@gmail.com>
To: Brantley Coile <brantley@coraid.com>
Cc: The Eunuchs Hysterical Society <tuhs@tuhs.org>
Subject: Re: [TUHS] v7 K&R C
Date: Sat, 16 May 2020 12:14:28 -0400 [thread overview]
Message-ID: <CAEoi9W5dUYSk5=L=-VujSzQkVks6B157T5r+HAMefGosvi+J7Q@mail.gmail.com> (raw)
In-Reply-To: <0A2C62EF-E43E-43E4-8C53-CE4C99BC5B32@coraid.com>
[-- Attachment #1: Type: text/plain, Size: 3297 bytes --]
On Fri, May 15, 2020 at 8:58 PM Brantley Coile <brantley@coraid.com> wrote:
> I always kept local, single characters in ints. This avoided the problem
> with loading a character being signed or unsigned. The reason for not
> specifying is obvious. Today, you can pick the move-byte-into-word
> instruction that either sign extends or doesn't. But when C was defined
> that wasn't the case. Some machines sign extended when a byte was loaded
> into a register and some filled the upper bits with zero. For machines that
> filled with zero, a char was unsigned. If you forced the language to do one
> or the other, it would be expensive on the opposite kind of machine.
>
Not only that, but if one used an exactly `char`-width value to hold, er,
character data as returned from `getchar` et al, then one would necessarily
give up the possibility of handling whatever character value was chosen for
the sentinel marking end-of-input stream. `getchar` et al are defined to
return EOF on end of input; if they didn't return a wider type than `char`,
there would be data that could not be read. On probably every machine I am
ever likely to use again in my lifetime, byte value 255 would be -1 as a
signed char, but it is also a perfect valid value for a byte.
The details of whether char is signed or unsigned aside, use of a wider
type is necessary for correctness and ability to completely represent the
input data.
It's one of the things that made C a good choice on a wide variety of
> machines.
>
> I guess I always "saw" the return value of the getchar() as being in a int
> sized register, at first namely R0, so kept the character values returned
> as ints. The actual EOF indication from a read is a return value of zero
> for the number of characters read.
>
That's certainly true. Had C supported multiple return values or some kind
of option type from the outset, it might have been that `getchar`, read,
etc, returned a pair with some useful value (e.g., for `getchar` the value
of the byte read; for `read` a length) and some indication of an
error/EOF/OK value etc. Notably, both Go and Rust support essentially this:
in Go, `io.Read()` returns a `(int, error)` pair, and the error is `io.EOF`
on end-of-input; in Rust, the `read` method of the `Read` trait returns a
`Result<usize, io::Error>`, though a `Result::Ok(n)`, where `n==0`
indicates EOF.
But I'm just making noise because I'm sure everyone knows all this.
>
I think it's worthwhile stating these things explicitly, sometimes.
- Dan C.
> On May 15, 2020, at 4:18 PM, ron@ronnatalie.com wrote:
> >
> > EOF is defined to be -1.
> > getchar() returns int, but c is a unsigned char, the value of (c =
> getchar()) will be 255. This will never compare equal to -1.
> >
> >
> >
> > Ron,
> >
> > Hmmm... getchar/getc are defined as returning int in the man page and C
> is traditionally defined as an int in this code..
> >
> > On Fri, May 15, 2020 at 4:02 PM <ron@ronnatalie.com> wrote:
> >> Unfortunately, if c is char on a machine with unsigned chars, or it’s
> of type unsigned char, the EOF will never be detected.
> >>
> >>
> >>
> >>> • while ((c = getchar()) != EOF) if (c == '\n') { /* entire record
> is now there */
>
>
[-- Attachment #2: Type: text/html, Size: 4362 bytes --]
next prev parent reply other threads:[~2020-05-16 16:15 UTC|newest]
Thread overview: 139+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-14 18:41 Doug McIlroy
2020-05-14 18:45 ` Richard Salz
2020-05-14 20:54 ` Clem Cole
2020-05-15 2:44 ` Rob Pike
2020-05-15 3:57 ` Rich Morin
2020-05-15 7:55 ` Dr Iain Maoileoin
2020-05-15 15:01 ` Larry McVoy
2020-05-15 15:36 ` John P. Linderman
2020-05-15 20:01 ` ron
2020-05-15 20:03 ` Larry McVoy
2020-05-15 20:05 ` Clem Cole
2020-05-15 20:18 ` ron
2020-05-15 20:24 ` Clem Cole
2020-05-16 0:57 ` Brantley Coile
2020-05-16 16:14 ` Dan Cross [this message]
2020-05-15 20:56 ` Steve Nickolas
2020-05-16 0:31 ` Peter Jeremy
2020-05-16 8:30 ` Steve Nickolas
2020-05-16 0:43 ` John P. Linderman
2020-05-16 16:28 ` Paul Winalski
2020-05-16 17:39 ` Warner Losh
2020-05-19 19:45 ` Peter Pentchev
2020-05-20 3:52 ` Rich Morin
2020-05-21 15:06 ` Dave Horsfall
-- strict thread matches above, loose matches on Subject: below --
2020-05-19 12:29 Noel Chiappa
2020-05-19 2:29 Doug McIlroy
2020-05-19 3:20 ` Steve Nickolas
2020-05-18 14:33 Doug McIlroy
2020-05-18 13:58 Doug McIlroy
2020-05-16 18:45 Richard Tobin
2020-05-16 21:55 ` Ronald Natalie
2020-05-16 0:15 Nelson H. F. Beebe
2020-05-16 0:28 ` Steffen Nurpmeso
2020-05-16 1:52 ` Warner Losh
2020-05-16 16:31 ` Paul Winalski
2020-05-16 20:35 ` Brad Spencer
2020-05-16 20:37 ` Warner Losh
2020-05-18 12:25 ` Tony Finch
2020-05-15 21:31 Richard Tobin
2020-05-15 21:53 ` Steve Nickolas
2020-05-15 22:33 ` ron
2020-05-15 23:34 ` Steffen Nurpmeso
2020-05-16 1:26 ` Larry McVoy
2020-05-16 21:59 ` Ronald Natalie
2020-05-16 23:26 ` Steffen Nurpmeso
2020-05-17 16:24 ` Paul Winalski
2020-05-17 16:29 ` ron
2020-05-17 16:38 ` Paul Winalski
2020-05-17 20:08 ` Clem Cole
2020-05-18 8:46 ` Peter Jeremy
2020-05-19 7:41 ` Dave Horsfall
2020-05-18 12:04 ` Tony Finch
2020-05-18 13:10 ` Clem Cole
2020-05-18 15:13 ` Rich Morin
2020-05-18 15:51 ` Brantley Coile
2020-05-18 16:11 ` Dan Cross
2020-05-18 21:18 ` ron
2020-05-17 16:10 ` Derek Fawcus
2020-05-17 16:14 ` ron
2020-05-15 20:34 Doug McIlroy
2020-05-15 20:40 ` Warner Losh
[not found] <mailman.1.1589421601.13778.tuhs@minnie.tuhs.org>
2020-05-14 3:02 ` Paul McJones
2020-05-14 17:08 ` Paul Winalski
2020-05-14 17:58 ` Clem Cole
2020-04-27 17:45 Noel Chiappa
2020-04-27 17:56 ` Richard Salz
2020-04-27 18:02 ` Brantley Coile
2020-04-27 18:47 ` Derek Fawcus
2020-04-25 19:41 Noel Chiappa
2020-04-25 20:27 ` Steffen Nurpmeso
2020-04-25 13:11 Noel Chiappa
2020-04-25 13:18 ` Rob Pike
2020-04-25 14:57 ` Warner Losh
2020-04-25 18:03 ` Noel Chiappa
2020-04-25 20:11 ` Michael Kjörling
2020-04-25 21:27 ` Brian L. Stuart
2020-04-26 0:07 ` emanuel stiebler
2020-04-26 0:54 ` Rob Pike
2020-04-26 19:37 ` Derek Fawcus
2020-04-26 20:10 ` Derek Fawcus
2020-04-26 21:59 ` Rich Morin
2020-04-26 22:38 ` Noel Hunt
2020-04-26 23:57 ` Nemo Nusquam
2020-04-27 3:38 ` Rob Pike
2020-04-25 13:35 ` Hellwig Geisse
2020-04-25 13:59 ` Richard Salz
2020-04-25 19:01 ` Brian L. Stuart
2020-04-25 20:07 ` Michael Kjörling
2020-04-25 21:34 ` Brian L. Stuart
2020-04-26 6:40 ` arnold
2020-04-25 1:59 Adam Thornton
2020-04-25 2:37 ` Charles Anthony
2020-04-25 2:47 ` Adam Thornton
2020-04-25 2:51 ` Rob Pike
2020-04-25 2:54 ` Rob Pike
2020-04-25 3:04 ` Larry McVoy
2020-04-25 3:30 ` Clem Cole
2020-04-25 3:43 ` Larry McVoy
2020-04-25 3:54 ` Jon Steinhart
2020-04-25 11:44 ` Michael Kjörling
2020-04-25 13:17 ` Dan Cross
2020-05-11 0:28 ` scj
2020-05-11 0:32 ` Rob Pike
2020-05-11 0:57 ` Larry McVoy
2020-05-11 17:32 ` Greg A. Woods
2020-05-11 18:25 ` Paul Winalski
2020-05-11 18:37 ` Clem Cole
2020-05-11 19:12 ` Paul Winalski
2020-05-11 19:57 ` joe mcguckin
2020-05-11 20:25 ` Larry McVoy
2020-05-12 17:23 ` Paul Winalski
2020-05-12 17:35 ` ron
2020-05-12 17:42 ` Larry McVoy
2020-05-12 18:36 ` Paul Winalski
2020-05-13 23:36 ` Dave Horsfall
2020-05-14 0:42 ` John P. Linderman
2020-05-14 2:44 ` Rich Morin
2020-05-14 3:09 ` Charles Anthony
2020-05-14 12:27 ` ron
2020-05-14 12:27 ` ron
2020-05-14 12:27 ` ron
2020-05-14 7:38 ` Dave Horsfall
2020-05-14 12:25 ` ron
2020-05-14 17:13 ` Paul Winalski
2020-05-14 17:21 ` Larry McVoy
2020-05-17 16:34 ` Derek Fawcus
2020-05-14 4:21 ` Greg A. Woods
2020-05-14 4:40 ` Warner Losh
2020-05-14 17:32 ` Larry McVoy
2020-05-14 22:32 ` Tony Finch
2020-05-16 23:53 ` Steffen Nurpmeso
2020-05-17 0:35 ` Larry McVoy
2020-05-11 18:37 ` Larry McVoy
2020-05-11 2:08 ` Lawrence Stewart
2020-05-11 11:36 ` Michael Kjörling
2020-04-25 3:37 ` Dave Horsfall
2020-04-27 13:19 ` Tony Finch
2020-04-25 2:50 ` Adam Thornton
2020-04-25 5:59 ` Lars Brinkhoff
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAEoi9W5dUYSk5=L=-VujSzQkVks6B157T5r+HAMefGosvi+J7Q@mail.gmail.com' \
--to=crossd@gmail.com \
--cc=brantley@coraid.com \
--cc=tuhs@tuhs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).