From: Larry McVoy <lm@mcvoy.com>
To: Dave Horsfall <dave@horsfall.org>
Cc: The Eunuchs Hysterical Society <tuhs@tuhs.org>
Subject: [TUHS] Re: Maximum Array Sizes in 16 bit C
Date: Fri, 20 Sep 2024 08:30:44 -0700 [thread overview]
Message-ID: <20240920153044.GE8905@mcvoy.com> (raw)
In-Reply-To: <11d46ab4-b90c-83fe-131a-ee399eebf342@horsfall.org>
On Sat, Sep 21, 2024 at 01:07:11AM +1000, Dave Horsfall wrote:
> On Fri, 20 Sep 2024, Paul Winalski wrote:
>
> > On Thu, Sep 19, 2024 at 7:52???PM Rich Salz <rich.salz@gmail.com> wrote:
> >
> > In my first C programming job I saw the source to V7 grep which
> > had a "foo[-2]" construct.
> >
> > That sort of thing is very dangerous with modern compilers.?? Does K&R C
> > require that variables be allocated in the order that they are declared??? If
> > not, you're playing with fire.?? To get decent performance out of modern
> > processors, the compiler must perform data placement to maximize cache
> > efficiency, and that practically guarantees that you can't rely on
> > out-of-bounds array references.
>
> [...]
>
> Unless I'm mistaken (quite possible at my age), the OP was referring to
> that in C, pointers and arrays are pretty much the same thing i.e.
> "foo[-2]" means "take the pointer 'foo' and go back two things" (whatever
> a "thing" is).
Yes, but that was a stack variable. Let me see if I can say it more clearly.
foo()
{
int a = 1, b = 2;
int alias[5];
alias[-2] = 0; // try and set a to 0.
}
In v7 days, the stack would look like
[stuff]
[2 bytes for a]
[2 bytes for b]
[2 bytes for the alias address, which I think points forward]
[10 bytes for alias contents]
I'm hazy on how the space for alias[] is allocated, so I made that up. It's
probably something like I said but Paul (or someone) will correct me.
When using a negative index for alias[], the coder is assuming that the stack
variables are placed in the order they were declared. Paul tried to explain
that _might_ be true but is not always true. Modern compilers will look see
which variables are used the most in the function, and place them next to
each other so that if you have the cache line for one heavily used variable,
the other one is right there next to it. Like so:
int heavy1 = 1;
int rarely1 = 2;
int spacer[10];
int heavy2 = 3;
int rarel2 = 4;
The compiler might figure out that heavy{1,2} are used a lot and lay out the
stack like so:
[2 bytes (or 4 or 8 these days) for heavy1]
[bytes for heavy2]
[bytes for rarely1]
[bytes for spacer[10]]
[bytes for rarely2]
Paul was saying that using a negative index in the array creates an alias,
or another name, for the scalar integer on the stack (his description made
me understand, for the first time in decades, why compiler writers hate
aliases and I get it now). Aliases mess hard with optimizers. Optimizers
may reorder the stack for better cache line usage and what you think
array[-2] means doesn't work any more unless the optimizer catches that
you made an alias and preserves it.
Paul, how did I do? I'm not a compiler guy, just had to learn enough to
walk the stack when the kernel panics.
next prev parent reply other threads:[~2024-09-20 15:30 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-18 23:51 Douglas McIlroy
2024-09-18 23:57 ` Henry Bent
2024-09-19 13:13 ` Rich Salz
2024-09-20 13:33 ` Paul Winalski
2024-09-20 15:07 ` Dave Horsfall
2024-09-20 15:30 ` Larry McVoy [this message]
2024-09-20 15:56 ` Stuff Received
2024-09-20 16:14 ` Dan Cross
2024-09-20 17:11 ` G. Branden Robinson
2024-09-20 20:16 ` Bakul Shah via TUHS
2024-09-20 20:58 ` Warner Losh
2024-09-20 21:18 ` Rob Pike
2024-09-20 22:04 ` Bakul Shah via TUHS
2024-09-20 22:19 ` G. Branden Robinson
2024-09-20 15:26 ` Rich Salz
2024-09-20 19:40 ` Leah Neukirchen
2024-09-20 15:24 Douglas McIlroy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240920153044.GE8905@mcvoy.com \
--to=lm@mcvoy.com \
--cc=dave@horsfall.org \
--cc=tuhs@tuhs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).