From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,HTML_MESSAGE,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 3086 invoked from network); 24 Aug 2020 16:00:18 -0000 Received: from minnie.tuhs.org (45.79.103.53) by inbox.vuxu.org with ESMTPUTF8; 24 Aug 2020 16:00:18 -0000 Received: by minnie.tuhs.org (Postfix, from userid 112) id 569E29CABB; Tue, 25 Aug 2020 02:00:14 +1000 (AEST) Received: from minnie.tuhs.org (localhost [127.0.0.1]) by minnie.tuhs.org (Postfix) with ESMTP id B61BD9CAB3; Tue, 25 Aug 2020 01:59:28 +1000 (AEST) Authentication-Results: minnie.tuhs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="HJSO2IPx"; dkim-atps=neutral Received: by minnie.tuhs.org (Postfix, from userid 112) id 548599CAB3; Tue, 25 Aug 2020 01:59:25 +1000 (AEST) Received: from mail-qk1-f196.google.com (mail-qk1-f196.google.com [209.85.222.196]) by minnie.tuhs.org (Postfix) with ESMTPS id 407619CAB1 for ; Tue, 25 Aug 2020 01:59:24 +1000 (AEST) Received: by mail-qk1-f196.google.com with SMTP id x69so7829694qkb.1 for ; Mon, 24 Aug 2020 08:59:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=+fWLZIVSL2x3nkXnAvp3kvR3MmizcJsltO0xAsQ4/68=; b=HJSO2IPxgavHBPa7MHynxpR9tE+a4+vgLLMJn3trXRAzfBvFVb+tt9yNx8Er/xo5F/ nYAwakWZkoCmxUTlJpqc/Xg31zEkN27LwFR2BRvlrvJD5oLMFSiEn1u2qB7zTOXEwIJh P1tVBJ1WKDH4UDV7hxbhq7YfWDVGIvGyk2S3N6SzbYdJVDes/tyhan+ORl+78LLUePV9 aS/ZO8qcj/VfTkHy1oiJl9sCO5zBp65UTA4vW7BnmmZ248n5UFUhDRp17hotT6eMn8Tr cQC96Ya6Bb4z25yRF5hPYGj+ANT1eQiCDr3mvzzr3xitgNnbsWe0k8X17CaDb+rmJ94s v24A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+fWLZIVSL2x3nkXnAvp3kvR3MmizcJsltO0xAsQ4/68=; b=LQ3Z2XBl6FS9O7H7Ngnd+zkRG0Eeg6+4GC+a2M2gkteh072TPREf1Din1rkS+Kfe4f zcrcop0QT5zcU1jtaz9IsZe9OWcnbm/W1L9Ovwm+QPG2vscwBb2c9SvoRN9suowpyxPA WWnD3IGAGS+2F/jJmNPwCungaOukBRcWI4ljl7gc5PRxvajFzLza0wSsVzJ4LGusUfgG Plp81CTqNV0/SpDSkkRb652TfHt+yXw9ggziiS2pZcIOHnNeUzMCINIaXCGhxWT5CUlO xswVfC9R75ziyOT9pp8S1Z80mg0LVf9pGXM+SKbnPpY2b5P9nRuAhiO6XqBDDkM/2m69 jVHw== X-Gm-Message-State: AOAM531Eu/qewtf+DNIq+lGOoMOf4rUH/OLdRHl7BdE2twaMkVnGpWDt f5p5LRwbfmAb5w9ihhWh67z/+xTlLweXjGK8wnoLwraGlB8= X-Google-Smtp-Source: ABdhPJzqgt3aaP3fV8058utJa2X2xbawfxvu2O6LIO1/IJgHAqEQDgdyl43ZSwLE7FHlUzGynxsL8loCZqaJZVmeZyQ= X-Received: by 2002:a05:620a:9d4:: with SMTP id y20mr5059666qky.14.1598284763008; Mon, 24 Aug 2020 08:59:23 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Dan Cross Date: Mon, 24 Aug 2020 11:58:46 -0400 Message-ID: To: Dibyendu Majumdar Content-Type: multipart/alternative; boundary="0000000000008c963705ada1ab5b" Subject: Re: [TUHS] Memory management in Dennis Ritchie's C Compiler X-BeenThere: tuhs@minnie.tuhs.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: The Unix Heritage Society mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: The TUHS Errors-To: tuhs-bounces@minnie.tuhs.org Sender: "TUHS" --0000000000008c963705ada1ab5b Content-Type: text/plain; charset="UTF-8" On Mon, Aug 17, 2020 at 4:16 PM Dibyendu Majumdar wrote: > On Mon, 17 Aug 2020 at 17:13, Dan Cross wrote: > > From my light skimming of V10 sources, it appears that the various > components of the default C compiler (that is, not LCC) either use > malloc/free or call `sbrk` directly. > > Yes, it only uses sbrk(). No, that's not true; or at least it's only partially true. Note that I'm talking specifically about 10th Edition here. The compiler is implemented as several cooperating processes, some of which employ different memory management strategies. Notice, for example, that `ccom` uses `malloc` but not `sbrk`, while `c2` does the opposite and uses `sbrk` but not `malloc`. On the other hand, `cpp` does no dynamic memory allocation and instead has fixed-size symbol tables for preprocessor definitions etc. Finally, the assembler uses `sbrk` but `ld` uses `malloc`. > One consequence I think is that sbrk() > expands the process memory without invalidating existing use of memory > - so the code is able to periodically expand heap while retaining all > existing allocations. > A simple workaround I used was to preallocate a heap and just stub out > sbrk() calls - so that works. So in essence given a single chunk of > memory (if large enough - which is still quite small by today's > standards) the compiler manages fine. > > However I find this unsatisfactory and would like to improve it. But > it is a bit difficult to understand how the memory is being used. > So in a nutshell, `sbrk` works by setting the "break" pointer: that is, the highest usable virtual address in the process's data segment. Stacks may be at the top of the user portion of the address space; but I'd have to double check the details. When the process starts up and gets going, one can discover the initial value of that pointer by executing `sbrk(0)`: since `sbrk` takes a delta and returns the old value of the break, the return value will be (basically) the current end of the data segment. By manipulating the break pointer via `sbrk`, one can cause the kernel to allocate memory to a process. That memory can then be managed internally (that is, within the process) via e.g. malloc _or_ a hand-rolled allocator (e.g., a "bump" allocator): when you want to put something in memory, you know where the next available pointer is (after the last one you allocated), you know how much space is available on the heap (since you allocated it using `sbrk`) and you presumably know how big the thing you want to store is; if the current pointer + alignment + size of next object is > break, you allocate more memory using `sbrk` and continue. There are a couple of approaches you can take to modernize this. The first you've already discovered: preallocate a heap that you manage with a stub `sbrk` and be done with it. Another might be to try and find some otherwise unallocated region of the address space and using `mmap()` to allocate successive pages to that portion of the address space to simulate `sbrk`; this may fail if you run into something that's already allocated in one of those regions. The most robust way might be the hardest, which is to replace the `sbrk`-based allocator with calls to e.g. `malloc`. However, looking a little more closely at 10th Ed's `c2`, for example`, this doesn't seem like it would be all that bad: `sbrk` is mostly wrapped by a function called `alloc` that takes an integer number of bytes to allocate and returns a `struct node *`; the return value is cast to whatever type is needed though. The use of `struct node *` as the return value is almost certainly because the compiler was written before `void *` had achieved primacy in the ANSI C world. Anyway, I suspect if you simply replaced the guts of `alloc` with `return malloc(an);` you'd get approximately what you want. Memory can be used for declarations, trees (for expressions) and > strings as far as I can tell. Strings actually use the tree > allocation, and just pretend that a node is a string. > Not exactly. The return type of `alloc` is `struct node *`, but that's just a detail; in the dialect of C that e.g. 10th Ed's PCC is written in, the return type doesn't matter: it's just a pointer. You can cast it to `char *` and copy string data there if you want to. I don't know why `struct node *` was chosen for `alloc`, but I suspect simply convenience. It seems that tree memory is allocated in a stack discipline. But what > puzzled me is that when a tree starts, about 512 bytes of memory are > left as gap for declarations to use. I have been trying to think in > what circumstances would you encounter a declaration while parsing an > expression - perhaps cast expressions? Anyway - if a declaration > occurs inside an expression (i.e. tree) then it only has 512 bytes > available. Of course this could be made bigger ... but at the least I > would like to have separate heaps for declarations, trees and strings. > This sounds like it's something independent of the allocator and more about the program's strategy around allocation in general. It's a common trick in C programs to e.g., allocate the space for some structs plus things that may be pointed to by those structs in a single allocation; not only does this cut down on overhead with respect to calling into the allocator, it also gives you locality. I don't think you want separate _heaps_ for the things you mention, though you may want to allocate them separately using independent calls to e.g. `malloc`. That's fine. I guess no one really dug into this code - as presumably once PCC came > along the original compiler by Dennis stopped being used. > 10th Edition `cc` _is_ `pcc`. I haven't looked at the 7th Ed compiler in detail. - Dan C. --0000000000008c963705ada1ab5b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Mon, Aug 17, 2020 at 4:16 PM Dibyendu = Majumdar <mobile@majumdar.org.= uk> wrote:
On Mon, 17 Aug 2020 at 17:13, Dan Cross <crossd@gmail.com>= ; wrote:
> From my light skimming of V10 sources, it appears that the various com= ponents of the default C compiler (that is, not LCC) either use malloc/free= or call `sbrk` directly.

Yes, it only uses sbrk().

No, that's no= t true; or at least it's only partially true. Note that I'm talking= specifically about 10th Edition here.

The compile= r is implemented as several cooperating processes, some of which employ dif= ferent memory management strategies. Notice, for example, that `ccom` uses = `malloc` but not `sbrk`, while `c2` does the opposite and uses `sbrk` but n= ot `malloc`. On the other hand, `cpp` does no dynamic memory allocation and= instead has fixed-size symbol tables for preprocessor definitions etc. Fin= ally, the assembler uses `sbrk` but `ld` uses `malloc`.
=C2=A0
One consequence I thin= k is that sbrk()
expands the process memory without invalidating existing use of memory
- so the code is able to periodically expand heap while retaining all
existing allocations.
A simple workaround I used was to preallocate a heap and just stub out
sbrk() calls - so that works. So in essence given a single chunk of
memory (if large enough - which is still quite small by today's
standards) the compiler manages fine.

However I find this unsatisfactory and would like to improve it. But
it is a bit difficult to understand how the memory is being used.

So in a nutshell, `sbrk` works by setting the &q= uot;break" pointer: that is, the highest usable virtual address in the= process's data segment. Stacks may be at the top of the user portion o= f the address space; but I'd have to double check the details. When the= process starts up and gets going, one can discover the initial value of th= at pointer by executing `sbrk(0)`: since `sbrk` takes a delta and returns t= he old value of the break, the return value will be (basically) the current= end of the data segment.

By manipulating the brea= k pointer via `sbrk`, one can cause the kernel to allocate memory to a proc= ess. That memory can then be managed internally (that is, within the proces= s) via e.g. malloc _or_ a hand-rolled allocator (e.g., a "bump" a= llocator): when you want to put something in memory, you know where the nex= t available pointer is (after the last one you allocated), you know how muc= h space is available on the heap (since you allocated it using `sbrk`) and = you presumably know how big the thing you want to store is; if the current = pointer=C2=A0+ alignment=C2=A0+ size of next object is > break, you allocate more memory us= ing `sbrk` and continue.

There are a couple of app= roaches you can take to modernize this. The first you've already discov= ered: preallocate a heap that you manage with a stub `sbrk` and be done wit= h it. Another might be to try and find some otherwise unallocated region of= the address space and using `mmap()` to allocate successive pages to that = portion of the address space to simulate `sbrk`; this may fail if you run i= nto something that's already allocated in one of those regions.

The most robust way might be the hardest, which is to rep= lace the `sbrk`-based allocator with calls to e.g. `malloc`. However, looki= ng a little more closely at 10th Ed's `c2`, for example`, this doesn= 9;t seem like it would be all that bad: `sbrk` is mostly wrapped by a funct= ion called `alloc` that takes an integer number of bytes to allocate and re= turns a `struct node *`; the return value is cast to whatever type is neede= d though. The use of `struct node *` as the return value is almost certainl= y because the compiler was written before `void *` had achieved primacy in = the ANSI C world.

Anyway, I suspect if you simply = replaced the guts of `alloc` with `return malloc(an);` you'd get approx= imately what you want.

Memory can be used for declarations, trees (for expressions) and
strings as far as I can tell. Strings actually use the tree
allocation, and just pretend that a node is a string.
=
Not exactly. The return type of `alloc` is `struct node *`, = but that's just a detail; in the dialect of C that e.g. 10th Ed's P= CC is written in, the return type doesn't matter: it's just a point= er. You can cast it to=C2=A0`char *` and copy string data there if you want= to. I don't know why `struct node *` was chosen for=C2=A0`alloc`, but = I suspect simply convenience.

It seems that tree memory is allocated in a stack discipline. But what
puzzled me is that when a tree starts, about 512 bytes of memory are
left as gap for declarations to use. I have been trying to think in
what circumstances would you encounter a declaration while parsing an
expression - perhaps cast expressions? Anyway - if a declaration
occurs inside an expression (i.e. tree) then it only has 512 bytes
available. Of course this could be made bigger ... but at the least I
would like to have separate heaps for declarations, trees and strings.
<= /blockquote>

This sounds like it's something indepen= dent of the allocator and more about the program's strategy around allo= cation in general. It's a common trick in C programs to e.g., allocate = the space for some structs plus things that may be pointed to by those stru= cts in a single allocation; not only does this cut down on overhead with re= spect to calling into the allocator, it also gives you locality.
=
I don't think you want separate _heaps_ for the things y= ou mention, though you may want to allocate them separately using independe= nt calls to e.g. `malloc`. That's fine.

I guess no one really dug into this code - as presumably once PCC came
along the original compiler by Dennis stopped being used.
<= div>
10th Edition `cc` _is_ `pcc`. I haven't looked at th= e 7th Ed compiler in detail.

=C2=A0 =C2=A0 =C2=A0 = =C2=A0 - Dan C.

--0000000000008c963705ada1ab5b--