On Tue, Jul 31, 2018, 3:49 AM Markus Wichmann, <nullplan@gmx.net> wrote:
On Mon, Jul 30, 2018 at 11:09:05AM -0400, Christopher Friedt wrote:
> I wrote it originally for using on the kernel-side of an rtos, but it could
> easily be extended to make a syscall when a process runs out of ram.
>
> Obviously, a shortcoming is that the memory blocks must be PoT and there is
> the potential for fragmentation. Otherwise though, the meta-information is
> intrinsic within the pointer, which obliviates the need for external
> storage.
>

That does sound interesting, but I don't really follow. The meta-info
which we need to save is the length of the object returned. How can you
encode that in the pointer without any external storage? Using internal
storage? Yeah, that is where we are right now, and that means there's
only one buffer overflow between us and oblivion at any given time.

Or do you essentially have a large block of memory, with one area for
the 16-byte objects, one for the 32-byte ones, &c. But then what do you
do if your given area runs out and the application still wants more?

https://en.wikipedia.org/wiki/Buddy_memory_allocation

The 'bins' do overlap (e.g. a 64 byte chunk can also be used as 2 32-byte chunks or "buddies"), but their size (when using a bitmap for metadata) is encoded by the position of the bit in the map. This also encodes if a parent (or child) of a specific memory region is already in use or free.

My original implementation was for an rtos with a compile-time-fixed memory size. In user space (since doubling ram usage is obviously bad) it would be best to use some container (e.g. an interval tree) to contain buddy-malloc nodes and general slabs of mmap. Current usage, a configurable fragmentation threshold, and the system pagesize, would determine if an allocation would use an existing buddy-malloc node, a new buddy-malloc node (PoT # of pages) or a sequence of pages. Pages would, of course, be allocated with mmap.

I've personally not implemented malloc debugging for a buddy allocator though, since bits are precious enough, and there is no fundamental requirement for it, although technically, it would mean adding some padding with a computable bit pattern. I suppose if checking is enabled, it can be run before free(3) returns or before realloc(3) does anything. A global check on every function call would be more thorough, but obviously more expensive.