I haven't looked at omalloc, but I wrote a deadly simple buddy allocator for Bionic some time ago with support for malloc(3), calloc(3), realloc(3), free(3), and posix_memalign(3). It would obviously also support aligned_alloc(3) for C11.

It ran well on everything from a arm cortex-m0 to an intel core i7. The code compiled to 504 .text bytes, on cortex-m0, iirc.

I wrote it originally for using on the kernel-side of an rtos, but it could easily be extended to make a syscall when a process runs out of ram.

Obviously, a shortcoming is that the memory blocks must be PoT and there is the potential for fragmentation. Otherwise though, the meta-information is intrinsic within the pointer, which obliviates the need for external storage.

Every call takes approximately the same average time which is about log2( depth ) for binary trees.

Objectively speaking though, in terms of choosing a new malloc implementation, the best one should be based on algorithmic complexity, but also size and speed metrics.

C