Other that compiler warnings, the main pain point I ran into porting a subset of Musl into a resource constrained environment was the resource usage of stdio.  I don't expect any of these modifications to make it upstream.  Talking out loud as a FYI / user feedback.  Also curious to see if there's any wisdom out there.

Stack usage of stdio was an issue.  On arm64, printf takes 8k of stack which is a rough when you only have 4-12k of stack.  This is because fmt_fp allocates stack space proportional O(log(MAX_LONG_DOUBLE)).  It also gets inlined into printf so you always take the hit.  (noinline fmt_fp is a Faustian bargain that makes stack usage worse in the worst case... hmmm.)  On arm64, long double is defined as 128 bits, which not only increases stack size because of the larger mantisa, but also pulls in software emulation for fp128.  In terms of spec compliance, Musl is doing the right thing.  But as a practical matter, none of the programs I care about will ever use long double.  So my rough first pass was to reduce the max float size from long double to double.  In a later pass, I'll also add a knob to remove floating point formatting entirely.

%m calls strerror which pulls in a string table, so removing support for %m lets static linking and DCE work its magic.  I also eliminated %n for security hardening reasons.

The "states" structure is sparse and takes a little more memory than I'd like -  464b of rodata.  I don't see any workarounds without deeper changes, so for now I am living with it.