Another issue is 0l/vl seems to output wrong bits for single precision floats in little endian mode, due to a similar reason: it used bytes 4-7 instead of 0-3. This seems to fix it:
% diff /sys/src/cmd/vl/asm.c asm.c
672c672,675
< buf.dbuf[l] = cast[fnuxi8[i+4]];
---
> if(little)
> buf.dbuf[l] = cast[fnuxi8[i]];
> else
> buf.dbuf[l] = cast[fnuxi8[i+4]];
An alternative fix would be simply use fnuxi4 instead of fnuxi8, so that both BE and LE would work (I guess, don't have BE machine to test).