Consider what `stalin' does in about 3300 lines of Scheme
code. It translates R4RS scheme to C and takes a lot of time
doing so but the code is generates is blazingly fast. The
kind of globally optimized C code you or I wouldn't have the
patience to write. Or the ability to keep all that context in
one's head to do as good a job. Stalin compiles itself to
over 660K lines of C code! Then you give this C code to gcc
and it munches away for many minutes and finally dies on a
2GB system! If gcc was capable of only doing peephole
optimizing, it would've been able to generate code much more
quickly and without need gigabytes of memory.

Ha! Just tried to compile Stalin on my 4G laptop... it quickly became a laptop fryer... OUCH!

I might try 6c or 8c in a bit for comparison.

-joe