>     Well, I can see several reasons :
>   * processors like powers of two, especially when it comes
> to the size
> of a memory address, because of cache issues, so you'd better
> make it 32
> or 64 words than 33 or 65.
>   * If the tag bit can be anywhere in a word you have to spend extra
> time to extract it, whereas when it is at a fixed place,
> especially LSB
> or MSB, it is very cheap and easy.
>   * You would need two registers to access a value and its
> tag instead
> of one, and registers are very precious, at least on IA-32
> architectures.

But who needs the tag bit? Only the garbage collector. Maybe it's
an advantage to see 32 tag bits as a whole, e.g. the question
"does the block contains any pointer" can be calculated bit-parallel.
Anyway the garbage collector could works on blocks and needs
just one additional memory access per block.

Regards,
Christoph Bauer