Hi all, I noticed something while reading code today: Near the end of qsort(), we have this gem: shl(p, 2); pshift -= 2; p[0] ^= 7; shr(p, 1); Now, I don't know if I am missing something, but don't the shl and the shr partially cancel out? Isn't this the same as shl(p, 1); p[0] ^= 3; As it is, it isn't wrong, just weird. Ciao, Markus

```
On Wed, Jul 01, 2020 at 08:50:26PM +0200, Markus Wichmann wrote:
> Hi all,
>
> I noticed something while reading code today: Near the end of qsort(),
> we have this gem:
>
> shl(p, 2);
> pshift -= 2;
> p[0] ^= 7;
> shr(p, 1);
>
> Now, I don't know if I am missing something, but don't the shl and the
> shr partially cancel out? Isn't this the same as
>
> shl(p, 1);
> p[0] ^= 3;
>
> As it is, it isn't wrong, just weird.
Assuming non-overflow, I think they're equivalent (also assuming you
keep the pshift-=2). I've CC'd the original author but we've not been
in touch for a long time so I don't know whether to expect a response.
I don't have any insight on why it was done this way.
Rich
```

```
On Wed, Jul 01, 2020 at 04:44:56PM -0400, Rich Felker wrote:
> On Wed, Jul 01, 2020 at 08:50:26PM +0200, Markus Wichmann wrote:
> > Hi all,
> >
> > I noticed something while reading code today: Near the end of qsort(),
> > we have this gem:
> >
> > shl(p, 2);
> > pshift -= 2;
> > p[0] ^= 7;
> > shr(p, 1);
> >
> > Now, I don't know if I am missing something, but don't the shl and the
> > shr partially cancel out? Isn't this the same as
> >
> > shl(p, 1);
> > p[0] ^= 3;
> >
> > As it is, it isn't wrong, just weird.
>
> Assuming non-overflow, I think they're equivalent (also assuming you
> keep the pshift-=2).
Yes, it looks that way. I'm afraid I don't have any further insight -
it's been quite a while since I thought about the qsort code, and I've
not been doing much work on algorithms over the last couple of years.
The only thing I can think of is that one could figure out which
behaviour with regard to overflow in shl() should be the valid one. I
suspect that replacing it would be valid and this is some transformation
I did while implementing smoothsort without realizing that this can be
simplified.
Cheers,
Valentin
```

```
On Wed, Jul 01, 2020 at 11:23:09PM +0200, Valentin Ochs wrote:
> On Wed, Jul 01, 2020 at 04:44:56PM -0400, Rich Felker wrote:
> > On Wed, Jul 01, 2020 at 08:50:26PM +0200, Markus Wichmann wrote:
> > > Hi all,
> > >
> > > I noticed something while reading code today: Near the end of qsort(),
> > > we have this gem:
> > >
> > > shl(p, 2);
> > > pshift -= 2;
> > > p[0] ^= 7;
> > > shr(p, 1);
> > >
> > > Now, I don't know if I am missing something, but don't the shl and the
> > > shr partially cancel out? Isn't this the same as
> > >
> > > shl(p, 1);
> > > p[0] ^= 3;
> > >
> > > As it is, it isn't wrong, just weird.
> >
> > Assuming non-overflow, I think they're equivalent (also assuming you
> > keep the pshift-=2).
>
> Yes, it looks that way. I'm afraid I don't have any further insight -
> it's been quite a while since I thought about the qsort code, and I've
> not been doing much work on algorithms over the last couple of years.
> The only thing I can think of is that one could figure out which
> behaviour with regard to overflow in shl() should be the valid one. I
> suspect that replacing it would be valid and this is some transformation
> I did while implementing smoothsort without realizing that this can be
> simplified.
>
> Cheers,
> Valentin
>
Overflow on shl() is completely impossible. To overflow a shl(p, 2), we
would need the penultimate bit in p to be set. Every bit in p stands in for
a tree of that order, so if bit n is set, the heap contains a tree with
a number of elements equal to the n'th Leonardo number.
I don't know how big the Leonardo number corresponding to the
penultimate bit is, but I do know that halfway through the upper word
(wasn't the factor something like 1.44 or so?), the Leonardo numbers get
too big to be contained in a machine word. Meaning that tree would be
way too large to be addressed.
I concur that this looks like a missed optimization opportunity, and not
a deliberate design decision.
Ciao,
Markus
```

```
On Thu, Jul 02, 2020 at 04:44:47PM +0200, Markus Wichmann wrote:
> On Wed, Jul 01, 2020 at 11:23:09PM +0200, Valentin Ochs wrote:
> > On Wed, Jul 01, 2020 at 04:44:56PM -0400, Rich Felker wrote:
> > > On Wed, Jul 01, 2020 at 08:50:26PM +0200, Markus Wichmann wrote:
> > > > Hi all,
> > > >
> > > > I noticed something while reading code today: Near the end of qsort(),
> > > > we have this gem:
> > > >
> > > > shl(p, 2);
> > > > pshift -= 2;
> > > > p[0] ^= 7;
> > > > shr(p, 1);
> > > >
> > > > Now, I don't know if I am missing something, but don't the shl and the
> > > > shr partially cancel out? Isn't this the same as
> > > >
> > > > shl(p, 1);
> > > > p[0] ^= 3;
> > > >
> > > > As it is, it isn't wrong, just weird.
> > >
> > > Assuming non-overflow, I think they're equivalent (also assuming you
> > > keep the pshift-=2).
> >
> > Yes, it looks that way. I'm afraid I don't have any further insight -
> > it's been quite a while since I thought about the qsort code, and I've
> > not been doing much work on algorithms over the last couple of years.
> > The only thing I can think of is that one could figure out which
> > behaviour with regard to overflow in shl() should be the valid one. I
> > suspect that replacing it would be valid and this is some transformation
> > I did while implementing smoothsort without realizing that this can be
> > simplified.
> >
> > Cheers,
> > Valentin
> >
>
> Overflow on shl() is completely impossible. To overflow a shl(p, 2), we
> would need the penultimate bit in p to be set. Every bit in p stands in for
> a tree of that order, so if bit n is set, the heap contains a tree with
> a number of elements equal to the n'th Leonardo number.
>
> I don't know how big the Leonardo number corresponding to the
> penultimate bit is, but I do know that halfway through the upper word
> (wasn't the factor something like 1.44 or so?), the Leonardo numbers get
> too big to be contained in a machine word. Meaning that tree would be
> way too large to be addressed.
>
> I concur that this looks like a missed optimization opportunity, and not
> a deliberate design decision.
Indeed, I don't believe overflow is possible here; I just mentioned it
for completeness. I think the change proposed here is correct but I'll
hold off on touching it til after release since it's not fixing a bug,
just a minor missed simplification.
Rich
```