mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Morten Welinder <mwelinder@gmail.com>
To: musl@lists.openwall.com
Subject: Re: Possible infinite loop in qsort()
Date: Sun, 10 Jan 2016 11:35:04 -0500	[thread overview]
Message-ID: <CANv4PN=+v50LQdYM5TjJ-ejdWwrzz3PpXrWF-o9mZfhiNhG87g@mail.gmail.com> (raw)
In-Reply-To: <20160110040516.GQ238@brightrain.aerifal.cx>

> Note also that if you do want to use this code on an
> implementation without such a guarantee, only the case where
> the member size is 1 can possibly have >SIZE_MAX/2
> members.

That's correct.

> In that case, you can massively optimize out the whole sort
> by just counting the number of times each byte appears [...]

That isn't.  sizeof(char) is guaranteed to be 1.  Is it not, however,
required to represent bytes.  You could have sizeof(pointer) be
1 also and if you so, you really cannot do sorting by counting.

M.



On Sat, Jan 9, 2016 at 11:05 PM, Rich Felker <dalias@libc.org> wrote:
> On Sat, Jan 09, 2016 at 10:07:19AM +0100, Felix Janda wrote:
>> Markus Wichmann wrote:
>> > Hi all,
>> >
>> > This is the Leonardo number precompute loop in qsort():
>> >
>> >    for(lp[0]=lp[1]=width, i=2; (lp[i]=lp[i-2]+lp[i-1]+width) < size; i++);
>> >
>> > I haven't actually tested this, but is it possible that this can become
>> > infinite on x32? My reasoning is this:
>> >
>> > This loop calculates all Leonardo numbers (scaled by width) until one
>> > comes along that is greater than the array length. However, that number
>> > is never actually needed, we only need to calculate all Leonardo numbers
>> > smaller than array size. And there is another problem: What if that
>> > smallest Leonardo number greater than array size isn't representable in
>> > size_t? In that case, the final addition step will overflow and the
>> > inequation will never become false. So if an array is entered that has
>> > more elements than the largest representable Leonardo number scaled by
>> > width (for instance, an array with more than 866,988,873 ints (size 4)),
>> > the above loop becomes infinite: The next Leonardo number is
>> > 1,402,817,465, multiplied by 4 that is larger than 2^32, so on a 32-bit
>> > architecture, this will overflow.
>> >
>> > Then I thought more about this: Such an array would be just over 3GB
>> > long. You don't have that much address space available on most 32-bit
>> > archs because Linux selfishly hogs a whole GB of address space for the
>> > kernel. On 64-bit archs, Linux hogs half the address space, so no
>> > userspace array can be larger than the largest Leonardo number
>> > representable in 64 bits, so it looks like we're safe, right?
>> >
>> > Except there's x32: 4GB of address space and no kernel infringes on it
>> > (x32 is basically x86_64, but we keep the userspace pointers down to 32
>> > bits, so the kernel is way beyond what we're looking at).
>> >
>> > But as I said, we don't actually need the smallest Leonardo number
>> > greater than size, we only need the largest Leonardo numer smaller than
>> > size. So this problem could be solved by either of:
>> >
>> > 1. Checking for overflow.
>> > 2. Putting an absolute limit on i.
>> >
>> > Did I miss anything?
>>
>> musl enforces that object sizes should not be greater than PTRDIFF_MAX.
>> See for example the discussion at
>>
>> http://www.openwall.com/lists/musl/2013/06/27/7
>>
>> So there will not be objects of size 3GB with musl on x32. Since the
>> Leonardo numbers grow slower than 2^n in general no overflow should
>> happen if "size" is valid. Otherwise, UB was invoked.
>
> Note also that if you do want to use this code on an implementation
> without such a guarantee, only the case where the member size is 1 can
> possibly have >SIZE_MAX/2 members. In that case, you can massively
> optimize out the whole sort by just counting the number of times each
> byte appears (in size_t[UCHAR_MAX+1] space which is tiny), sorting the
> pairs (value,count) using the comparison function, then writing out
> each value the appropriate number of times.
>
> Rich


  parent reply	other threads:[~2016-01-10 16:35 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-09  8:21 Markus Wichmann
2016-01-09  9:07 ` Felix Janda
2016-01-10  4:05   ` Rich Felker
2016-01-10 10:33     ` Szabolcs Nagy
2016-01-10 11:38     ` Alexander Monakov
2016-01-10 11:38     ` Markus Wichmann
2016-01-10 12:15       ` Szabolcs Nagy
2016-01-12 12:25       ` Alexander Cherepanov
2016-01-12 12:48         ` Szabolcs Nagy
2016-01-12 14:31           ` Alexander Cherepanov
2016-01-12 16:22             ` Szabolcs Nagy
2016-01-14 22:21               ` Rich Felker
2016-01-14 22:17         ` Rich Felker
2016-01-10 16:35     ` Morten Welinder [this message]
2016-01-10 16:45       ` Jens Gustedt
2016-01-12 10:30 ` Alexander Cherepanov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANv4PN=+v50LQdYM5TjJ-ejdWwrzz3PpXrWF-o9mZfhiNhG87g@mail.gmail.com' \
    --to=mwelinder@gmail.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).