From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/2417 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Fix strverscmp Date: Wed, 5 Dec 2012 21:14:38 -0500 Message-ID: <20121206021438.GP20323@brightrain.aerifal.cx> References: <20121205110959.87b6111a.idunham@lavabit.com> <20121205193520.GN20323@brightrain.aerifal.cx> <20121205164329.c5cd3a20.idunham@lavabit.com> <20121206010000.GO20323@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1354760092 4530 80.91.229.3 (6 Dec 2012 02:14:52 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 6 Dec 2012 02:14:52 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-2418-gllmg-musl=m.gmane.org@lists.openwall.com Thu Dec 06 03:15:06 2012 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1TgQzP-0005R8-5J for gllmg-musl@plane.gmane.org; Thu, 06 Dec 2012 03:15:03 +0100 Original-Received: (qmail 26529 invoked by uid 550); 6 Dec 2012 02:14:51 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 26520 invoked from network); 6 Dec 2012 02:14:50 -0000 Content-Disposition: inline In-Reply-To: <20121206010000.GO20323@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Xref: news.gmane.org gmane.linux.lib.musl.general:2417 Archived-At: On Wed, Dec 05, 2012 at 08:00:00PM -0500, Rich Felker wrote: > > This should not be equal, but not for the reason I'd expected: a > > leading 0 is supposed to be interpreted as "0.0". Decimal points are > > not factored in... > > My understanding of the code is that, after it hits the first place > where the strings differ, it switches to reading a digit string as a > decimal number, and the result is the difference of those decimal > numbers. I just used the decimal point as an example because it > terminates the loop. > > I also don't understand what you mean about leading zero. If leading > zeros are not considered equal to the number without leading zeros, > then a simple algorithm would be to, on the first mismatching byte, > remember the difference, then count the number of consecutive digits > starting at that position in both strings. If this count is the same > (possibly zero) for both strings, return the saved byte difference. If > the count is different, consider the string with fewer digits to be > less than the one with more digits. This is trivial to implement with > no arithmetic, but I'm not sure it matches the original semantics. OK, I read the "spec" in the man page, and this seems to be the simplest implementation: 1. Find the first mismatching bytes. During this search, keep 1 bit of state that's set when a '0' byte is encountered now following a digit, and cleared when any non-digit byte is encountered. 2. Upon finding the first mismatching bytes, if either is a non-digit or '0' or if above-described flag is set, return the difference of the mismatching bytes. 3. Otherwise, count the number of consecutive digits beginning with the first mismatching bytes. If the count differs for the two strings, the string with the lower count compares less. Otherwise, return the difference of the first mismatching bytes. The rules in step 2 cover both the leading-zeros rule and non-numeric cases. The rules in step 3 cover the simple numeric comparison rule that the longer number is greater, with ties broken by comparing the most-significant position that differs. Does this sound correct to you? I like the concept of the flag in the first phase, which we could easily extend to treat other cases as non-numeric if desired. Rich