From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/2807 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: strverscmp Date: Sun, 17 Feb 2013 13:39:44 -0500 Message-ID: <20130217183944.GF20323@brightrain.aerifal.cx> References: <20130201071053.GA14593@brightrain.aerifal.cx> <20130201182340.c60061b6.idunham@lavabit.com> <20130202023833.GW20323@brightrain.aerifal.cx> <20130201200452.26687bed.idunham@lavabit.com> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1361126397 17319 80.91.229.3 (17 Feb 2013 18:39:57 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 17 Feb 2013 18:39:57 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-2808-gllmg-musl=m.gmane.org@lists.openwall.com Sun Feb 17 19:40:17 2013 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1U799t-0001CX-As for gllmg-musl@plane.gmane.org; Sun, 17 Feb 2013 19:40:17 +0100 Original-Received: (qmail 11327 invoked by uid 550); 17 Feb 2013 18:39:57 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 11319 invoked from network); 17 Feb 2013 18:39:56 -0000 Content-Disposition: inline In-Reply-To: <20130201200452.26687bed.idunham@lavabit.com> User-Agent: Mutt/1.5.21 (2010-09-15) Xref: news.gmane.org gmane.linux.lib.musl.general:2807 Archived-At: Hey, sorry it's taken me so long to get to reading and reviewing this well. Here are some comments, mostly stylistic, but one's about a corner-case bug. See below. With that said, the algorithm looks right to me, and I think we can commit soon. Rich On Fri, Feb 01, 2013 at 08:04:52PM -0800, Isaac Dunham wrote: > On Fri, 1 Feb 2013 21:38:33 -0500 > Rich Felker wrote: > > > > Fix strverscmp (patch same as the last time I sent it) > > > > I'm not sure whether we got it into a fully-working state or not; the > > conversation kinda died out last time. I'll review it again too. I > > remember it didn't look quite like the algorithm I described/proposed, > > but that doesn't mean it's wrong. It looked like it could at least use > > some streamlining though. > > The last review was just before I got it working. Here's the final version. > > Probably somewhere it could be optimized, though... > As long as it doesn't end up looking like the GNU one. > > -- > Isaac Dunham > diff --git a/src/string/strverscmp.c b/src/string/strverscmp.c > index 7054967..8f3f11f 100644 > --- a/src/string/strverscmp.c > +++ b/src/string/strverscmp.c > @@ -1,7 +1,41 @@ > +#define _GNU_SOURCE > +#include > #include > > int strverscmp(const char *l, const char *r) > { > - /* FIXME */ > - return strcmp(l, r); > + int haszero=1; > + while (*l && *r && l[0]==r[0]){ I used to do a lot of while loops like this, but now I feel like it makes it hard to follow the end-of-string logic -- a lot more code runs after the null terminator is hit, and it's nontrivial to follow through it all and make sure the right thing happens, as the subsequent code after the loop is now handling two very-distinct loop-exit conditions: 1. Loop exited due to mismatch. 2. Loop exited due to hitting end-of-string with no mismatch. What about instead doing something like this: while (*l==*r) { if (!*l) return 0; /* ... */ That way, the "equal strings" case is trivially correct. It also might generate better code. Note that I also made a couple stylistic changes here; unless you object, I prefer *p to p[0] when p is just being used as a moving pointer within a string, always accessed with offset 0. It's just more concise/less cluttered and avoids suggesting it's used as a string in itself. Also added a space before the open brace, for consistency with style elsewhere. > + if (l[0]=='0'){ > + if (haszero==1) { > + haszero=0; > + } > + } else if (isdigit(l[0])) { > + if (haszero==1) { > + haszero=2; > + } > + } else { > + haszero=1; > + } > + l++; r++; > + } > + if (haszero==1 && (l[0]=='0' || r[0]=='0')) { > + haszero=0; > + } > + if ((isdigit(l[0]) && isdigit(r[0]) ) && haszero) { > + int lenl=0, lenr=0, firstl=l[0], firstr=r[0]; These variables have the wrong types. int is not sufficient to store a character count in a string. You need size_t. > + while (isdigit(l++[0]) ) { > + lenl++; > + } > + while (isdigit(r++[0]) ) { > + lenr++; > + } > + if (lenl==lenr) { > + return (firstl - firstr); > + } else { > + return (lenl - lenr); > + } I see what's going on here, but I find saving the first mismatching characters (btw, you could have just saved the difference) then continuing to advance the l/r pointers a bit confusing at first. Why not instead do: while (isdigit(l[lenl])) lenl++;