From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7126 Path: news.gmane.org!not-for-mail From: Sergey Dmitrouk Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH] conforming strverscmp() implementation Date: Tue, 3 Mar 2015 21:27:26 +0200 Message-ID: <20150303192726.GA11305@zx-spectrum.accesssoftek.com> References: <20150303104507.GA5094@zx-spectrum> <20150303155411.GH23507@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="7AUc2qLy4jB3hD7Z" Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1425410856 4563 80.91.229.3 (3 Mar 2015 19:27:36 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 3 Mar 2015 19:27:36 +0000 (UTC) To: "musl@lists.openwall.com" Original-X-From: musl-return-7139-gllmg-musl=m.gmane.org@lists.openwall.com Tue Mar 03 20:27:36 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1YSsTe-0001xI-Go for gllmg-musl@m.gmane.org; Tue, 03 Mar 2015 20:27:34 +0100 Original-Received: (qmail 26521 invoked by uid 550); 3 Mar 2015 19:27:32 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 26480 invoked from network); 3 Mar 2015 19:27:27 -0000 Content-Disposition: inline In-Reply-To: <20150303155411.GH23507@brightrain.aerifal.cx> Xref: news.gmane.org gmane.linux.lib.musl.general:7126 Archived-At: --7AUc2qLy4jB3hD7Z Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit On Tue, Mar 03, 2015 at 07:54:11AM -0800, Rich Felker wrote: > Have you run libc-test against it and checked that it > fixes all the test failures there? I would guess it does since they're > based on the man page examples but it would be good to double-check > anyway if you haven't. Yes, related libc-test regression tests pass now. I also compared results with glibc implementation and it seems to be correct for numbers, but I just run check again in directory with some non-ASCII file names and got: glibc: . .. 000 00 01 010 02 03 04 05 06 07 08 09 0 1 3 9 10 11 русские буквы в имени musl (new): русские буквы в имени . .. 000 00 01 010 02 03 04 05 06 07 08 09 0 1 3 9 10 11 musl (old): русские буквы в имени . .. 0 00 000 01 010 02 03 04 05 06 07 08 09 1 3 9 10 11 You can see that sorting of Cyrillic file names differs from glibc in UTF-8 locale. I believe it's another bug and this one is related to: return (*l - *r); which should be changed (maybe even for 1.1.6) to something equivalent to: return ((unsigned char)*l - (unsigned char)*r); in which case results become: glibc: . .. 000 00 01 010 02 03 04 05 06 07 08 09 0 1 3 9 10 11 русские буквы в имени musl (new): . .. 000 00 01 010 02 03 04 05 06 07 08 09 0 1 3 9 10 11 русские буквы в имени musl (old): . .. 0 00 000 01 010 02 03 04 05 06 07 08 09 1 3 9 10 11 русские буквы в имени -- Sergey --7AUc2qLy4jB3hD7Z Content-Type: text/plain; charset="us-ascii" Content-Disposition: attachment; filename="strverscmp.c" #define _GNU_SOURCE #include #include int strverscmp(const char *l, const char *r) { const char *ln=(isdigit(*l) ? l : NULL), *rn=(isdigit(*r) ? r : NULL); while (*l==*r) { if (!*l) return 0; if (isdigit(*l)) { if (ln == NULL) { ln = l; rn = r; } } else { ln = NULL; rn = NULL; } l++; r++; } if ((*l != '\0' && !isdigit(*l)) || (*r != '\0' && !isdigit(*r))) { ln = NULL; rn = NULL; } if (ln != NULL) { int intl=(*ln != '0' || !isdigit(*(ln + 1))); int intr=(*rn != '0' || !isdigit(*(rn + 1))); if (intl ^ intr) { return intl ? 1 : -1; } else if (intl) { size_t lenl=0, lenr=0; while (isdigit(l[lenl])) lenl++; while (isdigit(r[lenr])) lenr++; if (lenl==lenr) { return ((unsigned char)*l - (unsigned char)*r); } else if (lenl>lenr) { return 1; } else { return -1; } } else { size_t zl=0, zr=0; while (ln[zl]=='0') zl++; while (rn[zr]=='0') zr++; if (zl>zr) { return -1; } else if (zl