From: Sergey Dmitrouk <sdmitrouk@accesssoftek.com>
To: "musl@lists.openwall.com" <musl@lists.openwall.com>
Subject: Re: [PATCH] conforming strverscmp() implementation
Date: Tue, 3 Mar 2015 21:27:26 +0200 [thread overview]
Message-ID: <20150303192726.GA11305@zx-spectrum.accesssoftek.com> (raw)
In-Reply-To: <20150303155411.GH23507@brightrain.aerifal.cx>
[-- Attachment #1: Type: text/plain, Size: 1499 bytes --]
On Tue, Mar 03, 2015 at 07:54:11AM -0800, Rich Felker wrote:
> Have you run libc-test against it and checked that it
> fixes all the test failures there? I would guess it does since they're
> based on the man page examples but it would be good to double-check
> anyway if you haven't.
Yes, related libc-test regression tests pass now. I also compared
results with glibc implementation and it seems to be correct for
numbers, but I just run check again in directory with some non-ASCII file
names and got:
glibc: . .. 000 00 01 010 02 03 04 05 06 07 08 09 0 1 3 9 10 11 русские буквы в имени
musl (new): русские буквы в имени . .. 000 00 01 010 02 03 04 05 06 07 08 09 0 1 3 9 10 11
musl (old): русские буквы в имени . .. 0 00 000 01 010 02 03 04 05 06 07 08 09 1 3 9 10 11
You can see that sorting of Cyrillic file names differs from glibc in
UTF-8 locale. I believe it's another bug and this one is related to:
return (*l - *r);
which should be changed (maybe even for 1.1.6) to something equivalent to:
return ((unsigned char)*l - (unsigned char)*r);
in which case results become:
glibc: . .. 000 00 01 010 02 03 04 05 06 07 08 09 0 1 3 9 10 11 русские буквы в имени
musl (new): . .. 000 00 01 010 02 03 04 05 06 07 08 09 0 1 3 9 10 11 русские буквы в имени
musl (old): . .. 0 00 000 01 010 02 03 04 05 06 07 08 09 1 3 9 10 11 русские буквы в имени
--
Sergey
[-- Attachment #2: strverscmp.c --]
[-- Type: text/plain, Size: 1090 bytes --]
#define _GNU_SOURCE
#include <ctype.h>
#include <string.h>
int strverscmp(const char *l, const char *r)
{
const char *ln=(isdigit(*l) ? l : NULL), *rn=(isdigit(*r) ? r : NULL);
while (*l==*r) {
if (!*l) return 0;
if (isdigit(*l)) {
if (ln == NULL) {
ln = l; rn = r;
}
} else {
ln = NULL; rn = NULL;
}
l++; r++;
}
if ((*l != '\0' && !isdigit(*l)) || (*r != '\0' && !isdigit(*r))) {
ln = NULL; rn = NULL;
}
if (ln != NULL) {
int intl=(*ln != '0' || !isdigit(*(ln + 1)));
int intr=(*rn != '0' || !isdigit(*(rn + 1)));
if (intl ^ intr) {
return intl ? 1 : -1;
} else if (intl) {
size_t lenl=0, lenr=0;
while (isdigit(l[lenl])) lenl++;
while (isdigit(r[lenr])) lenr++;
if (lenl==lenr) {
return ((unsigned char)*l - (unsigned char)*r);
} else if (lenl>lenr) {
return 1;
} else {
return -1;
}
} else {
size_t zl=0, zr=0;
while (ln[zl]=='0') zl++;
while (rn[zr]=='0') zr++;
if (zl>zr) {
return -1;
} else if (zl<zr) {
return 1;
}
}
}
return ((unsigned char)*l - (unsigned char)*r);
}
prev parent reply other threads:[~2015-03-03 19:27 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-03 10:45 Sergey Dmitrouk
2015-03-03 15:54 ` Rich Felker
2015-03-03 19:27 ` Sergey Dmitrouk [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150303192726.GA11305@zx-spectrum.accesssoftek.com \
--to=sdmitrouk@accesssoftek.com \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).