mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: [PATCH] Byte-based C locale, draft 1
Date: Sat, 6 Jun 2015 21:17:38 -0400	[thread overview]
Message-ID: <20150607011738.GB17573@brightrain.aerifal.cx> (raw)
In-Reply-To: <20150606214007.GA17398@brightrain.aerifal.cx>

[-- Attachment #1: Type: text/plain, Size: 1130 bytes --]

On Sat, Jun 06, 2015 at 05:40:07PM -0400, Rich Felker wrote:
> Before applying this I should probably overhaul fnmatch.c again. I
> believe it has some hard-coded UTF-8 processing code in it for the
> useless "check the tail before middle" step that I've been wanting to
> eliminate. Alternatively I could just apply a quick fix to make it
> work right without any invasive changes.
> 
> Other than possible weird cases with fnmatch (which are largely
> harmless but might inhibit matching high bytes in non-UTF-8 mode),
> this code should be ready for testing. I'd appreciate some feedback
> from anyone interested in the feature.

On further review, the special last-component handling fnmatch does is
not wrong, just wrongly ordered. It should take place after the "sea
of stars" component is processsed, rather than before, to avoid O(n)
operation (essentially strlen) when an early failure could be
detected. But since only the ordering is wrong, I think fixing it is
orthogonal to the bytelocale work, and a single-line patch to add a
case for MB_CUR_MAX==1 should just be added to this proposed patch
(see attached).

Rich

[-- Attachment #2: bytelocale_v1_fnmatch.diff --]
[-- Type: text/plain, Size: 717 bytes --]

diff --git a/src/regex/fnmatch.c b/src/regex/fnmatch.c
index 7f6b65f..978fff8 100644
--- a/src/regex/fnmatch.c
+++ b/src/regex/fnmatch.c
@@ -18,6 +18,7 @@
 #include <stdlib.h>
 #include <wchar.h>
 #include <wctype.h>
+#include "locale_impl.h"
 
 #define END 0
 #define UNMATCHABLE -2
@@ -229,7 +230,7 @@ static int fnmatch_internal(const char *pat, size_t m, const char *str, size_t n
 	 * On illegal sequences we may get it wrong, but in that case
 	 * we necessarily have a matching failure anyway. */
 	for (s=endstr; s>str && tailcnt; tailcnt--) {
-		if (s[-1] < 128U) s--;
+		if (s[-1] < 128U || MB_CUR_MAX==1) s--;
 		else while ((unsigned char)*--s-0x80U<0x40 && s>str);
 	}
 	if (tailcnt) return FNM_NOMATCH;

  parent reply	other threads:[~2015-06-07  1:17 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-06 21:40 Rich Felker
2015-06-06 22:39 ` Harald Becker
2015-06-06 23:10   ` Rich Felker
2015-06-06 23:59     ` Harald Becker
2015-06-07  0:24       ` Rich Felker
2015-06-07 23:59         ` Build option to disable locale [was: Byte-based C locale, draft 1] Harald Becker
2015-06-08  0:28           ` Josiah Worcester
2015-06-08  1:57             ` Harald Becker
2015-06-08  2:36               ` Rich Felker
2015-06-08  3:35                 ` Harald Becker
2015-06-08  3:51                   ` Josiah Worcester
2015-06-08  0:33           ` Rich Felker
2015-06-08  2:46             ` Harald Becker
2015-06-08  4:06               ` Rich Felker
2015-06-09  3:20               ` Isaac Dunham
2015-06-09  4:27                 ` Rich Felker
2015-06-07  1:17 ` Rich Felker [this message]
2015-06-07  2:50 ` [PATCH] Byte-based C locale, draft 1 Rich Felker
2015-06-13  7:06   ` [PATCH] Byte-based C locale, draft 2 Rich Felker
2015-06-16  4:26     ` Rich Felker
2015-06-16  4:35       ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150607011738.GB17573@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).