zsh-workers
 help / color / mirror / code / Atom feed
From: Stephane Chazelas <stephane.chazelas@gmail.com>
To: Peter Stephenson <p.stephenson@samsung.com>,
	Zsh hackers list <zsh-workers@zsh.org>
Subject: Re: [PATCH v4] [[:blank:]] only matches on SPC and TAB
Date: Wed, 16 May 2018 22:02:51 +0100	[thread overview]
Message-ID: <20180516210250.GC1433@chaz.gmail.com> (raw)
In-Reply-To: <20180516163119.GB1433@chaz.gmail.com>

2018-05-16 17:31:19 +0100, Stephane Chazelas:
[...]
> > Is iswblank() guaranteed to be available?  It's covered by an extra set
> > of #ifdef's compared with the isblank() case but none of them is forcing
> > it to use C99 standard headers.
[...] 

I have to admit I'm not sure what you mean by that. And those
are the kind of thing I'm not very familiar with. AFAICT, the
AC_CHECK_FUNCS() checks that the iswblank symbol is available in
the libc. And Src/zsh_system.h looks like it should enable
enough of the feature test macros for the system headers to
expose it, but I may very well misunderstand things.

> In that v3 patch, I've added iswblank() in the list of functions
> to check before enabling "unicode support". Maybe we should do
> like for isblank() so that we can still have unicode support if
> iswalpha()... are present but not iswblank() (and have
> iswblank() check for spc and tab only then).
> 
> OK, I'll send a v4 patch tonight.


diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo
index 8b447e2..c791097 100644
--- a/Doc/Zsh/expn.yo
+++ b/Doc/Zsh/expn.yo
@@ -2004,7 +2004,7 @@ The character is 7-bit, i.e. is a single-byte character without
 the top bit set.
 )
 item(tt([:blank:]))(
-The character is either space or tab
+The character is a blank character
 )
 item(tt([:cntrl:]))(
 The character is a control character
diff --git a/NEWS b/NEWS
index 1db9da6..1786897 100644
--- a/NEWS
+++ b/NEWS
@@ -4,7 +4,14 @@ CHANGES FROM PREVIOUS VERSIONS OF ZSH
 
 Note also the list of incompatibilities in the README file.
 
-Changes from %.5 to 5.5.1
+Changes from 5.5.1 to FIXME
+---------------------------
+
+In shell patterns, [[:blank:]] now honours the locale instead of
+matching exclusively on space and tab, like for the other POSIX
+character classes or for extended regular expressions.
+
+Changes from 5.5 to 5.5.1
 -------------------------
 
 Apart from a fix for a configuration problem finding singal names from
diff --git a/Src/pattern.c b/Src/pattern.c
index fc7c737..737f5cd 100644
--- a/Src/pattern.c
+++ b/Src/pattern.c
@@ -3605,7 +3605,15 @@ mb_patmatchrange(char *range, wchar_t ch, int zmb_ind, wint_t *indptr, int *mtp)
 		    return 1;
 		break;
 	    case PP_BLANK:
-		if (ch == L' ' || ch == L'\t')
+#if !defined(HAVE_ISWBLANK) && !defined(iswblank)
+/*
+ * iswblank() is GNU and C99. There's a remote chance that some
+ * systems still don't support it (but would support the other ones
+ * if MULTIBYTE_SUPPORT is enabled).
+ */
+#define iswblank(c) (c == L' ' || c == L'\t')
+#endif
+		if (iswblank(ch))
 		    return 1;
 		break;
 	    case PP_CNTRL:
@@ -3840,7 +3848,14 @@ patmatchrange(char *range, int ch, int *indptr, int *mtp)
 		    return 1;
 		break;
 	    case PP_BLANK:
-		if (ch == ' ' || ch == '\t')
+#if !defined(HAVE_ISBLANK) && !defined(isblank)
+/*
+ * isblank() is GNU and C99. There's a remote chance that some
+ * systems still don't support it.
+ */
+#define isblank(c) (c == ' ' || c == '\t')
+#endif
+		if (isblank(ch))
 		    return 1;
 		break;
 	    case PP_CNTRL:
diff --git a/configure.ac b/configure.ac
index 4329afb..00c7318 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1304,6 +1304,7 @@ AC_CHECK_FUNCS(strftime strptime mktime timelocal \
 	       memcpy memmove strstr strerror strtoul \
 	       getrlimit getrusage \
 	       setlocale \
+	       isblank iswblank \
 	       uname \
 	       signgam tgamma \
 	       scalbn \
@@ -2564,6 +2565,12 @@ AC_HELP_STRING([--enable-multibyte], [support multibyte characters]),
 [AC_CACHE_VAL(zsh_cv_c_unicode_support,
   AC_MSG_NOTICE([checking for functions supporting multibyte characters])
   [zfuncs_absent=
+dnl
+dnl Note that iswblank is not included and checked separately.
+dnl As iswblank() was added to C long after the others, we still
+dnl want to enabled unicode support even if iswblank is not available
+dnl (we then just do the SPC+TAB approximation)
+dnl
    for zfunc in iswalnum iswcntrl iswdigit iswgraph iswlower iswprint \
 iswpunct iswspace iswupper iswxdigit mbrlen mbrtowc towupper towlower \
 wcschr wcscpy wcslen wcsncmp wcsncpy wcrtomb wcwidth wmemchr wmemcmp \

-- 
Stephane


  reply	other threads:[~2018-05-16 21:03 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-13 21:25 [PATCH] " Stephane Chazelas
2018-05-13 21:49 ` [PATCH v2] " Stephane Chazelas
2018-05-14  2:27 ` [PATCH] " Sebastian Gniazdowski
2018-05-14  4:41   ` Sebastian Gniazdowski
2018-05-14  6:36   ` Stephane Chazelas
2018-05-14  6:44     ` Stephane Chazelas
2018-05-14  8:47       ` Peter Stephenson
2018-05-14 12:34         ` Stephane Chazelas
2018-05-14 13:50           ` Peter Stephenson
2018-05-14 15:51             ` Stephane Chazelas
2018-05-14 16:31               ` Sebastian Gniazdowski
2018-05-14 16:50                 ` Bart Schaefer
2018-05-14 19:52                   ` Daniel Tameling
2018-05-14 20:42                     ` Stephane Chazelas
2018-05-15 18:12                       ` Stephane Chazelas
2018-05-16  4:18                         ` Sebastian Gniazdowski
2018-05-15 19:06               ` Oliver Kiddle
2018-05-16 13:15                 ` Stephane Chazelas
2018-05-16 13:40                   ` Peter Stephenson
2018-05-16 16:31                     ` Stephane Chazelas
2018-05-16 21:02                       ` Stephane Chazelas [this message]
2018-05-17  8:29                         ` [PATCH v4] " Peter Stephenson
2018-05-17 22:05                       ` [PATCH] " Oliver Kiddle
2018-05-17  9:03           ` Sebastian Gniazdowski
2018-05-17 10:10             ` Sebastian Gniazdowski
2018-05-14  8:11     ` Sebastian Gniazdowski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180516210250.GC1433@chaz.gmail.com \
    --to=stephane.chazelas@gmail.com \
    --cc=p.stephenson@samsung.com \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).