From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2604 invoked by alias); 16 May 2018 21:03:02 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: List-Unsubscribe: X-Seq: 42790 Received: (qmail 9750 invoked by uid 1010); 16 May 2018 21:03:02 -0000 X-Qmail-Scanner-Diagnostics: from mail-wr0-f194.google.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.99.2/21882. spamassassin: 3.4.1. Clear:RC:0(209.85.128.194):SA:0(-2.5/5.0):. Processed in 5.183508 secs); 16 May 2018 21:03:02 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.5 required=5.0 tests=BAYES_00,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_PASS,T_DKIM_INVALID autolearn=ham autolearn_force=no version=3.4.1 X-Envelope-From: stephane.chazelas@gmail.com X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to:user-agent; bh=uQAN2MlBrYk+vbLUWxcj94JDTtJR2m4bulNbcovwsf4=; b=FR4OnRApBnYqzes4h2QJ2VqEljduSyg17ynQl026NaldoWnaAqU7wKl8eOdjoNWqhi o0qHuQsrGjqBMmdKoT9IKjGX/rYhqF4fn8YCp6owA0s6zcaNAliklsYzGO95kIV2tmEZ in5cF52AvyZiL4EKOUwz75WMUMFWNJIngsBixhbXGO4Sb85PtzrNdHQ+hRFoHvLrtfOn P64ay2D7chL/pTVaH/VaooZTvpbIB0JI4vaO2vaoR829hcJfFzdocwlo99TlqICCi405 gMrDiemj+dmh3KIV6tI9HhTYI8hw9alaRp50Sp5m4QNWcroywfeNXn65525n4tsp6VVb iBkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:subject:message-id:mail-followup-to :references:mime-version:content-disposition:in-reply-to:user-agent; bh=uQAN2MlBrYk+vbLUWxcj94JDTtJR2m4bulNbcovwsf4=; b=Um1BPqXDYCk7oZbfJAVj64q3R1svsZs4gBU/2le0JOZP0nzg1r7vK8bRLDi56lyX5p M9iPfkkHBGXDpfrVrmlzBoNKnIcrkOZc32/blbEeW1otX4yIS3ZPBliyFvLcB+49hekS WvjgjikkADVzDT6Ze5YW871pggWXaI72Ud868mLpg68SzJ5+SE5V7TCyzIjxkkwa5akn 3M/JewdRLOGC11duU09gz0ajBJu0xzZfMmm0pP7k17HVNnMzT2f9JLtPr2sth5v+I65T ZOf3C/nZD7aZx6QisiJZ2Mf7MVOjNTR6YqxXGvNaGlE41dhQYx7p9AcBb0W6+KTgR99T xmhA== X-Gm-Message-State: ALKqPwdVJhoUnZAnaDjCEU0vb70eLfPQdP8aRG1crTTkccSHpP5oib3U SvN06N6asrnGvLytyQrjCwE= X-Google-Smtp-Source: AB8JxZpesyMBgi2bo7DVynl39OR93+KBe63pY15wSEKokzQnPTcdL1XOzQSrpINiWEXpEhao589Hfw== X-Received: by 2002:adf:de0c:: with SMTP id b12-v6mr1928468wrm.131.1526504573384; Wed, 16 May 2018 14:02:53 -0700 (PDT) Date: Wed, 16 May 2018 22:02:51 +0100 From: Stephane Chazelas To: Peter Stephenson , Zsh hackers list Subject: Re: [PATCH v4] [[:blank:]] only matches on SPC and TAB Message-ID: <20180516210250.GC1433@chaz.gmail.com> Mail-Followup-To: Peter Stephenson , Zsh hackers list References: <20180514064431.GB7263@chaz.gmail.com> <20180514094733.308bff1a@camnpupstephen.cam.scsc.local> <20180514123425.GA19631@chaz.gmail.com> <20180514145056.3eedaea9@camnpupstephen.cam.scsc.local> <20180514155131.GC7263@chaz.gmail.com> <18720.1526411161@thecus> <20180516131547.GA1433@chaz.gmail.com> <20180516144026.7c21e073@camnpupstephen.cam.scsc.local> <20180516163119.GB1433@chaz.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180516163119.GB1433@chaz.gmail.com> User-Agent: Mutt/1.5.24 (2015-08-30) 2018-05-16 17:31:19 +0100, Stephane Chazelas: [...] > > Is iswblank() guaranteed to be available? It's covered by an extra set > > of #ifdef's compared with the isblank() case but none of them is forcing > > it to use C99 standard headers. [...] I have to admit I'm not sure what you mean by that. And those are the kind of thing I'm not very familiar with. AFAICT, the AC_CHECK_FUNCS() checks that the iswblank symbol is available in the libc. And Src/zsh_system.h looks like it should enable enough of the feature test macros for the system headers to expose it, but I may very well misunderstand things. > In that v3 patch, I've added iswblank() in the list of functions > to check before enabling "unicode support". Maybe we should do > like for isblank() so that we can still have unicode support if > iswalpha()... are present but not iswblank() (and have > iswblank() check for spc and tab only then). > > OK, I'll send a v4 patch tonight. diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo index 8b447e2..c791097 100644 --- a/Doc/Zsh/expn.yo +++ b/Doc/Zsh/expn.yo @@ -2004,7 +2004,7 @@ The character is 7-bit, i.e. is a single-byte character without the top bit set. ) item(tt([:blank:]))( -The character is either space or tab +The character is a blank character ) item(tt([:cntrl:]))( The character is a control character diff --git a/NEWS b/NEWS index 1db9da6..1786897 100644 --- a/NEWS +++ b/NEWS @@ -4,7 +4,14 @@ CHANGES FROM PREVIOUS VERSIONS OF ZSH Note also the list of incompatibilities in the README file. -Changes from %.5 to 5.5.1 +Changes from 5.5.1 to FIXME +--------------------------- + +In shell patterns, [[:blank:]] now honours the locale instead of +matching exclusively on space and tab, like for the other POSIX +character classes or for extended regular expressions. + +Changes from 5.5 to 5.5.1 ------------------------- Apart from a fix for a configuration problem finding singal names from diff --git a/Src/pattern.c b/Src/pattern.c index fc7c737..737f5cd 100644 --- a/Src/pattern.c +++ b/Src/pattern.c @@ -3605,7 +3605,15 @@ mb_patmatchrange(char *range, wchar_t ch, int zmb_ind, wint_t *indptr, int *mtp) return 1; break; case PP_BLANK: - if (ch == L' ' || ch == L'\t') +#if !defined(HAVE_ISWBLANK) && !defined(iswblank) +/* + * iswblank() is GNU and C99. There's a remote chance that some + * systems still don't support it (but would support the other ones + * if MULTIBYTE_SUPPORT is enabled). + */ +#define iswblank(c) (c == L' ' || c == L'\t') +#endif + if (iswblank(ch)) return 1; break; case PP_CNTRL: @@ -3840,7 +3848,14 @@ patmatchrange(char *range, int ch, int *indptr, int *mtp) return 1; break; case PP_BLANK: - if (ch == ' ' || ch == '\t') +#if !defined(HAVE_ISBLANK) && !defined(isblank) +/* + * isblank() is GNU and C99. There's a remote chance that some + * systems still don't support it. + */ +#define isblank(c) (c == ' ' || c == '\t') +#endif + if (isblank(ch)) return 1; break; case PP_CNTRL: diff --git a/configure.ac b/configure.ac index 4329afb..00c7318 100644 --- a/configure.ac +++ b/configure.ac @@ -1304,6 +1304,7 @@ AC_CHECK_FUNCS(strftime strptime mktime timelocal \ memcpy memmove strstr strerror strtoul \ getrlimit getrusage \ setlocale \ + isblank iswblank \ uname \ signgam tgamma \ scalbn \ @@ -2564,6 +2565,12 @@ AC_HELP_STRING([--enable-multibyte], [support multibyte characters]), [AC_CACHE_VAL(zsh_cv_c_unicode_support, AC_MSG_NOTICE([checking for functions supporting multibyte characters]) [zfuncs_absent= +dnl +dnl Note that iswblank is not included and checked separately. +dnl As iswblank() was added to C long after the others, we still +dnl want to enabled unicode support even if iswblank is not available +dnl (we then just do the SPC+TAB approximation) +dnl for zfunc in iswalnum iswcntrl iswdigit iswgraph iswlower iswprint \ iswpunct iswspace iswupper iswxdigit mbrlen mbrtowc towupper towlower \ wcschr wcscpy wcslen wcsncmp wcsncpy wcrtomb wcwidth wmemchr wmemcmp \ -- Stephane