From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21701 invoked by alias); 14 May 2018 15:51:38 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: List-Unsubscribe: X-Seq: 42775 Received: (qmail 24461 invoked by uid 1010); 14 May 2018 15:51:37 -0000 X-Qmail-Scanner-Diagnostics: from mail-wm0-f45.google.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.99.2/21882. spamassassin: 3.4.1. Clear:RC:0(74.125.82.45):SA:0(-1.9/5.0):. Processed in 1.115225 secs); 14 May 2018 15:51:37 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_PASS, T_DKIM_INVALID autolearn=ham autolearn_force=no version=3.4.1 X-Envelope-From: stephane.chazelas@gmail.com X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to:user-agent; bh=tOON2Z9tAhGAYkXJ3hiH7/EYAF3MFUOhCaS9174Kd2o=; b=k52ReuSEVr3DsWxrn4TGDTEMm4HX0wSc9iCbZRZl2p1+05f/9wedbjG05A42WLiCRJ V3OZ0vTDp3ECsVjkyurH3YUG557KkCCvgk9WTf5JHbOy1lPiJlMmd9+T+IXfkDOpqze/ f2S89bqLT0zhkjs/WVtB8sNrAeymBIIafd74b7OlFYIi4qBgGxh3o+stXuJxyvtJaMvq z78FrF4xMaLFFJ3eL0INy5VVOYENmRCWA0tCVOrvTAyw2VqT5UyKkk/AxGpLJ1JRREgr HckpZ4rrYkzymDcQdNOYzFUvDKxXbYv0ElI9QPG4knUR9MITlIdZX3AtSzlJ8r/0d7+X /xQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :in-reply-to:user-agent; bh=tOON2Z9tAhGAYkXJ3hiH7/EYAF3MFUOhCaS9174Kd2o=; b=XnDNHuVfO36PYXYp81BgD6dcJgsnIYptvoWhMhUvY5zY42ox/a91wiwdrQBhOTVyLj GbrCE/2QIVGamlkoNQ7rv6hWDQgAySAY5UMRNU4Im6EMAaI4Zxj5A4m9ej27Io/PLceq whVADcoj5TsYf0ALl9su51KS5I/yBccZ5zKPYI3eOUK8n4GVYa74ZGI+gw+ejVShkuxC y1IObwuPeuhTXNgl0z2SOl5xeczEAr1XLVnqm91ArIKWhrzauyLfKADUdaHAhx8YHvgc G5ybaf6jPoU+YCC0pltdgsJNPXQEOpRjOL+8p5AFqOfFh1E8+skc0NgecbEnZoHdy3Mc ArGg== X-Gm-Message-State: ALKqPwd6vkpfN0SF3XTVf/NL0gdHrWTF49e4ECR/OlejwoEwuykm5olK lwZj/9O2OLXOG2qauj47FT0= X-Google-Smtp-Source: AB8JxZrPiLUtjAsc+NW6scaIln3Sn08epWWAlD37oURxl4aobZdQv5T9uiW079zztj16aE4HAD7Ovg== X-Received: by 2002:a1c:3282:: with SMTP id y124-v6mr5816242wmy.33.1526313093014; Mon, 14 May 2018 08:51:33 -0700 (PDT) Date: Mon, 14 May 2018 16:51:31 +0100 From: Stephane Chazelas To: Peter Stephenson Cc: Zsh hackers list Subject: Re: [PATCH] [[:blank:]] only matches on SPC and TAB Message-ID: <20180514155131.GC7263@chaz.gmail.com> Mail-Followup-To: Peter Stephenson , Zsh hackers list References: <20180513212553.GA29028@chaz.gmail.com> <20180514063611.GA7263@chaz.gmail.com> <20180514064431.GB7263@chaz.gmail.com> <20180514094733.308bff1a@camnpupstephen.cam.scsc.local> <20180514123425.GA19631@chaz.gmail.com> <20180514145056.3eedaea9@camnpupstephen.cam.scsc.local> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180514145056.3eedaea9@camnpupstephen.cam.scsc.local> User-Agent: Mutt/1.5.24 (2015-08-30) 2018-05-14 14:50:56 +0100, Peter Stephenson: [...] > It wouldn't be ridiculous to change the documentation for this case and > require "unsetopt multibyte" for strict byte-by-byte comparisions, which > is already how it works in the vast majority of other cases. [...] But note that here it's not about multibyte vs singlebyte but whether [:blank:] honours the locale like the other POSIX character classes (alpha, punct...) do. There are locales on some systems (like NetBSD already mentioned) that use a single-byte charset where more than SPC and TAB are classified as "blank" (like 0xA0 (nbsp) in locales using iso8859-x charsets or 0x9A in KOI8-R on NetBSD). IMO, without the "multibyte" option, we should still call isblank() which on most systems and most locales will match only on SPC and TAB but is not guaranteed to (and does not in practice like on NetBSD). I just noticed that on NetBSD, in locales using UTF-8 or GB18030, isblank() returns true on \v (vertical TAB), not in any other locale! So does iswblank(). So out goes my claim that "blank" should be for horizontal spaces. On OpenBSD (where only UTF-8 charsets are supported in locales other than C/POSIX), iswblank() matches on \v and \f. What a mess! -- Stephane