From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 747 invoked by alias); 14 May 2018 19:52:21 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: List-Unsubscribe: X-Seq: 42779 Received: (qmail 21224 invoked by uid 1010); 14 May 2018 19:52:21 -0000 X-Qmail-Scanner-Diagnostics: from mail-wm0-f42.google.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.99.2/21882. spamassassin: 3.4.1. Clear:RC:0(74.125.82.42):SA:0(-1.9/5.0):. Processed in 1.292101 secs); 14 May 2018 19:52:21 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_PASS, T_DKIM_INVALID autolearn=ham autolearn_force=no version=3.4.1 X-Envelope-From: tamelingdaniel@gmail.com X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to; bh=i3D0gnPnxXF13Yl9lx4AZhX2NxetC9giGYyXtRU2Ouc=; b=djU5hTiE9TBmqPuG0oJStZpqfQuQY5etNuW8meQFjiIMGEIxhcdkeq+1GZWUDPny7L 0woR0xt5+rHbpPqC63t8yZsxcvj9Z9fgXiFzsJEY0zqk+UDH7Zz71H2nhaPrMiT8WEVC yES3+WyFD63mQ8apqhCRSSL+6QYjez6Ph8YWdJH6Ok8EIEKNuUbi4TdKDdFXTJvDDzQM uk+eDT/SP81TMIUiwb+DasxNKPvAjKVgrMwEJ9geBJlsEkhJBen1CnjjFap7/xe2cxqg t5ArLaOEPNDUKu9QNkf25ONH4nGp5k6dF7i6BbwOT1dpkyrDADt109JXdHvkAAq0+0fE tP+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:subject:message-id:mail-followup-to :references:mime-version:content-disposition:in-reply-to; bh=i3D0gnPnxXF13Yl9lx4AZhX2NxetC9giGYyXtRU2Ouc=; b=Ze4th5B/dHnlwg2mCIaGj2BDTxTsP8XfA8ECUZrgHvtaGzacaBF6rhmt0gC/bCcfCj Om2LrVhJOhJXNEWYiAjEaM9GUw5meVrKxrGkRLx4mHow88khlqzXk/3UpwuSKvhm4sq9 wzF94OWUtM6mittePVDbr0jKi3FFHBlsXHbKKjSYI7kBCxk2ElLN+IisaTbRPehHwamf aL4DP0AZ/hS0af2lqZafhy8SM7Dwgaa7fklxPi9HzyHymoGSBwP+VvwLDQdZ8J1Y0ebQ wZkecYMElLL9Uwgpd82haNpDARnI1Qif5TX9qdwwqNUNp6W1lN/fcm74qpmBXZgyIYGk DQJg== X-Gm-Message-State: ALKqPweP4dTB/ApM4hR7RvuKS2AzGhsoPvgTeVvdTJSZok+BAwXrE2uX 1fmbrWKFwLaT9bWPAx7iRJOhbQ== X-Google-Smtp-Source: AB8JxZo8+mGcanJ5k9ZfNKfEy5SQslYu3pKvRAxtTfPCgqiGTeuc0bwTdiy53leJP56KVdsudtE4Og== X-Received: by 2002:a1c:7a0b:: with SMTP id v11-v6mr6308561wmc.58.1526327536430; Mon, 14 May 2018 12:52:16 -0700 (PDT) Date: Mon, 14 May 2018 21:52:14 +0200 From: Daniel Tameling To: zsh-workers@zsh.org Subject: Re: [PATCH] [[:blank:]] only matches on SPC and TAB Message-ID: <20180514195214.6lunkmugxwcatm7y@Daniels-MacBook-Air.local> Mail-Followup-To: zsh-workers@zsh.org References: <20180514063611.GA7263@chaz.gmail.com> <20180514064431.GB7263@chaz.gmail.com> <20180514094733.308bff1a@camnpupstephen.cam.scsc.local> <20180514123425.GA19631@chaz.gmail.com> <20180514145056.3eedaea9@camnpupstephen.cam.scsc.local> <20180514155131.GC7263@chaz.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Stephane already quoted some man pages, but here is what the C99/C11 standards say: "The isblank function tests for any character that is a standard blank character or is one of a locale-specific set of characters for which isspace is true and that is used to separate words within a line of text. The standard blank characters are the following: space (' '), and horizontal tab ('\t'). In the "C" locale, isblank returns true only for the standard blank characters." And Posix seems to say the same: it defines blank for the C locale and states that in other locales it should at least encompass space and tab. So in other locales it seems to be totally undefined what a blank is, and everybody does what they think is good choice. Thus the mess Stephane observed. In fact, I looked at the musl library and found this code: int isblank(int c) { return (c == ' ' || c == '\t'); } int __isblank_l(int c, locale_t l) { return isblank(c); } So they completely ignore the locale and just use the bare minimum required by the standard. So after the patch, zsh would not only behave differently on different platforms but would also change it's behavior if you link with a different libc. Nevertheless, I'm slightly in favour of the patch. While defining our own :blank: for other locales might give us consistency across platforms, I think it will end up to be different than what everybody else does and will thus lead to unexpected results for users -- in particular if the libc's start to agree on isblank for different locales. And at that point, it might be difficult to change the behavior if it breaks backward compatibility. In fact, it's the hope that the situation will improve in the future that sways me towards the patch compared to the status-quo. But seeing the mess Stephane uncovered made it a very tight race. Finally, whether the patch gets applied or not, the documentation should definitely be updated to reflect the issues around :blank:. -- Daniel