From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17709 invoked by alias); 17 Jun 2017 06:31:44 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 41314 Received: (qmail 14385 invoked from network); 17 Jun 2017 06:31:44 -0000 X-Qmail-Scanner-Diagnostics: from mail-wm0-f51.google.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.99.2/21882. spamassassin: 3.4.1. Clear:RC:0(74.125.82.51):SA:0(-0.0/5.0):. Processed in 1.169477 secs); 17 Jun 2017 06:31:44 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-0.0 required=5.0 tests=FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_PASS, T_DKIM_INVALID autolearn=unavailable autolearn_force=no version=3.4.1 X-Envelope-From: stephane.chazelas@gmail.com X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: pass (ns1.primenet.com.au: SPF record at _netblocks.google.com designates 74.125.82.51 as permitted sender) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:mail-followup-to:references :mime-version:content-disposition:in-reply-to:user-agent; bh=3iSnhdhplahlqOA2AUXa3J80ftQp1XqdoyXl49oC0go=; b=SqDZTwp4aPh+Yf8fijaGH1I56ehiDyWJo4C9gGfGrIgZ1nUCaScAzMhIYNMFjVGCrf xyz3lUHrtgehHRqj1mEw1MDWab25LNjdbwXf/C1G2OTRJrIUV0R578b9jY6z4W/aDmgA yTpCzPINgnwkvDVfWZnV4lPaX+9oq9qalB53CPlJmaiP3jaPc2k2NcAtWH4pnwQQV68I TX7OttYY/hWhCD4LtDrtklOdNJEg4GOWLhobgPxNcUVpBlY3OV21s0/Uzr5BRZnerG9w O3u1SiN9HKH2yH3T8HvCzi75iRMuvW8KoOujiJRHcQtdK0ooiOZBq5j/agU6NreUnirb VwDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id :mail-followup-to:references:mime-version:content-disposition :in-reply-to:user-agent; bh=3iSnhdhplahlqOA2AUXa3J80ftQp1XqdoyXl49oC0go=; b=JF504f2qc2Kk7eur+AggvSV8NPKXiKQ7YYDB0/9m0sSqbDUV6+wOEf20ewip0Bo+eV pN4goLQy8HMRzGV6Vlx+0pvbVcBLb2YahKW8OsAMdiDCG6jGeJpPNH3O3iugZxytIwwS Oaobe9Ht6J4+M9E8aThKUcncdi/xuYdC6+kRHPliemvvsou7EfwbJ+S17d0EhCCKTeUa SyJNxhYTFxc99c46oP4dVzkaJatu8kxbHx2vssafuQPkwz+Dg0G3XI4DNUwpIbzg0+4h a1AyQGKMoVB3OaY7flwKB5uvaP/G1ngYrmrMOX76t5JWxIF3Ldfeku8RJxZBZg8ZFrRj cVGg== X-Gm-Message-State: AKS2vOwpVt651p+EU3ltyTUn4gMqK+mJkO8geUJCmoWa3MLTfeULS3Nb Yv8OvCdYcvnNtxz3 X-Received: by 10.28.234.70 with SMTP id i67mr9279902wmh.91.1497681093509; Fri, 16 Jun 2017 23:31:33 -0700 (PDT) Date: Sat, 17 Jun 2017 07:31:28 +0100 From: Stephane Chazelas To: Bart Schaefer Cc: zsh-workers@zsh.org Subject: Re: [PATCH] PCRE/NUL: pass NUL in for text, handle NUL out Message-ID: <20170617063128.GA5478@chaz.gmail.com> Mail-Followup-To: Bart Schaefer , zsh-workers@zsh.org References: <20170615204050.GA27003@breadbox.private.spodhuis.org> <20170616064129.GA19469@chaz.gmail.com> <170616201049.ZM28016@torch.brasslantern.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <170616201049.ZM28016@torch.brasslantern.com> User-Agent: Mutt/1.5.24 (2015-08-30) 2017-06-16 20:10:49 -0700, Bart Schaefer: > On Jun 16, 7:41am, Stephane Chazelas wrote: > } > } Solution for now in zsh is to escape like: > } > } [[ $x =~ "\b\Q${word//\\E/\\E\\\\E\\Q}\E" ]] > > Hmm, wouldn't "\b\Q${(b)word}\E" be sufficient there? In fact if > you've applied ${(b)word} do you even need \E and \Q ? Not really Inside \Q...\E PCREs, only \E is special, and there's no escaping you may do. It's like strong quotes. Changing ? to \? would change the meaning of the regexp. And wouldn't help for \E Outside of \Q...\E where what needs to be escaped on whether the regexp has a (?x)), there are things like . or $ (or blanks with (?x)) it would still leave unescaped. PCREs (as opposed to some ERE implementations that have things like \<, \=) are good though in that AFAICT, there are only \x operators where x is an ASCII alnum, so adding a \ in front of every ASCII non-alnum should be enough I would think (as long as we're not inside [...] or things like \g{...}). So a an equivalent of ${(b)var} for PCRE should not too difficult. Quoting both ERE and PCRE is a problem in theory for (?x) and blanks where "\ " is unspecified in ERE, but in practice, I don't think any ERE implementation would ever have "\ " as a special operator. So I think it should be a matter of quoting only (and not more than): ASCII [[:space:]] $^*()+[]{}.?\| (and again (from a security standpoint at least), that quoting could be fooled in some locales like those that have BIG5-HKSCS or GB18030 as the charset where some characters whose encoding contains the encoding of other characters including ASCII ones). -- Stephane