From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 23652 invoked from network); 26 Mar 2021 18:16:26 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 26 Mar 2021 18:16:26 -0000 ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20200801; t=1616782586; b=SDjJ/OQf2ag8pgu31QwB26D6Ahc32o0D3kf9RqsamSE+qtjd001/ZlblFDuM9K3n0bGenNBCbC T/pR7XOTA/jAoSmT+OjUL4wgKGYNI7CoQ1QwrS/nA9Ktl7gZAUohCplKYSga07iJJ7Wbs0Tbvl y+S+hlOlHea7dsRKoVNHR147hbZ0q2ErcGZQP0lksp+0Bofty/xMnYX7a254M0P+Xpw1BvPirX 8zuR9tlbVze4XF/H9KZeb70Isq+PQ1mihaXu+nl2UF5r5NjQPnZpXVE0pAu164uJddLtlZLV+c QAFj72sGbl6cJmk8dzN5QysDdS0jd2o4D+ykvC0MiVVnHQ==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (mail-oo1-f47.google.com) smtp.remote-ip=209.85.161.47; dkim=pass header.d=brasslantern-com.20150623.gappssmtp.com header.s=20150623 header.a=rsa-sha256; dmarc=none header.from=brasslantern.com; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20200801; t=1616782586; bh=tiSWR3C5FxbmidHQd0zh2sUlIPjLqLORd5noBQy7P4w=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Content-Type:To:Subject:Message-ID:Date:From:MIME-Version: DKIM-Signature:DKIM-Signature; b=S0G9yuQPCWxZysqxQy0v+w+l4dqDquVgNcoKC42sWhVPoAWqILqKLdqyQrilKlOxU9wEsNr+By 8WqauI0hI4HCY4DJlqvt4YmGbGd0kUQi0t9MvmHf+26mATIx+K3uBRtmAFVke/SvhYRzLyCObD 0OKSx3BMrad1R6id0WkRKSO+lQ0i3314XGY+Zv++6xcazQ37AGfZneawLevzi3u/8xTW0frxG0 MyXaWVwiZvi243lFMN4V3/mlg01A+tYQt3LsvuQRgfFj4L+vTyhymX+QMxbxaUm9vFDfxhCE7a m7GANt+1pk+KDQWeJKilTKzEuVDJPBHtMfjl9xfl9XeVyg==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20200801; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Content-Type:To:Subject:Message-ID: Date:From:MIME-Version:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References; bh=OrIjy4621KL80OcyGwhMTPN2nvaBRe7u9VEEU8rSehU=; b=sJvqyJQCXxKDz9PLilDDraz/aZ 1SJa4Z2ajImiTzJ+YOfYfR6IdVZ5yZIt7wOFtzUYjfj6vqxG7bqUeY+hfXLzIo+xA9wifnS0AidSU 6nWCA0aVWZpMg9YkyYQijxmjdtrRU9PU5rlyKgTj4m7ry94tu3v8fy0OH7Tg0x+AJNDPaj00xXlxN cuBAO8Dfa7C/ceihxJEjNszx870ho29RrnKODFiBK4WtwoQDSNUeX5wz9uCcsCtMAtTxSHRBuuLCQ 5r5FR7WC3AhurnqLRzohIuIfPfcFSVLZFHkMZmvjptptHrTf2K146AYy/f8XmLpbY8bGPIGU0U0LT 64nGae9A==; Received: from authenticated user by zero.zsh.org with local id 1lPr0R-0006rA-ED; Fri, 26 Mar 2021 18:16:23 +0000 Authentication-Results: zsh.org; iprev=pass (mail-oo1-f47.google.com) smtp.remote-ip=209.85.161.47; dkim=pass header.d=brasslantern-com.20150623.gappssmtp.com header.s=20150623 header.a=rsa-sha256; dmarc=none header.from=brasslantern.com; arc=none Received: from mail-oo1-f47.google.com ([209.85.161.47]:45795) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_128_GCM_SHA256:128) id 1lPr04-0006OE-Sk; Fri, 26 Mar 2021 18:16:01 +0000 Received: by mail-oo1-f47.google.com with SMTP id n6-20020a4ac7060000b02901b50acc169fso1489598ooq.12 for ; Fri, 26 Mar 2021 11:16:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brasslantern-com.20150623.gappssmtp.com; s=20150623; h=mime-version:from:date:message-id:subject:to; bh=OrIjy4621KL80OcyGwhMTPN2nvaBRe7u9VEEU8rSehU=; b=HwIL/WiTwc7LK9QUPXpz/offyd22lBi7qLuXhHasFfx3QjdZs3gMnlsGw8c2XxqX0d OfKpcMdY/TqXgEV2sdO6zRh1EKkZnGdcxB9lw5mM2b1/0UM3mBkxRhUD7MlN7wDG4QWB ZfZ8rsZoLOdSRaueLoeOmQ+bMQYoN6FYqUMEsjq04k4HNHPmhpcMHUJoIfWQxIOvl14V fWwPlqzrmevNrEbc8apMHKFlnbotAdnFc197e9qh6m1qm0k9Pd4HtsRlzaP1PQMCVsvk 1V4uU4EO7vr8/xSa7XMdAlZvOFohPWW1x1hCAKbaION8XhBBj83qXB3QblZqN8W0zkVB 4r7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=OrIjy4621KL80OcyGwhMTPN2nvaBRe7u9VEEU8rSehU=; b=IVCzGtScuWXGC9fSUYNRST+TYngmPB59aeRvrCd/RbLdw0LQyKuKaaJLh1YmYRKfT5 URYi5TqWif3YhRFJjw4Px0CjyeLIq24XyVvxnmCsOxCddvGDaJeN5vikRcpJg1hOOKTq M+yX7Z9JWqygsDld5PQI4YLS+2MfkoGILAohG0Z0+q55gXiFZprvimax65Mq6yGRFKP7 bKDuF74HLWYGyzsu10gWXE0J+aR3eVs352KB+N0Z8qYMvKF91nRmgKqx3sVyCN//YsCe za6Wi5b71EVZ7a7jOlMDQqqONMdC82EJfYgX2Tia8OKlmjs2W8Ak//Xhp800zWmu4ewZ fOyA== X-Gm-Message-State: AOAM533tpM9knQes9hkoPlF8BClwwxSEEURFelrjWVQxznqrs/4LMk/4 k6SCVtZVl34XKkWKODAHQEEK2iX/eDDnaRqW6HLhqlK83EUsXA== X-Google-Smtp-Source: ABdhPJyHXe21YY07FVgG7XUVdDg3Fvi2O0zqVLnNRfUcRNaBPPCi8QCVUV9rCqL0VxIR5WCBvZA5O6lLbetylj7MLyU= X-Received: by 2002:a4a:6b04:: with SMTP id g4mr12171136ooc.78.1616782558641; Fri, 26 Mar 2021 11:15:58 -0700 (PDT) MIME-Version: 1.0 From: Bart Schaefer Date: Fri, 26 Mar 2021 11:15:47 -0700 Message-ID: Subject: UTF-8 non-breaking spaces To: Zsh hackers list Content-Type: text/plain; charset="UTF-8" X-Seq: 48250 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: List-Subscribe: List-Unsubscribe: List-Post: List-Owner: List-Archive: Archived-At: > If you're copy-pasting from an edit in browser gmail, for example, it > has a tendency to insert non-breaking spaces whenever there is more > than one consecutive space, which the shell interprets as > non-whitespace and attempts to execute as commands. Non-breaking space in this case is (bindkey syntax) "\M-B\M- ". The error message is equally confusing because you still can't see the non-breaking spaces when "not found" is reported. Handling this is complicated by bracketed-paste, which protects the non-breaking spaces from (for example) { bindkey -s '\M-B\M- ' ' ' }. "unsetopt multibyte" does not affect this but LANG=C results in (for example) (In gmail editor) echo " " " " (Pasted at shell prompt) % echo " " " " That's totally a ZLE display thing, the actual nbsp is output when the command executes, but at least you can see what's going on. (The non-breaking spaces go back to normal spaces in sent email, I believe, or at least do so when the message is displayed in gmail; this is just a "thing" in the browser text editor.) Similar goofiness can result when copy-pasting from other "smart" multibyte editors when zsh has a UTF-8 variant in $LANG. Any good suggestions how to deal with this in a non-confusing fashion? Everything I've thought of (short of hacking up the lexer) risks corrupting parts of the input that aren't intended to be word separators (the bindkey -s above has that problem, for example, if bracketed-paste is disabled).