From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.2 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_EF,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: from zero.zsh.org (zero.zsh.org [IPv6:2a02:898:31:0:48:4558:7a:7368]) by inbox.vuxu.org (Postfix) with ESMTP id A9B8326490 for ; Fri, 23 Feb 2024 20:32:33 +0100 (CET) ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1708716753; b=OR3GLq53iby1v92uOFjJnCT+MZffVoa3N1CyYUpCdnF1/nBVS99LM44aX5iVko0e7ITHhMWJVc a0PE+L4UjjpvOCjCO1vrNEYs+e1WE0roYKHXL6KvCoPpd9onHUCtbwXoMf+mHFLpRQHTJRQ6hY TUvq9sPdQBRJIt7hQREm0BZJBxQoW083Zd/ZAVMlot2f6KUJPvQhTfmzHhhabhh7gBZhr6j6LW snRZLwFJ/1nR6i4IJU96JH1L2O7aGKKAKqrlcTsRnGIimXfwiKR0FZtEp3+O9HV8lUDxTuMLV2 TB4hVlyGxY4h0hkngIlUloThjnff0yLb43v4bgSCbUnr9w==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (relay1-d.mail.gandi.net) smtp.remote-ip=217.70.183.193; dmarc=none header.from=chazelas.org; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1708716753; bh=jTQPSfgQ8HQqrw4WT4ANJtDUIIDUPJOxIdNiT0wcC7A=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:In-Reply-To:Content-Transfer-Encoding:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:DKIM-Signature; b=HH+F/AQmKUNN/1u8hp9Cq8CG+lhC1HwIPMtM8k6RdXT702B7IDhmUqFRKdN4L1zPg4m3nsQRt9 gQtTfdil75bm1khYwNF82IZOX5aox5Yd1eN7BfhKoZD/UKAqzetElyB9XfDBwpAGPeWkuE2kqC barZBuWBdZxrtBjy+v/vtKMkFVva0Vi9ssmCosAE4ZqaV6BfZqxR6rs99MEoErGSs/S4B+MSgr GQ/b0/CWq8IAC0yASLR1DwzJGkFPtk7y0PhouWNUfxpstaWV2gu4eH2ZprfgHVEq+GNwH5ybN0 Fg69NgEbbzVnHOTdbHKPV8GpnYQzoweV2vMPGmRxi2suiw==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:In-Reply-To:Content-Transfer-Encoding :Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID; bh=cEMsh0PPI+TUqktu1fYihP7wObQa8poGaYAQY2F40w0=; b=Il4cmb4smeFQ3cIV6ZAY9aTElS QQMN8FWr+xiZxJC0sQGg9Kzd+fxV08v369YMCpVFzs3uPQnVkzy+Msgi2pLGP7TYDRhMiXVw/wvbo otkbVf0Z+UAGSG9oi06V5SbVVP9VtaoVTLuMd7wnwmZfsQbZ1/kIWY04UoRWBHOoLAEHaCSOlKUuR IO8gAzgQecIEEOt6h7SLAqj1ieo8OruxPow09gvj3hT0rbrzKw4LOSS7x8AFJhgJ2P6+rW8L7SISt KQVf9Bu6r7gYQd/5g65U6vefNcUSRV6+BicSnmpcV3LGG1r3z+hzVR/lvTULSztEkuzkd6vN7jotC /EMtINkg==; Received: by zero.zsh.org with local id 1rdbHc-000DTh-Tv; Fri, 23 Feb 2024 19:32:33 +0000 Authentication-Results: zsh.org; iprev=pass (relay1-d.mail.gandi.net) smtp.remote-ip=217.70.183.193; dmarc=none header.from=chazelas.org; arc=none Received: from relay1-d.mail.gandi.net ([217.70.183.193]:44731) by zero.zsh.org with esmtps (TLS1.2:ECDHE-RSA-AES256-GCM-SHA384:256) id 1rdbCY-000CpB-Rj; Fri, 23 Feb 2024 19:27:20 +0000 Received: by mail.gandi.net (Postfix) with ESMTPSA id 252E3240003; Fri, 23 Feb 2024 19:27:17 +0000 (UTC) Date: Fri, 23 Feb 2024 19:27:17 +0000 From: Stephane Chazelas To: Bart Schaefer Cc: zsh workers Subject: Re: Metafication in error messages (Was: [PATCH] unmetafy Re: $var not expanded in ${x?$var}) Message-ID: <20240223192717.tczrbc63fei7d4m2@chazelas.org> Mail-Followup-To: Bart Schaefer , zsh workers References: <20240220193911.avnmcqfliwltkj5m@chazelas.org> <20240221194534.o2mufin7orng6ttg@chazelas.org> <20240221202150.tccftcqbxqqexq4x@chazelas.org> <20240222072313.7woy5vxvt4fbxyhj@chazelas.org> <20240222075528.eruaoosiuhmcrdsy@chazelas.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-GND-Sasl: stephane@chazelas.org X-Seq: 52584 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: , List-Subscribe: , List-Unsubscribe: , List-Post: List-Owner: List-Archive: 2024-02-22 14:31:12 -0800, Bart Schaefer: [...] > Opinions? There are two separate things here: 1. whether user-supplied text in messages should be output raw or with the non-printable characters given some common visible representation (like \n for newline, ^C for 0x03, \M-^C for 0x83, ^@ for NUL, ^[ for ESC, \uffff for the encoding of U+FFFF...) by going through nicezputs(). 2. Whether internally in the code the data should be passed "metafied" or not to zerr* functions. For 1, IMO, when the error message is generated by zsh, it should go through nicezputs(). zsh should decide of the formatting, have it pass escape sequences as-is would make it hard to understand and diagnose the error. For instance, $ printf 'uname\r\n' | zsh zsh: command not found: unane^M is more useful than zsh: command not found: uname Where CR when sent to a terminal moves the cursor to the left column so we don't see the problem is caused by that extra bogus character. The only cases where it should be passed raw is when the error message is constructed by the user, where the user is expected to be able to decide the formatting. Like in: syserror -p $'\e[1;31mERROR\e[m: ' echo ${1?$'multiline\nerror\nmessage'} ${DISPLAY:?$'\e[1;31mNo graphics'} (syserror likely doesn't use zerrmsg anyway). For 2, it looks like zerrmsg expects its input metafied and as you say, that input in most cases is likely to be metafied already. Not metafied would mean either we couldn't pass text containing NUL, or we'd need to pass it as ptr+len. So what makes most sense to me: %s remains passed metafied and is output nicezputs'ed %l same, truncated to the given number of bytes (though truncating to a number of characters or at least not cutting in the middle of character) would be nicer, but maybe overkill. %S also passed metafied, but no nicezputs. Now, my previous message was showing there were quite a few issue with the metafication and possibly with the nicezputs'ing and/or multibyte handling. > $ printf '%d\n' $'1+|a\x83 c' > zsh: bad math expression: operand expected at `|a^@c' Should have been: zsh: bad math expression: operand expected at `|aM-^C c' The text was not passed metafied to zerrmsg with 0x83 0x20 then incorrectly unmetafied to NUL, then rendered by nicezputs as ^@. [...] > $ printf '%d\n' '1+|รรรรรร' > zsh: bad math expression: operand expected at `|\M-C\M-c\M-c\M-c\M-c\M-c\M-^C...' I picked ร because it's a letter from the latin script, so you can even use it in variable names: $ zsh -c '(( รรรรรร ++ )); typeset -p รรรรรร' typeset -i รรรรรร=1 But its UTF-8 encoding happens to contain the 0x83 byte used in metafication. $ printf %s ร | od -An -vtx1 c3 83 $ printf %s ร | cat -v M-CM-^C Again above, we see the effect of a missing metafication. The error should have been: zsh: bad math expression: operand expected at `|รรรรรร' Like in: ~$ printf '%d\n' '1+|AAAAAA' zsh: bad math expression: operand expected at `|AAAAAA' > 0 > $ ((1+|รรรรรร)) > zsh: bad math expression: operand expected at `|รรรร\M-C...' In that case, metafication OK, but character cut in the middle. 2024-02-22 16:49:20 -0800, Bart Schaefer: > The changes for that are minimal. With them, Stephane's math-garbles > handle the ellipsis more cleanly: > > % printf '%d\n' '1+|รรรรรร' > zsh: bad math expression: operand expected at `|\M-C\M-c\...' > 0 > % ((1+|รรรรรร)) > zsh: bad math expression: operand expected at `|ร?ร?ร?...' It seems rather worse to me. -- Stephane