From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 19514 invoked from network); 22 Apr 2021 14:29:07 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 22 Apr 2021 14:29:07 -0000 ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20200801; t=1619101747; b=Bm1tdG+Yu+UqknUAMKAUFp3sk4ZfP3TdZM1I3W7uKpxdPqYHtavT6E5K9yp2R5lK50nBYAlb4J rujW7xrCYX2dtV4TxEjQ1LC7T7z6nhnn8KjSN77d66Ty9k2TT9Y+sFHnwcgdBrej/7wczIanLC GdSHEU2Yn0mYaiANixABQM5NCRYatyOifhr9R1blU0hTQ1j4uB7XROnjytFNcfE7ccDe2YD6FM lyMwZZHB611Act73ava7yH2P+1uACT2lEMyJjU/g0+R95aebl5gntSGP4SwdqsecVSuBmDFzje cNbi/dsMwQz4ove5hURH7w/FqkPpAOMnqYWKB1U3UMLozw==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (joooj.vinc17.net) smtp.remote-ip=155.133.131.76; dmarc=none header.from=vinc17.net; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20200801; t=1619101747; bh=YDH1y/k1u393I99GhVEVuvqDA9aa23Mf811TIJgz9TY=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:In-Reply-To:Content-Transfer-Encoding:Content-Type: MIME-Version:References:Message-ID:Subject:To:From:Date:DKIM-Signature; b=bzDXZxG3evLsCSsAMs+YiX67ygWenjXmc77pX4BgFMt91UQE224E9yWOer97jY3X01AEabkFDH y09i6Q9s+DKrB7dFkyVsIRDWNSm0glHMV/D1ddZmCGyfhp/ESj7EthwIT3qu2lNhb5VmX3cXhz 2PiG6JX6tUF8gSyOAzInZ88N27uaO7MUl06GbDBM+01MX1L/Qi4WTdBcyLH1fVsdJKGolr/urd bMPc8N/pH+9ILTsKvORB2E3Pe8qoa8B64wMLXPK6dU08FsjNfM56H48+R3ZxMmQ/OHXVv+G7+p K+U9UMpM1/BvWGB81T44Z86SavcFoQpJBeNGAPEjuJIAZw==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20200801; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:In-Reply-To:Content-Transfer-Encoding :Content-Type:MIME-Version:References:Message-ID:Subject:To:From:Date: Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=Y3zmgkrpY9zl+bbBbbfgqr7dyuQvqRHFFilMjASl2iY=; b=DJqNkZntJd8mauSVNOoXW+UPqF 4bdg3jKVQy4XFy9RSgC6b0hSKteKpwfZVij009BSjt7s1ONZc8C30QAgogOFmc1Mg9B/fALb7arEK HOlyaVeOr4KK6e3cuw3Eps1ldISzTzYx3m5OdJ0RppmvcdIRxeL3aLaaXAyg1VZvIV6ij6bxqURTK zfr+jtq9CGYaqiJ4MJI7K5Cr8Jf6vXvJaeeW0QOKrUQHOnO3j44onzJh0DXIjC4XZxNqHEF016Zno jJvDw7lvUW8OLiYHoC3BvFAoSY2QLjGp86+Wxrd6uixLcHU8Q7BHLBSTOpv45wpdaKmgI1Iax8bzp vAh9aPew==; Received: from authenticated user by zero.zsh.org with local id 1lZaKE-0006DD-O4; Thu, 22 Apr 2021 14:29:02 +0000 Authentication-Results: zsh.org; iprev=pass (joooj.vinc17.net) smtp.remote-ip=155.133.131.76; dmarc=none header.from=vinc17.net; arc=none Received: from joooj.vinc17.net ([155.133.131.76]:52854) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_256_GCM_SHA384:256) id 1lZaJh-0005yh-2j; Thu, 22 Apr 2021 14:28:29 +0000 Received: from smtp-zira.vinc17.net (lfbn-tou-1-1431-42.w90-89.abo.wanadoo.fr [90.89.233.42]) by joooj.vinc17.net (Postfix) with ESMTPSA id B8F23239; Thu, 22 Apr 2021 16:28:28 +0200 (CEST) Received: by zira.vinc17.org (Postfix, from userid 1000) id 18329C22A6D; Thu, 22 Apr 2021 16:28:27 +0200 (CEST) Date: Thu, 22 Apr 2021 16:28:27 +0200 From: Vincent Lefevre To: zsh-workers@zsh.org Subject: Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) Message-ID: <20210422142827.GB154089@zira.vinc17.org> Mail-Followup-To: zsh-workers@zsh.org References: <7FD930F4-37CD-402B-9A06-893818856199@dana.is> <20210411175726.hxnm33mxoska2tsm@chazelas.org> <20210411194205.e7mr2wx33wlkq3rs@chazelas.org> <20210422135934.GA154089@zira.vinc17.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20210422135934.GA154089@zira.vinc17.org> X-Mailer-Info: https://www.vinc17.net/mutt/ User-Agent: Mutt/2.0.6+158 (8162c129) vl-132933 (2021-04-22) X-Seq: 48654 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: List-Subscribe: List-Unsubscribe: List-Post: List-Owner: List-Archive: On 2021-04-22 15:59:34 +0200, Vincent Lefevre wrote: > I would think that's intentional, at least for the precision > (e.g. %.4s) in order to prevent buffer overflow. The behavior with incomplete UTF-8 sequences (the one with "\x84\x9d") is rather ugly: zira% printf "%3s\n" $(printf "\xe2\x84\x9d") | hd 00000000 20 20 e2 84 9d 0a | ....| 00000006 zira% printf "%3s\n" $(printf "\x84\x9d") | hd 00000000 20 84 9d 0a | ...| 00000004 zira% printf "%.1s\n" $(printf "\xe2\x84\x9d") | hd 00000000 e2 84 9d 0a |....| 00000004 zira% printf "%.1s\n" $(printf "\x84\x9d") | hd 00000000 84 9d 0a |...| 00000003 I think that only the POSIX spec makes sense, unless you consider that %s must handle valid characters, in which case it should fail with an error on any invalid sequence. But I would say that a different conversion specifier should be used, as an extension. -- Vincent Lefèvre - Web: 100% accessible validated (X)HTML - Blog: Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)