From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 31527 invoked from network); 13 Apr 2021 18:04:56 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 13 Apr 2021 18:04:56 -0000 ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20200801; t=1618337096; b=0+d58fFIEq4bVEI27wNfbHkm9g9ISocyUlu71iu4odD+jez68e0wFjpgDoOiJxvj2T7mAyUulX mD2VnYznYipnTbhjENcOTP59zEkIfhrGPu9ClwhSoK7+IqGOqLACIUYO5zwMhzAxCIDosTE6WO CWjTR0FiStbNUv4tqaz5gjK44p4Pkr6IVae9A5hs8LfqNVe1ScaBUqh6WAERCgG6/2KtjE4raR C/E4Kn8EIypVTv6E9kOBagKEh5CHndLM9kktVUznptE62JcJ3Y4nUJMiHL7ghBBAHsQJKfkbgv AvD9Ia9N7e/tba6EBQ4AWMow3+Z+PwX4930leyOyCpFlfg==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (relay2-d.mail.gandi.net) smtp.remote-ip=217.70.183.194; dmarc=none header.from=chazelas.org; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20200801; t=1618337096; bh=JS4lACV0v54sFw06rMdjp5Wjx085in631DMQt2QweBc=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:In-Reply-To:Content-Transfer-Encoding:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:DKIM-Signature; b=j4joeL49WHuKiOzMDJEovxzMDr1V1RNpQBSDj2IMsh6d9kBtTYtlY+1tddpvCZvIBuiXwFzDBp u/1ghPPpga75mKz/wsLC/oQ9LFXPUWCay3EVUhvRUia2sIy85LK8I3i5/h1ifESbSBo2oUBCkM nXCij9ScYQw6UM6IKiNGYyeF1ZnNkquU1+a1JkhgrHmaQP3BewVAJGF72fvFcbmZv92hSU6Ici dxuZTmWZL12EJc3Q3HdULy4lGLV/Q8nAvcDsZGmZ1LfsLMwsvTbTbqThT3N9NirFad8azJlcLx 0QEvsNhmgyFyUmY5wFcUabBd2tO3N3tbXKtU2UrZV+rp6g==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20200801; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:In-Reply-To:Content-Transfer-Encoding :Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID; bh=FOEXkUYSCZJzoNcCxalNtKt13xu3yGnU0ikN4aP75Co=; b=XXhujv1KRuu6cD6doZ6Ub6lhLR m+FeaxLsIdZMID4J2YMlxbf0ItBF7KMPXLSY3tTCFMtntdihJKaCxhAdVFoPavrM45VVw1dla5Glg fA/MXFRJM+DxTPpQBeB7vN01xvJpbVpAoqFb//DJjD9frx457GDxIvqN+HOxlSoIOEZf8LA+2i88d S9jWZTPaU8qj/AsoAT4RWuiEcugFksRf3AYMFwOUCcsSwpHSD6SyuOKh05/vsDvEKkvP88l0kENsZ 0aDEUgk+TjkmArO0V7vKjlXfuDaUICmBLngwmOu3s+GYVuIlt+dD7SF8MhbLK70fTYf1LLBKmfEXe NTgYqD1Q==; Received: from authenticated user by zero.zsh.org with local id 1lWNPD-000CKL-MY; Tue, 13 Apr 2021 18:04:55 +0000 Authentication-Results: zsh.org; iprev=pass (relay2-d.mail.gandi.net) smtp.remote-ip=217.70.183.194; dmarc=none header.from=chazelas.org; arc=none Received: from relay2-d.mail.gandi.net ([217.70.183.194]:53893) by zero.zsh.org with esmtps (TLS1.2:ECDHE-RSA-AES256-GCM-SHA384:256) id 1lWNOD-000Bgm-2o; Tue, 13 Apr 2021 18:03:53 +0000 X-Originating-IP: 90.215.204.106 Received: from chazelas.org (unknown [90.215.204.106]) (Authenticated sender: stephane@chazelas.org) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id E139540002; Tue, 13 Apr 2021 18:03:50 +0000 (UTC) Date: Tue, 13 Apr 2021 19:03:49 +0100 From: Stephane Chazelas To: Daniel Shahaf Cc: Zsh hackers list Subject: Re: sh emulation POSIX non-conformances (printf %10s and bytes vs character) Message-ID: <20210413180349.bo4bbvro6dk634qg@chazelas.org> Mail-Followup-To: Daniel Shahaf , Zsh hackers list References: <7FD930F4-37CD-402B-9A06-893818856199@dana.is> <20210411175726.hxnm33mxoska2tsm@chazelas.org> <20210411194205.e7mr2wx33wlkq3rs@chazelas.org> <20210413155744.GS6819@tarpaulin.shahaf.local2> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20210413155744.GS6819@tarpaulin.shahaf.local2> X-Seq: 48540 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: List-Subscribe: List-Unsubscribe: List-Post: List-Owner: List-Archive: 2021-04-13 15:57:44 +0000, Daniel Shahaf: > Stephane Chazelas wrote on Sun, Apr 11, 2021 at 20:42:05 +0100: > > Another POSIX bug fixed by zsh (but which makes it non-compliant): > > > > With multibyte characters: > > > > $ printf '|%10s|\n' Stéphane Chazelas > > | Stéphane| > > | Chazelas| > > > > POSIX requires: > > > > | Stéphane| > > | Chazelas| > > > > (with a UTF-8 é encoded one 2 bytes > > Note that e-with-acute has two encodings in Unicode: > > é, one codepoint, two UTF-8 bytes > é, two codepoints, three UTF-8 bytes > > https://en.wikipedia.org/wiki/Unicode_equivalence#Normal_forms That was shown already in the part of my message you didn't quote, where I pointed out how ksh93 addresses it with its %Ls (zsh also has ${(ml[10])var} for that though). See also: https://unix.stackexchange.com/questions/350240/why-is-printf-shrinking-umlaut Cheers, Stephane