From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 19253 invoked by alias); 15 Feb 2012 09:10:54 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 30222 Received: (qmail 27099 invoked from network); 15 Feb 2012 09:10:49 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham version=3.3.2 Received-SPF: none (ns1.primenet.com.au: domain at vinc17.net does not designate permitted sender hosts) Date: Wed, 15 Feb 2012 10:10:44 +0100 From: Vincent Lefevre To: zsh-workers@zsh.org Subject: Re: printf %s in UTF-8 is not always POSIX-compliant Message-ID: <20120215091044.GB19525@xvii.vinc17.org> Mail-Followup-To: zsh-workers@zsh.org References: <20120215021519.GA19525@xvii.vinc17.org> <120215001413.ZM22585@torch.brasslantern.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <120215001413.ZM22585@torch.brasslantern.com> X-Mailer-Info: http://www.vinc17.net/mutt/ User-Agent: Mutt/1.5.21-6201-vl-r48020 (2011-12-20) On 2012-02-15 00:14:12 -0800, Bart Schaefer wrote: > On Feb 15, 3:15am, Vincent Lefevre wrote: > } > } In UTF-8 locales: > } > } xvii% printf ".%2s.\n" é > } .é. > > Am I understanding correctly that the intent here is that é is a two- > byte character so %2s should print the two literal bytes, rather than > print the single logical character in a field two logical characters > wide? Yes, the number is the size in bytes, not in characters. I think that the intent is to deal with internal structures (e.g. with file formats where some fields have a fixed or limited size, and the same syntax can be used in C to avoid buffer overflows). Note that there's the same problem with: xvii% printf ".%.3s.\n" éabcd .éab. xvii% emulate ksh xvii% printf ".%.3s.\n" éabcd .éab. xvii% emulate sh xvii% printf ".%.3s.\n" éabcd .éa. -- Vincent Lefèvre - Web: 100% accessible validated (X)HTML - Blog: Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)