From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8110 invoked from network); 4 Mar 2008 01:29:32 -0000 X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.4 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 4 Mar 2008 01:29:32 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 62827 invoked from network); 4 Mar 2008 01:29:27 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 4 Mar 2008 01:29:27 -0000 Received: (qmail 21016 invoked by alias); 4 Mar 2008 01:29:22 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 24676 Received: (qmail 21001 invoked from network); 4 Mar 2008 01:29:22 -0000 Received: from bifrost.dotsrc.org (130.225.254.106) by sunsite.dk with SMTP; 4 Mar 2008 01:29:22 -0000 Received: from prunille.vinc17.org (vinc17.pck.nerim.net [213.41.242.187]) by bifrost.dotsrc.org (Postfix) with ESMTP id 9DE0B8028C53 for ; Tue, 4 Mar 2008 02:29:18 +0100 (CET) Received: by prunille.vinc17.org (Postfix, from userid 501) id 8FBE220825E1; Tue, 4 Mar 2008 02:29:17 +0100 (CET) Date: Tue, 4 Mar 2008 02:29:17 +0100 From: Vincent Lefevre To: zsh-workers@sunsite.dk Subject: printf %s in UTF-8 is not POSIX-compliant Message-ID: <20080304012917.GA15833@prunille.vinc17.org> Mail-Followup-To: zsh-workers@sunsite.dk MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Mailer-Info: http://www.vinc17.org/mutt/ User-Agent: Mutt/1.5.17-vl-r21552 (2008-03-02) X-Virus-Scanned: ClamAV 0.91.2/6097/Mon Mar 3 21:58:29 2008 on bifrost X-Virus-Status: Clean Hi, Under UTF-8 locales: vin:~> zsh-beta -f vin% emulate sh vin% printf ".%2s.\n" é . é. vin% /usr/bin/printf ".%2s.\n" é .é. vin% As you can see, the zsh printf builtin doesn't behave like the coreutils printf, and this is zsh which is wrong. Indeed, the precision is the number of bytes, not the number of characters. http://www.opengroup.org/onlinepubs/009695399/utilities/printf.html says (in the extended description) that the "file format notation" shall be used for the format (and %s isn't an exception). http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap05.html (file format notation) says: s The argument shall be taken to be a string and bytes from the string shall be written until the end of the string or the number of bytes indicated by the precision specification of the argument is reached. If the precision is omitted from the argument, it shall be taken to be infinite, so all bytes up to the end of the string shall be written. Note: ksh93 has the same bug, but not pdksh and bash. But bash may change its behavior if not under POSIX compatibility, see http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=459413 -- Vincent Lefèvre - Web: 100% accessible validated (X)HTML - Blog: Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)