From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5485 invoked from network); 9 Dec 2007 22:50:37 -0000 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.2.3 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 9 Dec 2007 22:50:37 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 35612 invoked from network); 9 Dec 2007 22:50:32 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 9 Dec 2007 22:50:32 -0000 Received: (qmail 12009 invoked by alias); 9 Dec 2007 22:50:29 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 24201 Received: (qmail 11991 invoked from network); 9 Dec 2007 22:50:29 -0000 Received: from bifrost.dotsrc.org (130.225.254.106) by sunsite.dk with SMTP; 9 Dec 2007 22:50:29 -0000 Received: from virusfilter.dotsrc.org (bifrost [127.0.0.1]) by spamfilter.dotsrc.org (Postfix) with ESMTP id 98D68801CDCF for ; Sun, 9 Dec 2007 23:48:04 +0100 (CET) Received: from a.mx.sunsite.dk (new-brage.dotsrc.org [130.225.254.104]) by bifrost.dotsrc.org (Postfix) with SMTP for ; Sun, 9 Dec 2007 23:48:04 +0100 (CET) Received: (qmail 35344 invoked from network); 9 Dec 2007 22:50:28 -0000 Received: from vms042pub.verizon.net (206.46.252.42) by a.mx.sunsite.dk with SMTP; 9 Dec 2007 22:50:21 -0000 Received: from torch.brasslantern.com ([71.121.18.67]) by vms042.mailsrvcs.net (Sun Java System Messaging Server 6.2-6.01 (built Apr 3 2006)) with ESMTPA id <0JST00JFA0R7DWC5@vms042.mailsrvcs.net> for zsh-workers@sunsite.dk; Sun, 09 Dec 2007 16:49:56 -0600 (CST) Received: from torch.brasslantern.com (localhost.localdomain [127.0.0.1]) by torch.brasslantern.com (8.13.1/8.13.1) with ESMTP id lB9MnsbD014123; Sun, 09 Dec 2007 14:49:54 -0800 Received: (from schaefer@localhost) by torch.brasslantern.com (8.13.1/8.13.1/Submit) id lB9Mnrp6014122; Sun, 09 Dec 2007 14:49:53 -0800 Date: Sun, 09 Dec 2007 14:49:53 -0800 From: Bart Schaefer Subject: Re: Bug#451382: i18n is NOT so easy! In-reply-to: <20071209180127.d955eb4f.p.w.stephenson@ntlworld.com> To: 451382@bugs.debian.org, zsh-workers@sunsite.dk Message-id: <071209144953.ZM14121@torch.brasslantern.com> MIME-version: 1.0 X-Mailer: OpenZMail Classic (0.9.2 24April2005) Content-type: text/plain; charset=us-ascii References: <20071205200825.148710@gmx.net> <20071206155436.GA6034@scowler.net> <200712061808.56054.ismail@pardus.org.tr> <20071206161022.GA6960@scowler.net> <20071207104413.74da4ef6@news01> <200712071411.lB7EBf2U014439@news01.csr.com> <20071207171511.GA2937@scowler.net> <200712071726.lB7HQv76016517@news01.csr.com> <20071209180127.d955eb4f.p.w.stephenson@ntlworld.com> Comments: In reply to Peter Stephenson "Re: Bug#451382: i18n is NOT so easy!" (Dec 9, 6:01pm) X-Virus-Scanned: ClamAV using ClamSMTP On Dec 9, 6:01pm, Peter Stephenson wrote: } Subject: Re: Bug#451382: i18n is NOT so easy! } } This scheme has various merits: (i) it is robust about changes to } the English text (ii) the explicit msgid serves as a visual cue that } there's something here that shouldn't be monkeyed with without good } reason (and that even if you change the English text it should mean } the same thing) (iii) the msgid in the catalogues is compact. This is close to the same scheme that I [*] adopted for localization of zmail twelve years ago. Except that we used a two-argument C macro with the msgid and English text, rather than a delimited string. We also had a number of tools that massaged the C source to add any new msgid where a programmer had forgotten to use one, and to extract and build the default English catalog file which could then be turned over to translators. It'd be pretty easy, I expect, to write a perl script to find $"..." strings in shell scripts and extract them. I'd be cautious about treating everything up to the first colon in a $"..." string as a msgid key, though. Error messages are going to look like $"thing that failed: reason it failed" a lot of the time. Or would that have to be written "thing that failed: "$"reason it failed" for this to work in the first place? Anyway, it might be better to adopt something like $"{msgid}original text" and treat both $"{message}" and $"message" the same when only one of the two parts is found. An additional issue that zsh may or may not have to address is that you need entirely separate strings for things like plurals. You can't localize something like: There %s %d thing%s in the bucket where the %s get replaced by "are" and "s" when the %d is not 1, and "is" and "" otherwise. You must instead have two strings (sometimes three for the zero case): There are %d things in the bucket There is 1 thing in the bucket There is nothing in the bucket There are gobs of other niggling details that I'm sure I've forgotten. } However, it seems like we can get something better by interfacing to } the library at a lower level, in particular to catopen() (strictly } this is a different family of interfaces). That accepts an absolute } path to a catalogue and also uses the environment variable NLSPATH to } search for files. This is also what I did back then in zmail -- gettext() didn't really even exist yet at that point, at least not in a fully-developed form. The POSIX cat*() interfaces work just fine, though NLSPATH searching has some pretty nasty bugs on older operating systems. [*] That's sort of the royal "I" as actually there was a whole team of people working for me on it. --