From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id XAA01335; Thu, 18 Apr 2002 23:09:02 +0200 (MET DST) Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id XAA01319 for caml-list@pauillac.inria.fr; Thu, 18 Apr 2002 23:08:58 +0200 (MET DST) Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id JAA13015 for ; Thu, 18 Apr 2002 09:57:03 +0200 (MET DST) Received: from laurelin.dementia.org ([208.167.88.73]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id g3I7v1b13449; Thu, 18 Apr 2002 09:57:01 +0200 (MET DST) Received: (from jprevost@localhost) by laurelin.dementia.org (8.11.6/8.11.6) id g3I81qJ06244; Thu, 18 Apr 2002 04:01:52 -0400 (EDT) (envelope-from visigoth@cs.cmu.edu) X-Authentication-Warning: laurelin.dementia.org: jprevost set sender to visigoth@cs.cmu.edu using -f To: Daniel de Rauglaudre Cc: caml-list@inria.fr Subject: Re: [Caml-list] Printf and i18n References: <87ads2rzu8.fsf@marant.org> <20020418084753.A11351@verdot.inria.fr> From: John Prevost Date: 18 Apr 2002 04:01:51 -0400 In-Reply-To: <20020418084753.A11351@verdot.inria.fr> Message-ID: <86y9fljuz4.fsf@laurelin.dementia.org> User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk >>>>> "dr" == Daniel de Rauglaudre writes: dr> For the translations, I use a lexicon file, containing a list dr> of entries which I want to be i18n. It is an idea I took from dr> Caml Light. The first string, without language, is the format dr> to be translated, in a vehicular language (actually English), dr> and I give the translation in each language I want. E.g. an dr> entry can be: dr> hello, my name is %s\n dr> en: hello, my name is %s\n dr> es: buenos dias, me llamo %s\n dr> fr: bonjour, je m'appelle %s\n A good try, but unfortunately it doesn't allow for the more significant problem, which is word order variation. Imagine a simple format like: Printf.printf "%s's %s" "John" "document" The problem here is that we have the following translations (sort of, no character sets here): en: X's Y ja: X no Y ru: Y u X And of course, with more arguments you get more variety. One solution is to treat your formats as functions, rather than formats. This cuts down on the utility somewhat, since you have to sprintf instead of printing directly, but: let owns = [En, fun owner owned -> Printf.sprintf "%s's %s" owner owned; Ja, fun owner owned -> Printf.sprintf "%s no %s" owner owned; Ru, fun owner owned -> Printf.sprintf "%s u %s" owned owner] sort of gets you moving in the right direction. Of course, you need an automated system to handle things reasonably. This is why many modern printf-style systems, especially those for handling internationalization, include numbered or named slots to be filled in. For example, in Python: fmt = "%(owner)s's %(owned)s" fmt = "%(owner)s no %(owned)s" fmt = "%(owned) u %(owner)" can all be used as: fmt % {'owner':'John', 'owned':'document'} In Java: fmt = new MessageFormat("{0}''s {1}"); fmt = new MessageFormat("{0} no {1}"); fmt = new MessageFormat("{1} u {0}"); can all be used as: fmt.format(new Object[] {"John", "document"}) Either of these mechanisms allow reading a format from a catalog and using it in that way. The only way I can see doing this with something like O'Caml's current mechanism is introducing a new format function like printf which takes a format like: ("%s%s:{0}'s {1}" : ('a,'b,'c) format) ("%s%s:{0} no {1}" : ('a,'b,'c) format) ("%s%s:{1} u {0}" : ('a,'b,'c) format) Use the % model up front to indicate the argument types in argument order, then indicate what order the arguments are to be used separately. This is, of course, rather ugly. A richer format language would be preferable. If keyword arguments can't be made to work right when included in the type (as seems likely), perhaps a format language like Python's, but using numbered argument positions instead of names, so: ("%s %d" : (string -> int -> 'a,'b,'a) format) but also ("%(1)d %(0)s" : (string -> int -> 'a, 'b, 'a) format) John. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners