From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <3a96057432a34fc38f4577b0b7b468a9@terzarima.net> From: Charles Forsyth Date: Thu, 3 Jun 2004 09:49:57 +0100 To: lucio@proxima.alt.za, 9fans@cse.psu.edu Subject: Re: [9fans] GNU Make In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Cc: Topicbox-Message-UUID: 92360db8-eacd-11e9-9e20-41e7f4b1d025 >>The former is a continuing "failure of vision" that will eventually be >>resolved out of necessity. The latter may be solved with the former i wonder if a message got lost. in another commercial product a group of us years ago handled messages in many natural languages in programs without fuss, in a serious commercial product that was, and i believe still is, widely used with many Western European languages. it did not need the use of message codes. strings worked well. the text of the message in the program was its own `index'. still seems straightforward to me. the strings are anyway the subject of translation by the translators! you can't get away from them. all those strcmps? hash. works for messages in files, too, though there are other searching techniques. more recently, an even simpler scheme for messages was successfully used in Limbo applications. i used a hash table, but as it happens, i accidentally produced a degenerate one. i still didn't notice for two years, even with profiling; it's just not that much of a bottleneck in many cases. it's the least of your worries as i said before: working out how to arrange the strings to cope with the differing requirements of various natural languages for word order, or `dictionary order' (which can vary across dialects of the same language), and several other things, all require extra mechanisms. for each string in a program, one needs to decide whether the text is essentially `program' or whether it's `speech'. it's helpful to have a conventional form for the text of such messages so they can be extracted automatically for translation (assuming the compiler can't assist). in many cases, system diagnostics should not be translated (or indeed must not be translated) because they are input to other programs, or internal diagnostics that should remain as-is. actually, given the extent of program tools when scripting, it's probably true that most existing messages wouldn't be translated anyway. one of the nice things about the change to largely GUI-oriented interfaces is that it's more obvious which bits of text are intended to be understood by users of an application. most output of programs such as sed, file, etc. would not be seen directly (by a `real' end user). the failure of vision here is to think that just using integers will solve anything. first, it complicates distributed development (who assigns the integers? shall we have the usual hack of a `user-defined range'? what happens when i federate two systems?), as the varying assignment within Unix systems shows, and they were dealing with a largely fixed set of system calls and outcomes (so in principal one could enumerate most of the possibilities). even then, quite a few things settle for EIO. that's a great help. to get round the problem of differing errno assignments in Unix, one could use EAGAIN (or is it EWOULDBLOCK?), ENOENT, etc. but hang on: that's just a string and you'll still need to put it through a map. EBAHGUM. more important, with user-level file servers the set of possible diagnostics is unbounded, because the range of application is not limited as it was (at least until recently) in Unix. there is a smaller `failure of vision' though: Lucio is right that a little more discipline in forming the text of the strings might help. file servers that really do serve up real files could use the same text for the same errors. it's a rather tedious job to go round the source to do it, but it's no more tedious than collecting messages for translation in any case.