From mboxrd@z Thu Jan  1 00:00:00 1970
Message-ID: <3a96057432a34fc38f4577b0b7b468a9@terzarima.net>
From: Charles Forsyth <forsyth@terzarima.net>
Date: Thu,  3 Jun 2004 09:49:57 +0100
To: lucio@proxima.alt.za, 9fans@cse.psu.edu
Subject: Re: [9fans] GNU Make
In-Reply-To: <df41879105515b8c3643ceff29d03796@proxima.alt.za>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
Cc: 
Topicbox-Message-UUID: 92360db8-eacd-11e9-9e20-41e7f4b1d025

>>The former is a continuing "failure of vision" that will eventually be
>>resolved out of necessity.  The latter may be solved with the former

i wonder if a message got lost.  in another commercial product a group
of us years ago handled messages in many natural languages in programs without fuss,
in a serious commercial product that was, and i believe still is,
widely used with many Western European languages.  it did not need the
use of message codes.  strings worked well.
the text of the message in the program was its own `index'.  still seems
straightforward to me.  the strings are anyway the subject
of translation by the translators!  you can't get away from them.
all those strcmps?  hash.  works for messages in files, too,
though there are other searching techniques.  more recently, an even
simpler scheme for messages was successfully used in Limbo applications.
i used a hash table, but as it happens, i accidentally produced
a degenerate one.  i still didn't notice for two years, even with profiling;
it's just not that much of a bottleneck in many cases.

it's the least of your worries as i said before: working out how to arrange
the strings to cope with the differing requirements of various natural languages
for word order, or `dictionary order' (which can vary across dialects of the
same language), and several other things, all require extra mechanisms.
for each string in a program, one needs to decide whether the text is essentially
`program' or whether it's `speech'.
it's helpful to have a conventional form for the text of such messages so
they can be extracted automatically for translation (assuming the compiler can't assist).

in many cases, system diagnostics
should not be translated (or indeed must not be translated) because they are input
to other programs, or internal diagnostics that should remain as-is.
actually, given the extent of program tools when scripting, it's probably
true that most existing messages wouldn't be translated anyway.
one of the nice things about the change to largely GUI-oriented interfaces
is that it's more obvious which bits of text are intended to be understood
by users of an application.  most output of programs such as sed, file, etc.
would not be seen directly (by a `real' end user).

the failure of vision here is to think that just using integers will solve anything.
first, it complicates distributed development (who assigns the integers?
shall we have the usual hack of a `user-defined range'?  what happens when
i federate two systems?), as the varying assignment
within Unix systems shows, and they were dealing with a largely fixed set of
system calls and outcomes (so in principal one could enumerate most of
the possibilities).  even then, quite a few things settle for EIO.
that's a great help.

to get round the problem of differing errno assignments in Unix, one could
use EAGAIN (or is it EWOULDBLOCK?), ENOENT, etc. but hang on: that's
just a string and you'll still need to put it through a map.  EBAHGUM.

more important, with user-level file servers
the set of possible diagnostics is unbounded, because the range of application
is not limited as it was (at least until recently) in Unix.

there is a smaller `failure of vision' though: Lucio is right that a little
more discipline in forming the text of the strings might help.
file servers that really do serve up real files could use the same text
for the same errors.  it's a rather tedious job to go round the source to do it,
but it's no more tedious than collecting messages for translation in any case.