I'd like to make a few comments concerning what you say below. 1. I've been involved with Unicode, both UTC and as a representative to WG2, and I can confidently affirm that there is no Unicode God. No one has ever said There is no Code but Unicode, and UTC/WG2 is its prophet, or anything like that. If you have a reference to the Unicode Standard where I can read in black and white what you are referring to, I will happily look at it. (This is not intended as a smart remark. I'm quite seriously interested in understanding the facts of this issue.) 2. Anyone involved in Unicode, including inner core members of UTC etc, recognize that it's far from perfect. There is acknowledgement that a number of things could have been handled differently, but weren't. Stability Policy may seem like a problematic restriction to some in cases like this, but it guarantees backward compatibility, so has wisdom to it. 3. Whatever views one may have on Unicode, for better or worse, it is what it is. As you said yourself, c'est un moyen et non pas une fin.... One is free to use it, or not, and or devise alternatives. (But more on alternatives below.) 4. You suggested in an earlier email that you'd like to think the whole thing through carefully in advance, rather than implement things in stages, as others do, who then never get to the advanced stages. To me this begs the question of whether such is always universally the case. In particular, if anyone or any group tried/had tried to implement all of what Unicode proposes to be/become (UCS--Universal Character Set), the sheer magnitude of the task (which of course grows over time since scripts either in themselves or as a set are not static), he/she/they would never get the thing off the ground. This is in part why there are (arguably) flaws in Unicode. In any case, I seriously doubt that even if one attempted to "redo" it "the right way this time" one would manage. This is just not within the grasp of human endeavour. The mistakes would simply be different or in different areas. Likewise, there are plenty of things one could bring against the process of Unicode endorsing proposals, i.e. the inherent politics of interested groups, but that again is always a reality. 5. All that being said--Plan 9, as far as I can see, intentionally supports Unicode (see http://plan9.bell-labs.com/plan9/about.html). ( http://plan9.bell-labs.com/plan9/about.html). ) So to me, it's a non-starter to want to port *TeX to Plan 9 but rail against Unicode, whether justifiably or through misunderstanding. 6. Unicode isn't Eternal, any more than any other encoding standard. (I'm sure there were--and perhaps still are--those who think that BCD, no wait! EBCD, no wait! ASCII, no wait...!--were/are the be all and end all). In time, something else will develop in response to developing needs. 7. But at present, the recognized standard out there that for most practical intents and purposes (in particular, to service the needs of something other than just North American anglophone techie society) is Unicode, with whatever blemishes it may have. So it seems to me that in keeping with your principle alluded to above, and given that were talking about a Plan 9 environment here, you ought to be talking UTF-8 right off the bad. As I said--"seems to me". Could be I'm seriously misunderstanding the discussion... but then again, the diminishing dialogue in terms of number of participants suggests to me that there may be at least *some* truth in what I'm thinking.... Please don't think this is intended as a rant, either due to the way I've formatted this or on account of the content. I'm interested in following what you're doing; I'm just a bit puzzled, and I sincerely wish you the best in your efforts with this project. K >>> 06/28/11 7:19 AM >>> On Mon, Jun 27, 2011 at 07:45:34PM -0400, Karljurgen Feuerherm wrote: > Thierry, > > > I only say that: > > > 1) Forcing, as this was written in the XeTeX FAQ, user to> special codepoint for the fi ligature since, white eyes, scornful wave > of the hand: "this is the way this is done with Unicode" is sheer > stupidity. > > I don't know who told you that... just because there is a codepoint for something does not mean that one has to access that codepoint directly in all cases. Software at various levels can render a ligature on the basis of various actual character sequences (e.g. f + i, or f, i when ligatures are forced, etc. > > It's simply a level of what support one wishes to offer.... This is exactly what I'm trying to say. If one enters \'e, \' is just the "charname" or macro command to access the acute accent in the font. One can enter directly the code for the acute accent. Or one can enter directly the é (if the CID entered is classified as "other" [literal], and the fonts have something at the corresponding index). BUT the documentation found told that with "modern" fonts, one has the absolute obligation threatened by Thy Unicode GOD to enter the codepoint and that ligatures were deprecated. TeX is absolutely agnostic. It is an engine, a compiler/interpreter. Even tex(1) is just the name of an instance of TeX with a special convention: D.E. Knuth's plain TeX. some \'e let CID > > KF -- Thierry Laronde http://www.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C