* utf-based lang-* files?
@ 2005-09-22 12:17 Christopher Creutzig
2005-09-22 12:27 ` Adam Lindsay
0 siblings, 1 reply; 5+ messages in thread
From: Christopher Creutzig @ 2005-09-22 12:17 UTC (permalink / raw)
Salvete,
while I am aware that my Japanese is ages away from creating anything
releasable, I thought about creating a lang-jap.tex file for my personal
use (and maybe for having it corrected by someone actually speaking the
language). Now, checking lang-chi.tex, I find it is encoded in a way I
don't really want to copy. I'd much rather write the whole file in
“proper” utf-8. Is it possible to simply enclose the file in a
\startregime[utf]...\stopregime pair or do I risk havoc by doing this?
(Should I start with this project, I'll have more questions, such as:
How do I make a unicode character such as 。active, for good line breaks?)
Christopher
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: utf-based lang-* files?
2005-09-22 12:17 utf-based lang-* files? Christopher Creutzig
@ 2005-09-22 12:27 ` Adam Lindsay
2005-09-22 16:25 ` Christopher Creutzig
0 siblings, 1 reply; 5+ messages in thread
From: Adam Lindsay @ 2005-09-22 12:27 UTC (permalink / raw)
Christopher Creutzig said this at Thu, 22 Sep 2005 14:17:57 +0200:
> Is it possible to simply enclose the file in a
>\startregime[utf]...\stopregime pair or do I risk havoc by doing this?
Well, if you're using a regime, it still (usually) depends on symbolic
character names being defined under the hood. Also, such an approach
(explicitly calling \startregime[utf]) doesn't make XeTeX as happy as it
could be (XeTeX is happiest if you just pass through Unicode characters.
Regimes imply ConTeXt processing.)
This isn't a definitive answer, just a couple of issues off the top of
my head.
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Adam T. Lindsay, Computing Dept. atl@comp.lancs.ac.uk
Lancaster University, InfoLab21 +44(0)1524/510.514
Lancaster, LA1 4WA, UK Fax:+44(0)1524/510.492
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: utf-based lang-* files?
2005-09-22 12:27 ` Adam Lindsay
@ 2005-09-22 16:25 ` Christopher Creutzig
2005-09-22 16:40 ` Adam Lindsay
2005-09-22 21:23 ` Hans Hagen
0 siblings, 2 replies; 5+ messages in thread
From: Christopher Creutzig @ 2005-09-22 16:25 UTC (permalink / raw)
Adam Lindsay wrote:
>>Is it possible to simply enclose the file in a
>>\startregime[utf]...\stopregime pair or do I risk havoc by doing this?
>
>
> Well, if you're using a regime, it still (usually) depends on symbolic
> character names being defined under the hood. Also, such an approach
Sure. But editing the file is oh so much easier when I can just type
\def\japChapterNumber#1{第#1章}
than if I have to look up the unicode numbers first and type
\def\japChapterNumber{\uchar{123}{44}#1\uchar{122}{224}}
> (explicitly calling \startregime[utf]) doesn't make XeTeX as happy as it
> could be (XeTeX is happiest if you just pass through Unicode characters.
That implies that ConTeXt should switch off all conversions when
running in XeTeX and seeing \startregime[utf], right? (I certainly want
to use the whole thing in XeTeX, if I ever do start it. I would prefer
not to make the code depend on that. I could live with som \if...
switches at the beginning and end, sure.)
Christopher
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: utf-based lang-* files?
2005-09-22 16:25 ` Christopher Creutzig
@ 2005-09-22 16:40 ` Adam Lindsay
2005-09-22 21:23 ` Hans Hagen
1 sibling, 0 replies; 5+ messages in thread
From: Adam Lindsay @ 2005-09-22 16:40 UTC (permalink / raw)
[-- Attachment #1: Type: text/plain, Size: 1148 bytes --]
Christopher Creutzig said this at Thu, 22 Sep 2005 18:25:01 +0200:
>Sure. But editing the file is oh so much easier when I can just type
>\def\japChapterNumber#1{µ⁄#1’¬}
>than if I have to look up the unicode numbers first and type
>\def\japChapterNumber{\uchar{123}{44}#1\uchar{122}{224}}
True, but this is scriptable...
> That implies that ConTeXt should switch off all conversions when
>running in XeTeX and seeing \startregime[utf], right?
That's a good point, actually. In fact, XeTeX now has its own regime-
like mechanism (\XeTeXinputencoding "charset-nam\0e"), that someone could/
should address, either bypassing ConTeXt's existing regimes, or
supplementing them. I'm not really active in that space right now, so I
can't do it, but I'd be willing to give hints to someone who wants this
feature.
adam
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Adam T. Lindsay, Computing Dept. atl@comp.lancs.ac.uk
Lancaster University, InfoLab21 +44(0)1524/510.514
Lancaster, LA1 4WA, UK Fax:+44(0)1524/510.492
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
[-- Attachment #2: Type: text/plain, Size: 139 bytes --]
_______________________________________________
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: utf-based lang-* files?
2005-09-22 16:25 ` Christopher Creutzig
2005-09-22 16:40 ` Adam Lindsay
@ 2005-09-22 21:23 ` Hans Hagen
1 sibling, 0 replies; 5+ messages in thread
From: Hans Hagen @ 2005-09-22 21:23 UTC (permalink / raw)
Christopher Creutzig wrote:
>Adam Lindsay wrote:
>
>
>>>Is it possible to simply enclose the file in a
>>>\startregime[utf]...\stopregime pair or do I risk havoc by doing this?
>>>
>>>
>>Well, if you're using a regime, it still (usually) depends on symbolic
>>character names being defined under the hood. Also, such an approach
>>
>>
>
> Sure. But editing the file is oh so much easier when I can just type
>\def\japChapterNumber#1{第#1章}
>than if I have to look up the unicode numbers first and type
>\def\japChapterNumber{\uchar{123}{44}#1\uchar{122}{224}}
>
>
>
>>(explicitly calling \startregime[utf]) doesn't make XeTeX as happy as it
>>could be (XeTeX is happiest if you just pass through Unicode characters.
>>
If xetex handles utf-8 by just looking at catcodes letter, you don't
need a regime; you just have to make sure that when the file is loaded
the chars 128->255 have the right catcode
\dostepwiserecurse{128}{255}{1}{\catcode\recurselevel=11\relax}
Hans
-----------------------------------------------------------------
Hans Hagen | PRAGMA ADE
Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
| www.pragma-pod.nl
-----------------------------------------------------------------
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2005-09-22 21:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-09-22 12:17 utf-based lang-* files? Christopher Creutzig
2005-09-22 12:27 ` Adam Lindsay
2005-09-22 16:25 ` Christopher Creutzig
2005-09-22 16:40 ` Adam Lindsay
2005-09-22 21:23 ` Hans Hagen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).