ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* utf-based lang-* files?
@ 2005-09-22 12:17 Christopher Creutzig
  2005-09-22 12:27 ` Adam Lindsay
  0 siblings, 1 reply; 5+ messages in thread
From: Christopher Creutzig @ 2005-09-22 12:17 UTC (permalink / raw)


Salvete,

 while I am aware that my Japanese is ages away from creating anything
releasable, I thought about creating a lang-jap.tex file for my personal
use (and maybe for having it corrected by someone actually speaking the
language).  Now, checking lang-chi.tex, I find it is encoded in a way I
don't really want to copy.  I'd much rather write the whole file in
“proper” utf-8.  Is it possible to simply enclose the file in a
\startregime[utf]...\stopregime pair or do I risk havoc by doing this?

 (Should I start with this project, I'll have more questions, such as:
How do I make a unicode character such as 。active, for good line breaks?)


Christopher

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: utf-based lang-* files?
  2005-09-22 12:17 utf-based lang-* files? Christopher Creutzig
@ 2005-09-22 12:27 ` Adam Lindsay
  2005-09-22 16:25   ` Christopher Creutzig
  0 siblings, 1 reply; 5+ messages in thread
From: Adam Lindsay @ 2005-09-22 12:27 UTC (permalink / raw)


Christopher Creutzig said this at Thu, 22 Sep 2005 14:17:57 +0200:

> Is it possible to simply enclose the file in a
>\startregime[utf]...\stopregime pair or do I risk havoc by doing this?

Well, if you're using a regime, it still (usually) depends on symbolic
character names being defined under the hood. Also, such an approach
(explicitly calling \startregime[utf]) doesn't make XeTeX as happy as it
could be (XeTeX is happiest if you just pass through Unicode characters.
Regimes imply ConTeXt processing.)

This isn't a definitive answer, just a couple of issues off the top of
my head.
-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Adam T. Lindsay, Computing Dept.     atl@comp.lancs.ac.uk
 Lancaster University, InfoLab21        +44(0)1524/510.514
 Lancaster, LA1 4WA, UK             Fax:+44(0)1524/510.492
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: utf-based lang-* files?
  2005-09-22 12:27 ` Adam Lindsay
@ 2005-09-22 16:25   ` Christopher Creutzig
  2005-09-22 16:40     ` Adam Lindsay
  2005-09-22 21:23     ` Hans Hagen
  0 siblings, 2 replies; 5+ messages in thread
From: Christopher Creutzig @ 2005-09-22 16:25 UTC (permalink / raw)


Adam Lindsay wrote:
>>Is it possible to simply enclose the file in a
>>\startregime[utf]...\stopregime pair or do I risk havoc by doing this?
> 
> 
> Well, if you're using a regime, it still (usually) depends on symbolic
> character names being defined under the hood. Also, such an approach

 Sure.  But editing the file is oh so much easier when I can just type
\def\japChapterNumber#1{第#1章}
than if I have to look up the unicode numbers first and type
\def\japChapterNumber{\uchar{123}{44}#1\uchar{122}{224}}

> (explicitly calling \startregime[utf]) doesn't make XeTeX as happy as it
> could be (XeTeX is happiest if you just pass through Unicode characters.

 That implies that ConTeXt should switch off all conversions when
running in XeTeX and seeing \startregime[utf], right?  (I certainly want
 to use the whole thing in XeTeX, if I ever do start it.  I would prefer
not to make the code depend on that.  I could live with som \if...
switches at the beginning and end, sure.)


Christopher

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: utf-based lang-* files?
  2005-09-22 16:25   ` Christopher Creutzig
@ 2005-09-22 16:40     ` Adam Lindsay
  2005-09-22 21:23     ` Hans Hagen
  1 sibling, 0 replies; 5+ messages in thread
From: Adam Lindsay @ 2005-09-22 16:40 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 1148 bytes --]

Christopher Creutzig said this at Thu, 22 Sep 2005 18:25:01 +0200:

>Sure.  But editing the file is oh so much easier when I can just type
>\def\japChapterNumber#1{µ⁄#1’¬}
>than if I have to look up the unicode numbers first and type
>\def\japChapterNumber{\uchar{123}{44}#1\uchar{122}{224}}

True, but this is scriptable...

> That implies that ConTeXt should switch off all conversions when
>running in XeTeX and seeing \startregime[utf], right? 

That's a good point, actually. In fact, XeTeX now has its own regime-
like mechanism (\XeTeXinputencoding "charset-nam\0e"), that someone could/
should address, either bypassing ConTeXt's existing regimes, or
supplementing them. I'm not really active in that space right now, so I
can't do it, but I'd be willing to give hints to someone who wants this
feature.

adam
-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Adam T. Lindsay, Computing Dept.     atl@comp.lancs.ac.uk
 Lancaster University, InfoLab21        +44(0)1524/510.514
 Lancaster, LA1 4WA, UK             Fax:+44(0)1524/510.492
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

[-- Attachment #2: Type: text/plain, Size: 139 bytes --]

_______________________________________________
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: utf-based lang-* files?
  2005-09-22 16:25   ` Christopher Creutzig
  2005-09-22 16:40     ` Adam Lindsay
@ 2005-09-22 21:23     ` Hans Hagen
  1 sibling, 0 replies; 5+ messages in thread
From: Hans Hagen @ 2005-09-22 21:23 UTC (permalink / raw)


Christopher Creutzig wrote:

>Adam Lindsay wrote:
>  
>
>>>Is it possible to simply enclose the file in a
>>>\startregime[utf]...\stopregime pair or do I risk havoc by doing this?
>>>      
>>>
>>Well, if you're using a regime, it still (usually) depends on symbolic
>>character names being defined under the hood. Also, such an approach
>>    
>>
>
> Sure.  But editing the file is oh so much easier when I can just type
>\def\japChapterNumber#1{第#1章}
>than if I have to look up the unicode numbers first and type
>\def\japChapterNumber{\uchar{123}{44}#1\uchar{122}{224}}
>
>  
>
>>(explicitly calling \startregime[utf]) doesn't make XeTeX as happy as it
>>could be (XeTeX is happiest if you just pass through Unicode characters.
>>
If xetex handles utf-8 by just looking at catcodes letter, you don't 
need a regime; you just have to make sure that when the file is loaded 
the chars 128->255 have the right catcode

\dostepwiserecurse{128}{255}{1}{\catcode\recurselevel=11\relax}

Hans

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-09-22 21:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-09-22 12:17 utf-based lang-* files? Christopher Creutzig
2005-09-22 12:27 ` Adam Lindsay
2005-09-22 16:25   ` Christopher Creutzig
2005-09-22 16:40     ` Adam Lindsay
2005-09-22 21:23     ` Hans Hagen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).