ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* Re: XeTeX, ConTeXt, and utf-8 hyphenation patterns.
       [not found] ` <8764j6rpms.fsf-Ia0+UBIDZBm+PENguQupYdBc4/FLrbF6@public.gmane.org>
@ 2006-06-13  7:25   ` Hans Hagen
       [not found]     ` <448E685C.1020007-42P/b7yZOt0@public.gmane.org>
  2006-06-13 11:41     ` Peter Heslin
  0 siblings, 2 replies; 6+ messages in thread
From: Hans Hagen @ 2006-06-13  7:25 UTC (permalink / raw)


Peter Heslin wrote:
> A little while ago, I said that I hoped to convert Dimitrios Filippou's
> ancient Greek hyphenation patterns (the elhyphen package) to utf-8, in
> order to use them with xetex.  Before thinking about starting this work,
> I decided to look to see if anyone else had done it, and I came across
> something interesting in ConTeXt, which is not a package I normally use.
>
> There appears to be a whole subdirectory in the ConTeXt distribution
> that is full of utf-8 hyphenation patterns, including Filippou's ancient
> Greek ones, but also including German, French, etc.  They are in the
> file: http://www.pragma-ade.com/context/current/cont-tmf.zip, in the
> tex/context/patterns directory.
>
> Can anyone who knows about ConTeXt explain about where these patterns
> come from and how it is that context manages to use these patterns?  (I
> thought that non-xetex TeX could only use single-byte encoded patterns.)
>   
some time ago i decided to ship patterns with context because

(1) there is no sound infrastructure in the tex world for managin gpatterns
(2) i need encoding neutral patterns [most patterns are ec only]
(3) i want control over what gets loaded in context
(4) i wanted to get rid of every year's disappearing, renamed, changed 
patterns
(5) apart from the fact that i wanted patterns that were not in a sense 
hard wired latex patterns
> If there is a script that was used to convert these from the source to
> utf-8, is it available?  A quick glance at the ancient greek patterns
> (in the file lang-agr.pat) shows that there is a bug in the conversion
> that I'd like to report and fix.
>   
ctxtools --pat             [en nl agr ...]
ctxtools --pat --utf    [en nl agr ...]

the greek conversions were done with the help of a greek language users 
on the context list, so in case of troubles, so i cc there; bugs need to 
be fixed indeed

in ctxtools.rb you can grep for 'agr' and see what conversions takes 
place for greek

more info can be found in:

http://www.pragma-ade.com/general/manuals/mpattern.pdf

(also published in tugboat)

there is a file lang-all.xml in the context distribution

> On a more general level, if both ConTeXt and XeTeX are engaged in
> converting legacy TeX hyphenation patterns to utf-8, should they be
> coordinated in order to avoid duplication of effort?
>
>   
anyone can use the patterns; of course bugs need to be sorted out, but 
given my experiences with pattern maintainance i will not drop them from 
context; too much has gone wrong in the past; but you can consider them 
to be generic so indeed we can avoid duplication of work.

Hans

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: XeTeX, ConTeXt, and utf-8 hyphenation patterns.
       [not found]     ` <448E685C.1020007-42P/b7yZOt0@public.gmane.org>
@ 2006-06-13  8:25       ` Jonathan Kew
  2006-06-13  9:49         ` [XeTeX] " Hans Hagen
  0 siblings, 1 reply; 6+ messages in thread
From: Jonathan Kew @ 2006-06-13  8:25 UTC (permalink / raw)
  Cc: c

On 13 Jun 2006, at 8:25 am, Hans Hagen wrote:

>> On a more general level, if both ConTeXt and XeTeX are engaged in
>> converting legacy TeX hyphenation patterns to utf-8, should they be
>> coordinated in order to avoid duplication of effort?
>>
>>
> anyone can use the patterns; of course bugs need to be sorted out, but
> given my experiences with pattern maintainance i will not drop them  
> from
> context; too much has gone wrong in the past; but you can consider  
> them
> to be generic so indeed we can avoid duplication of work.

Indeed.... I have no desire to duplicate work. :)

My main concern at this point relates to packaging and co-ordination  
between the different macro packages that load patterns; we can't  
expect latex users to be dependent on having context installed, or  
vice versa. Patterns belong in a base tex installation, where they  
can be available to any higher-level macro package.

This needs to be sorted out among a wider group than this mailing  
list....

JK

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [XeTeX] XeTeX, ConTeXt, and utf-8 hyphenation patterns.
  2006-06-13  8:25       ` Jonathan Kew
@ 2006-06-13  9:49         ` Hans Hagen
  0 siblings, 0 replies; 6+ messages in thread
From: Hans Hagen @ 2006-06-13  9:49 UTC (permalink / raw)
  Cc: c

Jonathan Kew wrote:
> On 13 Jun 2006, at 8:25 am, Hans Hagen wrote:
>
>   
>>> On a more general level, if both ConTeXt and XeTeX are engaged in
>>> converting legacy TeX hyphenation patterns to utf-8, should they be
>>> coordinated in order to avoid duplication of effort?
>>>
>>>
>>>       
>> anyone can use the patterns; of course bugs need to be sorted out, but
>> given my experiences with pattern maintainance i will not drop them  
>> from
>> context; too much has gone wrong in the past; but you can consider  
>> them
>> to be generic so indeed we can avoid duplication of work.
>>     
>
> Indeed.... I have no desire to duplicate work. :)
>
> My main concern at this point relates to packaging and co-ordination  
> between the different macro packages that load patterns; we can't  
> expect latex users to be dependent on having context installed, or  
> vice versa. Patterns belong in a base tex installation, where they  
> can be available to any higher-level macro package.
>   
well, the problem is that until now, most pattern files were basically 
latex oriented files; the same is kind of true with fonts: changes in 
related files and names take place, and are synced with latex and then 
bites contex users; i've kind of given up on that
> This needs to be sorted out among a wider group than this mailing  
> list....
>   
well, installing the lang-* pat files only is an option, as is adding 
tex/context/patterns to the xetex input path in the xetex input path 
variable (although i believe that the tree is searched anyway);

also, given what people have to install nowadays, installing the context 
ipackage is not that big a burden (xetex binaries and associated libs 
are pretty big themselves anyway)

Hans

-- 

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: XeTeX, ConTeXt, and utf-8 hyphenation patterns.
  2006-06-13  7:25   ` XeTeX, ConTeXt, and utf-8 hyphenation patterns Hans Hagen
       [not found]     ` <448E685C.1020007-42P/b7yZOt0@public.gmane.org>
@ 2006-06-13 11:41     ` Peter Heslin
       [not found]       ` <87ejxtmgel.fsf-Ia0+UBIDZBm+PENguQupYdBc4/FLrbF6@public.gmane.org>
  2006-06-13 12:43       ` XeTeX, ConTeXt, and utf-8 hyphenation patterns Peter Heslin
  1 sibling, 2 replies; 6+ messages in thread
From: Peter Heslin @ 2006-06-13 11:41 UTC (permalink / raw)
  Cc: ntg-context-wvrSQK3plZs

Hans Hagen <pragma-42P/b7yZOt0@public.gmane.org> writes:

> ctxtools --pat             [en nl agr ...]
> ctxtools --pat --utf    [en nl agr ...]
>
> the greek conversions were done with the help of a greek language users 
> on the context list, so in case of troubles, so i cc there; bugs need to 
> be fixed indeed

Thanks for the tips.  I have taken a closer look at the Greek patterns,
and it seems as though they have not only small problems, but also major
problems.  (They will fail to find most hyphenation points before
accented vowels.)  I will try to come up with a patch, but I don't know
any Ruby, so it will be an interesting challenge -- the changes required
go beyond tweaking the existing code.

The characters in the file lang-agr.pat are precomposed, Unicode
normalization form D.  But I'd like to support both normalization forms
C and D, if possible, in the same pattern file.  Is that goal compatible
with Context?

-- 
Peter Heslin (http://www.dur.ac.uk/p.j.heslin)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: XeTeX, ConTeXt, and utf-8 hyphenation patterns. / GREEK
       [not found]       ` <87ejxtmgel.fsf-Ia0+UBIDZBm+PENguQupYdBc4/FLrbF6@public.gmane.org>
@ 2006-06-13 12:29         ` Hans Hagen
  0 siblings, 0 replies; 6+ messages in thread
From: Hans Hagen @ 2006-06-13 12:29 UTC (permalink / raw)
  Cc: ntg-context-wvrSQK3plZs

Peter Heslin wrote:
> Hans Hagen <pragma-42P/b7yZOt0@public.gmane.org> writes:
>
>   
>> ctxtools --pat             [en nl agr ...]
>> ctxtools --pat --utf    [en nl agr ...]
>>
>> the greek conversions were done with the help of a greek language users 
>> on the context list, so in case of troubles, so i cc there; bugs need to 
>> be fixed indeed
>>     
>
> Thanks for the tips.  I have taken a closer look at the Greek patterns,
> and it seems as though they have not only small problems, but also major
> problems.  (They will fail to find most hyphenation points before
> accented vowels.)  I will try to come up with a patch, but I don't know
> any Ruby, so it will be an interesting challenge -- the changes required
> go beyond tweaking the existing code.
>
> The characters in the file lang-agr.pat are precomposed, Unicode
> normalization form D.  But I'd like to support both normalization forms
> C and D, if possible, in the same pattern file.  Is that goal compatible
> with Context?
>
>   
this is more related to (xe)tex than to context; i leave that to the 
greek experts on the context list

Hans

-- 

-----------------------------------------------------------------
                                          Hans Hagen | PRAGMA ADE
              Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                             | www.pragma-pod.nl
-----------------------------------------------------------------

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: XeTeX, ConTeXt, and utf-8 hyphenation patterns.
  2006-06-13 11:41     ` Peter Heslin
       [not found]       ` <87ejxtmgel.fsf-Ia0+UBIDZBm+PENguQupYdBc4/FLrbF6@public.gmane.org>
@ 2006-06-13 12:43       ` Peter Heslin
  1 sibling, 0 replies; 6+ messages in thread
From: Peter Heslin @ 2006-06-13 12:43 UTC (permalink / raw)
  Cc: ntg-context-wvrSQK3plZs

Peter Heslin <pj-Ia0+UBIDZBm+PENguQupYdBc4/FLrbF6@public.gmane.org> writes:

> (They will fail to find most hyphenation points before accented
> vowels.)  

Sorry, of course I meant to say "after accented vowels".

P.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-06-13 12:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <8764j6rpms.fsf@heslin.eclipse.co.uk>
     [not found] ` <8764j6rpms.fsf-Ia0+UBIDZBm+PENguQupYdBc4/FLrbF6@public.gmane.org>
2006-06-13  7:25   ` XeTeX, ConTeXt, and utf-8 hyphenation patterns Hans Hagen
     [not found]     ` <448E685C.1020007-42P/b7yZOt0@public.gmane.org>
2006-06-13  8:25       ` Jonathan Kew
2006-06-13  9:49         ` [XeTeX] " Hans Hagen
2006-06-13 11:41     ` Peter Heslin
     [not found]       ` <87ejxtmgel.fsf-Ia0+UBIDZBm+PENguQupYdBc4/FLrbF6@public.gmane.org>
2006-06-13 12:29         ` XeTeX, ConTeXt, and utf-8 hyphenation patterns. / GREEK Hans Hagen
2006-06-13 12:43       ` XeTeX, ConTeXt, and utf-8 hyphenation patterns Peter Heslin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).