ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
From: Hans Hagen <pragma@wxs.nl>
To: mailing list for ConTeXt users <ntg-context@ntg.nl>
Subject: Re: Support for Thai in ConTeXt
Date: Wed, 15 May 2013 17:20:58 +0200	[thread overview]
Message-ID: <5193A7DA.3070203@wxs.nl> (raw)
In-Reply-To: <CALBOmsZ2uDTKH22m_X6CVCYAR1dixhV8=wzvO6-9f=j4JJ0ptA@mail.gmail.com>

On 5/15/2013 4:09 PM, Mojca Miklavec wrote:
> On Tue, May 14, 2013 at 6:17 PM, Hans Hagen wrote:
>> On 5/14/2013 6:07 PM, luigi scarso wrote:
>>
>>> I Hope  that someone can help here
>>
>>
>> as Mojca mentioned thai at bachotex i'll add the patterns as a start
>>
>> given specs, examples and time, adding support for thai to context shouldn't
>> be too hard (assuming that there are users)
>
> But it's not trivial either.

It depends ... we're using a dictionary to determine word boundaries, 
aren't we? I'm pretty sure that I've done more complex coding.

> There's an opensource project implementing word segmentation:
>      http://linux.thai.net/projects/swath
> The specification (someone's thesis) can be found here:
>      http://www.cs.cmu.edu/~paisarn/papers/thesis99.pdf

Ok, so there are some ttext files there with words.

> The ugly part of pdfTeX approach is that it requires an external text
> processor to digest an input TeX document and return a copy with word
> segmentation. Then pdfTeX is run on the resulting file. XeTeX can use
> ICU library to do the segmentation.
>
> In LuaTeX one would have to plug the word segmentation somewhere (but
> writing that part is slightly non-trivial).

I just did a quick test using those dictionaries (abusing some code that 
i already had on my machine). Quite doable. It all depends on having the 
dictionaries available (on the garden or in the distribution).

Anyhow, it's not that much font related, just language / script support 
and we already have that for some languages and adding thai to it 
doesn't hurt. Of course we'd need some testing. It doesn't make much 
sense to add features to context that no one would use at some point.

But ... Luigi is already teaching himself Thai, so ...

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


  reply	other threads:[~2013-05-15 15:20 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-14 16:07 luigi scarso
2013-05-14 16:17 ` Hans Hagen
2013-05-15 14:09   ` Mojca Miklavec
2013-05-15 15:20     ` Hans Hagen [this message]
2013-05-15 15:33       ` luigi scarso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5193A7DA.3070203@wxs.nl \
    --to=pragma@wxs.nl \
    --cc=ntg-context@ntg.nl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).