ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* Support for Thai in ConTeXt
@ 2013-05-14 16:07 luigi scarso
  2013-05-14 16:17 ` Hans Hagen
  0 siblings, 1 reply; 5+ messages in thread
From: luigi scarso @ 2013-05-14 16:07 UTC (permalink / raw)
  To: Theppitak Karoonboonyanan, mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 1114 bytes --]

On Tue, May 14, 2013 at 5:59 PM, Theppitak Karoonboonyanan <
theppitak@gmail.com> wrote:

> On Tue, May 14, 2013 at 9:58 PM, luigi scarso <luigi.scarso@gmail.com>
> wrote:
> >
> > On Tue, May 14, 2013 at 4:16 PM, Mojca Miklavec
> > <mojca.miklavec.lists@gmail.com> wrote:
> >>
> >> I could also ask differently: suppose that a motivated Thai programmer
> >> would be willing to work on solving the problem properly. What would
> >> be the suggested solution?
> >
> > You can post also in the context ml, maybe there is some Thai user there
> .
>
> I am a Thai developer who works on Thai word segmentation tools and
> thailatex package. So, you can suggest to me. (Please Cc: me, I'm not
> in the mailing list.)
>
> I'm totally new to LuaTeX and Lua programming language. But I can learn
> necessary stuffs to get it done.
>
> With a quick search, I saw "linebreak_filter" callback in LuaTeX reference.
> Is that relevant to the problem? Or using external filter is already
> acceptable?
>
> Regards,
> --
> Theppitak Karoonboonyanan
> http://linux.thai.net/~thep/
>

I Hope  that someone can help here

-- 
luigi

[-- Attachment #1.2: Type: text/html, Size: 1854 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Support for Thai in ConTeXt
  2013-05-14 16:07 Support for Thai in ConTeXt luigi scarso
@ 2013-05-14 16:17 ` Hans Hagen
  2013-05-15 14:09   ` Mojca Miklavec
  0 siblings, 1 reply; 5+ messages in thread
From: Hans Hagen @ 2013-05-14 16:17 UTC (permalink / raw)
  To: ntg-context

On 5/14/2013 6:07 PM, luigi scarso wrote:

> I Hope  that someone can help here

as Mojca mentioned thai at bachotex i'll add the patterns as a start

given specs, examples and time, adding support for thai to context 
shouldn't be too hard (assuming that there are users)

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Support for Thai in ConTeXt
  2013-05-14 16:17 ` Hans Hagen
@ 2013-05-15 14:09   ` Mojca Miklavec
  2013-05-15 15:20     ` Hans Hagen
  0 siblings, 1 reply; 5+ messages in thread
From: Mojca Miklavec @ 2013-05-15 14:09 UTC (permalink / raw)
  To: mailing list for ConTeXt users; +Cc: Theppitak Karoonboonyanan

On Tue, May 14, 2013 at 6:17 PM, Hans Hagen wrote:
> On 5/14/2013 6:07 PM, luigi scarso wrote:
>
>> I Hope  that someone can help here
>
>
> as Mojca mentioned thai at bachotex i'll add the patterns as a start
>
> given specs, examples and time, adding support for thai to context shouldn't
> be too hard (assuming that there are users)

But it's not trivial either.

There's an opensource project implementing word segmentation:
    http://linux.thai.net/projects/swath
The specification (someone's thesis) can be found here:
    http://www.cs.cmu.edu/~paisarn/papers/thesis99.pdf

The ugly part of pdfTeX approach is that it requires an external text
processor to digest an input TeX document and return a copy with word
segmentation. Then pdfTeX is run on the resulting file. XeTeX can use
ICU library to do the segmentation.

In LuaTeX one would have to plug the word segmentation somewhere (but
writing that part is slightly non-trivial).

Mojca
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Support for Thai in ConTeXt
  2013-05-15 14:09   ` Mojca Miklavec
@ 2013-05-15 15:20     ` Hans Hagen
  2013-05-15 15:33       ` luigi scarso
  0 siblings, 1 reply; 5+ messages in thread
From: Hans Hagen @ 2013-05-15 15:20 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On 5/15/2013 4:09 PM, Mojca Miklavec wrote:
> On Tue, May 14, 2013 at 6:17 PM, Hans Hagen wrote:
>> On 5/14/2013 6:07 PM, luigi scarso wrote:
>>
>>> I Hope  that someone can help here
>>
>>
>> as Mojca mentioned thai at bachotex i'll add the patterns as a start
>>
>> given specs, examples and time, adding support for thai to context shouldn't
>> be too hard (assuming that there are users)
>
> But it's not trivial either.

It depends ... we're using a dictionary to determine word boundaries, 
aren't we? I'm pretty sure that I've done more complex coding.

> There's an opensource project implementing word segmentation:
>      http://linux.thai.net/projects/swath
> The specification (someone's thesis) can be found here:
>      http://www.cs.cmu.edu/~paisarn/papers/thesis99.pdf

Ok, so there are some ttext files there with words.

> The ugly part of pdfTeX approach is that it requires an external text
> processor to digest an input TeX document and return a copy with word
> segmentation. Then pdfTeX is run on the resulting file. XeTeX can use
> ICU library to do the segmentation.
>
> In LuaTeX one would have to plug the word segmentation somewhere (but
> writing that part is slightly non-trivial).

I just did a quick test using those dictionaries (abusing some code that 
i already had on my machine). Quite doable. It all depends on having the 
dictionaries available (on the garden or in the distribution).

Anyhow, it's not that much font related, just language / script support 
and we already have that for some languages and adding thai to it 
doesn't hurt. Of course we'd need some testing. It doesn't make much 
sense to add features to context that no one would use at some point.

But ... Luigi is already teaching himself Thai, so ...

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
     tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Support for Thai in ConTeXt
  2013-05-15 15:20     ` Hans Hagen
@ 2013-05-15 15:33       ` luigi scarso
  0 siblings, 0 replies; 5+ messages in thread
From: luigi scarso @ 2013-05-15 15:33 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 231 bytes --]

On Wed, May 15, 2013 at 5:20 PM, Hans Hagen <pragma@wxs.nl> wrote:

>
> But ... Luigi is already teaching himself Thai, so ...
>
no no, just connecting people on different ml.
Currently I'm in a completely different area
-- 
luigi

[-- Attachment #1.2: Type: text/html, Size: 621 bytes --]

[-- Attachment #2: Type: text/plain, Size: 485 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : http://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-05-15 15:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-14 16:07 Support for Thai in ConTeXt luigi scarso
2013-05-14 16:17 ` Hans Hagen
2013-05-15 14:09   ` Mojca Miklavec
2013-05-15 15:20     ` Hans Hagen
2013-05-15 15:33       ` luigi scarso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).