From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/82508 Path: news.gmane.org!not-for-mail From: Mojca Miklavec Newsgroups: gmane.comp.tex.context Subject: Re: Support for Thai in ConTeXt Date: Wed, 15 May 2013 16:09:25 +0200 Message-ID: References: <51926385.70705@wxs.nl> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1368626984 8491 80.91.229.3 (15 May 2013 14:09:44 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 15 May 2013 14:09:44 +0000 (UTC) Cc: Theppitak Karoonboonyanan To: mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Wed May 15 16:09:46 2013 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from balder.ntg.nl ([195.12.62.10]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1UccOn-00021n-HW for gctc-ntg-context-518@m.gmane.org; Wed, 15 May 2013 16:09:45 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 055A4101F3; Wed, 15 May 2013 16:09:45 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id CMX2an0qfx8H; Wed, 15 May 2013 16:09:39 +0200 (CEST) Original-Received: from balder.ntg.nl (localhost [IPv6:::1]) by balder.ntg.nl (Postfix) with ESMTP id 6B085101E5; Wed, 15 May 2013 16:09:39 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 0061B101E5 for ; Wed, 15 May 2013 16:09:38 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id YSOWNNIeP+oP for ; Wed, 15 May 2013 16:09:36 +0200 (CEST) Original-Received: from filter4-til.mf.surf.net (filter4-til.mf.surf.net [194.171.167.220]) by balder.ntg.nl (Postfix) with ESMTP id 4F31E101E0 for ; Wed, 15 May 2013 16:09:26 +0200 (CEST) Original-Received: from mail-we0-x235.google.com (mail-we0-x235.google.com [IPv6:2a00:1450:400c:c03::235]) by filter4-til.mf.surf.net (8.14.3/8.14.3/Debian-9.4) with ESMTP id r4FE9PI5015578 for ; Wed, 15 May 2013 16:09:25 +0200 Original-Received: by mail-we0-f181.google.com with SMTP id q55so1651985wes.40 for ; Wed, 15 May 2013 07:09:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=zI0m4OZCFrG7Hrm5WeIfr7z3ZbRmLTKzL+gp8yombTk=; b=UxexwMKbk7kRTxSALU32eYbgP3BBN3gALJGj1/qTtO4mLA5h6vyxi8TyMZhERM7ivS JByKD5G7X7veubQoB+y+bBKoNmx/NshQ6r10mLVnXsE1G86xzysIVOQ3+5kXJCKELxka D8P8FvEEtzy43ZQZUvZgJFmB/cRqRTe3ebjTRUHjnx10r+ztaVdm0VhMmvwyYKixt+n7 NkjSOc0mTAkSnW91g79CTnBAryfIDkzeBjFhL0OFEoeSnWxcsY1ZJNH6L+WKusqqkR1E Co3liCSa5AumfHdsQsf8vdT81vvHgswhI5TkKm5jk6zQJ2EFNm+D8OMxGjFFdDq31upH r5Ww== X-Received: by 10.180.188.141 with SMTP id ga13mr15141632wic.9.1368626965242; Wed, 15 May 2013 07:09:25 -0700 (PDT) Original-Received: by 10.194.45.198 with HTTP; Wed, 15 May 2013 07:09:25 -0700 (PDT) In-Reply-To: <51926385.70705@wxs.nl> X-Bayes-Prob: 0.0001 (Score 0, tokens from: @@RPTN) X-CanIt-Geo: ip=2a00:1450:400c:c03::235; country=IE X-CanItPRO-Stream: uu:ntg-context@ntg.nl (inherits from uu:default, base:default) X-Canit-Stats-ID: 0VJAe9pFh - 6a7fc4daa6f4 - 20130515 (trained as not-spam) X-Scanned-By: CanIt (www . roaringpenguin . com) X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.14 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ntg-context-bounces@ntg.nl Original-Sender: ntg-context-bounces@ntg.nl Xref: news.gmane.org gmane.comp.tex.context:82508 Archived-At: On Tue, May 14, 2013 at 6:17 PM, Hans Hagen wrote: > On 5/14/2013 6:07 PM, luigi scarso wrote: > >> I Hope that someone can help here > > > as Mojca mentioned thai at bachotex i'll add the patterns as a start > > given specs, examples and time, adding support for thai to context shouldn't > be too hard (assuming that there are users) But it's not trivial either. There's an opensource project implementing word segmentation: http://linux.thai.net/projects/swath The specification (someone's thesis) can be found here: http://www.cs.cmu.edu/~paisarn/papers/thesis99.pdf The ugly part of pdfTeX approach is that it requires an external text processor to digest an input TeX document and return a copy with word segmentation. Then pdfTeX is run on the resulting file. XeTeX can use ICU library to do the segmentation. In LuaTeX one would have to plug the word segmentation somewhere (but writing that part is slightly non-trivial). Mojca ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________