From: Simo Ojala <smsojala@gmail.com>
To: ntg-context@ntg.nl
Subject: Re: Problem with ConTeXt (MkIV), Hebrew and ligatures
Date: Mon, 01 Oct 2012 18:16:14 +0300 [thread overview]
Message-ID: <5069B3BE.7060301@gmail.com> (raw)
In-Reply-To: <5066DCFD.7040106@wxs.nl>
[-- Attachment #1: Type: text/plain, Size: 4444 bytes --]
On 09/29/2012 02:35 PM, Hans Hagen wrote:
> On 29-9-2012 01:41, Simo Ojala wrote:
>> Hans Hagen <pragma@wxs.nl>
>>
>> On 09/28/2012 11:46 AM, Hans Hagen wrote:
>>> On 27-9-2012 21:27, Simo Ojala wrote:
>>>> This is a problem originally posted in TeX/StackExchange. However,
>>>> since
>>>> I have not had any luck in finding a solution I post it here too. I am
>>>> confident that somebody here should know the answer.
>>>>
>>>>
>>>> http://tex.stackexchange.com/questions/73970/problem-with-context-mkiv-hebrew-and-ligatures
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> "Since I last played with the latest ConTeXt MkIV, there has been
>>>> introduced this new feature. It now seems to combine Hebrew characters
>>>> automatically when possible to ligatures. So for example. If I have a
>>>> word with following two characters:
>>>>
>>>> U+05D5 (HEBREW LETTER VAV)
>>>> U+05BC (HEBREW POINT DAGESH OR MAPIQ)
>>>>
>>>> ConTeXt will combine these to:
>>>>
>>>> U+FB35 (HEBREW LETTER VAV WITH DAGESH)
>>>>
>>>> However, I would need to disable this feature for a number of reasons.
>>>> For example, this breaks my little database query, because the query
>>>> key
>>>> is changed before(?) macro gets it.
>>>>
>>>> So if somebody would know how to turn this off and maybe also that what
>>>> has changed."
>>>
>>> It depends on the font ... normally you can disable this by *not* using
>>> the mark and mkmk features
>>>
>>> Hans
>>>
>>
>> Ok, I have now tried turning off all kinds of features without luck. So,
>> I tried putting together minimal test case. I suspect that there should
>> be done something more than just turn off some font features. However,
>> my ConTeXt skills are very limited so I can be wrong.
>>
>> The goal is that the word passed from ConTeXt file remains as it is
>> written and gives unicode characters U+5e1, U+5d5, U+5bc and U+5e1. This
>> is what already happens when the word is in the lua file.
>>
>> Simo
>>
>> PS: In case this matters. My ConTeXt MkIV version is "2012.09.23 12:40".
>> It should be the latest for Ubuntu 12.04 LTS Precise Pangolin that is in
>> the Adam Reviczky's PPA.
>>
>>
>> %% testcase.tex
>>
>> \definefontfeature[hebrew][arabic][script=hebr]
>> \definefont[dejavusans][name:dejavusans*hebrew at 26pt]
>> \setupdirections[bidi=global]
>>
>> \starttext
>> \dejavusans
>>
>> \def\Macro#1{\directlua{
>> dofile(resolvers.findfile("testcase.lua"))
>> userdata.testfunction("#1")
>> }}
>>
>> \Macro{סוּס}
>>
>> \blank[1cm]however, we can still color these independently\blank[0.5cm]
>>
>> \color[red]{ס}\color[green]{ו}\color[blue]{ּ}\color[yellow]{ס}
>>
>> \stoptext
>>
>>
>> -- testcase.lua
>>
>> userdata = userdata or {}
>>
>> function userdata.testfunction(word)
>>
>> tex.sprint("\\blank[1cm]word passed by macro\\blank[0.5cm]")
>>
>> for i = 1, unicode.utf8.len(word) do
>> tex.sprint("U+" ..
>> string.format("%x",unicode.utf8.byte(word,i)) .. ": " ..
>> unicode.utf8.sub(word,i,i) .. "\\par" )
>> end
>>
>> tex.sprint("\\blank[1cm]word written in lua file\\blank[0.5cm]")
>>
>> word = "סוּס"
>>
>> for i = 1, unicode.utf8.len(word) do
>> tex.sprint("U+" ..
>> string.format("%x",unicode.utf8.byte(word,i)) .. ": " ..
>> unicode.utf8.sub(word,i,i) .. "\\par" )
>> end
>> end
>
> I see three characters next to each other so what exactly is the problem?
>
> (BTW, take a look at goodies-002.tex in the test suite ... you can
> define colored glyphs as a feature)
>
> Hans
>
Sorry for being unclear, I try to clarify. The problem is:
1. I have tex file with which calls a macro with argument that has
characters U+5d5 and U+5bc.
2. Macro passes argument further to lua code. When it gets there
characters have turned to U+fb35.
3. When the lua code then compares the U+fb35 with xml file that has the
original forms U+5d5 and U+5bc it of course fails.
So, the problem is that there is this phase 2 that has not always
happened. If possible I would like to turn it off somehow. Of course I
could try to write some workaround code to countermeasure this
substitution or what it should be called. But that could be complicated
and lead to more problems.
Simo
PS: I attached my result of the test case in case this is problem with
my setup. Compiled with ConTeXt MkIV 2012.09.25 21:44.
[-- Attachment #2: testcase.pdf --]
[-- Type: application/pdf, Size: 20530 bytes --]
[-- Attachment #3: Type: text/plain, Size: 485 bytes --]
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://tex.aanhet.net
archive : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___________________________________________________________________________________
next prev parent reply other threads:[~2012-10-01 15:16 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-27 19:27 Simo Ojala
2012-09-28 8:46 ` Hans Hagen
2012-09-28 23:41 ` Simo Ojala
2012-09-29 11:35 ` Hans Hagen
2012-10-01 15:16 ` Simo Ojala [this message]
2012-10-01 16:23 ` Philipp Gesang
2012-10-01 16:43 ` Hans Hagen
2012-10-01 17:25 ` Philipp Gesang
2012-10-01 17:39 ` Hans Hagen
2012-10-01 20:18 ` Philipp Gesang
2012-10-01 20:52 ` Hans Hagen
2012-10-08 18:51 ` Simo Ojala
2012-10-08 19:10 ` Wolfgang Schuster
2012-10-10 0:17 ` Simo Ojala
2012-10-10 7:36 ` Sietse Brouwer
2012-10-11 0:52 ` Simo Ojala
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5069B3BE.7060301@gmail.com \
--to=smsojala@gmail.com \
--cc=ntg-context@ntg.nl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).