ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* Greek in luatex
@ 2007-09-01 10:56 Thomas A. Schmitz
  2007-09-11  6:47 ` Thomas A. Schmitz
  0 siblings, 1 reply; 31+ messages in thread
From: Thomas A. Schmitz @ 2007-09-01 10:56 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Hi all,

I've been experimenting with my Greek stuff in luatex, and I think  
I'm making nice progress. Things pretty much work with Unicode input,  
and as soon as the kerning problem is solved, I'm very optimistic.  
Two questions came up for me; I assume the answers are  
straightforward, but couldn't find anything:

1. How can I remap single characters? Let's say that we have a  
Unicode character in the input stream that maps to 0x03c3, but I want  
it remapped to 0x3f2, how can this be achieved?

2. Similarly: if I want to support the legacy input method babel, I  
need to remap the input stream to the Greek characters (question 1)  
and also need to feed the font some ligature rules, such as: the  
combination >a needs to be combined into the character 0x1f00. What  
would be the syntax and the way to do this?

All best

Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-01 10:56 Greek in luatex Thomas A. Schmitz
@ 2007-09-11  6:47 ` Thomas A. Schmitz
  2007-09-11 10:12   ` Hans Hagen
  2007-09-13  1:15   ` Arthur Reutenauer
  0 siblings, 2 replies; 31+ messages in thread
From: Thomas A. Schmitz @ 2007-09-11  6:47 UTC (permalink / raw)
  To: mailing list for ConTeXt users

OK, the message below didn't get too many responses, so maybe I can  
rephrase my quiestions in a more precise manner:

1. For otftotfm, there's the "unicoding" command where you can  
replace a character in a certain slot with another unicode character,  
so you could say
unicoding "A = uni03D1"
Is anything like this possible in luatex?

2. I see this code in font-otf.lua:
fonts.otf.features.data.tex = {
     { "endash", "hyphen hyphen" },
     { "emdash", "hyphen hyphen hyphen" },
     { "quotedblleft", "quoteleft quoteleft" },
     { "quotedblright", "quoteright quoteright" },
     { "quotedblleft", "grave grave" },
     { "quotedblright", "quotesingle quotesingle" },
     { "quotedblbase", "comma comma" }
}

and this list is used in the function function  
fonts.initializers.base.otf.texligatures(tfm,value)

How is it possible to write a similar list and function for just a  
single font or fonts in a specific typescript?

Thanks a lot!

Thomas


On Sep 1, 2007, at 12:56 PM, Thomas A. Schmitz wrote:

> Hi all,
>
> I've been experimenting with my Greek stuff in luatex, and I think
> I'm making nice progress. Things pretty much work with Unicode input,
> and as soon as the kerning problem is solved, I'm very optimistic.
> Two questions came up for me; I assume the answers are
> straightforward, but couldn't find anything:
>
> 1. How can I remap single characters? Let's say that we have a
> Unicode character in the input stream that maps to 0x03c3, but I want
> it remapped to 0x3f2, how can this be achieved?
>
> 2. Similarly: if I want to support the legacy input method babel, I
> need to remap the input stream to the Greek characters (question 1)
> and also need to feed the font some ligature rules, such as: the
> combination >a needs to be combined into the character 0x1f00. What
> would be the syntax and the way to do this?
>
> All best
>
> Thomas
>
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-11  6:47 ` Thomas A. Schmitz
@ 2007-09-11 10:12   ` Hans Hagen
  2007-09-13  1:15   ` Arthur Reutenauer
  1 sibling, 0 replies; 31+ messages in thread
From: Hans Hagen @ 2007-09-11 10:12 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Thomas A. Schmitz wrote:
> OK, the message below didn't get too many responses, so maybe I can  
> rephrase my quiestions in a more precise manner:
> 
> 1. For otftotfm, there's the "unicoding" command where you can  
> replace a character in a certain slot with another unicode character,  
> so you could say
> unicoding "A = uni03D1"
> Is anything like this possible in luatex?

sure, but it depends a bit on what level ... font driven or not

if you have open type fonts, you can add features on the fly ...

\starttext

\installfontfeature[otf][verb]

\definefontfeature
   [test]
   [mode=node,language=dflt,script=latn,
    verb=yes,featurefile=verbose-digits.fea]

{\font\test=name:lmroman10regular*test at 20pt \test 1 2 3 4}

\ctxlua{characters.context.show(\number"00AB)}

\stoptext

this replaces 1 by one and 2 by two ...

the file verbose-digits.fea in in the distribution and an example of a 
fontforge specification file

> 2. I see this code in font-otf.lua:
> fonts.otf.features.data.tex = {
>      { "endash", "hyphen hyphen" },
>      { "emdash", "hyphen hyphen hyphen" },
>      { "quotedblleft", "quoteleft quoteleft" },
>      { "quotedblright", "quoteright quoteright" },
>      { "quotedblleft", "grave grave" },
>      { "quotedblright", "quotesingle quotesingle" },
>      { "quotedblbase", "comma comma" }
> }

that's ligatures and there for backward compatibility (hm, makes me 
wonder if it makes more sense to do that using feature files)

> and this list is used in the function function  
> fonts.initializers.base.otf.texligatures(tfm,value)
> 
> How is it possible to write a similar list and function for just a  
> single font or fonts in a specific typescript?

in principle you can add lua code in typescripts and then register that 
as a feature (so, texligatures or tlig is one of them, as is lineheight)

it all depends on how generic things are; we can think of features like 
remap=name-of-remap-vector (keep in mind that this operates on node 
lists then; rencoding the input i.e. regimes is done differently)

so .. just write down detailed specs -)

Han

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-11  6:47 ` Thomas A. Schmitz
  2007-09-11 10:12   ` Hans Hagen
@ 2007-09-13  1:15   ` Arthur Reutenauer
  2007-09-13  7:03     ` Taco Hoekwater
  2007-09-13  9:45     ` Thomas A. Schmitz
  1 sibling, 2 replies; 31+ messages in thread
From: Arthur Reutenauer @ 2007-09-13  1:15 UTC (permalink / raw)
  To: mailing list for ConTeXt users

[-- Attachment #1: Type: text/plain, Size: 4628 bytes --]

	Hello Thomas,

  I was waiting for someone else to answer your questions because I
had no clue how to address them even if I was interested; but now I do,
thanks to Hans' reply:

  For your general problem you need to define a new regime that will
map each relevant character sequence to the corresponding Unicode
character.  That is, you inform ConTeXt that the character stream it sees
is actually a way of coding another set of characters and that it can
forget the original stream.  This treatment should be done before any sort
of font property intervenes, because it does not depend on the
appearance of the typeset text.  That's what regimes are for.

  Now I turn up to Hans to give us guidelines on how to define an
advanced regime in Mark IV: Hans, what we need here is to replace
sequences of characters by other characters, so the mapping is not
one-to-one and it's more complicated than simple regimes defined by a
table lookup; but I guess all we have to do is write a lua function that
we could plug into the input stream reading routine (just like other
regimes work).

  As far as the rest of Hans' reply is concerned (Opentype features and
such), I would like to add that it is a very interesting and fascinating
thing to do, but definitely not what you want here, for a lot of
reasons: Opentype features can be used to alter the appearance of the
text, but the not nature of characters themselves.  That is, if you did
the transformation of your input stream at the font level, you would
actually tell ConTeXt that you are handling Latin characters with a
special appearance (that the font takes care of), so for example, the
underlying text in a PDF would be a stream of Latin characters, and
copying-and-pasting would yield Latin characters, not Greek.  That is
not what you want here: you want your "a" to be understood as "alpha"
and your "less-than acute-sign w vertical-bar" to be considered an
"omega with dasia, varia and subscribed iota".  Nor should you think of
these transformations as a collection of ligatures (which act at the
font level), but rather as a text encoding, just like UTF-8 is an
encoding of the Unicode characters: in UTF-8 the byte sequence
"hexadecimal byte E1, hexadecimal byte BC, hexadecimal byte 80" is the
coding for the Unicode character U+1F00 GREEK SMALL LETTER ALPHA WITH PSILI,
and in the Babel input scheme for Ancient Greek the same character is
encoded with the byte sequence "hexadecimal byte 3C [ASCII '<'],
hexadecimal byte 61 [ASCII 'a']".

  Of course in the past, these transformations were handled at the font
level and sequences like "< a" were actually ligatures, because that was
all we had (and copypasting from a PDF was, mostly, doomed to fail); but
we should not persist in that use now we can treat them as real Unicode
characters.

  As for your other question in your original message from September 1st
(remapping single characters, for example U+03C3 to U+03F2), I have to
say first that I'm not very comfortable commenting on it since I'm not
quite sure what the issues are here; it may be that you have a simple
variant of some character, and this you should handle at font level
(some glyph being transformed into some other one); but if I am to judge
by the very example you gave, I would deem this should be a part of your
input regime: indeed, if every sigma is to be mapped to lunate sigma,
then it probably means that the lunate sigmas are part of your character
stream (even if you didn't input it directly).  But I really can't give
any general advice here, especially because I don't actually know what a
lunate sigma really is ;-)  You would have to decide for yourself as a
specialist of Greek if you're dealing with really different characters
or simple font variants; in the former case you should handle the
transformation as a part of your regime; in the latter, by defining a
font feature like Hans demonstrated.

  But for now, as long as it is understood that font tricks aren't the
general solution for the problem at stake, I would like to demonstrate
that it is still possible to do everything at font level :-)

  If you have a look at the attached greek-babel.tex (and the features
definition file greek-babel.fea) you will see that (almost) everything
is taken care of using Opentype substitutions.  You need Bosporos and
GFS Baskerville to compile the file; by the way, the line with GFS
Baskerville is a further proof that you shouldn't handle the
transformation at font level: can you explain why it doesn't work here?
As a compliment, I also attach the Perl script which I wrote to generate
the .fea file.

	Arthur

[-- Attachment #2: greek-babel.tex --]
[-- Type: text/x-tex, Size: 1161 bytes --]

% For Thomas Schmitz.
% Define a new Opentype feature to replace new Babel input scheme and use it
% with some polytonic Greek fonts

% Not quite complete; some rhos with breathings and accents are missing from
% the .fea file (where are they?) and the final sigma isn't accounted for.
\installfontfeature[otf][grbl]

\definefontfeature
   [greek-babel]
   [mode=node,language=dflt,script=latn,
    grbl=yes,featurefile=greek-babel.fea]

\font\grbask=name:GFSBaskerville*greek-babel at 20pt
\font\bosphoros=name:BosporosU*greek-babel at 20pt

\starttext

\catcode`\~=11

\bosphoros
Peis'istratis m'en o>~un >egkateg'hrase t~h|
>arq~h| ka`i >ap'ejane nos'hsas >ep`i Fil'onew >'rqontos,
af' o<~ou m`en kat'esth t`o pr~wton t'urannos >'eth tri'akonta
ka`i tr'ia Bi'wsas, <`a d' >en t~h| >arq~h| di'emeinen
<enos d'eonta e>'ikosi; >'efeuge g`ap t`a loip`a.

% Don't do that!
\grbask
Peis'istratis m'en o>~un >egkateg'hrase t~h|
>arq~h| ka`i >ap'ejane nos'hsas >ep`i Fil'onew >'rqontos,
af' o<~ou m`en kat'esth t`o pr~wton t'urannos >'eth tri'akonta
ka`i tr'ia Bi'wsas, <`a d' >en t~h| >arq~h| di'emeinen
<enos d'eonta e>'ikosi; >'efeuge g`ap t`a loip`a.

\stoptext

[-- Attachment #3: greek-babel.fea --]
[-- Type: text/plain, Size: 8471 bytes --]

# An Opentype feature to replace the Babel input scheme

# Not quite complete; some rhos with breathings and accents are missing (where
# are they?) and the final sigma isn't accounted for.

lookup GreekBabelLookupSimple {
    lookupflag 0 ;
	sub a	by alpha ;
	sub b	by beta ;
	sub g	by gamma ;
	sub d	by delta ;
	sub e	by epsilon ;
	sub z	by zeta ;
	sub h	by eta ;
	sub j	by theta ;
	sub i	by iota ;
	sub k	by kappa ;
	sub l	by lambda ;
	sub m	by mu ;
	sub n	by nu ;
	sub x	by xi ;
	sub o	by omicron ;
	sub p	by pi ;
	sub r	by rho ;
	sub c	by sigmafinal ;
	sub s	by sigma ;
	sub t	by tau ;
	sub u	by upsilon ;
	sub f	by phi ;
	sub q	by chi ;
	sub y	by psi ;
	sub w	by omega ;
	sub A	by Alpha ;
	sub B	by Beta ;
	sub G	by Gamma ;
	sub D	by Delta ;
	sub E	by Epsilon ;
	sub Z	by Zeta ;
	sub H	by Eta ;
	sub J	by Theta ;
	sub I	by Iota ;
	sub K	by Kappa ;
	sub L	by Lambda ;
	sub M	by Mu ;
	sub N	by Nu ;
	sub X	by Xi ;
	sub O	by Omicron ;
	sub P	by Pi ;
	sub R	by Rho ;
	sub C	by Uni03C2 ;
	sub S	by Sigma ;
	sub T	by Tau ;
	sub U	by Upsilon ;
	sub F	by Phi ;
	sub Q	by Chi ;
	sub Y	by Psi ;
	sub W	by Omega ;
	sub semicolon	by periodcentered ;
} GreekBabelLookupSimple ;

lookup GreekBabelLookupMultiple {
    lookupflag 1 ;
	# sub s 'space by sigmafinal ;
	sub greater  a by uni1F00 ;
	sub greater  A by uni1F08 ;
	sub greater  e by uni1F10 ;
	sub greater  E by uni1F18 ;
	sub greater  h by uni1F20 ;
	sub greater  H by uni1F28 ;
	sub greater  i by uni1F30 ;
	sub greater  I by uni1F38 ;
	sub greater  o by uni1F40 ;
	sub greater  O by uni1F48 ;
	sub greater  u by uni1F50 ;
	# sub greater  U by uni1F58 ;
	sub greater  w by uni1F60 ;
	sub greater  W by uni1F68 ;
	sub greater grave a by uni1F02 ;
	sub greater grave A by uni1F0A ;
	sub greater grave e by uni1F12 ;
	sub greater grave E by uni1F1A ;
	sub greater grave h by uni1F22 ;
	sub greater grave H by uni1F2A ;
	sub greater grave i by uni1F32 ;
	sub greater grave I by uni1F3A ;
	sub greater grave o by uni1F42 ;
	sub greater grave O by uni1F4A ;
	sub greater grave u by uni1F52 ;
	# sub greater grave U by uni1F5A ;
	sub greater grave w by uni1F62 ;
	sub greater grave W by uni1F6A ;
	sub greater quotesingle a by uni1F04 ;
	sub greater quotesingle A by uni1F0C ;
	sub greater quotesingle e by uni1F14 ;
	sub greater quotesingle E by uni1F1C ;
	sub greater quotesingle h by uni1F24 ;
	sub greater quotesingle H by uni1F2C ;
	sub greater quotesingle i by uni1F34 ;
	sub greater quotesingle I by uni1F3C ;
	sub greater quotesingle o by uni1F44 ;
	sub greater quotesingle O by uni1F4C ;
	sub greater quotesingle u by uni1F54 ;
	sub greater quotesingle U by uni1F5C ;
	sub greater quotesingle w by uni1F64 ;
	sub greater quotesingle W by uni1F6C ;
	sub greater asciitilde a by uni1F06 ;
	sub greater asciitilde A by uni1F0E ;
	sub greater asciitilde e by uni1F16 ;
	sub greater asciitilde E by uni1F1E ;
	sub greater asciitilde h by uni1F26 ;
	sub greater asciitilde H by uni1F2E ;
	sub greater asciitilde i by uni1F36 ;
	sub greater asciitilde I by uni1F3E ;
	sub greater asciitilde o by uni1F46 ;
	sub greater asciitilde O by uni1F4E ;
	sub greater asciitilde u by uni1F56 ;
	sub greater asciitilde U by uni1F5E ;
	sub greater asciitilde w by uni1F66 ;
	sub greater asciitilde W by uni1F6E ;
	sub less  a by uni1F01 ;
	sub less  A by uni1F09 ;
	sub less  e by uni1F11 ;
	sub less  E by uni1F19 ;
	sub less  h by uni1F21 ;
	sub less  H by uni1F29 ;
	sub less  i by uni1F31 ;
	sub less  I by uni1F39 ;
	sub less  o by uni1F41 ;
	sub less  O by uni1F49 ;
	sub less  u by uni1F51 ;
	sub less  U by uni1F59 ;
	sub less  w by uni1F61 ;
	sub less  W by uni1F69 ;
	sub less grave a by uni1F03 ;
	sub less grave A by uni1F0B ;
	sub less grave e by uni1F13 ;
	sub less grave E by uni1F1B ;
	sub less grave h by uni1F23 ;
	sub less grave H by uni1F2B ;
	sub less grave i by uni1F33 ;
	sub less grave I by uni1F3B ;
	sub less grave o by uni1F43 ;
	sub less grave O by uni1F4B ;
	sub less grave u by uni1F53 ;
	sub less grave U by uni1F5B ;
	sub less grave w by uni1F63 ;
	sub less grave W by uni1F6B ;
	sub less quotesingle a by uni1F05 ;
	sub less quotesingle A by uni1F0D ;
	sub less quotesingle e by uni1F15 ;
	sub less quotesingle E by uni1F1D ;
	sub less quotesingle h by uni1F25 ;
	sub less quotesingle H by uni1F2D ;
	sub less quotesingle i by uni1F35 ;
	sub less quotesingle I by uni1F3D ;
	sub less quotesingle o by uni1F45 ;
	sub less quotesingle O by uni1F4D ;
	sub less quotesingle u by uni1F55 ;
	sub less quotesingle U by uni1F5D ;
	sub less quotesingle w by uni1F65 ;
	sub less quotesingle W by uni1F6D ;
	sub less asciitilde a by uni1F07 ;
	sub less asciitilde A by uni1F0F ;
	sub less asciitilde e by uni1F17 ;
	sub less asciitilde E by uni1F1F ;
	sub less asciitilde h by uni1F27 ;
	sub less asciitilde H by uni1F2F ;
	sub less asciitilde i by uni1F37 ;
	sub less asciitilde I by uni1F3F ;
	sub less asciitilde o by uni1F47 ;
	sub less asciitilde O by uni1F4F ;
	sub less asciitilde u by uni1F57 ;
	sub less asciitilde U by uni1F5F ;
	sub less asciitilde w by uni1F67 ;
	sub less asciitilde W by uni1F6F ;
	sub grave a by uni1F70 ;
	sub quotesingle a by uni1F71 ;
	sub grave e by uni1F72 ;
	sub quotesingle e by uni1F73 ;
	sub grave h by uni1F74 ;
	sub quotesingle h by uni1F75 ;
	sub grave i by uni1F76 ;
	sub quotesingle i by uni1F77 ;
	sub grave o by uni1F78 ;
	sub quotesingle o by uni1F79 ;
	sub grave u by uni1F7A ;
	sub quotesingle u by uni1F7B ;
	sub grave w by uni1F7C ;
	sub quotesingle w by uni1F7D ;
	sub grave A by uni1FBA ;
	sub quotesingle A by uni1FBB ;
	sub grave E by uni1FC8 ;
	sub quotesingle E by uni1FC9 ;
	sub grave H by uni1FCA ;
	sub quotesingle H by uni1FCB ;
	sub grave I by uni1FDA ;
	sub quotesingle I by uni1FDB ;
	sub grave U by uni1FEA ;
	sub quotesingle U by uni1FEB ;
	sub grave W by uni1FFA ;
	sub quotesingle W by uni1FFB ;
	sub greater  a bar by uni1F80 ;
	sub greater  A bar by uni1F88 ;
	sub greater  h bar by uni1F90 ;
	sub greater  H bar by uni1F98 ;
	sub greater  w bar by uni1FA0 ;
	sub greater  W bar by uni1FA8 ;
	sub greater grave a bar by uni1F82 ;
	sub greater grave A bar by uni1F8A ;
	sub greater grave h bar by uni1F92 ;
	sub greater grave H bar by uni1F9A ;
	sub greater grave w bar by uni1FA2 ;
	sub greater grave W bar by uni1FAA ;
	sub greater quotesingle a bar by uni1F84 ;
	sub greater quotesingle A bar by uni1F8C ;
	sub greater quotesingle h bar by uni1F94 ;
	sub greater quotesingle H bar by uni1F9C ;
	sub greater quotesingle w bar by uni1FA4 ;
	sub greater quotesingle W bar by uni1FAC ;
	sub greater asciitilde a bar by uni1F86 ;
	sub greater asciitilde A bar by uni1F8E ;
	sub greater asciitilde h bar by uni1F96 ;
	sub greater asciitilde H bar by uni1F9E ;
	sub greater asciitilde w bar by uni1FA6 ;
	sub greater asciitilde W bar by uni1FAE ;
	sub less  a bar by uni1F81 ;
	sub less  A bar by uni1F89 ;
	sub less  h bar by uni1F91 ;
	sub less  H bar by uni1F99 ;
	sub less  w bar by uni1FA1 ;
	sub less  W bar by uni1FA9 ;
	sub less grave a bar by uni1F83 ;
	sub less grave A bar by uni1F8B ;
	sub less grave h bar by uni1F93 ;
	sub less grave H bar by uni1F9B ;
	sub less grave w bar by uni1FA3 ;
	sub less grave W bar by uni1FAB ;
	sub less quotesingle a bar by uni1F85 ;
	sub less quotesingle A bar by uni1F8D ;
	sub less quotesingle h bar by uni1F95 ;
	sub less quotesingle H bar by uni1F9D ;
	sub less quotesingle w bar by uni1FA5 ;
	sub less quotesingle W bar by uni1FAD ;
	sub less asciitilde a bar by uni1F87 ;
	sub less asciitilde A bar by uni1F8F ;
	sub less asciitilde h bar by uni1F97 ;
	sub less asciitilde H bar by uni1F9F ;
	sub less asciitilde w bar by uni1FA7 ;
	sub less asciitilde W bar by uni1FAF ;
	sub grave a bar by uni1FB2 ;
	sub a bar by uni1FB3 ;
	sub quotesingle a bar by uni1FB4 ;
	sub grave h bar by uni1FC2 ;
	sub h bar by uni1FC3 ;
	sub quotesingle h bar by uni1FC4 ;
	sub grave w bar by uni1FD2 ;
	sub w bar by uni1FD3 ;
	sub quotesingle w bar by uni1FD4 ;
	sub asciitilde a by uni1FB6 ;
	sub asciitilde a bar by uni1FB7 ;
	sub asciitilde h by uni1FC6 ;
	sub asciitilde h bar by uni1FC7 ;
	sub asciitilde w by uni1FD6 ;
	sub asciitilde w bar by uni1FD7 ;
	sub greater r by uni1FE4 ;
	sub less r by uni1FE5 ;
	sub less R by uni1FEC ;
} GreekBabelLookupMultiple ;

feature grbl {

    script DFLT ;
	language dflt ;
	    lookup GreekBabelLookupMultiple ;
	    lookup GreekBabelLookupSimple ;

    script latn;
	language dflt ;
	    lookup GreekBabelLookupMultiple ;
	    lookup GreekBabelLookupSimple ;
} grbl ;


[-- Attachment #4: greek-babel.pdf --]
[-- Type: application/pdf, Size: 12656 bytes --]

[-- Attachment #5: babelify --]
[-- Type: text/plain, Size: 4965 bytes --]

#!/usr/bin/perl -W
# Outputs GSUB rules for replacing Babel-inputted greek characters with their
# Unicode value.
# In Adobe Feature Language, suitable for use in Fontlab's .fea files.

use strict ;
use utf8 ;

# Character types: breathings, accents, vowels
# The void string is considered an accent for convenience with breathings
my %charmask ;
my $charshift = 8 ;
my @breathings = ('greater', 'less') ;
my @accents = ('', 'grave', 'quotesingle', 'asciitilde') ;
my @vowels = ('a', 'e', 'h', 'i', 'o', 'u', 'w') ;

# Unicode masks for characters with breathings
$charmask{''} = 0 ;
$charmask{'greater'} = 0 ;
$charmask{'less'} = 1 ;
$charmask{'grave'} = 2 ;
$charmask{'quotesingle'} = 4 ;
$charmask{'asciitilde'} = 6 ;
$charmask{'a'} = 0x1F00 ;
$charmask{'e'} = 0x1F10 ;
$charmask{'h'} = 0x1F20 ;
$charmask{'i'} = 0x1F30 ;
$charmask{'o'} = 0x1F40 ;
$charmask{'u'} = 0x1F50 ;
$charmask{'w'} = 0x1F60 ;

# Local variables
my $breathing ; my $accent ; my $vowel ;
my $uchar ;

# First the U+1F00–U+1F6F sequence: breathing accent vowel
# We compile the Unicode code points by simply ORing the mask of each element
# Note that some of these characters actually don't exist!
# But is was easier this way (we can always edit the output afterward)
foreach $breathing (@breathings)
{
  foreach $accent (@accents)
  {
    foreach $vowel (@vowels)
    {
      # Space cadet input scheme ;-)
      $uchar = $charmask{$breathing} | $charmask{$accent} | $charmask{$vowel} ;
      printf "sub $breathing $accent $vowel by uni%04X ;\n", $uchar ;

      # Uppercase characters: the same shifted 8.
      $uchar = $charmask{$breathing} | $charmask{$accent}
        | $charmask{$vowel} | $charshift ;
      printf "sub $breathing $accent %s by uni%04X ;\n", uc($vowel), $uchar ;
    }
  }
}

# The U+1F7x range: lowercase vowels with only one accent.
# I have no idea why Unicode decided to put them there ... (especially seen as
# the uppercase vowels are somewhere else, and in an even more clumsy
# arrangement).

# We have to change the masks
$charmask{'grave'} = 0 ;
$charmask{'quotesingle'} = 1 ;
$charmask{'a'} = 0x1F70 ;
$charmask{'e'} = 0x1F72 ;
$charmask{'h'} = 0x1F74 ;
$charmask{'i'} = 0x1F76 ;
$charmask{'o'} = 0x1F78 ;
$charmask{'u'} = 0x1F7A ;
$charmask{'w'} = 0x1F7C ;

foreach $vowel (@vowels)
{
  foreach $accent ('grave', 'quotesingle')
  {
    $uchar = $charmask{$accent} | $charmask{$vowel} ;
    printf "sub $accent $vowel by uni%04X ;\n", $uchar ; }
}

# As announced before, the uppercase counterparts of these 14 characters are in
# a delighfully crappy mess. Simply output them one by one.
print "sub grave A by uni1FBA ;\n" ;
print "sub quotesingle A by uni1FBB ;\n" ;
print "sub grave E by uni1FC8 ;\n" ;
print "sub quotesingle E by uni1FC9 ;\n" ;
print "sub grave H by uni1FCA ;\n" ;
print "sub quotesingle H by uni1FCB ;\n" ;
print "sub grave I by uni1FDA ;\n" ;
print "sub quotesingle I by uni1FDB ;\n" ;
print "sub grave U by uni1FEA ;\n" ;
print "sub quotesingle U by uni1FEB ;\n" ;
print "sub grave W by uni1FFA ;\n" ;
print "sub quotesingle W by uni1FFB ;\n" ;

# U+1F80–U+1FAF: characters with subscribed iotas and breathings.
# We have to change the masks once again.
$charmask{'grave'} = 2 ;
$charmask{'quotesingle'} = 4 ;
$charmask{'a'} = 0x1F80 ;
$charmask{'h'} = 0x1F90 ;
$charmask{'w'} = 0x1FA0 ;
foreach $breathing (@breathings)
{
  foreach $accent (@accents)
  {
    foreach $vowel ('a', 'h', 'w') # Only these three vowels!
    {
      $uchar = $charmask{$breathing} | $charmask{$accent} | $charmask{$vowel} ;
      printf "sub $breathing $accent $vowel bar by uni%04X ;\n", $uchar ;

      # Uppercase counterparts
      $uchar = $charmask{$breathing} | $charmask{$accent}
        | $charmask{$vowel} | $charshift ;
      printf "sub $breathing $accent %s bar by uni%04X ;\n", uc($vowel), $uchar ;
    }
  }
}

# And finally, the characters with subscribed iotas but without breathings.
# Only nine of them, write them one by one.
print "sub grave a bar by uni1FB2 ;\n" ;
print "sub a bar by uni1FB3 ;\n" ;
print "sub quotesingle a bar by uni1FB4 ;\n" ;
print "sub grave h bar by uni1FC2 ;\n" ;
print "sub h bar by uni1FC3 ;\n" ;
print "sub quotesingle h bar by uni1FC4 ;\n" ;
print "sub grave w bar by uni1FD2 ;\n" ;
print "sub w bar by uni1FD3 ;\n" ;
print "sub quotesingle w bar by uni1FD4 ;\n" ;

# And some more with perispomeni ...
print "sub asciitilde a by uni1FB6 ;\n" ;
print "sub asciitilde a bar by uni1FB7 ;\n" ;
print "sub asciitilde h by uni1FC6 ;\n" ;
print "sub asciitilde h bar by uni1FC7 ;\n" ;
print "sub asciitilde w by uni1FD6 ;\n" ;
print "sub asciitilde w bar by uni1FD7 ;\n" ;

# Rhos
print "sub greater r by uni1FE4 ;\n" ;
print "sub less r by uni1FE5 ;\n" ;
print "sub less R by uni1FEC ;\n" ;

# We leave some over but that should already be useful. Enjoy!

[-- Attachment #6: Type: text/plain, Size: 487 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13  1:15   ` Arthur Reutenauer
@ 2007-09-13  7:03     ` Taco Hoekwater
  2007-09-13 10:24       ` Arthur Reutenauer
  2007-09-13 17:42       ` Hans Hagen
  2007-09-13  9:45     ` Thomas A. Schmitz
  1 sibling, 2 replies; 31+ messages in thread
From: Taco Hoekwater @ 2007-09-13  7:03 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Arthur Reutenauer wrote:
> 	Hello Thomas,
> 
>   I was waiting for someone else to answer your questions because I
> had no clue how to address them even if I was interested; but now I do,
> thanks to Hans' reply:
> 
>   For your general problem you need to define a new regime that will
> map each relevant character sequence to the corresponding Unicode
> character.  That is, you inform ConTeXt that the character stream it sees
> is actually a way of coding another set of characters and that it can
> forget the original stream.  This treatment should be done before any sort
> of font property intervenes, because it does not depend on the
> appearance of the typeset text.  That's what regimes are for.

Yes, except that we need a more powerful version (almost like OTPs) if
we want to handle transcriptions properly. The vital point is that it
should operate on tokens, not on nodes. I am not sure if Hans already
has a hook there that can be extended.

>   If you have a look at the attached greek-babel.tex (and the features
> definition file greek-babel.fea) you will see that (almost) everything
> is taken care of using Opentype substitutions.  You need Bosporos and
> GFS Baskerville to compile the file; by the way, the line with GFS
> Baskerville is a further proof that you shouldn't handle the
> transformation at font level: can you explain why it doesn't work here?

Possibly because a single one of the glyphs has a different name in
GFS Baskerville, or because a previous gsub rule has e.g. replaced
F;i; => Fi; (your own gsub rules are always executed last, after
everything defined by the font)

As you say, .fea's are definately not the right way to handle this,
even if they would work flawlessly.


___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13  1:15   ` Arthur Reutenauer
  2007-09-13  7:03     ` Taco Hoekwater
@ 2007-09-13  9:45     ` Thomas A. Schmitz
  2007-09-13 10:49       ` Arthur Reutenauer
  2007-09-13 17:51       ` Hans Hagen
  1 sibling, 2 replies; 31+ messages in thread
From: Thomas A. Schmitz @ 2007-09-13  9:45 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Hi Arthur,

first of all: thank you so much for your time and your expertise!  
Your reply and your scripts really make things a lot clearer for me;  
this is a huge step forward! I'll have to experiment and think more  
about it, here's just a few reactions to some of your remarks:

On Sep 13, 2007, at 3:15 AM, Arthur Reutenauer wrote:

> 	Hello Thomas,
>
>   I was waiting for someone else to answer your questions because I
> had no clue how to address them even if I was interested; but now I  
> do,
> thanks to Hans' reply:
>
>   For your general problem you need to define a new regime that will
> map each relevant character sequence to the corresponding Unicode
> character.  That is, you inform ConTeXt that the character stream  
> it sees
> is actually a way of coding another set of characters and that it can
> forget the original stream.  This treatment should be done before  
> any sort
> of font property intervenes, because it does not depend on the
> appearance of the typeset text.  That's what regimes are for.

I agree that this would probably be the cleanest solution: since  
luatex has unicode support, map everything to the corresponding  
Unicode characters. This would also make hyphenation easier to achieve.

>
>   Now I turn up to Hans to give us guidelines on how to define an
> advanced regime in Mark IV: Hans, what we need here is to replace
> sequences of characters by other characters, so the mapping is not
> one-to-one and it's more complicated than simple regimes defined by a
> table lookup; but I guess all we have to do is write a lua function  
> that
> we could plug into the input stream reading routine (just like other
> regimes work).
>
>   As far as the rest of Hans' reply is concerned (Opentype features  
> and
> such), I would like to add that it is a very interesting and  
> fascinating
> thing to do, but definitely not what you want here, for a lot of
> reasons: Opentype features can be used to alter the appearance of the
> text, but the not nature of characters themselves.  That is, if you  
> did
> the transformation of your input stream at the font level, you would
> actually tell ConTeXt that you are handling Latin characters with a
> special appearance (that the font takes care of), so for example, the
> underlying text in a PDF would be a stream of Latin characters, and
> copying-and-pasting would yield Latin characters, not Greek.

The question of copy-and-paste is one of the big mysteries, and I  
have no clue why it works in some cases, but not in others. Right  
now, on my system (OS X 10.4), only Adobe Reader 8.0 does copy-paste  
correctly, and it does it correctly no matter if I use babel or  
Unicode input. Never touch a running system: I just take this as  
some  sort of divine favor and leave it at that...

> That is
> not what you want here: you want your "a" to be understood as "alpha"
> and your "less-than acute-sign w vertical-bar" to be considered an
> "omega with dasia, varia and subscribed iota".  Nor should you  
> think of
> these transformations as a collection of ligatures (which act at the
> font level), but rather as a text encoding, just like UTF-8 is an
> encoding of the Unicode characters: in UTF-8 the byte sequence
> "hexadecimal byte E1, hexadecimal byte BC, hexadecimal byte 80" is the
> coding for the Unicode character U+1F00 GREEK SMALL LETTER ALPHA  
> WITH PSILI,
> and in the Babel input scheme for Ancient Greek the same character is
> encoded with the byte sequence "hexadecimal byte 3C [ASCII '<'],
> hexadecimal byte 61 [ASCII 'a']".

Yes, that's crystal clear. It would also take care of another  
problem: in the input stream, you know exactly which character  
sequence translates to what. On the font level, legacy fonts  
sometimes have their own ideas about where to put certain glyphs.

>
>   Of course in the past, these transformations were handled at the  
> font
> level and sequences like "< a" were actually ligatures, because  
> that was
> all we had (and copypasting from a PDF was, mostly, doomed to  
> fail); but
> we should not persist in that use now we can treat them as real  
> Unicode
> characters.

Well yes, but see above.

>
>   As for your other question in your original message from  
> September 1st
> (remapping single characters, for example U+03C3 to U+03F2), I have to
> say first that I'm not very comfortable commenting on it since I'm not
> quite sure what the issues are here; it may be that you have a simple
> variant of some character, and this you should handle at font level
> (some glyph being transformed into some other one); but if I am to  
> judge
> by the very example you gave, I would deem this should be a part of  
> your
> input regime: indeed, if every sigma is to be mapped to lunate sigma,
> then it probably means that the lunate sigmas are part of your  
> character
> stream (even if you didn't input it directly).  But I really can't  
> give
> any general advice here, especially because I don't actually know  
> what a
> lunate sigma really is ;-)  You would have to decide for yourself as a
> specialist of Greek if you're dealing with really different characters
> or simple font variants; in the former case you should handle the
> transformation as a part of your regime; in the latter, by defining a
> font feature like Hans demonstrated.

I guess that different sorts of users would respond differently. In  
Unicode, there's a different slot for some alternate characters, so  
the Unicode standard really considers them different characters. For  
the classicist, a sigma is a sigma, and the fact that it can be  
rendered as a "lunate" or a "normal" sigma is irrelevant. For me,  
this makes more sense, so  I would support this on the font level.

>
>   But for now, as long as it is understood that font tricks aren't the
> general solution for the problem at stake, I would like to demonstrate
> that it is still possible to do everything at font level :-)
>
>   If you have a look at the attached greek-babel.tex (and the features
> definition file greek-babel.fea) you will see that (almost) everything
> is taken care of using Opentype substitutions.  You need Bosporos and
> GFS Baskerville to compile the file; by the way, the line with GFS
> Baskerville is a further proof that you shouldn't handle the
> transformation at font level: can you explain why it doesn't work  
> here?
> As a compliment, I also attach the Perl script which I wrote to  
> generate
> the .fea file.

Wonderful! I will look carefully at these files. I've been playing  
with perl and python all day yesterday for another problem, so I'm  
very much looking forward to studying your script.

Thanks so much, and all best

Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13  7:03     ` Taco Hoekwater
@ 2007-09-13 10:24       ` Arthur Reutenauer
  2007-09-13 11:38         ` Taco Hoekwater
  2007-09-13 17:42       ` Hans Hagen
  1 sibling, 1 reply; 31+ messages in thread
From: Arthur Reutenauer @ 2007-09-13 10:24 UTC (permalink / raw)
  To: mailing list for ConTeXt users

> Yes, except that we need a more powerful version (almost like OTPs) if
> we want to handle transcriptions properly. The vital point is that it
> should operate on tokens, not on nodes.

  Yes, sure. OTP would work fine here, but I thought Mark IV had already
something handy.

> Possibly because a single one of the glyphs has a different name in
> GFS Baskerville, or because a previous gsub rule has e.g. replaced
> F;i; => Fi;

  No, simply because GFS Baskerville has no glyphs for Latin characters,
so they're dropped by the token reader and can't be transformed afterwards!

	Arthur
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13  9:45     ` Thomas A. Schmitz
@ 2007-09-13 10:49       ` Arthur Reutenauer
  2007-09-13 12:51         ` Thomas A. Schmitz
  2007-09-13 14:25         ` Taco Hoekwater
  2007-09-13 17:51       ` Hans Hagen
  1 sibling, 2 replies; 31+ messages in thread
From: Arthur Reutenauer @ 2007-09-13 10:49 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

>                                                             Right  
> now, on my system (OS X 10.4), only Adobe Reader 8.0 does copy-paste  
> correctly, and it does it correctly no matter if I use babel or  
> Unicode input.

  You mean with LuaTeX? Copypasting isn't supported yet in LuaTeX so
it's no surprise that it wouldn't work (for me Adobe Reader and Preview
fail in two different ways).  As for pdfTeX I leave that to Taco and
others to answer.

  But hyphenation is another important issue, maybe even clearer.

> I guess that different sorts of users would respond differently. In  
> Unicode, there's a different slot for some alternate characters, so  
> the Unicode standard really considers them different characters.

  Actually, now I think about it, the name for U+03F2 has "symbol" in
it, and that's a clear indication that the character is intended for
"technical use", not for inputting Greek text; so your choice is
consistent with the intent of the Standard.

> Wonderful! I will look carefully at these files. I've been playing  
> with perl and python all day yesterday for another problem, so I'm  
> very much looking forward to studying your script.

  Somewhere in the middle of writing it, I realized that I should have
written it in Lua :-)  It wouldn't have been much different.

	Arthur
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13 10:24       ` Arthur Reutenauer
@ 2007-09-13 11:38         ` Taco Hoekwater
  2007-09-13 12:54           ` Thomas A. Schmitz
  2007-09-13 18:36           ` Arthur Reutenauer
  0 siblings, 2 replies; 31+ messages in thread
From: Taco Hoekwater @ 2007-09-13 11:38 UTC (permalink / raw)
  To: mailing list for ConTeXt users

[-- Attachment #1: Type: text/plain, Size: 464 bytes --]

Arthur Reutenauer wrote:
>> Yes, except that we need a more powerful version (almost like OTPs) if
>> we want to handle transcriptions properly. The vital point is that it
>> should operate on tokens, not on nodes.
> 
>   Yes, sure. OTP would work fine here, but I thought Mark IV had already
> something handy.

I played a bit, see attachment. Surely Hans will want to improve on this 
interface, so don't patch any of the core files just now.

Best wishes,
Taco

[-- Attachment #2: tokfilter.tex --]
[-- Type: text/x-tex, Size: 1751 bytes --]

% engine=luatex

%D First a hack to the core. two changes:
%D * don't force end_cs to be \relax
%D * don't remove end_cs from the input stream

\ctxlua{
function collectors.install(tag,end_cs)
    collectors.data[tag] = { }
    local data   = collectors.data[tag]
    local call   = token.command_id("call")
    local endcs  = token.csname_id(end_cs)
    local expand = collectors.registered
    local get    = token.get_next
    while true do
        local t = get()
        local a, b = t[1], t[3]
        if b == endcs then
            tex.print('\\' ..end_cs)
            return
        elseif a == call and expand[b] then
            token.expand()
        else
            data[\string#data+1] = t
        end
    end
end 
}

%D a small extension to the core interface, to have a 
%D nice wrapper around the lua code

\ctxlua {
function collectors.handle(tag,handle)
    collectors.data[tag] = handle(collectors.data[tag])
end
}
\def\handletokens[#1][#2]{\ctxlua{collectors.handle("#1",#2)}}

%D Here starts the document-specific code

%D Start capturing tokens in the buffer named 'babel', stop
%D at \stopbabel

\def\startbabel 
  {\ctxlua{collectors.install("babel", "stopbabel")}}


%D The lua mutation function. str is a table containing the captured
%D tokens, each itself a three-item table (this is explained in the 
%D luatex manual)

\ctxlua {
function convert_babel(str)
    local t = { }
    for k,v in ipairs(str) do
        t[\string#t+1] = tokens.other('*')
        t[\string#t+1] = v
    end
    return t
end
}

%D convert the tokens using that lua function, then 
%D flush the result
\def\stopbabel  
  {\handletokens[babel][convert_babel]
   \flushtokens[babel]}

\starttext

\startbabel%
some stuff here
\stopbabel

\stoptext

[-- Attachment #3: Type: text/plain, Size: 487 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13 10:49       ` Arthur Reutenauer
@ 2007-09-13 12:51         ` Thomas A. Schmitz
  2007-09-13 14:25         ` Taco Hoekwater
  1 sibling, 0 replies; 31+ messages in thread
From: Thomas A. Schmitz @ 2007-09-13 12:51 UTC (permalink / raw)
  To: Mailing list for ConTeXt users


On Sep 13, 2007, at 12:49 PM, Arthur Reutenauer wrote:
>
>   You mean with LuaTeX? Copypasting isn't supported yet in LuaTeX so
> it's no surprise that it wouldn't work (for me Adobe Reader and  
> Preview
> fail in two different ways).  As for pdfTeX I leave that to Taco and
> others to answer.
>
>   But hyphenation is another important issue, maybe even clearer.
>

Yes, I meant in pdfTeX,  sorry for being imprecise.

>   Actually, now I think about it, the name for U+03F2 has "symbol" in
> it, and that's a clear indication that the character is intended for
> "technical use", not for inputting Greek text; so your choice is
> consistent with the intent of the Standard.
>
OK, good to hear that. I now realize that much of the stuff that I  
hacked together for use with pdfTeX worked by dumb luck; with luaTeX,  
I'll be forced to be adhere to standards more closely. I guess that's  
a good thing...

>
>   Somewhere in the middle of writing it, I realized that I should have
> written it in Lua :-)  It wouldn't have been much different.
>
Yes, I'm hoping to look into lua as well.

Thanks so much!

Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13 11:38         ` Taco Hoekwater
@ 2007-09-13 12:54           ` Thomas A. Schmitz
  2007-09-13 18:36           ` Arthur Reutenauer
  1 sibling, 0 replies; 31+ messages in thread
From: Thomas A. Schmitz @ 2007-09-13 12:54 UTC (permalink / raw)
  To: mailing list for ConTeXt users


On Sep 13, 2007, at 1:38 PM, Taco Hoekwater wrote:

> Arthur Reutenauer wrote:
>>> Yes, except that we need a more powerful version (almost like  
>>> OTPs) if
>>> we want to handle transcriptions properly. The vital point is  
>>> that it
>>> should operate on tokens, not on nodes.
>>   Yes, sure. OTP would work fine here, but I thought Mark IV had  
>> already
>> something handy.
>
> I played a bit, see attachment. Surely Hans will want to improve on  
> this interface, so don't patch any of the core files just now.
>
> Best wishes,
> Taco

Taco,

it almost feels like today's my birthday - thanks again! Will look at  
it more closely soonish!

Best

Thomas


___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13 10:49       ` Arthur Reutenauer
  2007-09-13 12:51         ` Thomas A. Schmitz
@ 2007-09-13 14:25         ` Taco Hoekwater
  1 sibling, 0 replies; 31+ messages in thread
From: Taco Hoekwater @ 2007-09-13 14:25 UTC (permalink / raw)
  To: Mailing list for ConTeXt users



Arthur Reutenauer wrote:
>>                                                             Right  
>> now, on my system (OS X 10.4), only Adobe Reader 8.0 does copy-paste  
>> correctly, and it does it correctly no matter if I use babel or  
>> Unicode input.
> 
>   You mean with LuaTeX? Copypasting isn't supported yet in LuaTeX so
> it's no surprise that it wouldn't work (for me Adobe Reader and Preview
> fail in two different ways).  As for pdfTeX I leave that to Taco and
> others to answer.

The next luatex release will finally have support for cut&paste when
using opentype and truetype fonts. In pdftex, cut&paste for traditional
type1 fonts was already present, and that will continue to work as
it did (at least for the immediate future).

Best wishes,
Taco
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13  7:03     ` Taco Hoekwater
  2007-09-13 10:24       ` Arthur Reutenauer
@ 2007-09-13 17:42       ` Hans Hagen
  1 sibling, 0 replies; 31+ messages in thread
From: Hans Hagen @ 2007-09-13 17:42 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Taco Hoekwater wrote:

> Yes, except that we need a more powerful version (almost like OTPs) if
> we want to handle transcriptions properly. The vital point is that it
> should operate on tokens, not on nodes. I am not sure if Hans already
> has a hook there that can be extended.

there are hooks, but i want to avoid token processign as much as 
possible beause it's slow (so it can definitely not be -as with nodes- 
done on all the data, i must give it some thought ..

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13  9:45     ` Thomas A. Schmitz
  2007-09-13 10:49       ` Arthur Reutenauer
@ 2007-09-13 17:51       ` Hans Hagen
  1 sibling, 0 replies; 31+ messages in thread
From: Hans Hagen @ 2007-09-13 17:51 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Thomas A. Schmitz wrote:

>>   For your general problem you need to define a new regime that will
>> map each relevant character sequence to the corresponding Unicode
>> character.  That is, you inform ConTeXt that the character stream  
>> it sees
>> is actually a way of coding another set of characters and that it can
>> forget the original stream.  This treatment should be done before  
>> any sort
>> of font property intervenes, because it does not depend on the
>> appearance of the typeset text.  That's what regimes are for.

regimes are a solution, but what solution is best depends on the input 
stream ... whole document? partial document? also written to external 
files? evenually everything can become a unicode, (private aereas) and 
as such travel through the system; of we can misuse virtual fonts ...

>> we could plug into the input stream reading routine (just like other
>> regimes work).

there are mechanisms for that (because that's what i played al lot with 
last year; there was (maybe even is) a mechanism for chained processing 
of input etc

>> actually tell ConTeXt that you are handling Latin characters with a
>> special appearance (that the font takes care of), so for example, the
>> underlying text in a PDF would be a stream of Latin characters, and
>> copying-and-pasting would yield Latin characters, not Greek.

not entirely true ... we can (and do) intercept the node stream ... ok, 
at that point we're dealing with a font/char pair, but we can chang ethe 
char (or node) to whatever we like ... depends on the problem

> The question of copy-and-paste is one of the big mysteries, and I  
> have no clue why it works in some cases, but not in others. Right  
> now, on my system (OS X 10.4), only Adobe Reader 8.0 does copy-paste  
> correctly, and it does it correctly no matter if I use babel or  
> Unicode input. Never touch a running system: I just take this as  
> some  sort of divine favor and leave it at that...

that's a matter of associating tounicode points, of course, no unicode 
means no copy/paste -)

>> That is
>> not what you want here: you want your "a" to be understood as "alpha"
>> and your "less-than acute-sign w vertical-bar" to be considered an
>> "omega with dasia, varia and subscribed iota".  Nor should you  
>> think of
>> these transformations as a collection of ligatures (which act at the
>> font level), but rather as a text encoding, just like UTF-8 is an
>> encoding of the Unicode characters: in UTF-8 the byte sequence
>> "hexadecimal byte E1, hexadecimal byte BC, hexadecimal byte 80" is the
>> coding for the Unicode character U+1F00 GREEK SMALL LETTER ALPHA  
>> WITH PSILI,
>> and in the Babel input scheme for Ancient Greek the same character is
>> encoded with the byte sequence "hexadecimal byte 3C [ASCII '<'],
>> hexadecimal byte 61 [ASCII 'a']".
> 
> Yes, that's crystal clear. It would also take care of another  
> problem: in the input stream, you know exactly which character  
> sequence translates to what. On the font level, legacy fonts  
> sometimes have their own ideas about where to put certain glyphs.

depends ... the input char becomes a node, now, if (probably controlled 
by attributes) a certain char is sees (say 'a') and you want it to be an 
alpha, well, we can change that char then in the node,

>>   Of course in the past, these transformations were handled at the  
>> font
>> level and sequences like "< a" were actually ligatures, because  
>> that was
>> all we had (and copypasting from a PDF was, mostly, doomed to  
>> fail); but
>> we should not persist in that use now we can treat them as real  
>> Unicode
>> characters.

those hard coded mechanism were indeed not sufficient

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13 11:38         ` Taco Hoekwater
  2007-09-13 12:54           ` Thomas A. Schmitz
@ 2007-09-13 18:36           ` Arthur Reutenauer
  2007-09-13 18:49             ` Hans Hagen
  2007-09-13 19:24             ` Hans Hagen
  1 sibling, 2 replies; 31+ messages in thread
From: Arthur Reutenauer @ 2007-09-13 18:36 UTC (permalink / raw)
  To: mailing list for ConTeXt users

[-- Attachment #1: Type: text/plain, Size: 608 bytes --]

> I played a bit, see attachment. Surely Hans will want to improve on this 
> interface, so don't patch any of the core files just now.

  Fantastic!

  Now I played a bit with your file myself, and compared with the
behaviour of an OTP which has the same action: you can see that macros
arguments between square brackets are preserved by OTP, whereas your
function (obviously) converts everything unconditionally. How difficult
would it be to program the same behaviour, that is, make
collectors.handle pass to convert_babel only contiguous ranges of
characters that are situated outside matching brackets?

[-- Attachment #2: tokfilter_otp.tex --]
[-- Type: text/x-tex, Size: 2247 bytes --]

% engine=luatex

%D First a hack to the core. two changes:
%D * don't force end_cs to be \relax
%D * don't remove end_cs from the input stream

\ctxlua{
function collectors.install(tag,end_cs)
    collectors.data[tag] = { }
    local data   = collectors.data[tag]
    local call   = token.command_id("call")
    local endcs  = token.csname_id(end_cs)
    local expand = collectors.registered
    local get    = token.get_next
    while true do
        local t = get()
        local a, b = t[1], t[3]
        if b == endcs then
            tex.print('\\' ..end_cs)
            return
        elseif a == call and expand[b] then
            token.expand()
        else
            data[\string#data+1] = t
        end
    end
end 
}

%D a small extension to the core interface, to have a 
%D nice wrapper around the lua code

\ctxlua {
function collectors.handle(tag,handle)
    collectors.data[tag] = handle(collectors.data[tag])
end
}
\def\handletokens[#1][#2]{\ctxlua{collectors.handle("#1",#2)}}

%D Here starts the document-specific code

%D Start capturing tokens in the buffer named 'babel', stop
%D at \stopbabel

\def\startbabel 
  {\ctxlua{collectors.install("babel", "stopbabel")}}


%D The lua mutation function. str is a table containing the captured
%D tokens, each itself a three-item table (this is explained in the 
%D luatex manual)

\ctxlua {
function convert_babel(str)
    local t = { }
    for k,v in ipairs(str) do
        t[\string#t+1] = tokens.other('*')
        t[\string#t+1] = v
    end
    return t
end
}

%D convert the tokens using that lua function, then 
%D flush the result
\def\stopbabel  
  {\handletokens[babel][convert_babel]
   \flushtokens[babel]}

\usetypescript[palatino][ec]

\starttext

\section{With Taco's \type{\startbabel}}
\startbabel%
some stuff here \blank[medium] some other stuff

\switchtobodyfont[palatino]

\subsection{More stuff}

stuff stuff stuff
\stopbabel

\blank[big]

\section{With an \tt OTP}

% Do the same as convert_babel, with a simple OTP (stars.otp)
\ocp\stars=stars
\ocplist\StarsOCP=\addbeforeocplist1\stars\nullocplist

\pushocplist\StarsOCP

some stuff here \blank[medium] some other stuff

\switchtobodyfont[palatino]

\subsection{More stuff}

stuff stuff stuff

\stoptext

[-- Attachment #3: stars.ocp --]
[-- Type: application/octet-stream, Size: 60 bytes --]

[-- Attachment #4: stars.otp --]
[-- Type: application/vnd.oasis.opendocument.presentation-template, Size: 58 bytes --]

[-- Attachment #5: Type: text/plain, Size: 487 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13 18:36           ` Arthur Reutenauer
@ 2007-09-13 18:49             ` Hans Hagen
  2007-09-13 19:24             ` Hans Hagen
  1 sibling, 0 replies; 31+ messages in thread
From: Hans Hagen @ 2007-09-13 18:49 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Arthur Reutenauer wrote:
>> I played a bit, see attachment. Surely Hans will want to improve on this 
>> interface, so don't patch any of the core files just now.
> 
>   Fantastic!
> 
>   Now I played a bit with your file myself, and compared with the
> behaviour of an OTP which has the same action: you can see that macros
> arguments between square brackets are preserved by OTP, whereas your
> function (obviously) converts everything unconditionally. How difficult
> would it be to program the same behaviour, that is, make
> collectors.handle pass to convert_babel only contiguous ranges of
> characters that are situated outside matching brackets?

i'll wrap tacos macro up a  bit

however, dealing with things like \blank[whatever] is not trivial

(1) we need to prevent expansion (register feature)
(2) but sometimes we need to expand
(3) and not all commands are treated the same

this is why otp liek things are suboptimal

also, a proper toks handling mechanism should look at its neighbours

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13 18:36           ` Arthur Reutenauer
  2007-09-13 18:49             ` Hans Hagen
@ 2007-09-13 19:24             ` Hans Hagen
  2007-09-13 19:45               ` Arthur Reutenauer
  1 sibling, 1 reply; 31+ messages in thread
From: Hans Hagen @ 2007-09-13 19:24 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Arthur Reutenauer wrote:

... greek ... greek ...

new beta


\defineremapper[babelgreek]

\remapcharacter[babelgreek][`a]{\alpha}
\remapcharacter[babelgreek][`b]{\beta}
\remapcharacter[babelgreek][`c]{\gamma}
\remapcharacter[babelgreek][`d]{OEPS}

\starttext

[\startbabelgreek
a b c some stuff here \blank[big] oeps b d
\stopbabelgreek]

[\babelgreek{some stuff here}]

\stoptext

i can think of a more clever mechanism (have some ideas) but not now (in 
the middle of something else)

for arthur ... [] are skipped

for mojca ... this beta also fixes your accent problem (if she's in the 
mood for source browsing ... interesting solution)

for luigi ... working on a variant xml parser ... now loading 40 meg in 
5 seconds

for taco ... i made your example into a configurable one

Hans


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13 19:24             ` Hans Hagen
@ 2007-09-13 19:45               ` Arthur Reutenauer
  2007-09-13 20:20                 ` Hans Hagen
  2007-09-13 20:38                 ` Thomas A. Schmitz
  0 siblings, 2 replies; 31+ messages in thread
From: Arthur Reutenauer @ 2007-09-13 19:45 UTC (permalink / raw)
  To: mailing list for ConTeXt users

> for arthur ... [] are skipped

  Thanks! I guess there's more to it and token filtering is not the only
way to do it, but it's still great.

	Arthur
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13 19:45               ` Arthur Reutenauer
@ 2007-09-13 20:20                 ` Hans Hagen
  2007-09-14  0:24                   ` Arthur Reutenauer
  2007-09-13 20:38                 ` Thomas A. Schmitz
  1 sibling, 1 reply; 31+ messages in thread
From: Hans Hagen @ 2007-09-13 20:20 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Arthur Reutenauer wrote:
>> for arthur ... [] are skipped
> 
>   Thanks! I guess there's more to it and token filtering is not the only
> way to do it, but it's still great.

indeed, also, its' important to look fresh at these things an dforget 
about how we do things now, else we replace hack with hack

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13 19:45               ` Arthur Reutenauer
  2007-09-13 20:20                 ` Hans Hagen
@ 2007-09-13 20:38                 ` Thomas A. Schmitz
  2007-09-13 21:05                   ` Hans Hagen
  2007-09-15 23:22                   ` Arthur Reutenauer
  1 sibling, 2 replies; 31+ messages in thread
From: Thomas A. Schmitz @ 2007-09-13 20:38 UTC (permalink / raw)
  To: mailing list for ConTeXt users


On Sep 13, 2007, at 9:45 PM, Arthur Reutenauer wrote:

>   Thanks! I guess there's more to it and token filtering is not the  
> only
> way to do it, but it's still great.
>
> 	Arthur


Oh boy... I'm afraid I lost you there. Hans, your remapper looks just  
like the thing I'd need for my Greek stuff. Right now, there appears  
to be a slight problem with the pdfs I produce with this code: on my  
system (OS X), they freeze or crash most pdf viewers (Adobe Reader  
can handle them, preview, TeXShop and pdfview all crash or freeze).  
Arthur, I also played with your fontfeatures. Most of the  
substitutions work, but there were a couple of problems that I just  
couldn't resolve, especially regarding the characters with an iota  
subscript: combinations involving accents and breathing (such as  
 >~h|) were remapped correctly; the pure vowel + iota (h|) was not  
remapped. I guess I will wait till the dust settles a bit and you  
tell me which is the best way to pursue.

Taco, one question: Hans mentioned that support for "wide" postscript  
fonts via afm was not supported yet. Does that mean that type 1 fonts  
with a unicode encoding do not work yet?

Thanks so much, all best

Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13 20:38                 ` Thomas A. Schmitz
@ 2007-09-13 21:05                   ` Hans Hagen
  2007-09-13 21:52                     ` Taco Hoekwater
  2007-09-15 23:22                   ` Arthur Reutenauer
  1 sibling, 1 reply; 31+ messages in thread
From: Hans Hagen @ 2007-09-13 21:05 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Thomas A. Schmitz wrote:

> Taco, one question: Hans mentioned that support for "wide" postscript  
> fonts via afm was not supported yet. Does that mean that type 1 fonts  
> with a unicode encoding do not work yet?

the latest mkiv works ok with wide fonts, the latest luatex also, but 
best wait till begin next week when all subsetting issues are resolved

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13 21:05                   ` Hans Hagen
@ 2007-09-13 21:52                     ` Taco Hoekwater
  0 siblings, 0 replies; 31+ messages in thread
From: Taco Hoekwater @ 2007-09-13 21:52 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Hans Hagen wrote:
> Thomas A. Schmitz wrote:
> 
>> Taco, one question: Hans mentioned that support for "wide" postscript  
>> fonts via afm was not supported yet. Does that mean that type 1 fonts  
>> with a unicode encoding do not work yet?
> 
> the latest mkiv works ok with wide fonts, the latest luatex also, but 
> best wait till begin next week when all subsetting issues are resolved

Like the man says.

Best wishes,
Taco

PS It is amazing how Hans manages to answer questions to me before I
even see them! All ntg-context mail arrives completely out of order
and hours late, today.



___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13 20:20                 ` Hans Hagen
@ 2007-09-14  0:24                   ` Arthur Reutenauer
  0 siblings, 0 replies; 31+ messages in thread
From: Arthur Reutenauer @ 2007-09-14  0:24 UTC (permalink / raw)
  To: mailing list for ConTeXt users

> indeed, also, its' important to look fresh at these things an dforget 
> about how we do things now, else we replace hack with hack

  Sure, of course. I only thought this was a nice way of handling things
but I'm not settled on that.

	Arthur
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-13 20:38                 ` Thomas A. Schmitz
  2007-09-13 21:05                   ` Hans Hagen
@ 2007-09-15 23:22                   ` Arthur Reutenauer
  2007-09-16  6:56                     ` Taco Hoekwater
                                       ` (2 more replies)
  1 sibling, 3 replies; 31+ messages in thread
From: Arthur Reutenauer @ 2007-09-15 23:22 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

[-- Attachment #1: Type: text/plain, Size: 1682 bytes --]

>                         there were a couple of problems that I just  
> couldn't resolve, especially regarding the characters with an iota  
> subscript:

  Indeed. This is a problem with the Fontforge code applying the GSUB
features: the 'grbl' feature is defined by two lookups, one being a list
of single substitutions (h -> eta) and the other a list of ligature
substitutions (h bar -> eta with subscribed iota).  Now, since the latter
has to take precedence to avoid conflicts, I explicitely put it before
the other, but it seems that Fontforge ignores this and applies the list
of single substitutions before the other (this is confirmed by the cache
file BosporosU@greek-babel.tma where the lookup with the single
substitutions, called "GreelBabelLookupSimple", appears first in the
gsub table).

  Note that this doesn't happen for substitutions *inside* a lookup (so
things like "greater eta bar" and "eta bar" don't conflict since they're
both ligature substitutions and I put the former before in the list, and
the substitutions are correctly applied.  As a far as I understand, this
behaviour is actually compliant with the Opentype specifications and is
quite widespread among typesetting engines and so it is not (only)
Fontforge's fault; but, needless to say, it is nevertheless annoying.
(In more crude terms: Opentype does not specify anything in that respect,
so manufacturers of typesetting software can do whatever they want ...)

  Thomas: to solve the problem at hand, you can try the new feature
file I send along with a small test (I simply define a new feature that is
to be applied after 'grbl', to deal specifically with the subscribed iotas).

	Arthur

[-- Attachment #2: greek-babel-extended.fea --]
[-- Type: text/plain, Size: 8742 bytes --]

# An Opentype feature to replace the Babel input scheme

# Not quite complete; some rhos with breathings and accents are missing (where
# are they?) and the final sigma isn't accounted for.

lookup GreekBabelLookupSimple {
    lookupflag 0 ;
	sub a	by alpha ;
	sub b	by beta ;
	sub g	by gamma ;
	sub d	by delta ;
	sub e	by epsilon ;
	sub z	by zeta ;
	sub h	by eta ;
	sub j	by theta ;
	sub i	by iota ;
	sub k	by kappa ;
	sub l	by lambda ;
	sub m	by mu ;
	sub n	by nu ;
	sub x	by xi ;
	sub o	by omicron ;
	sub p	by pi ;
	sub r	by rho ;
	sub c	by sigmafinal ;
	sub s	by sigma ;
	sub t	by tau ;
	sub u	by upsilon ;
	sub f	by phi ;
	sub q	by chi ;
	sub y	by psi ;
	sub w	by omega ;
	sub A	by Alpha ;
	sub B	by Beta ;
	sub G	by Gamma ;
	sub D	by Delta ;
	sub E	by Epsilon ;
	sub Z	by Zeta ;
	sub H	by Eta ;
	sub J	by Theta ;
	sub I	by Iota ;
	sub K	by Kappa ;
	sub L	by Lambda ;
	sub M	by Mu ;
	sub N	by Nu ;
	sub X	by Xi ;
	sub O	by Omicron ;
	sub P	by Pi ;
	sub R	by Rho ;
	sub C	by Uni03C2 ;
	sub S	by Sigma ;
	sub T	by Tau ;
	sub U	by Upsilon ;
	sub F	by Phi ;
	sub Q	by Chi ;
	sub Y	by Psi ;
	sub W	by Omega ;
	sub semicolon	by periodcentered ;
} GreekBabelLookupSimple ;

lookup GreekBabelLookupMultiple {
    lookupflag 1 ;
	# sub s 'space by sigmafinal ;
	sub greater  a by uni1F00 ;
	sub greater  A by uni1F08 ;
	sub greater  e by uni1F10 ;
	sub greater  E by uni1F18 ;
	sub greater  h by uni1F20 ;
	sub greater  H by uni1F28 ;
	sub greater  i by uni1F30 ;
	sub greater  I by uni1F38 ;
	sub greater  o by uni1F40 ;
	sub greater  O by uni1F48 ;
	sub greater  u by uni1F50 ;
	# sub greater  U by uni1F58 ;
	sub greater  w by uni1F60 ;
	sub greater  W by uni1F68 ;
	sub greater grave a by uni1F02 ;
	sub greater grave A by uni1F0A ;
	sub greater grave e by uni1F12 ;
	sub greater grave E by uni1F1A ;
	sub greater grave h by uni1F22 ;
	sub greater grave H by uni1F2A ;
	sub greater grave i by uni1F32 ;
	sub greater grave I by uni1F3A ;
	sub greater grave o by uni1F42 ;
	sub greater grave O by uni1F4A ;
	sub greater grave u by uni1F52 ;
	# sub greater grave U by uni1F5A ;
	sub greater grave w by uni1F62 ;
	sub greater grave W by uni1F6A ;
	sub greater quotesingle a by uni1F04 ;
	sub greater quotesingle A by uni1F0C ;
	sub greater quotesingle e by uni1F14 ;
	sub greater quotesingle E by uni1F1C ;
	sub greater quotesingle h by uni1F24 ;
	sub greater quotesingle H by uni1F2C ;
	sub greater quotesingle i by uni1F34 ;
	sub greater quotesingle I by uni1F3C ;
	sub greater quotesingle o by uni1F44 ;
	sub greater quotesingle O by uni1F4C ;
	sub greater quotesingle u by uni1F54 ;
	sub greater quotesingle U by uni1F5C ;
	sub greater quotesingle w by uni1F64 ;
	sub greater quotesingle W by uni1F6C ;
	sub greater asciitilde a by uni1F06 ;
	sub greater asciitilde A by uni1F0E ;
	sub greater asciitilde e by uni1F16 ;
	sub greater asciitilde E by uni1F1E ;
	sub greater asciitilde h by uni1F26 ;
	sub greater asciitilde H by uni1F2E ;
	sub greater asciitilde i by uni1F36 ;
	sub greater asciitilde I by uni1F3E ;
	sub greater asciitilde o by uni1F46 ;
	sub greater asciitilde O by uni1F4E ;
	sub greater asciitilde u by uni1F56 ;
	sub greater asciitilde U by uni1F5E ;
	sub greater asciitilde w by uni1F66 ;
	sub greater asciitilde W by uni1F6E ;
	sub less  a by uni1F01 ;
	sub less  A by uni1F09 ;
	sub less  e by uni1F11 ;
	sub less  E by uni1F19 ;
	sub less  h by uni1F21 ;
	sub less  H by uni1F29 ;
	sub less  i by uni1F31 ;
	sub less  I by uni1F39 ;
	sub less  o by uni1F41 ;
	sub less  O by uni1F49 ;
	sub less  u by uni1F51 ;
	sub less  U by uni1F59 ;
	sub less  w by uni1F61 ;
	sub less  W by uni1F69 ;
	sub less grave a by uni1F03 ;
	sub less grave A by uni1F0B ;
	sub less grave e by uni1F13 ;
	sub less grave E by uni1F1B ;
	sub less grave h by uni1F23 ;
	sub less grave H by uni1F2B ;
	sub less grave i by uni1F33 ;
	sub less grave I by uni1F3B ;
	sub less grave o by uni1F43 ;
	sub less grave O by uni1F4B ;
	sub less grave u by uni1F53 ;
	sub less grave U by uni1F5B ;
	sub less grave w by uni1F63 ;
	sub less grave W by uni1F6B ;
	sub less quotesingle a by uni1F05 ;
	sub less quotesingle A by uni1F0D ;
	sub less quotesingle e by uni1F15 ;
	sub less quotesingle E by uni1F1D ;
	sub less quotesingle h by uni1F25 ;
	sub less quotesingle H by uni1F2D ;
	sub less quotesingle i by uni1F35 ;
	sub less quotesingle I by uni1F3D ;
	sub less quotesingle o by uni1F45 ;
	sub less quotesingle O by uni1F4D ;
	sub less quotesingle u by uni1F55 ;
	sub less quotesingle U by uni1F5D ;
	sub less quotesingle w by uni1F65 ;
	sub less quotesingle W by uni1F6D ;
	sub less asciitilde a by uni1F07 ;
	sub less asciitilde A by uni1F0F ;
	sub less asciitilde e by uni1F17 ;
	sub less asciitilde E by uni1F1F ;
	sub less asciitilde h by uni1F27 ;
	sub less asciitilde H by uni1F2F ;
	sub less asciitilde i by uni1F37 ;
	sub less asciitilde I by uni1F3F ;
	sub less asciitilde o by uni1F47 ;
	sub less asciitilde O by uni1F4F ;
	sub less asciitilde u by uni1F57 ;
	sub less asciitilde U by uni1F5F ;
	sub less asciitilde w by uni1F67 ;
	sub less asciitilde W by uni1F6F ;
	sub grave a by uni1F70 ;
	sub quotesingle a by uni1F71 ;
	sub grave e by uni1F72 ;
	sub quotesingle e by uni1F73 ;
	sub grave h by uni1F74 ;
	sub quotesingle h by uni1F75 ;
	sub grave i by uni1F76 ;
	sub quotesingle i by uni1F77 ;
	sub grave o by uni1F78 ;
	sub quotesingle o by uni1F79 ;
	sub grave u by uni1F7A ;
	sub quotesingle u by uni1F7B ;
	sub grave w by uni1F7C ;
	sub quotesingle w by uni1F7D ;
	sub grave A by uni1FBA ;
	sub quotesingle A by uni1FBB ;
	sub grave E by uni1FC8 ;
	sub quotesingle E by uni1FC9 ;
	sub grave H by uni1FCA ;
	sub quotesingle H by uni1FCB ;
	sub grave I by uni1FDA ;
	sub quotesingle I by uni1FDB ;
	sub grave U by uni1FEA ;
	sub quotesingle U by uni1FEB ;
	sub grave W by uni1FFA ;
	sub quotesingle W by uni1FFB ;
	sub greater  a bar by uni1F80 ;
	sub greater  A bar by uni1F88 ;
	sub greater  h bar by uni1F90 ;
	sub greater  H bar by uni1F98 ;
	sub greater  w bar by uni1FA0 ;
	sub greater  W bar by uni1FA8 ;
	sub greater grave a bar by uni1F82 ;
	sub greater grave A bar by uni1F8A ;
	sub greater grave h bar by uni1F92 ;
	sub greater grave H bar by uni1F9A ;
	sub greater grave w bar by uni1FA2 ;
	sub greater grave W bar by uni1FAA ;
	sub greater quotesingle a bar by uni1F84 ;
	sub greater quotesingle A bar by uni1F8C ;
	sub greater quotesingle h bar by uni1F94 ;
	sub greater quotesingle H bar by uni1F9C ;
	sub greater quotesingle w bar by uni1FA4 ;
	sub greater quotesingle W bar by uni1FAC ;
	sub greater asciitilde a bar by uni1F86 ;
	sub greater asciitilde A bar by uni1F8E ;
	sub greater asciitilde h bar by uni1F96 ;
	sub greater asciitilde H bar by uni1F9E ;
	sub greater asciitilde w bar by uni1FA6 ;
	sub greater asciitilde W bar by uni1FAE ;
	sub less  a bar by uni1F81 ;
	sub less  A bar by uni1F89 ;
	sub less  h bar by uni1F91 ;
	sub less  H bar by uni1F99 ;
	sub less  w bar by uni1FA1 ;
	sub less  W bar by uni1FA9 ;
	sub less grave a bar by uni1F83 ;
	sub less grave A bar by uni1F8B ;
	sub less grave h bar by uni1F93 ;
	sub less grave H bar by uni1F9B ;
	sub less grave w bar by uni1FA3 ;
	sub less grave W bar by uni1FAB ;
	sub less quotesingle a bar by uni1F85 ;
	sub less quotesingle A bar by uni1F8D ;
	sub less quotesingle h bar by uni1F95 ;
	sub less quotesingle H bar by uni1F9D ;
	sub less quotesingle w bar by uni1FA5 ;
	sub less quotesingle W bar by uni1FAD ;
	sub less asciitilde a bar by uni1F87 ;
	sub less asciitilde A bar by uni1F8F ;
	sub less asciitilde h bar by uni1F97 ;
	sub less asciitilde H bar by uni1F9F ;
	sub less asciitilde w bar by uni1FA7 ;
	sub less asciitilde W bar by uni1FAF ;
	sub grave a bar by uni1FB2 ;
	sub quotesingle a bar by uni1FB4 ;
	sub grave h bar by uni1FC2 ;
	sub quotesingle h bar by uni1FC4 ;
	sub grave w bar by uni1FD2 ;
	sub quotesingle w bar by uni1FD4 ;
	sub asciitilde a by uni1FB6 ;
	sub asciitilde a bar by uni1FB7 ;
	sub asciitilde h by uni1FC6 ;
	sub asciitilde h bar by uni1FC7 ;
	sub asciitilde w by uni1FD6 ;
	sub asciitilde w bar by uni1FD7 ;
	sub greater r by uni1FE4 ;
	sub less r by uni1FE5 ;
	sub less R by uni1FEC ;
} GreekBabelLookupMultiple ;

lookup GreekBabel2LookupMultiple {
    lookupflag 1 ;
	sub alpha bar by uni1FB3 ;
	sub eta bar by uni1FC3 ;
	sub omega bar by uni1FF3 ;
} GreekBabel2LookupMultiple ;

feature grbl {

    script DFLT ;
	language dflt ;
	    lookup GreekBabelLookupMultiple ;
	    lookup GreekBabelLookupSimple ;

    script latn;
	language dflt ;
	    lookup GreekBabelLookupMultiple ;
	    lookup GreekBabelLookupSimple ;
} grbl ;

feature grb2 {

    script DFLT ;
	language dflt ;
	    lookup GreekBabel2LookupMultiple ;

    script latn;
	language dflt ;
	    lookup GreekBabel2LookupMultiple ;
} grb2 ;


[-- Attachment #3: subscribed_iotas.tex --]
[-- Type: text/x-tex, Size: 378 bytes --]

% For Thomas Schmitz.
% Deal with subscribed iotas
\installfontfeature[otf][grbl]
\installfontfeature[otf][grb2]

\definefontfeature
   [greek-babel]
   [mode=node,language=dflt,script=latn,
    grbl=yes,grb2=yes,featurefile=greek-babel-extended.fea]

\font\bosphoros=name:BosporosU*greek-babel at 20pt

\starttext

\catcode`\~=11
\catcode`\|=11

\bosphoros
a| h| w|

\stoptext

[-- Attachment #4: Type: text/plain, Size: 487 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-15 23:22                   ` Arthur Reutenauer
@ 2007-09-16  6:56                     ` Taco Hoekwater
  2007-09-16  8:22                     ` Taco Hoekwater
  2007-09-17  8:48                     ` Hans Hagen
  2 siblings, 0 replies; 31+ messages in thread
From: Taco Hoekwater @ 2007-09-16  6:56 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

Arthur Reutenauer wrote:
>                                        As a far as I understand, this
> behaviour is actually compliant with the Opentype specifications and is
> quite widespread among typesetting engines and so it is not (only)
> Fontforge's fault; but, needless to say, it is nevertheless annoying.
> (In more crude terms: Opentype does not specify anything in that respect,
> so manufacturers of typesetting software can do whatever they want ...)

The specification says that lookups should be applied in LookupList
order. Featurefiles don't have an explicit ordering command, but that
does not mean that ordering should be irrelevant. So I think this
is a bug in the version of fontforge I am using in luatex. I will do
some testing.

Best wishes,
Taco
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-15 23:22                   ` Arthur Reutenauer
  2007-09-16  6:56                     ` Taco Hoekwater
@ 2007-09-16  8:22                     ` Taco Hoekwater
  2007-09-16 13:01                       ` Thomas A. Schmitz
  2007-09-16 13:08                       ` Arthur Reutenauer
  2007-09-17  8:48                     ` Hans Hagen
  2 siblings, 2 replies; 31+ messages in thread
From: Taco Hoekwater @ 2007-09-16  8:22 UTC (permalink / raw)
  To: Mailing list for ConTeXt users


Hi guys,

Try this ordering:

> lookup GreekBabelLookupMultiple {
> ...
> } GreekBabelLookupMultiple ;
> 
> lookup GreekBabelLookupSimple {
> 	...
> } GreekBabelLookupSimple ;
> 


Best wishes,
Taco
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-16  8:22                     ` Taco Hoekwater
@ 2007-09-16 13:01                       ` Thomas A. Schmitz
  2007-09-16 23:12                         ` Hans Hagen
  2007-09-16 13:08                       ` Arthur Reutenauer
  1 sibling, 1 reply; 31+ messages in thread
From: Thomas A. Schmitz @ 2007-09-16 13:01 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Hi Arthur, Taco,

you're my heroes! Changing the order of the lookup tables in the .fea  
file actually took care of the problem. Thanks for looking into this,  
now I get the results I was expecting; every substitution is applied  
to the font! Once the initial lookup has been done, this is  
reasonably fast, too, so I like it. I'm eagerly waiting for teh new  
release next week to see if this works with copy-and-past from pdfs.  
So this appears to be one way to deal with ASCII input a la babel.  
Easy to implement, but fails on fonts that don't have the glyphs for  
the Latin characters.

One trivial question: when I want to experiment with feature files,  
the cached instance of the font seems to be in the way. Only after  
deleting the current luatex-cache, regenerating it and recompiling  
the format do I get proper results. Is there an easier/faster way to  
do this?

Will now go on and experiment some more, especially with type1/afm- 
based fonts. Thanks a lot, best wishes

Thomas

On Sep 16, 2007, at 10:22 AM, Taco Hoekwater wrote:

>
> Hi guys,
>
> Try this ordering:
>
>> lookup GreekBabelLookupMultiple {
>> ...
>> } GreekBabelLookupMultiple ;
>>
>> lookup GreekBabelLookupSimple {
>> 	...
>> } GreekBabelLookupSimple ;
>>
>
>
>
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-16  8:22                     ` Taco Hoekwater
  2007-09-16 13:01                       ` Thomas A. Schmitz
@ 2007-09-16 13:08                       ` Arthur Reutenauer
  2007-09-16 13:44                         ` Thomas A. Schmitz
  1 sibling, 1 reply; 31+ messages in thread
From: Arthur Reutenauer @ 2007-09-16 13:08 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

> Try this ordering:

  Yes, it works. So Fontforge is sensitive to the order in which the lookups
are defined in the file? Interesting ...

  Thomas, you can try this but I have made a mistake in the Unicode code
for omega with subscribed iota: it should be 1FF3 and not 1FD3.

	Arthur
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-16 13:08                       ` Arthur Reutenauer
@ 2007-09-16 13:44                         ` Thomas A. Schmitz
  0 siblings, 0 replies; 31+ messages in thread
From: Thomas A. Schmitz @ 2007-09-16 13:44 UTC (permalink / raw)
  To: Mailing list for ConTeXt users


On Sep 16, 2007, at 3:08 PM, Arthur Reutenauer wrote:

>  Yes, it works. So Fontforge is sensitive to the order in which the  
> lookups
> are defined in the file? Interesting ...
>
>   Thomas, you can try this but I have made a mistake in the Unicode  
> code
> for omega with subscribed iota: it should be 1FF3 and not 1FD3.
>
> 	Arthur

Yep, I had already fixed that (and also replied to Taco's message,  
the context list is again a bit out of order today). Arthur, while  
we're at it: could you try and insert this line into the fea-file:

sub quotedbl quotesingle i by un1FD3 ;

whenever I try anything like this with the quotedbl character (which  
produces some ligatures), I get this error:

</Users/tas/texmf/fonts/opentype/greek/bosporos/BosporosU.otf
!luaTeX error (file /Users/tas/texmf/fonts/opentype/greek/bosporos/ 
BosporosU.otf): Unexpected error: 255 != 256
  ==> Fatal error occurred, no output PDF file produced!

(Or similar errors with other fonts). The mechanism for the single  
dieresis works:

sub quotedbl i by uni03CA ;

but nothing with quotedbl + something else. Do you have any ideal  
what triggers this error?

Best

Thomas
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-16 13:01                       ` Thomas A. Schmitz
@ 2007-09-16 23:12                         ` Hans Hagen
  0 siblings, 0 replies; 31+ messages in thread
From: Hans Hagen @ 2007-09-16 23:12 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Thomas A. Schmitz wrote:
> Hi Arthur, Taco,
> 
> you're my heroes! Changing the order of the lookup tables in the .fea  
> file actually took care of the problem. Thanks for looking into this,  
> now I get the results I was expecting; every substitution is applied  
> to the font! Once the initial lookup has been done, this is  
> reasonably fast, too, so I like it. I'm eagerly waiting for teh new  
> release next week to see if this works with copy-and-past from pdfs.  
> So this appears to be one way to deal with ASCII input a la babel.  
> Easy to implement, but fails on fonts that don't have the glyphs for  
> the Latin characters.

arthur mentions the final sigma in the fea file .. can be a (part of) 
feature too (like fina)


> One trivial question: when I want to experiment with feature files,  
> the cached instance of the font seems to be in the way. Only after  
> deleting the current luatex-cache, regenerating it and recompiling  
> the format do I get proper results. Is there an easier/faster way to  
> do this?

jumping the version number of the otf handler will force this, but this 
is a bad idea; also, caching is fast because no file checking has to be 
done, so deleting cached files (just the one you test) is the price you 
pay when developing a font (fea) file

btw, the fea file can be part of the distribution (but we need to think 
of a naming scheme)

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: Greek in luatex
  2007-09-15 23:22                   ` Arthur Reutenauer
  2007-09-16  6:56                     ` Taco Hoekwater
  2007-09-16  8:22                     ` Taco Hoekwater
@ 2007-09-17  8:48                     ` Hans Hagen
  2 siblings, 0 replies; 31+ messages in thread
From: Hans Hagen @ 2007-09-17  8:48 UTC (permalink / raw)
  To: Mailing list for ConTeXt users

Hi Arthur and Thomas,

i've put the greek file in the distribution (fea path), do we also need 
this babel stuff for "u and such?

we should start thinking about a set of predefined features

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2007-09-17  8:48 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-01 10:56 Greek in luatex Thomas A. Schmitz
2007-09-11  6:47 ` Thomas A. Schmitz
2007-09-11 10:12   ` Hans Hagen
2007-09-13  1:15   ` Arthur Reutenauer
2007-09-13  7:03     ` Taco Hoekwater
2007-09-13 10:24       ` Arthur Reutenauer
2007-09-13 11:38         ` Taco Hoekwater
2007-09-13 12:54           ` Thomas A. Schmitz
2007-09-13 18:36           ` Arthur Reutenauer
2007-09-13 18:49             ` Hans Hagen
2007-09-13 19:24             ` Hans Hagen
2007-09-13 19:45               ` Arthur Reutenauer
2007-09-13 20:20                 ` Hans Hagen
2007-09-14  0:24                   ` Arthur Reutenauer
2007-09-13 20:38                 ` Thomas A. Schmitz
2007-09-13 21:05                   ` Hans Hagen
2007-09-13 21:52                     ` Taco Hoekwater
2007-09-15 23:22                   ` Arthur Reutenauer
2007-09-16  6:56                     ` Taco Hoekwater
2007-09-16  8:22                     ` Taco Hoekwater
2007-09-16 13:01                       ` Thomas A. Schmitz
2007-09-16 23:12                         ` Hans Hagen
2007-09-16 13:08                       ` Arthur Reutenauer
2007-09-16 13:44                         ` Thomas A. Schmitz
2007-09-17  8:48                     ` Hans Hagen
2007-09-13 17:42       ` Hans Hagen
2007-09-13  9:45     ` Thomas A. Schmitz
2007-09-13 10:49       ` Arthur Reutenauer
2007-09-13 12:51         ` Thomas A. Schmitz
2007-09-13 14:25         ` Taco Hoekwater
2007-09-13 17:51       ` Hans Hagen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).