From: "Idris Samawi Hamid" <ishamid@colostate.edu>
Subject: Aleph in ConTeXt: A Guide to the Perplexed
Date: Wed, 18 Jan 2006 15:47:51 -0700 [thread overview]
Message-ID: <op.s3lix1iinx1yh1@walayah1.wildblue.com> (raw)
Aleph in ConTeXt: A Guide to the Perplexed (with apologies to Maimonides)
Dear gang,
I have been helping a number of ConTeXt users off-list with getting aleph
running, along with right-to-left typesetting. The following notes are
meant to help ConTeXt users who want to do RL typesetting, particularly
Arabic script, get started with a minimal of fuss, as well as point to
more advanced applications. This is much more complete than my last post
(and hence the wiki as well)
I hope that someone can take this and add it to the wiki. I can then edit
the wiki myself and improve the clarity, etc. Perhaps I will also write an
augmentation to Hans' Aleph manual later.
I also have a support utlities package that could be placed on the wiki or
somewhere else. Is there anyone who can upload it for me?
==============================================
I. Introduction.
Aleph is a typesetting engine derived from Omega and eTeX. Reasons for
Aleph:
1. ConTeXt depends on the eTeX extensions, and even LaTeX now defaults to
pdfeTeX;
2. Omega provides a nice foundation for multilingual typesetting with
large (>256) character sets, including large virtual fonts, but a stable,
dependable version has not been a priority with its developers.
a. In particular, the RL-LR code works excellently for the most part
(minor bugs, easy to work around).
b. Omega 1.15 was the last relatively stable bugfix version, as far as
usability is concerned.
3. Some users need a dependable LR-RL TeX engine NOW.
Aleph weds Omega 1.15 and eTeX201, removes some extraneous stuff, and
fixes a few bugs. I use it for production purposes. It uses dvipdfmx for
pdf production, and can take advantage of most of ConTeXt's capabilities.
Giuseppe Bilotta has done virtually all of the development work.
In addition to large character sets Aleph inherits the filter sequence
mechanism for script processing (extension ocp, compile from text-editable
otp). So you can script whatever input encoding you like to whatever
output font encoding you like. It is mechanism powerful enough to do
contextual analysis of Arabic script for example, but not powerful enough
for things like vertical glyph positioning for cursive scripts and the
like.
Aleph, inheriting from Omega, provides many ready-to-go filters, using a
Times Roman like font for Latin, Greek, and Cyrillic scripts. The ConTeXt
module for this setup is called Gamma (m-gamma.tex); this is a port of the
Lambda (i.e., LaTeX) style files to ConTeXt. The font typescript is called
type-omg.
II. Installing.
This install is based on the stand-alone ConTeXt for Win32 package:
http://www.pragma-ade.com/context/install/mswincontext.zip
Users of MiKTeX and other OS's will need to adjust the following
instructions to their own setups.
1. Make sure you have a very recent version of ConTeXt that supports the
engine path mechanism. This mechanism allows texexec to manage two, e.g.,
cont-en.fmt files at once, one in
C:\ConTeXt\tex\texmf-mswin\web2c\aleph
and one in
C:\ConTeXt\tex\texmf-mswin\web2c\pdfetex
How recent, you ask? Just be safe and get the latest-)
2. Some configuration points:
a. Make sure you have the following line in
ConTeXt\tex\texmf-local\context\config\texexec.ini
set to "true", viz.,
set UseEnginePath to true
b. In
texmf-local\web2c\texmf.cnf,
texmf-local\web2c\context.cnf, and
texmf\web2c\texmf.cnf,
comment this line as follows
%extra_mem_bot.context = 2000000
otherwise aleph will crash under some conditions, like overfull boxes and
the like... The XeTeX developer found the source to this bug, and a fix;
hopefully Giuseppe will get to it-))
3. Get the omega support files:
http://www.ctan.org/get?fn=/systems/win32/fptex/0.7/package/omega.zip
http://www.ctan.org/get?fn=/systems/win32/fptex/0.7/package/omegafonts.zip
4. Get rid of the following directories from omega.zip (not really
necessary but if u want to be
efficient):
texmf/eomega
texmf/omega/encodings
5. Put support files in texmf-local;
6. Compile the Aleph format:
mktexlsr
texexec --make en -tex=aleph
7. Here is a test file. Note the preamble
% tex=aleph output=dvipdfmx
at the beginning of every aleph file.
=================omarb.tex================
% tex=aleph output=dvipdfmx
\input m-gamma.tex
\input type-omg.tex
\setupbodyfont[omlgc,12pt]
\starttext
\startlatin
This is a test
\bf This is a test
\stoplatin
\startgreek
A B G D a b g d
{\bf A B G D a b g d}
\stopgreek
\startarab
`rby:
A b t th j H kh
{\bf \ A b t th j H kh}
fArsy:
A b p t th j ch H kh
{\bf A b p t th j ch H kh}
\starturdu
ArdU:
A b p t 't th j ch H kh
{\bf A b p t 't th j ch H kh}
\stopurdu
\blank
\tfc
`rby:
bsm ALLah Al-rrHmn Al-rrHym
fArsy:
bh nAm khdAwnd b-kh-sh-nde mhrbAn
\starturdu
\tfc
ArdU:
ALLah kE nAm sE jw rHmAn w rHym hE
\stopurdu
\stoparab
\stoptext
=========================================
8. For Arabic script you will probably want to use an encoding that
supports direct Arabic-script editing. There are three: utf-8, iso-8859-6
(apple-unix), and cp1256 (micro$oft). We can define the following, using
ConTeXt macros for managing filter sequences. Maybe I will add these to
m-gamma and ask Hans to distribute. In the meantime, here are some
definitions, samples of all three encodings, and an example of mixed lr-rl
text:
===============m-arabic-enc.tex================
% tex=aleph output=dvipdfmx
%\input m-gamma.tex
\input type-omg.tex
\usetypescriptfile[type-omg]
\usetypescript[OmegaArab]
\hoffset=0pt
%% Individual Filters
% Input filters (from what you type)
\definefiltersynonym [UTF8] [inutf8]
\definefiltersynonym [ISO8859-6] [in88596]
\definefiltersynonym [CP1256] [incp1256]
% Contextual filter
\definefiltersynonym [UniCUni] [uni2cuni]
% Output filters (font mapping)
\definefiltersynonym [CUniArab] [cuni2oar]
%% Filter Sequences
\definefiltersequence
[UTFArabic]
[UTF8,UniCUni,CUniArab]
\definefiltersequence
[ISOArabic]
[ISO8859-6,UniCUni,CUniArab]
\definefiltersequence
[WINArabic]
[CP1256,UniCUni,CUniArab]
% For inner paragraph control within an LR paragraph
\def\ArabicTextUTF#1{{\textdir TRT\usefiltersequence[UTFArabic]%
\switchtobodyfont[omarb]#1\textdir TLT
\clearocplists}}
\def\ArabicTextISO#1{{\textdir TRT\usefiltersequence[ISOArabic]%
\switchtobodyfont[omarb]#1\textdir TLT
\clearocplists}}
\def\ArabicTextWIN#1{{\textdir TRT\usefiltersequence[WINFArabic]%
\switchtobodyfont[omarb]#1\textdir TLT
\clearocplists}}
% For global Arabic script
\def\ArabicDirGlobal{%
\pagedir TRT\bodydir TRT\textdir TRT\pardir TRT %
\hoffset=-8.88cm} % compensate for a bug in \bodydir TRT
\def\ArabicUTF{\ArabicDirGlobal\usefiltersequence[UTFArabic]
\switchtobodyfont[omarb]}
\def\ArabicISO{\ArabicDirGlobal\usefiltersequence[ISOArabic]
\switchtobodyfont[omarb]}
\def\ArabicWIN{\ArabicDirGlobal\usefiltersequence[WINArabic]
\switchtobodyfont[omarb]}
% For separate Arabic-script paragraphs
\def\ArabicDirPar{\textdir TRT\pardir TRT}
\definestartstop
[arabutf]
[commands=%
{\usefiltersequence[UTFArabic]
\switchtobodyfont[omarb]%
\ArabicDirPar}]
\definestartstop
[arabiso]
[commands=%
{\usefiltersequence[ISOArabic]
\switchtobodyfont[omarb]%
\ArabicDirPar}]
\definestartstop
[arabwin]
[commands=%
{\usefiltersequence[WINArabic]
\switchtobodyfont[omarb]%
\ArabicDirPar}]
\showframe[text]
\starttext
\startarabutf
اللَّهÙمَّ صَلّ٠عَلَى Ù…ÙØَمَّد٠وَ
آل٠مÙØَمَّد٠وَ ارْزÙقْنÙÙŠ
الْيَقÙينَ ÙˆÙŽ ØÙسْنَ الظَّنّ٠بÙÙƒÙŽ
ÙˆÙŽ أَثْبÙتْ رَجَاءَكَ ÙÙÙŠ قَلْبÙÙŠ
ÙˆÙŽ اقْطَعْ رَجَائÙÙŠ عَمَّنْ سÙوَاكَ
Øَتَّى لَا أَرْجÙÙˆÙŽ غَيْرَكَ ÙˆÙŽ لَا
Ø£ÙŽØ«ÙÙ‚ÙŽ Ø¥Ùلَّا بÙÙƒâ€
\stoparabutf
\blank
\startarabiso
Çääñîçïåñî Õîäñð Ùîäîé åïÍîåñîÏí èî Âäð åïÍîåñîÏí èî ÇÑòÒïâòæðê
Çäòêîâðêæî èî ÍïÓòæî ÇäØñîæñð Èðãî èî ÃîËòÈðÊò ÑîÌîÇÁîãî áðê
âîäòÈðê èî Çâò×îÙò ÑîÌîÇÆðê Ùîåñîæò ÓðèîÇãî ÍîÊñîé äîÇ ÃîÑòÌïèî
ÚîêòÑîãî èî äîÇ ÃîËðâî ÅðäñîÇ Èðã
\stoparabiso
\blank
\startarabwin
Çááøóåõãøó Õóáøö Úóáóì ãõÍóãøóÏò æó Âáö ãõÍóãøóÏò æó ÇÑúÒõÞúäöí
ÇáúíóÞöíäó æó ÍõÓúäó ÇáÙøóäøö Èößó æó ÃóËúÈöÊú ÑóÌóÇÁóßó Ýöí
ÞóáúÈöí æó ÇÞúØóÚú ÑóÌóÇÆöí Úóãøóäú ÓöæóÇßó ÍóÊøóì áóÇ ÃóÑúÌõæó
ÛóíúÑóßó æó áóÇ ÃóËöÞó ÅöáøóÇ Èößþ
\stoparabwin
\blank
Here is some mixed {\em Arabic-} (\ArabicTextUTF{عربي}) and
Latin-script. As you can see, Aleph does a very good job mixing
{\em LR} (\ArabicTextUTF{يسار-يمين}) and {\em RL}
(\ArabicTextUTF{يمين-يسار}) texts. \ArabicTextUTF{و
هنا جملة منقطعة ÙÙŠ وسط قرينة
لاتينية}. Aleph even does a great job breaking Arabic
phrases across lines.
\stoptext
=========================================
III. Going beyond.
The last example shows how to make and apply your own filter sequences
beyond the basic Gamma module. To go further u need to learn some
low-level business. You will also need some working utilities. I have put
together a windows package that you can unzip to C:\ConTeXt. These
utilities do work, but they are cobbled together from old fpTeX and MiKTeX
versions. just place the tree in C:\ConTeXt\
1. Example: If you want to get the final Persian kaaf instead of the
default Arabic one:
Check to see if your glyph is in the Arabic font. The Arabic font is made
of 6 raw fonts: 3 regular and three bold:
C:\ConTeXt\tex\texmf-local\fonts\type1\public\omega
omsea1, omsea1b,...omsea3b
Using a font viewer or editor you will find the Persian final kaaf in
omsea2, named kafswashfin.
Now go to
C:\ConTeXt\tex\texmf-local\omega\lambda\misc
and open
omarab.cfg
you will find a line
04AA N kafswashfin
This means that the 044A is the virtual font position for kafswashfin.
Open cuni2oar.otp and add the following at line 263:
%@"E343 => @"04AA;
Following this line you should see
% remaining Arabic glyphs
@"E000-@"E3FF => #(\1 - @"DF00);
Basically, in uni2cuni.otp final-kaaf gets mapped to E343. In the font, we
want it mapped to kafswashfin, so we did that. Now recompile the otp:
otp2ocp cuni2oar
Now you will get kafswashfin for the final kaaf.
2. Want new fonts (Arabic or Latin). Here are the instructions:
1. Read the following two papers carefully again and again; they are
your friends:-)
http://omega.enstb.org/papers/tsukuba-methods97.pdf
http://omega.enstb.org/papers/ridt-omega98.pdf
2. Make a pfb file containing the glyphs you need, or use an existing font
3.Make a cfg file a la texmf\omega\lambda\misc\omlgc.cfg Make sure u list
your glyph
positions in hexadecimal notation.
5. Get the following from an old TeXLive distro: \support\makeovp.zip,
containing makeovp.pl. There is a SH file with a sample of its use
using omlgc.
4. Following are instructions for cooking omarab.ovf. You want your
own ovf, say, omlgcch.ovf (<ch> for <cherokee>). Generate an afm
file for your private glyph pfb/pfa plus the afm files that are
listed in the SH file (base files for omlgc found in
\texmf\fonts\afm\public\omega )
Using the instructions below and the SH file (IGNORE the kernings.afm
file!) you can figure out how to make your own ovp and ovf. Before
making the ovf file, examine the ovp file created, especially the
first few lines, to see how the font-metric info from the afm's are
concatenated. Very instructive.
6. Don't forget the rest of the accounting:
a) adding lines to a map file and pointing dvips/dvipdfm to it;
b) create a typescript file;
c) edit your otp's. If u get stuck be sure to read
http://omega.enstb.org/papers/tsukuba-arabic97.pdf
==============================================
[How to cook omarab.ovf:]
[Ingredients: omarab.cfg, omseco.afm, omsea1.afm, omsea2.afm, omsea3.afm]
#perl makeovp.pl omarab.cfg omseco.afm omsea1.afm omsea2.afm omsea3.afm
omarab.ovp
#pltotf omseco.pl omseco.tfm
#pltotf omsea1.pl omsea1.tfm
#pltotf omsea2.pl omsea2.tfm
#pltotf omsea3.pl omsea3.tfm
#ovp2ovf omarab.ovp omarab.ovf omarab.ofm
[If the last line does not work, try
#ovp2ovf omarab.ovp omarab.ovf omarab.tfm
rename omarab.tfm to omarab.ofm ===> ofm directory]
-----------------------------
[How to distill omarab.ovp from omarab.ovf:]
[Use a different directory or a different name for
the output ovp so that omarab.ovp created above is not overwritten]
[get omarab.ofm & rename to omarab.tfm]
#ovf2ovp omarab.ovf omarab.tfm omarab.ovp
=========================================================
[How to cook omarabb.ovf:]
[Ingredients: omarab.cfg, omsecob.afm, omsea1b.afm, omsea2b.afm,
omsea3b.afm]
#perl makeovp.pl omarab.cfg omsecob.afm omsea1b.afm omsea2b.afm
omsea3b.afm omarabb.ovp
#pltotf omsecob.pl omsecob.tfm
#pltotf omsea1b.pl omsea1b.tfm
#pltotf omsea2b.pl omsea2b.tfm
#pltotf omsea3b.pl omsea3b.tfm
#ovp2ovf omarabb.ovp omarabb.ovf omarabb.ofm
[If the last line does not work, try
#ovp2ovf omarabb.ovp omarabb.ovf omarabb.tfm
rename omarab.tfm to omarab.ofm ===> ofm directory]
-----------------------------
[How to distill omarabb.ovp from omarabb.ovf:]
[Use a different directory or a different name for
the output ovp so that omarabb.ovp created above is not overwritten]
[get omarab.ofm & rename to omarab.tfm]
#ovf2ovp omarabb.ovf omarabb.tfm omarabb.ovp
==============================================
3. For more info, there is also the (mostly cryptic) Omega manual:
http://omega.enstb.org/roadmap/doc-1.12.ps
Don't ask me why it's not in pdf-(
See also
http://omega.enstb.org/papers/tsukuba-arabic97.pdf
IV. Misc.
1. Some people have gotten large opentype fonts to work in Aleph/Omega.
Probably they used FontForge to convert to CFF-enriched type1. FF can
produce ofm files (large tfms) so that's a help too.
2. Me, I'm working on an advanced Arabic-script typesetting system that
really pushes Aleph to the max. At present I don't actually use m-gamma,
etc, but my own macros. I really hope to release something this year...
3. See also
http://www.dtek.chalmers.se/~d97ost/omega-example.html
V. To the future:
1. The otp mechanism does not seem well suited to support, e.g., opentype
GPOS tables, important for really advanced Arabic (though GDEF and GSUB
should work fine with the present mechanism for most purposes). We need a
better model for horizontal and vertical glyph substitutions.
2. The low-level filtersequence mechanism needs to abstract language
processing from font mapping. Right now both are hardwired into a single
sequence, so setting up more than one font for a single language is more
of a pain than it should be.
3. The otp language is a bit cryptic. Hans has suggested switching otp's
to a new language (like lua or io) but I don't know how hard that will
be...
4. One very important feature which may work better at the
primitive/engine level by extending the pdfetex engine:
glyph substitution that depends on the paragraph. For example: In
traditional Arabic typography, one way to
compensate for "underfull" paragraphs is to substitute a "swash" version
of a letter. Another way is by
stretching the cursive tie between joining characters (which is already
implemented in my own Arabic
system). Combined with HZ we can get some pretty interesting high-level
options, effects, etc. that the user can choose etc.
==============================================
==============================================
Ok, there is your (almost?) complete guide to getting going with aleph.
Feel free to make suggestions for improving this document. I hope you all
find it useful. Again, we need a volunteer to edit this for the wiki, and
a place to upload the utilities.
All the Best
Idris
--
Professor Idris Samawi Hamid
Department of Philosophy
Colorado State University
Fort Collins, CO 80523
--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
next reply other threads:[~2006-01-18 22:47 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-01-18 22:47 Idris Samawi Hamid [this message]
2006-01-18 22:59 ` Idris Samawi Hamid
2006-01-18 23:48 ` Henning Hraban Ramm
2006-01-19 0:05 ` Idris Samawi Hamid
2006-01-19 0:57 ` Henning Hraban Ramm
2006-01-19 3:56 ` Idris Samawi Hamid
2006-01-19 0:25 ` Idris Samawi Hamid
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=op.s3lix1iinx1yh1@walayah1.wildblue.com \
--to=ishamid@colostate.edu \
--cc=ntg-context@ntg.nl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).