ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* Arabic index entries
@ 2008-06-20  0:23 Khaled Hosny
  2008-06-20  1:38 ` Idris Samawi Hamid
  2008-06-20  8:00 ` Hans Hagen
  0 siblings, 2 replies; 9+ messages in thread
From: Khaled Hosny @ 2008-06-20  0:23 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1.1: Type: text/plain, Size: 274 bytes --]

Arabic index entries are all listed under "unknown" instead of its
respective Arabic letters. I'm not sure if this is a bug or a
misconfiguration from my side. See the attached example.

Regards,

-- 
 Khaled Hosny
 Arabic localizer and member of Arabeyes.org team

[-- Attachment #1.1.2: index_ar.tex --]
[-- Type: text/x-tex, Size: 2126 bytes --]

% engine=luatex 
%
% Font
\definefontfeature
   [arabic]
   [mode=node,language=dflt,script=arab,
    init=yes,medi=yes,fina=yes,isol=yes,
    liga=yes,dlig=yes,rlig=yes,clig=yes,
    mark=yes,mkmk=yes,kern=yes,curs=yes]

\starttypescript [serif] [arabic]
  \definefontsynonym [Arabic-Light]          [name:arabtype]     [features=arabic]
  \definefontsynonym [Arabic-Bold]           [name:arabtype]      [features=arabic]
  \definefontsynonym [Arabic-Italic]         [name:arabtype]     [features=arabic]
  \definefontsynonym [Arabic-Bold-Italic]    [name:arabtype]      [features=arabic]
 \stoptypescript

\starttypescript [serif] [arabic] [name]
  \usetypescript[serif][fallback]
  \definefontsynonym [Serif]                   [Arabic-Light]                   [features=arabic]
  \definefontsynonym [SerifItalic]             [Arabic-Italic]                  [features=arabic]
  \definefontsynonym [SerifBold]               [Arabic-Bold]                    [features=arabic]
  \definefontsynonym [SerifBoldItalic]         [Arabic-Bold-Italic]             [features=arabic]
\stoptypescript

\starttypescript [Arabic]
  \definetypeface [Arabic] [rm] [serif] [arabic] [default] 
\stoptypescript 

\usetypescript[Arabic]
\setupbodyfont[Arabic,20pt]

% directionality
\pagedir TRT\bodydir TRT\pardir TRT\textdir TRT
%
\setcharactermirroring[1]

% hyperlinks
\setupinteraction
  [state=start,
   color=red,
   style=bold]

\starttext
\startstandardmakeup
  \midaligned{كيف تكتب عربي في كنتكست}
  \midaligned{كتبه}
  \midaligned{خالد حسني}
\stopstandardmakeup
\completecontent
%\placecontent
\chapter{مقدمة}
... كلام\index{عنصر فهرس} ...
\chapter{أول فصل}
\section[firstsection]{أول باب}
... كلام ...
\section{ثاني باب}
... كلام\index{فهرس آخر} ...
\subsection{ثاني مسألة}
... كلام\index{index entry} ...
\section{ثالث باب}
... كلام ...
\chapter{فصل آخر}
... كلام ...
\chapter[lastchapter]{آخر فصل}
... كلام ...
\completeindex
\stoptext

[-- Attachment #1.1.3: index_ar.pdf --]
[-- Type: application/pdf, Size: 27537 bytes --]

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 487 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Arabic index entries
  2008-06-20  0:23 Arabic index entries Khaled Hosny
@ 2008-06-20  1:38 ` Idris Samawi Hamid
  2008-06-20  6:14   ` Wolfgang Schuster
  2008-06-20  7:34   ` Hans Hagen
  2008-06-20  8:00 ` Hans Hagen
  1 sibling, 2 replies; 9+ messages in thread
From: Idris Samawi Hamid @ 2008-06-20  1:38 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Thu, 19 Jun 2008 18:23:05 -0600, Khaled Hosny <khaledhosny@eglug.org>  
wrote:

> Arabic index entries are all listed under "unknown" instead of its
> respective Arabic letters. I'm not sure if this is a bug or a
> misconfiguration from my side. See the attached example.

We need to include arabic-farsi-urdu etc. databases in the distro. If Hans  
can tell us what file to emulate/edit etc....

Best wishes
Idris

-- 
Professor Idris Samawi Hamid, Editor-in-Chief
International Journal of Shi`i Studies
Department of Philosophy
Colorado State University
Fort Collins, CO 80523
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Arabic index entries
  2008-06-20  1:38 ` Idris Samawi Hamid
@ 2008-06-20  6:14   ` Wolfgang Schuster
  2008-06-21  1:06     ` Idris Samawi Hamid
  2008-06-20  7:34   ` Hans Hagen
  1 sibling, 1 reply; 9+ messages in thread
From: Wolfgang Schuster @ 2008-06-20  6:14 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Fri, Jun 20, 2008 at 3:38 AM, Idris Samawi Hamid
<ishamid@colostate.edu> wrote:
> On Thu, 19 Jun 2008 18:23:05 -0600, Khaled Hosny <khaledhosny@eglug.org>
> wrote:
>
>> Arabic index entries are all listed under "unknown" instead of its
>> respective Arabic letters. I'm not sure if this is a bug or a
>> misconfiguration from my side. See the attached example.
>
> We need to include arabic-farsi-urdu etc. databases in the distro. If Hans
> can tell us what file to emulate/edit etc....

Index sorting in MkIV works currently only for english, dutch, czech and german.

Take a look at "sort-lan.lua" to know what you have to do.

Wolfgang
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Arabic index entries
  2008-06-20  1:38 ` Idris Samawi Hamid
  2008-06-20  6:14   ` Wolfgang Schuster
@ 2008-06-20  7:34   ` Hans Hagen
  2008-06-20 16:02     ` Khaled Hosny
  1 sibling, 1 reply; 9+ messages in thread
From: Hans Hagen @ 2008-06-20  7:34 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Idris Samawi Hamid wrote:
> On Thu, 19 Jun 2008 18:23:05 -0600, Khaled Hosny <khaledhosny@eglug.org>  
> wrote:
> 
>> Arabic index entries are all listed under "unknown" instead of its
>> respective Arabic letters. I'm not sure if this is a bug or a
>> misconfiguration from my side. See the attached example.
> 
> We need to include arabic-farsi-urdu etc. databases in the distro. If Hans  
> can tell us what file to emulate/edit etc....

first we need to discuss the logic ... say that we have a sequence of 
chars ... do we need to erase the vowels? etc


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Arabic index entries
  2008-06-20  0:23 Arabic index entries Khaled Hosny
  2008-06-20  1:38 ` Idris Samawi Hamid
@ 2008-06-20  8:00 ` Hans Hagen
  1 sibling, 0 replies; 9+ messages in thread
From: Hans Hagen @ 2008-06-20  8:00 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Khaled Hosny wrote:
> Arabic index entries are all listed under "unknown" instead of its
> respective Arabic letters. I'm not sure if this is a bug or a
> misconfiguration from my side. See the attached example.

btw, some of these things have to wait till i have adapted mkiv in a 
more rigourous way.

for instance i'm currently rewriting a sectioning code which is related 
to lists; in lists we need to let language and such into travel with the 
entries; the same is true for the index

Hans


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Arabic index entries
  2008-06-20  7:34   ` Hans Hagen
@ 2008-06-20 16:02     ` Khaled Hosny
  2008-06-20 16:38       ` Hans Hagen
  2008-06-20 17:17       ` Charles P. Schaum
  0 siblings, 2 replies; 9+ messages in thread
From: Khaled Hosny @ 2008-06-20 16:02 UTC (permalink / raw)
  To: mailing list for ConTeXt users


[-- Attachment #1.1: Type: text/plain, Size: 1090 bytes --]

On Fri, Jun 20, 2008 at 09:34:40AM +0200, Hans Hagen wrote:
> Idris Samawi Hamid wrote:
> > On Thu, 19 Jun 2008 18:23:05 -0600, Khaled Hosny <khaledhosny@eglug.org>  
> > wrote:
> > 
> >> Arabic index entries are all listed under "unknown" instead of its
> >> respective Arabic letters. I'm not sure if this is a bug or a
> >> misconfiguration from my side. See the attached example.
> > 
> > We need to include arabic-farsi-urdu etc. databases in the distro. If Hans  
> > can tell us what file to emulate/edit etc....
> 
> first we need to discuss the logic ... say that we have a sequence of 
> chars ... do we need to erase the vowels? etc

Erase vowels as in not counting them? Then yes we should only respect
full letters. We might need also need to strip the Arabic definite
article "ال", but this will be tricky since there are words that start
with it. May be we better have syntax like \index[a]{entry} where this
entry will be under "a", or we already have this?

Regards,
 Khaled

-- 
 Khaled Hosny
 Arabic localizer and member of Arabeyes.org team

[-- Attachment #1.2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 487 bytes --]

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Arabic index entries
  2008-06-20 16:02     ` Khaled Hosny
@ 2008-06-20 16:38       ` Hans Hagen
  2008-06-20 17:17       ` Charles P. Schaum
  1 sibling, 0 replies; 9+ messages in thread
From: Hans Hagen @ 2008-06-20 16:38 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Khaled Hosny wrote:
> On Fri, Jun 20, 2008 at 09:34:40AM +0200, Hans Hagen wrote:
>> Idris Samawi Hamid wrote:
>>> On Thu, 19 Jun 2008 18:23:05 -0600, Khaled Hosny <khaledhosny@eglug.org>  
>>> wrote:
>>>
>>>> Arabic index entries are all listed under "unknown" instead of its
>>>> respective Arabic letters. I'm not sure if this is a bug or a
>>>> misconfiguration from my side. See the attached example.
>>> We need to include arabic-farsi-urdu etc. databases in the distro. If Hans  
>>> can tell us what file to emulate/edit etc....
>> first we need to discuss the logic ... say that we have a sequence of 
>> chars ... do we need to erase the vowels? etc
> 
> Erase vowels as in not counting them? Then yes we should only respect
> full letters. We might need also need to strip the Arabic definite
> article "ال", but this will be tricky since there are words that start
> with it. May be we better have syntax like \index[a]{entry} where this
> entry will be under "a", or we already have this?

you can provide an optional sort key indeed

Hans


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
      tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                              | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Arabic index entries
  2008-06-20 16:02     ` Khaled Hosny
  2008-06-20 16:38       ` Hans Hagen
@ 2008-06-20 17:17       ` Charles P. Schaum
  1 sibling, 0 replies; 9+ messages in thread
From: Charles P. Schaum @ 2008-06-20 17:17 UTC (permalink / raw)
  To: mailing list for ConTeXt users

The issues of indexing, &c., probably fall into two issues:

a. Is is something European-derivative in reference to a work or
b. It is something entirely for "native-speaking" use and expectations?

I've been on the Ivritex list for quite a while and there has been some
long-running issues on how to deal with mixed versus pure texts and what
people ought expect. I have seen considerable variance in Hebrew
materials from the latter nineteenth-century to today in which they, for
example, consider the ex-height in relation to superdiacritica and
subdiacritica from nikkud to cantillation. They have had to tackle the
issues of handling a mixed-versus non-mixed language text. It's
nontrivial.

Just from an historical perspective, at one time Latin and other
languages concatenated the articles to the words, for example, in the
nomenclature Alcoran for the Qur'an. Today in indexing (I have used
Cindex to do quite a few book indices) one generally drops the definite
and indefinite articles of most languages. Even in contents and chapter
headings, one aviods articles except in informal literature for
entertainment consumption. That may be language-dependent, for in German
and Greek one does have to use articles more than in English. Still, I
have seldom seen an index with arthrous forms in any language.

CPS

On Fri, 2008-06-20 at 19:02 +0300, Khaled Hosny wrote:
> On Fri, Jun 20, 2008 at 09:34:40AM +0200, Hans Hagen wrote:
> > Idris Samawi Hamid wrote:
> > > On Thu, 19 Jun 2008 18:23:05 -0600, Khaled Hosny <khaledhosny@eglug.org>  
> > > wrote:
> > > 
> > >> Arabic index entries are all listed under "unknown" instead of its
> > >> respective Arabic letters. I'm not sure if this is a bug or a
> > >> misconfiguration from my side. See the attached example.
> > > 
> > > We need to include arabic-farsi-urdu etc. databases in the distro. If Hans  
> > > can tell us what file to emulate/edit etc....
> > 
> > first we need to discuss the logic ... say that we have a sequence of 
> > chars ... do we need to erase the vowels? etc
> 
> Erase vowels as in not counting them? Then yes we should only respect
> full letters. We might need also need to strip the Arabic definite
> article "ال", but this will be tricky since there are words that start
> with it. May be we better have syntax like \index[a]{entry} where this
> entry will be under "a", or we already have this?
> 
> Regards,
>  Khaled
> 
> ___________________________________________________________________________________
> If your question is of interest to others as well, please add an entry to the Wiki!
> 
> maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
> webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
> archive  : https://foundry.supelec.fr/projects/contextrev/
> wiki     : http://contextgarden.net
> ___________________________________________________________________________________

___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Arabic index entries
  2008-06-20  6:14   ` Wolfgang Schuster
@ 2008-06-21  1:06     ` Idris Samawi Hamid
  0 siblings, 0 replies; 9+ messages in thread
From: Idris Samawi Hamid @ 2008-06-21  1:06 UTC (permalink / raw)
  To: mailing list for ConTeXt users

On Fri, 20 Jun 2008 00:14:49 -0600, Wolfgang Schuster  
<schuster.wolfgang@googlemail.com> wrote:

> "sort-lan.lua"

Thanks, Wolfgang!

Best wishes
Idris

-- 
Professor Idris Samawi Hamid, Editor-in-Chief
International Journal of Shi`i Studies
Department of Philosophy
Colorado State University
Fort Collins, CO 80523
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://tex.aanhet.net
archive  : https://foundry.supelec.fr/projects/contextrev/
wiki     : http://contextgarden.net
___________________________________________________________________________________


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-06-21  1:06 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-06-20  0:23 Arabic index entries Khaled Hosny
2008-06-20  1:38 ` Idris Samawi Hamid
2008-06-20  6:14   ` Wolfgang Schuster
2008-06-21  1:06     ` Idris Samawi Hamid
2008-06-20  7:34   ` Hans Hagen
2008-06-20 16:02     ` Khaled Hosny
2008-06-20 16:38       ` Hans Hagen
2008-06-20 17:17       ` Charles P. Schaum
2008-06-20  8:00 ` Hans Hagen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).