ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* Re: Index sorting for other languages than English (2)
@ 2006-05-30  7:23 Richard Gabriel
  2006-05-30  8:00 ` Hans Hagen
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Gabriel @ 2006-05-30  7:23 UTC (permalink / raw)



[-- Attachment #1.1: Type: text/plain, Size: 1687 bytes --]

I'd suggest you to use the extended variant of the \index macro. There you can specify an ASCII equivalent of the word, which will be used for sorting:

\index[soz kesmek]{s\"oz kesmek}
\index[seref]{\c seref}

-Richard


  _____  

From: "R. Ermers" [mailto:r.ermers@hccnet.nl]
To: mailing list for ConTeXt users [mailto:ntg-context@ntg.nl]
Sent: Tue, 30 May 2006 08:43:01 +0200
Subject: [NTG-context] Index sorting for other languages than English (2)

Hi all,
  
  I have a document in Dutch (\mainlanguage[nl]) in which I quote Turkish 
  items, which I want to collect in a separate index, like this:
  
  "Enkele voorbeelden zijn: \quote{oudere zus} \turkish{abla}, 
  \quote{jongere broer of zus} \turkish{karde\c{s}}, de \quote{zus van 
  vader} (\quote{tante}) \turkish{hala, \quote{de zus van moeder} 
  \turkish{teyze}. Voor aangetrouwde familieleden gelden soms juist vagere 
  termen dan in het Nederlands, bijv. \quote{aangetrouwde tante} en 
  \quote{schoonzuster}, \turkish{yenge}."
  
  The index, however, is based on Dutch (mainlanguage). This causes two 
  problems:
  
  1. words with accents, like s\"oz, are not sorted correctly to any standard:
  S
  söz kesmek 76
  saygı 14
  s¸eref 3, 14, 24, 27
  
  2. letters with diacritics, like \c{s} (under which \c{s}eref is to be 
  placed) are not included in the alphabetical listing in the index, which 
  of course follows the Dutch alphabet.
  
  Does anyone have a solution?
  
  Regards,
  
  Robert
  
  
  _______________________________________________
  ntg-context mailing list
  ntg-context@ntg.nl
  http://www.ntg.nl/mailman/listinfo/ntg-context
    

[-- Attachment #1.2: Type: text/html, Size: 2309 bytes --]

[-- Attachment #2: Type: text/plain, Size: 139 bytes --]

_______________________________________________
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context

^ permalink raw reply	[flat|nested] 4+ messages in thread
* Re: Index sorting for other languages that English
@ 2006-05-24 10:11 Richard Gabriel
  2006-05-24 15:55 ` Hans Hagen
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Gabriel @ 2006-05-24 10:11 UTC (permalink / raw)



[-- Attachment #1.1: Type: text/plain, Size: 3014 bytes --]

Thanks Hans, it works with my test file, 
unless I set up:

\setupregister[index][expansion=xml]

which i need for correct processing of the XML files.
If I simply add this command into the testing TeX file (no XML), the Czech sorting stops to work and all accented characters are placed under "A".

Regarding the sorting itself (sort-lan.tex): 
I found the definiton of the sorting quite strange, let's say, incomplete. 
It makes no sense to separate ccaron while all other accented letters are placed under the unaccented ones.
I'll update the definitions, test it and send it to you.


-Richard



  _____  

From: Hans Hagen [mailto:pragma@wxs.nl]
To: mailing list for ConTeXt users [mailto:ntg-context@ntg.nl]
Sent: Tue, 23 May 2006 17:02:53 +0200
Subject: Re: [NTG-context] Index sorting for other languages that English

Richard Gabriel wrote:
  > Hello Hans,
  >
  > after an upgrade I noticed thar the index sorting works even worse 
  > than before (tested on Czech, Chinese and Japanese, but probably 
  > related to non-ASCII characters in common).
  >
  > With TeXExec 5.4.3, all words beginning with national (accented) 
  > characters were put into a separate ("symbols") group and placed 
  > before "A". This was not good but more or less acceptable.
  > With TeXExec 6.2.0, words beginning with accented characters are 
  > placed under certain unaccented letter. My colleague found out that 
  > these words are sorted according the first unaccented letter. This is 
  > unacceptable and unusable.
  >
  > We do a "work-around" so we try to avoid indexing words beginning with 
  > accented charaters. But it's impossible in many cases.
  > I'd like to ask you to improve the index sorting. Could I help or 
  > contribute in some way?
  >
  > Attached is a testing file, which creates 2 indexes from various Czech 
  > words (covering the Czech alphabet). The index should be sorted 
  > exactly that way as the terms are written in the file.
  >
  actually the nex texexec implementation does czech sorting but it's not enables yet in context itself (was experimental until now) 
  
  - download the latest version (i uploaded a version that enables it) 
  - don't forget \mainlanguage[cz] at the top of your document 
  - in sort-lan.tex you can see how czech sorting is defined 
  
  (context adds a lot of into to the tui file in order to get sorting done) 
  
  -----------------------------------------------------------------
                                            Hans Hagen | PRAGMA ADE
                Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
       tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com
                                               | www.pragma-pod.nl
  -----------------------------------------------------------------
  
  _______________________________________________
  ntg-context mailing list
  ntg-context@ntg.nl
  http://www.ntg.nl/mailman/listinfo/ntg-context
    

[-- Attachment #1.2: Type: text/html, Size: 3742 bytes --]

[-- Attachment #2: Type: text/plain, Size: 139 bytes --]

_______________________________________________
ntg-context mailing list
ntg-context@ntg.nl
http://www.ntg.nl/mailman/listinfo/ntg-context

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-05-30  8:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-05-30  7:23 Index sorting for other languages than English (2) Richard Gabriel
2006-05-30  8:00 ` Hans Hagen
  -- strict thread matches above, loose matches on Subject: below --
2006-05-24 10:11 Index sorting for other languages that English Richard Gabriel
2006-05-24 15:55 ` Hans Hagen
2006-05-30  6:43   ` Index sorting for other languages than English (2) R. Ermers
2006-05-30  7:35     ` Hans Hagen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).