From: Philipp Gesang <pgesang@ix.urz.uni-heidelberg.de>
To: ntg-context@ntg.nl
Subject: sort-lan.lua nitpicks and sorting
Date: Sun, 2 May 2010 15:59:53 +0200 [thread overview]
Message-ID: <20100502135953.GA9875@aides> (raw)
[-- Attachment #1.1: Type: text/plain, Size: 1532 bytes --]
Hi again,
1. In sort-lan.lua, line 101 should read «['r'] = "r"», and line 144
«['r'] = 26, -- r».
2. Although I read the disclaimer about said file being “preliminary and
incomplete” -- is there some rationale behind the range of integers for
each language mapping? The mapping for English goes from 1 to 51,
interleaving 2 integers for each letter (which is odd because it should
start from index 3 with “a”, shouldn't it?), while the Czech one goes
from 1 to 40 without skipping, Finnish and Austrian from 1 to 58.
What about mapping them onto a larger but common scale that would
alleviate multilingual sorting so that the alphabetical representation
of the phoneme /a/ maps to the same value over different languages?†
E.g.
["a"] = 3, -- in a Latin mapping,
["α"] = 3, -- in Greek mapping,
["а"] = 3, -- in a Russian mapping.
3. Is it intended that the digraph “ch” resolves (temporarily) to
http://www.fileformat.info/info/unicode/char/ff01/index.htm according to
line 72?
Feel free to state more general opinions on the sorting topic as I am
playing with different ways of sorting my bibliography. I will be glad
about any advice,
Philipp
† I know this is impractical for many writing systems and even within
the set of Latin or Greek based alphabets it largely depends on a given
purpose how much precision you need in sorting.
--
() ascii ribbon campaign - against html e-mail
/\ www.asciiribbon.org - against proprietary attachments
[-- Attachment #1.2: Type: application/pgp-signature, Size: 198 bytes --]
[-- Attachment #2: Type: text/plain, Size: 486 bytes --]
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage : http://www.pragma-ade.nl / http://tex.aanhet.net
archive : http://foundry.supelec.fr/projects/contextrev/
wiki : http://contextgarden.net
___________________________________________________________________________________
next reply other threads:[~2010-05-02 13:59 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-02 13:59 Philipp Gesang [this message]
2010-05-03 7:35 ` Philipp Gesang
2010-05-07 9:21 ` Hans Hagen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100502135953.GA9875@aides \
--to=pgesang@ix.urz.uni-heidelberg.de \
--cc=ntg-context@ntg.nl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).