From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/58486 Path: news.gmane.org!not-for-mail From: Philipp Gesang Newsgroups: gmane.comp.tex.context Subject: sort-lan.lua nitpicks and sorting Date: Sun, 2 May 2010 15:59:53 +0200 Message-ID: <20100502135953.GA9875@aides> Reply-To: mailing list for ConTeXt users NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0226400460==" X-Trace: dough.gmane.org 1272808855 24803 80.91.229.12 (2 May 2010 14:00:55 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sun, 2 May 2010 14:00:55 +0000 (UTC) To: ntg-context@ntg.nl Original-X-From: ntg-context-bounces@ntg.nl Sun May 02 16:00:54 2010 connect(): No such file or directory Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from balder.ntg.nl ([195.12.62.10]) by lo.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1O8Zj3-0001Dn-5S for gctc-ntg-context-518@m.gmane.org; Sun, 02 May 2010 16:00:53 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 60A5AC9BC9; Sun, 2 May 2010 16:00:52 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 1YDEWyJe5PG8; Sun, 2 May 2010 16:00:49 +0200 (CEST) Original-Received: from balder.ntg.nl (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 0C31BC9B0D; Sun, 2 May 2010 16:00:49 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by balder.ntg.nl (Postfix) with ESMTP id 26CF2C9B0D for ; Sun, 2 May 2010 16:00:47 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at balder.ntg.nl Original-Received: from balder.ntg.nl ([127.0.0.1]) by localhost (balder.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id AlOsKyQlRHib for ; Sun, 2 May 2010 16:00:44 +0200 (CEST) Original-Received: from relay.uni-heidelberg.de (relay.uni-heidelberg.de [129.206.100.212]) by balder.ntg.nl (Postfix) with ESMTP id A4353C9AF5 for ; Sun, 2 May 2010 16:00:44 +0200 (CEST) Original-Received: from ix.urz.uni-heidelberg.de (cyrus-portal.urz.uni-heidelberg.de [129.206.100.176]) by relay.uni-heidelberg.de (8.14.1/8.14.1) with ESMTP id o42E0WiB002002 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sun, 2 May 2010 16:00:33 +0200 Original-Received: from extmail.urz.uni-heidelberg.de (extmail.urz.uni-heidelberg.de [129.206.100.140]) by ix.urz.uni-heidelberg.de (8.13.8/8.13.8) with ESMTP id o42E0gom007535 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sun, 2 May 2010 16:00:43 +0200 Original-Received: from localhost (mnhm-4d011547.pool.mediaWays.net [77.1.21.71]) (authenticated bits=0) by extmail.urz.uni-heidelberg.de (8.13.4/8.13.1) with ESMTP id o42E0f8c015950 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO) for ; Sun, 2 May 2010 16:00:42 +0200 X-Operating-System: Linux aides 2.6.34-rc3 User-Agent: Mutt/1.5.20 (2010-04-22) X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.12 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl Xref: news.gmane.org gmane.comp.tex.context:58486 Archived-At: --===============0226400460== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="gBBFr7Ir9EOA20Yy" Content-Disposition: inline --gBBFr7Ir9EOA20Yy Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi again, 1. In sort-lan.lua, line 101 should read =C2=AB['r'] =3D "r"=C2=BB, and lin= e 144 =C2=AB['r'] =3D 26, -- r=C2=BB. 2. Although I read the disclaimer about said file being =E2=80=9Cpreliminar= y and incomplete=E2=80=9D -- is there some rationale behind the range of integers= for each language mapping? The mapping for English goes from 1 to 51, interleaving 2 integers for each letter (which is odd because it should start from index 3 with =E2=80=9Ca=E2=80=9D, shouldn't it?), while the Czec= h one goes =66rom 1 to 40 without skipping, Finnish and Austrian from 1 to 58.=20 What about mapping them onto a larger but common scale that would alleviate multilingual sorting so that the alphabetical representation of the phoneme /a/ maps to the same value over different languages?=E2=80=A0 E.g. ["a"] =3D 3, -- in a Latin mapping, ["=CE=B1"] =3D 3, -- in Greek mapping, ["=D0=B0"] =3D 3, -- in a Russian mapping. 3. Is it intended that the digraph =E2=80=9Cch=E2=80=9D resolves (temporari= ly) to http://www.fileformat.info/info/unicode/char/ff01/index.htm according to line 72? Feel free to state more general opinions on the sorting topic as I am playing with different ways of sorting my bibliography. I will be glad about any advice, Philipp =E2=80=A0 I know this is impractical for many writing systems and even wi= thin the set of Latin or Greek based alphabets it largely depends on a given purpose how much precision you need in sorting. --=20 () ascii ribbon campaign - against html e-mail /\ www.asciiribbon.org - against proprietary attachments --gBBFr7Ir9EOA20Yy Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iEYEARECAAYFAkvdhVgACgkQ02lYlJYWs9KEawCgpPgAap6OPvVxFr9exIWnwIwj 0GsAnjpW/ta4oJVy9bCsas//gaRmpj/j =toXS -----END PGP SIGNATURE----- --gBBFr7Ir9EOA20Yy-- --===============0226400460== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : http://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________ --===============0226400460==--