From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.tex.context/42273 Path: news.gmane.org!not-for-mail From: Arthur Reutenauer Newsgroups: gmane.comp.tex.context Subject: Re: Finish register sorting (was LuaTeX problems) Date: Wed, 9 Jul 2008 15:52:23 +0200 Message-ID: <20080709135223.GX4532@phare.normalesup.org> References: <115224fb0807030035s75096281ofda93c0a39ad7039@mail.gmail.com> <115224fb0807032351v19013d47x9e65a5ee8d340070@mail.gmail.com> <486DD357.2060604@wxs.nl> <115224fb0807090031j720a3372p81b9410051ba8e68@mail.gmail.com> Reply-To: Mailing list for ConTeXt users NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="z6Eq5LdranGa6ru8" Content-Transfer-Encoding: 7bit X-Trace: ger.gmane.org 1215611735 27748 80.91.229.12 (9 Jul 2008 13:55:35 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 9 Jul 2008 13:55:35 +0000 (UTC) To: Mailing list for ConTeXt users Original-X-From: ntg-context-bounces@ntg.nl Wed Jul 09 15:56:22 2008 Return-path: Envelope-to: gctc-ntg-context-518@m.gmane.org Original-Received: from ronja.vet.uu.nl ([131.211.172.88] helo=ronja.ntg.nl) by lo.gmane.org with esmtp (Exim 4.50) id 1KGa92-0001qw-Vq for gctc-ntg-context-518@m.gmane.org; Wed, 09 Jul 2008 15:55:45 +0200 Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id B295B1FDDE; Wed, 9 Jul 2008 15:54:46 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 30943-09-10; Wed, 9 Jul 2008 15:54:09 +0200 (CEST) Original-Received: from ronja.vet.uu.nl (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 7AA9F1FD4A; Wed, 9 Jul 2008 15:54:07 +0200 (CEST) Original-Received: from localhost (localhost [127.0.0.1]) by ronja.ntg.nl (Postfix) with ESMTP id 0F69B1FD4A for ; Wed, 9 Jul 2008 15:54:05 +0200 (CEST) Original-Received: from ronja.ntg.nl ([127.0.0.1]) by localhost (smtp.ntg.nl [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 00194-01-11 for ; Wed, 9 Jul 2008 15:53:26 +0200 (CEST) Original-Received: from nef2.ens.fr (nef2.ens.fr [129.199.96.40]) by ronja.ntg.nl (Postfix) with ESMTP id 9549D1FC6A for ; Wed, 9 Jul 2008 15:52:24 +0200 (CEST) Original-Received: from phare.normalesup.org (phare.normalesup.org [129.199.129.80]) by nef2.ens.fr (8.13.6/1.01.28121999) with ESMTP id m69DqNpR094025 for ; Wed, 9 Jul 2008 15:52:23 +0200 (CEST) X-Envelope-To: Original-Received: by phare.normalesup.org (Postfix, from userid 1008) id 607B3BC09B; Wed, 9 Jul 2008 15:52:23 +0200 (CEST) Content-Disposition: inline In-Reply-To: <115224fb0807090031j720a3372p81b9410051ba8e68@mail.gmail.com> User-Agent: Mutt/1.5.17 (2007-11-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.1.4 (nef2.ens.fr [129.199.96.32]); Wed, 09 Jul 2008 15:52:23 +0200 (CEST) X-Virus-Scanned: amavisd-new at ntg.nl X-BeenThere: ntg-context@ntg.nl X-Mailman-Version: 2.1.9 Precedence: list List-Id: mailing list for ConTeXt users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: ntg-context-bounces@ntg.nl Errors-To: ntg-context-bounces@ntg.nl X-Virus-Scanned: amavisd-new at ntg.nl Xref: news.gmane.org gmane.comp.tex.context:42273 Archived-At: --z6Eq5LdranGa6ru8 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable > [51] =3D "z", [53] =3D "=C3=83=C2=A5", [55] =3D "=C3=83=C2=A4", [57] =3D= "=C3=83=C2=B6", Indeed, the UTF-8 encoding has been badly interpreted as Windows-1252, it seems (and then recoded back in UTF-8 :-) I attach the correctly encoded file (I also corrected =E2=80=9Cfinish=E2=80=9D :-)=20 Arthur --z6Eq5LdranGa6ru8 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: attachment; filename="sort-lan.lua" Content-Transfer-Encoding: quoted-printable -- filename : sort-lan.lua -- comment : companion to sort-lan.tex -- author : Hans Hagen, PRAGMA-ADE, Hasselt NL -- copyright: PRAGMA ADE / ConTeXt Development Team -- license : see context related readme files if not versions then versions =3D { } end versions['sort-lan'] =3D 1.001 -- this is a rather preliminary and incomplete file -- maybe we should load this kind of stuff runtime -- english -- The next one can be more efficient when not indexed this way, but -- other languages are sparse so for the moment we keep this one. sorters.entries['en'] =3D { [ 1] =3D "a", [ 3] =3D "b", [ 5] =3D "c", [ 7] =3D "d", [ 9] =3D "e", [11] =3D "f", [13] =3D "g", [15] =3D "h", [17] =3D "i", [19] =3D "j", [21] =3D "k", [23] =3D "l", [25] =3D "m", [27] =3D "n", [29] =3D "o", [31] =3D "p", [33] =3D "q", [35] =3D "r", [37] =3D "s", [39] =3D "t", [41] =3D "u", [43] =3D "v", [45] =3D "w", [47] =3D "x", [49] =3D "y", [51] =3D "z", [ 2] =3D 1, [ 4] =3D 3, [ 6] =3D 5, [ 8] =3D 7, [10] =3D 9, [12] =3D 11, [14] =3D 13, [16] =3D 15, [18] =3D 17, [20] =3D 19, [22] =3D 21, [24] =3D 23, [26] =3D 25, [28] =3D 27, [30] =3D 29, [32] =3D 31, [34] =3D 33, [36] =3D 35, [38] =3D 37, [40] =3D 39, [42] =3D 41, [44] =3D 43, [46] =3D 45, [48] =3D 47, [50] =3D 49, [52] =3D 51, } sorters.mappings['en'] =3D { ["a"] =3D 1, ["b"] =3D 3, ["c"] =3D 5, ["d"] =3D 7, ["e"] =3D 9, ["f"] =3D 11, ["g"] =3D 13, ["h"] =3D 15, ["i"] =3D 17, ["j"] =3D 19, ["k"] =3D 21, ["l"] =3D 23, ["m"] =3D 25, ["n"] =3D 27, ["o"] =3D 29, ["p"] =3D 31, ["q"] =3D 33, ["r"] =3D 35, ["s"] =3D 37, ["t"] =3D 39, ["u"] =3D 41, ["v"] =3D 43, ["w"] =3D 45, ["x"] =3D 47, ["y"] =3D 49, ["z"] =3D 51, ["A"] =3D 2, ["B"] =3D 4, ["C"] =3D 6, ["D"] =3D 8, ["E"] =3D 10, ["F"] =3D 12, ["G"] =3D 14, ["H"] =3D 16, ["I"] =3D 18, ["J"] =3D 20, ["K"] =3D 22, ["L"] =3D 24, ["M"] =3D 26, ["N"] =3D 28, ["O"] =3D 30, ["P"] =3D 32, ["Q"] =3D 34, ["R"] =3D 36, ["S"] =3D 38, ["T"] =3D 40, ["U"] =3D 42, ["V"] =3D 44, ["W"] =3D 46, ["X"] =3D 48, ["Y"] =3D 50, ["Z"] =3D 52, } -- dutch sorters.replacements['nl'] =3D { { "ij", 'y' }, { "IJ", 'Y' } } sorters.entries ['nl'] =3D sorters.entries ['en'] sorters.mappings ['nl'] =3D sorters.mappings['en'] -- czech local uc =3D unicode.utf8.char local ub =3D unicode.utf8.byte sorters.replacements['cz'] =3D { [1] =3D { "ch", uc(0xFF01) } } sorters.entries['cz'] =3D { [ 1] =3D "a", [ 2] =3D 1, [ 3] =3D "b", [ 4] =3D "c", [ 5] =3D uc(0x010D), -- ccaron [ 6] =3D "d", [ 7] =3D uc(0x010F), -- dcaron [ 8] =3D "e", [ 9] =3D 8, [10] =3D 8, [11] =3D "f", [12] =3D "g", [13] =3D "h", [14] =3D "ch", [15] =3D "i", [16] =3D 15, [17] =3D "j", [18] =3D "k", [19] =3D "l", [20] =3D "m", [21] =3D "n", [22] =3D uc(0x0147), -- ncaron [23] =3D "o", [24] =3D "p", [25] =3D "q", [26] =3D "r", [27] =3D uc(0x0147), -- rcaron [28] =3D "s", [29] =3D uc(0x0161), -- scaron [30] =3D "t", [31] =3D uc(0x0165), -- tcaron [32] =3D "u", [33] =3D 32, [34] =3D 32, [35] =3D "v", [36] =3D "w", [37] =3D "x", [38] =3D "y", [49] =3D "z", [40] =3D uc(0x017E), -- zcaron } sorters.mappings['cz'] =3D { ['a'] =3D 1, -- a [uc(0x00E1)] =3D 2, -- aacute ['b'] =3D 3, -- b ['c'] =3D 4, -- c [uc(0x010D)] =3D 5, -- ccaron ['d'] =3D 6, -- d [uc(0x010F)] =3D 7, -- dcaron ['e'] =3D 8, -- e [uc(0x00E9)] =3D 9, -- eacute [uc(0x011B)] =3D 10, -- ecaron ['f'] =3D 11, -- f ['g'] =3D 12, -- g ['h'] =3D 13, -- h [uc(0xFF01)] =3D 14, -- ch ['i'] =3D 15, -- i [uc(0x00ED)] =3D 16, -- iacute ['j'] =3D 17, -- j ['k'] =3D 18, -- k ['l'] =3D 19, -- l ['m'] =3D 20, -- m ['n'] =3D 21, -- n [uc(0x0147)] =3D 22, -- ncaron ['o'] =3D 23, -- o ['p'] =3D 24, -- p ['q'] =3D 25, -- q ['s'] =3D 26, -- r [uc(0x0147)] =3D 27, -- rcaron ['s'] =3D 28, -- s [uc(0x0161)] =3D 29, -- scaron ['t'] =3D 30, -- t [uc(0x0165)] =3D 31, -- tcaron ['u'] =3D 32, -- u [uc(0x00FA)] =3D 33, -- uacute [uc(0x01F6)] =3D 34, -- uring ['v'] =3D 35, -- v ['w'] =3D 36, -- w ['x'] =3D 37, -- x ['y'] =3D 38, -- y ['z'] =3D 49, -- z [uc(0x017E)] =3D 40, -- zcaron } -- German (by Wolfgang Schuster) -- DIN 5007-1 sorters.entries ['DIN 5007-1'] =3D sorters.entries ['en'] sorters.mappings ['DIN 5007-1'] =3D sorters.mappings['en'] -- DIN 5007-2 sorters.replacements['DIN 5007-2'] =3D { { "=E4", 'ae' }, { "=F6", 'oe' }, { "=FC", 'ue' }, { "=C4", 'Ae' }, { "=D6", 'Oe' }, { "=DC", 'Ue' } } sorters.entries ['DIN 5007-2'] =3D sorters.entries ['en'] sorters.mappings ['DIN 5007-2'] =3D sorters.mappings['en'] -- Duden sorters.replacements['Duden'] =3D { { "=DF", 's' } } sorters.entries ['Duden'] =3D sorters.entries ['en'] sorters.mappings ['Duden'] =3D sorters.mappings['en'] -- new german sorters.entries ['de'] =3D sorters.entries ['en'] sorters.mappings ['de'] =3D sorters.mappings['en'] -- old german sorters.entries ['deo'] =3D sorters.entries ['de'] sorters.mappings ['deo'] =3D sorters.mappings['de'] -- german - Germany sorters.entries ['de-DE'] =3D sorters.entries ['de'] sorters.mappings ['de-DE'] =3D sorters.mappings['de'] -- german - Swiss sorters.entries ['de-CH'] =3D sorters.entries ['de'] sorters.mappings ['de-CH'] =3D sorters.mappings['de'] -- german - Austria sorters.entries['de-AT'] =3D { [ 1] =3D "a", [ 3] =3D 1, [ 5] =3D "b", [ 7] =3D "c", [ 9] =3D "d", [11] =3D "e", [13] =3D "f", [15] =3D "g", [17] =3D "h", [19] =3D "i", [21] =3D "j", [23] =3D "k", [25] =3D "l", [27] =3D "m", [29] =3D "n", [31] =3D "o", [33] =3D 31, [35] =3D "p", [37] =3D "q", [39] =3D "r", [41] =3D "s", [43] =3D "t", [45] =3D "u", [47] =3D 45, [49] =3D "v", [51] =3D "w", [53] =3D "y", [55] =3D "y", [57] =3D "z", [ 2] =3D 1, [ 4] =3D 3, [ 6] =3D 5, [ 8] =3D 7, [10] =3D 9, [12] =3D 11, [14] =3D 13, [16] =3D 15, [18] =3D 17, [20] =3D 19, [22] =3D 21, [24] =3D 23, [26] =3D 25, [28] =3D 27, [30] =3D 29, [32] =3D 31, [34] =3D 33, [36] =3D 35, [38] =3D 37, [40] =3D 39, [42] =3D 41, [44] =3D 43, [46] =3D 45, [48] =3D 47, [50] =3D 49, [52] =3D 51, [54] =3D 53, [56] =3D 55, [58] =3D 57, } sorters.mappings['de-AT'] =3D { ["a"] =3D 1, ["=E4"] =3D 3, ["b"] =3D 5, ["c"] =3D 7, ["d"] =3D = 9, ["e"] =3D 11, ["f"] =3D 13, ["g"] =3D 15, ["h"] =3D 17, ["i"] =3D 19, ["j"] =3D 21, ["k"] =3D 23, ["l"] =3D 25, ["m"] =3D 27, ["n"] =3D 29, ["o"] =3D 31, ["=F6"] =3D 33, ["p"] =3D 35, ["q"] =3D 37, ["r"] =3D 3= 9, ["s"] =3D 41, ["t"] =3D 43, ["u"] =3D 45, ["=FC"] =3D 47, ["v"] =3D 4= 9, ["w"] =3D 51, ["x"] =3D 53, ["y"] =3D 55, ["z"] =3D 57, ["A"] =3D 2, ["=C4"] =3D 4, ["B"] =3D 6, ["C"] =3D 8, ["D"] =3D 1= 0, ["E"] =3D 12, ["F"] =3D 14, ["G"] =3D 16, ["H"] =3D 18, ["I"] =3D 20, ["J"] =3D 22, ["K"] =3D 24, ["L"] =3D 26, ["M"] =3D 28, ["N"] =3D 30, ["O"] =3D 32, ["=D6"] =3D 34, ["P"] =3D 36, ["Q"] =3D 38, ["R"] =3D 4= 0, ["S"] =3D 42, ["T"] =3D 44, ["U"] =3D 46, ["=DC"] =3D 48, ["V"] =3D 5= 0, ["W"] =3D 52, ["X"] =3D 54, ["Y"] =3D 56, ["Z"] =3D 58, } -- finnish (by Wolfgang Schuster) sorters.entries['fi'] =3D { [ 1] =3D "a", [ 3] =3D "b", [ 5] =3D "c", [ 7] =3D "d", [ 9] =3D "e", [11] =3D "f", [13] =3D "g", [15] =3D "h", [17] =3D "i", [19] =3D "j", [21] =3D "k", [23] =3D "l", [25] =3D "m", [27] =3D "n", [29] =3D "o", [31] =3D "p", [33] =3D "q", [35] =3D "r", [37] =3D "s", [39] =3D "t", [41] =3D "u", [43] =3D "v", [45] =3D "w", [47] =3D "x", [49] =3D "y", [51] =3D "z", [53] =3D "=E5", [55] =3D "=E4", [57] =3D "=F6", [ 2] =3D 1, [ 4] =3D 3, [ 6] =3D 5, [ 8] =3D 7, [10] =3D 9, [12] =3D 11, [14] =3D 13, [16] =3D 15, [18] =3D 17, [20] =3D 19, [22] =3D 21, [24] =3D 23, [26] =3D 25, [28] =3D 27, [30] =3D 29, [32] =3D 31, [34] =3D 33, [36] =3D 35, [38] =3D 37, [40] =3D 39, [42] =3D 41, [44] =3D 43, [46] =3D 45, [48] =3D 47, [50] =3D 49, [52] =3D 51, [54] =3D 53, [56] =3D 55, [58] =3D 57, } sorters.mappings['fi'] =3D { ["a"] =3D 1, ["b"] =3D 3, ["c"] =3D 5, ["d"] =3D 7, ["e"] =3D 9, ["f"] =3D 11, ["g"] =3D 13, ["h"] =3D 15, ["i"] =3D 17, ["j"] =3D 19, ["k"] =3D 21, ["l"] =3D 23, ["m"] =3D 25, ["n"] =3D 27, ["o"] =3D 29, ["p"] =3D 31, ["q"] =3D 33, ["r"] =3D 35, ["s"] =3D 37, ["t"] =3D 39, ["u"] =3D 41, ["v"] =3D 43, ["w"] =3D 45, ["x"] =3D 47, ["y"] =3D 49, ["z"] =3D 51, ["=E5"] =3D 53, ["=E4"] =3D 55, ["=F6"] =3D 57, ["A"] =3D 2, ["B"] =3D 4, ["C"] =3D 6, ["D"] =3D 8, ["E"] =3D 10, ["F"] =3D 12, ["G"] =3D 14, ["H"] =3D 16, ["I"] =3D 18, ["J"] =3D 20, ["K"] =3D 22, ["L"] =3D 24, ["M"] =3D 26, ["N"] =3D 28, ["O"] =3D 30, ["P"] =3D 32, ["Q"] =3D 34, ["R"] =3D 36, ["S"] =3D 38, ["T"] =3D 40, ["U"] =3D 42, ["V"] =3D 44, ["W"] =3D 46, ["X"] =3D 48, ["Y"] =3D 50, ["Z"] =3D 52, ["=C5"] =3D 54, ["=C4"] =3D 56, ["=D6"] =3D 58, } --~ sorters.test =3D '' --~ sorters.test =3D 'nl' --~ sorters.test =3D 'cz' --~ if sorters.test =3D=3D 'nl' then -- dutch test --~ data =3D { --~ { 'e', { {"ijsco",""} },2,"","","",""}, --~ { 'e', { {"ysco" ,""} },2,"","","",""}, --~ { 'e', { {"ijsco",""} },2,"","","",""}, --~ { 'e', { {"hans" ,""}, {"aap" ,""} },2,"","","",""}, --~ { 'e', { {"$a$" ,""} },2,"","","",""}, --~ { 'e', { {"aap" ,""} },2,"","","",""}, --~ { 'e', { {"hans" ,""}, {"aap" ,""} },6,"","","",""}, --~ { 'e', { {"hans" ,""}, {"noot",""} },2,"","","",""}, --~ { 'e', { {"hans" ,""}, {"mies",""} },2,"","","",""}, --~ { 'e', { {"hans" ,""}, {"mies",""} },2,"","","",""}, --~ { 'e', { {"hans" ,""}, {"mies",""}, [3] =3D {"oeps",""} },2,"= ","","",""}, --~ { 'e', { {"hans" ,""}, {"mies",""}, [3] =3D {"oeps",""} },4,"= ","","",""}, --~ } --~ sorters.index.process({ entries =3D data, language =3D 'nl'}) --~ elseif sorters.test =3D=3D 'cz' then -- czech test --~ data =3D { --~ { 'e', { {"blabla",""} },2,"","","",""}, --~ { 'e', { {"czacza",""} },2,"","","",""}, --~ { 'e', { {"albalb",""} },2,"","","",""}, --~ { 'e', { {"azcazc",""} },2,"","","",""}, --~ { 'e', { {"chacha",""} },2,"","","",""}, --~ { 'e', { {"hazzah",""} },2,"","","",""}, --~ { 'e', { {"iaccai",""} },2,"","","",""}, --~ } --~ sorters.index.process({ entries =3D data, language =3D 'cz'}) --~ end --~ print(table.serialize(sorters)) --z6Eq5LdranGa6ru8 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline ___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki! maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context webpage : http://www.pragma-ade.nl / http://tex.aanhet.net archive : https://foundry.supelec.fr/projects/contextrev/ wiki : http://contextgarden.net ___________________________________________________________________________________ --z6Eq5LdranGa6ru8--