ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* Czech sorting rules
@ 2005-08-09 11:57 David Antos
  0 siblings, 0 replies; only message in thread
From: David Antos @ 2005-08-09 11:57 UTC (permalink / raw)



	Hello,

I've tried to find something relevant about the terrible Czech sorting :-)

The first thing to note is that there is a standard (from 1970 or so) that
is not implementable in fact, it requires such stupid sorts like Karel IV <
Karel III as one should sort it as the numbers were written in words
(ctvrty, treti) :-)

So in practice, there are more-or-less accurate approximations. Quite good
intro is http://www.vitsoft.info/sortkit.htm

The sorting is considered very reasonable if it conforms with order stated
in http://www.fi.muni.cz/~adelton/l10n/cssort/cssort.table . Characters on
a single line in the table are considered equivalent. Note the `ch'
character that is sorted between h and i. This table contains accented
letters that are not used in Czech (like crossed l, z dot above). It should
IMHO be also completely OK for Slovak (as they, I hope, inherited the
standard).

I think that it would be completely OK to sort according to that table
taking chars on single lines as equivalent. The modules the table is from
implements a four-pass sorting algorithm that reflects pretty damn rules,
see http://www.fi.muni.cz/~adelton/l10n/cssort/csort.c .

An example of sorted sequences is
http://www.fi.muni.cz/~adelton/l10n/cssort/sort.tab .

The question is if it is reasonable to implement it internally in ConTeXt
or to use an external module. An external Perl module was prepared by Tom
Hudec once (he even modified the sorting table, he preferred all letters
with `hacek (\v{})' to be greater than without \v. If you consider
internal ConTeXt implementation feasible, I'd be happy if you commented the
sorting macros a bit, so that I could contact native Czech users and
fine-tune it. I'd like to consult it with our Czech TeX frieds, I don't
feel myself to be a sorting expert (it's quite tricky, isn't it).

Thanks,
D.A.

-- 
Early to rise, early to bed, makes a man healthy, wealthy and dead.
-- Terry Pratchett, "The Light Fantastic"

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2005-08-09 11:57 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-08-09 11:57 Czech sorting rules David Antos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).