ntg-context - mailing list for ConTeXt users
 help / color / mirror / Atom feed
* italian index: "I" and "J" under "I", "U" and "V" under "V"
@ 2017-06-26 12:48 MF
  2017-06-26 13:34 ` Hans Hagen
  0 siblings, 1 reply; 6+ messages in thread
From: MF @ 2017-06-26 12:48 UTC (permalink / raw)
  To: mailing list for ConTeXt users

Hello list,
there's a bug in the way ConTeXt groups the items of a register when
"indicator=yes" and mainlanguage is "it":
- items starting with the J letter are grouped under the I letter
- items starting with the V letter are grouped under the U letter

This is not what one would expect from an index in modern italian.
If you browse an italian dictionary, you will find all the 26 letters.

You can test the bug with this code:
----------------------
\starttext

\mainlanguage[it]

Imbuto\index{imbuto}, Juventus\index{Juventus},
volpe\index{volpe}, Windows\index{Windows},
uovo\index{uovo}, yes\index{yes}.

\placeindex[indicator=yes,n=1]

\stoptext
----------------------

Is there a way to get around this bug and get all the 26 distinct
letters in a register keeping "it" as the main language?

Thanks in advance,
best regards,
Massi
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: italian index: "I" and "J" under "I", "U" and "V" under "V"
  2017-06-26 12:48 italian index: "I" and "J" under "I", "U" and "V" under "V" MF
@ 2017-06-26 13:34 ` Hans Hagen
  2017-06-26 19:59   ` mf
  2017-06-26 21:39   ` Hans Åberg
  0 siblings, 2 replies; 6+ messages in thread
From: Hans Hagen @ 2017-06-26 13:34 UTC (permalink / raw)
  To: mailing list for ConTeXt users, MF

On 6/26/2017 2:48 PM, MF wrote:
> Hello list,
> there's a bug in the way ConTeXt groups the items of a register when
> "indicator=yes" and mainlanguage is "it":
> - items starting with the J letter are grouped under the I letter
> - items starting with the V letter are grouped under the U letter
> 
> This is not what one would expect from an index in modern italian.
> If you browse an italian dictionary, you will find all the 26 letters.
> 
> You can test the bug with this code:
> ----------------------
> \starttext
> 
> \mainlanguage[it]
> 
> Imbuto\index{imbuto}, Juventus\index{Juventus},
> volpe\index{volpe}, Windows\index{Windows},
> uovo\index{uovo}, yes\index{yes}.
> 
> \placeindex[indicator=yes,n=1]
> 
> \stoptext
> ----------------------
> 
> Is there a way to get around this bug and get all the 26 distinct
> letters in a register keeping "it" as the main language?
in sort-lan.lua you can fix the table:

definitions["it"] = {
     entries = {

(not sure which italian is responsible for it)

Hans

-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: italian index: "I" and "J" under "I", "U" and "V" under "V"
  2017-06-26 13:34 ` Hans Hagen
@ 2017-06-26 19:59   ` mf
  2017-06-26 21:39   ` Hans Åberg
  1 sibling, 0 replies; 6+ messages in thread
From: mf @ 2017-06-26 19:59 UTC (permalink / raw)
  To: mailing list for ConTeXt users

> in sort-lan.lua you can fix the table:
> 
> definitions["it"] = {
>      entries = {
> 
> (not sure which italian is responsible for it)
> 
> Hans
> 
> 

Thank you, Hans.
Looking at the code, there's a revealing comment before the definitions
for the Latin language: 

-- Treating the post-classical fricatives “j” and “v” as “i” and “u”
-- respectively.

When I saw the bug, I suspected something like that, because "U" and
"V" are written as "V" in Latin, but not in Italian.
The letter "j" replaces "i" also in Italian when it's between vowels
or  when it's at the beginning of a word, followed by a vowel. But this
is the Italian of a century or at least decades ago; in modern Italian
it is rarely used and you won't find it i.e. in newspapers.

Quoting from Wikipedia (https://it.wikipedia.org/wiki/Alfabeto_italiano
):

"Il latino classico non distingueva graficamente la U dalla V (il
latino classico aveva solo la U e scriveva parole come divvs per  per
/ˈdiːwus/); in epoca classica e soprattutto nel latino medievale (che è
arrivato fino a noi tramite l'uso ecclesiastico) iniziò a farsi sentire
una distinzione tra U e V e quindi la nuova consonante venne creata
modificando la V in U..."

Classical latin did not distinguish graphically "U" from "V"
(classical  latin had only the "U" letter and wrote words like "divvus"
for /ˈdiːwus/); during classical antiquity and even more in medieval
latin (which arrived to us through ecclesiastical use) a distinction
between "U" and "V" started to emerge, so the new consonant had been
created modifying the "V" letter into "U"...

"La J inizia ad essere usata nel '500 fino all'inizio del XX secolo,
sia per indicare il suono semiconsonantico della I (jella), ovvero la
"i" intervocalica (grondaja, aja), e come segno tipografico per la
doppia i (principj).
Le lettere I e J erano ancora considerate equivalenti, per quanto 
riguarda l'ordine alfabetico nei dizionari e nelle enciclopedie 
italiani, fino alla metà del XX secolo."

The "J" letter started to be used in XVI century until the beginning of
the XX century, to suggest the semiconsonantic sound of "I" (jella) or
the "i" between vowels (grondaja, aja), and also as a typographic sign
for the double "i" (principj).
"I" and "J" letters were still considered equivalent in the alphabetic
order for italian dictionaries and encyclopedias until the midst of XX
century.

Getting back, to sort-lan.lua, it should be like this:
-------------------------------
definitions["it"] = {
  entries = {
      ["a"] = "a", ["á"] = "a", ["b"] = "b", ["c"] = "c", ["d"] = "d",
      ["e"] = "e", ["é"] = "e", ["è"] = "e", ["f"] = "f", ["g"] = "g",
      ["h"] = "h", ["i"] = "i", ["í"] = "i", ["ì"] = "i", ["j"] = "j",
      ["k"] = "k", ["l"] = "l", ["m"] = "m", ["n"] = "n", ["o"] = "o",
      ["ó"] = "o", ["ò"] = "o", ["p"] = "p", ["q"] = "q", ["r"] = "r",
      ["s"] = "s", ["t"] = "t", ["u"] = "u", ["ú"] = "u", ["ù"] = "u",
      ["v"] = "v", ["w"] = "w", ["x"] = "x", ["y"] = "y", ["z"] = "z",
  },
-------------------------------

Thank you again,
best regards,
Massi
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: italian index: "I" and "J" under "I", "U" and "V" under "V"
  2017-06-26 13:34 ` Hans Hagen
  2017-06-26 19:59   ` mf
@ 2017-06-26 21:39   ` Hans Åberg
  2017-06-27  7:06     ` Hans Hagen
  1 sibling, 1 reply; 6+ messages in thread
From: Hans Åberg @ 2017-06-26 21:39 UTC (permalink / raw)
  To: mailing list for ConTeXt users; +Cc: Hans Hagen


> On 26 Jun 2017, at 15:34, Hans Hagen <pragma@wxs.nl> wrote:
> 
> On 6/26/2017 2:48 PM, MF wrote:
>> 
>> there's a bug in the way ConTeXt groups the items of a register when
>> "indicator=yes" and mainlanguage is "it":
>> - items starting with the J letter are grouped under the I letter
>> - items starting with the V letter are grouped under the U letter
>> This is not what one would expect from an index in modern italian.
>> If you browse an italian dictionary, you will find all the 26 letters.

>> Is there a way to get around this bug and get all the 26 distinct
>> letters in a register keeping "it" as the main language?
> in sort-lan.lua you can fix the table:
> 
> definitions["it"] = {
>    entries = {
> 
> (not sure which italian is responsible for it)

In Swedish, originally, "w" is sorted the same as "v", but it has changed lately, though there is a recommendation to still use the old style in tables of personal names, in view they phonetically identical in Swedish. So there are two different sortings in use.


___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: italian index: "I" and "J" under "I", "U" and "V" under "V"
  2017-06-26 21:39   ` Hans Åberg
@ 2017-06-27  7:06     ` Hans Hagen
  2017-06-27  8:43       ` Hans Åberg
  0 siblings, 1 reply; 6+ messages in thread
From: Hans Hagen @ 2017-06-27  7:06 UTC (permalink / raw)
  To: Hans Åberg, mailing list for ConTeXt users

On 6/26/2017 11:39 PM, Hans Åberg wrote:
> 
>> On 26 Jun 2017, at 15:34, Hans Hagen <pragma@wxs.nl> wrote:
>>
>> On 6/26/2017 2:48 PM, MF wrote:
>>>
>>> there's a bug in the way ConTeXt groups the items of a register when
>>> "indicator=yes" and mainlanguage is "it":
>>> - items starting with the J letter are grouped under the I letter
>>> - items starting with the V letter are grouped under the U letter
>>> This is not what one would expect from an index in modern italian.
>>> If you browse an italian dictionary, you will find all the 26 letters.
> 
>>> Is there a way to get around this bug and get all the 26 distinct
>>> letters in a register keeping "it" as the main language?
>> in sort-lan.lua you can fix the table:
>>
>> definitions["it"] = {
>>     entries = {
>>
>> (not sure which italian is responsible for it)
> 
> In Swedish, originally, "w" is sorted the same as "v", but it has changed lately, though there is a recommendation to still use the old style in tables of personal names, in view they phonetically identical in Swedish. So there are two different sortings in use.
it's no problem to have several vectors because one can configure what 
vector(set) to choose

Hans


-----------------------------------------------------------------
                                           Hans Hagen | PRAGMA ADE
               Ridderstraat 27 | 8061 GH Hasselt | The Netherlands
        tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl
-----------------------------------------------------------------
___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: italian index: "I" and "J" under "I", "U" and "V" under "V"
  2017-06-27  7:06     ` Hans Hagen
@ 2017-06-27  8:43       ` Hans Åberg
  0 siblings, 0 replies; 6+ messages in thread
From: Hans Åberg @ 2017-06-27  8:43 UTC (permalink / raw)
  To: Hans Hagen; +Cc: mailing list for ConTeXt users


> On 27 Jun 2017, at 09:06, Hans Hagen <pragma@wxs.nl> wrote:
> 
> On 6/26/2017 11:39 PM, Hans Åberg wrote:
>> 
>> In Swedish, originally, "w" is sorted the same as "v", but it has changed lately, though there is a recommendation to still use the old style in tables of personal names, in view they phonetically identical in Swedish. So there are two different sortings in use.
> it's no problem to have several vectors because one can configure what vector(set) to choose

Indeed, that is what I think might be required in general.


___________________________________________________________________________________
If your question is of interest to others as well, please add an entry to the Wiki!

maillist : ntg-context@ntg.nl / http://www.ntg.nl/mailman/listinfo/ntg-context
webpage  : http://www.pragma-ade.nl / http://context.aanhet.net
archive  : https://bitbucket.org/phg/context-mirror/commits/
wiki     : http://contextgarden.net
___________________________________________________________________________________

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-06-27  8:43 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-26 12:48 italian index: "I" and "J" under "I", "U" and "V" under "V" MF
2017-06-26 13:34 ` Hans Hagen
2017-06-26 19:59   ` mf
2017-06-26 21:39   ` Hans Åberg
2017-06-27  7:06     ` Hans Hagen
2017-06-27  8:43       ` Hans Åberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).