The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: Steffen Nurpmeso <steffen@sdaoden.eu>
To: John Cowan <cowan@ccil.org>
Cc: tuhs@tuhs.org
Subject: [TUHS] Re: Bell Foreign-Language UNIX Efforts
Date: Mon, 20 Mar 2023 23:28:19 +0100	[thread overview]
Message-ID: <20230320222819.tL9QW%steffen@sdaoden.eu> (raw)
In-Reply-To: <CAD2gp_TgTFL5agm8Z=immnGiMkpELL-wM_ZXos8OcKngw=2DLw@mail.gmail.com>

John Cowan wrote in
 <CAD2gp_TgTFL5agm8Z=immnGiMkpELL-wM_ZXos8OcKngw=2DLw@mail.gmail.com>:
 |On Mon, Mar 20, 2023 at 4:48 PM Steffen Nurpmeso <steffen@sdaoden.eu> \
 |wrote:
 |
 |However note that even something like "uppercase this string"
 |> cannot be done the right way, because a truly Unicode aware
 |> operation needs to look at the entire string (sentence), because
 |> there may be interdependencies that modify the result.
 |
 |If you are talking about downcasing Greek Σ, then it's true that always
 |downcasing Σ to σ is inadequate.  Unicode specifies that if the Σ appears
 |before a space or punctuation mark, it downcases to ς instead.  But this is
 |not always correct.
 |
 |For example, if the string "ΦΙΛΟΣ." is the word "φιλοσ" (meaning 'beloved'
 |or 'friend') at the end of a sentence, "φιλοσ." is the correct downcasing.
 |But if it is the abbreviation for "φιλοσοφία", meaning "philosophy", then
 |the correct downcasing is "φιλοσ."  So getting this right is an AI-complete
 |problem which neither Unicode nor ICU can solve.

Oh, i'd wish i only would be able to speek/read/write (old) Greek.
Unfortunately, after English, i either had to go to another school
or choose in between French and Latin, (i would have given
everything for Chinese, Japanese, and/or Russian), so i had chosen
Latin.  And whereas i started out as one of the three best, i then
watched an Interview with a CDU ("republican") state secretary,
with the wonderful Lea Rosh, and he talked Latin; and
whereas she repeatedly said "i understand you, but what is with
the audience?", you know, i as a young teenager, i was _so_ pissed
that "i quit", as like in the book "The Tin Drum" of Günter Grass.
So this made my grade point average a bit weaker.

But yes, i think quite a lot of languages have this problem.  Even
my own native language German for the conversion of the lowercase
sharp-s, even though for over hundred years some try to establish
an uppercase variant, which the Swiss tongue has.  (Mind you, even
after WWII when that uppercase ss was forbidden, at least in some
dosage forms, like that one used by the US rock band Kiss, ..not.)

If you would ask on the Unicode mailing-list, you will be told to
only convert entire sentences.  But it seems Greek sigma is very
special, says Unicode FAQ.

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

  reply	other threads:[~2023-03-20 22:28 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-19  5:00 [TUHS] " segaloco via TUHS
2023-03-19 13:32 ` [TUHS] " Diomidis Spinellis
2023-03-19 13:47   ` [TUHS] " Ralph Corderoy
2023-03-19 20:27     ` [TUHS] " Rob Pike
2023-03-20  7:55       ` arnold
2023-03-20  9:22         ` Rob Pike
2023-03-20 11:02           ` arnold
2023-03-20 15:44         ` Steffen Nurpmeso
2023-03-20 22:01           ` John Cowan
2023-03-20 22:28             ` Steffen Nurpmeso [this message]
2023-03-22  2:25       ` Larry McVoy
2023-03-22  2:52         ` Rob Pike
2023-03-22  7:12           ` Mehdi Sadeghi via TUHS
2023-03-22  7:33             ` Rob Pike
2023-03-22  7:40               ` arnold
2023-03-22 10:02                 ` Skip Tavakkolian
2023-03-22 10:09                   ` Skip Tavakkolian
2023-03-22 12:02                     ` Rob Pike
2023-03-22 22:33                       ` Steffen Nurpmeso
2023-03-22 23:33                         ` segaloco via TUHS
2023-03-23  0:01                           ` Warren Toomey via TUHS
2023-03-19 13:38 ` Edouard Klein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230320222819.tL9QW%steffen@sdaoden.eu \
    --to=steffen@sdaoden.eu \
    --cc=cowan@ccil.org \
    --cc=tuhs@tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).