The Unix Heritage Society mailing list
 help / color / mirror / Atom feed
From: John Cowan <cowan@ccil.org>
To: arnold@skeeve.com, robpike@gmail.com, ralph@inputplus.co.uk,
	tuhs@tuhs.org
Subject: [TUHS] Re: Bell Foreign-Language UNIX Efforts
Date: Mon, 20 Mar 2023 18:01:23 -0400	[thread overview]
Message-ID: <CAD2gp_TgTFL5agm8Z=immnGiMkpELL-wM_ZXos8OcKngw=2DLw@mail.gmail.com> (raw)
In-Reply-To: <20230320154430.DW_SS%steffen@sdaoden.eu>

[-- Attachment #1: Type: text/plain, Size: 974 bytes --]

On Mon, Mar 20, 2023 at 4:48 PM Steffen Nurpmeso <steffen@sdaoden.eu> wrote:

However note that even something like "uppercase this string"
> cannot be done the right way, because a truly Unicode aware
> operation needs to look at the entire string (sentence), because
> there may be interdependencies that modify the result.


If you are talking about downcasing Greek Σ, then it's true that always
downcasing Σ to σ is inadequate.  Unicode specifies that if the Σ appears
before a space or punctuation mark, it downcases to ς instead.  But this is
not always correct.

For example, if the string "ΦΙΛΟΣ." is the word "φιλοσ" (meaning 'beloved'
or 'friend') at the end of a sentence, "φιλοσ." is the correct downcasing.
But if it is the abbreviation for "φιλοσοφία", meaning "philosophy", then
the correct downcasing is "φιλοσ."  So getting this right is an AI-complete
problem which neither Unicode nor ICU can solve.

[-- Attachment #2: Type: text/html, Size: 1834 bytes --]

  reply	other threads:[~2023-03-20 22:01 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-19  5:00 [TUHS] " segaloco via TUHS
2023-03-19 13:32 ` [TUHS] " Diomidis Spinellis
2023-03-19 13:47   ` [TUHS] " Ralph Corderoy
2023-03-19 20:27     ` [TUHS] " Rob Pike
2023-03-20  7:55       ` arnold
2023-03-20  9:22         ` Rob Pike
2023-03-20 11:02           ` arnold
2023-03-20 15:44         ` Steffen Nurpmeso
2023-03-20 22:01           ` John Cowan [this message]
2023-03-20 22:28             ` Steffen Nurpmeso
2023-03-22  2:25       ` Larry McVoy
2023-03-22  2:52         ` Rob Pike
2023-03-22  7:12           ` Mehdi Sadeghi via TUHS
2023-03-22  7:33             ` Rob Pike
2023-03-22  7:40               ` arnold
2023-03-22 10:02                 ` Skip Tavakkolian
2023-03-22 10:09                   ` Skip Tavakkolian
2023-03-22 12:02                     ` Rob Pike
2023-03-22 22:33                       ` Steffen Nurpmeso
2023-03-22 23:33                         ` segaloco via TUHS
2023-03-23  0:01                           ` Warren Toomey via TUHS
2023-03-19 13:38 ` Edouard Klein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAD2gp_TgTFL5agm8Z=immnGiMkpELL-wM_ZXos8OcKngw=2DLw@mail.gmail.com' \
    --to=cowan@ccil.org \
    --cc=arnold@skeeve.com \
    --cc=ralph@inputplus.co.uk \
    --cc=robpike@gmail.com \
    --cc=tuhs@tuhs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).