From: John Cowan <cowan@ccil.org>
To: arnold@skeeve.com, robpike@gmail.com, ralph@inputplus.co.uk,
tuhs@tuhs.org
Subject: [TUHS] Re: Bell Foreign-Language UNIX Efforts
Date: Mon, 20 Mar 2023 18:01:23 -0400 [thread overview]
Message-ID: <CAD2gp_TgTFL5agm8Z=immnGiMkpELL-wM_ZXos8OcKngw=2DLw@mail.gmail.com> (raw)
In-Reply-To: <20230320154430.DW_SS%steffen@sdaoden.eu>
[-- Attachment #1: Type: text/plain, Size: 974 bytes --]
On Mon, Mar 20, 2023 at 4:48 PM Steffen Nurpmeso <steffen@sdaoden.eu> wrote:
However note that even something like "uppercase this string"
> cannot be done the right way, because a truly Unicode aware
> operation needs to look at the entire string (sentence), because
> there may be interdependencies that modify the result.
If you are talking about downcasing Greek Σ, then it's true that always
downcasing Σ to σ is inadequate. Unicode specifies that if the Σ appears
before a space or punctuation mark, it downcases to ς instead. But this is
not always correct.
For example, if the string "ΦΙΛΟΣ." is the word "φιλοσ" (meaning 'beloved'
or 'friend') at the end of a sentence, "φιλοσ." is the correct downcasing.
But if it is the abbreviation for "φιλοσοφία", meaning "philosophy", then
the correct downcasing is "φιλοσ." So getting this right is an AI-complete
problem which neither Unicode nor ICU can solve.
[-- Attachment #2: Type: text/html, Size: 1834 bytes --]
next prev parent reply other threads:[~2023-03-20 22:01 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-19 5:00 [TUHS] " segaloco via TUHS
2023-03-19 13:32 ` [TUHS] " Diomidis Spinellis
2023-03-19 13:47 ` [TUHS] " Ralph Corderoy
2023-03-19 20:27 ` [TUHS] " Rob Pike
2023-03-20 7:55 ` arnold
2023-03-20 9:22 ` Rob Pike
2023-03-20 11:02 ` arnold
2023-03-20 15:44 ` Steffen Nurpmeso
2023-03-20 22:01 ` John Cowan [this message]
2023-03-20 22:28 ` Steffen Nurpmeso
2023-03-22 2:25 ` Larry McVoy
2023-03-22 2:52 ` Rob Pike
2023-03-22 7:12 ` Mehdi Sadeghi via TUHS
2023-03-22 7:33 ` Rob Pike
2023-03-22 7:40 ` arnold
2023-03-22 10:02 ` Skip Tavakkolian
2023-03-22 10:09 ` Skip Tavakkolian
2023-03-22 12:02 ` Rob Pike
2023-03-22 22:33 ` Steffen Nurpmeso
2023-03-22 23:33 ` segaloco via TUHS
2023-03-23 0:01 ` Warren Toomey via TUHS
2023-03-19 13:38 ` Edouard Klein
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAD2gp_TgTFL5agm8Z=immnGiMkpELL-wM_ZXos8OcKngw=2DLw@mail.gmail.com' \
--to=cowan@ccil.org \
--cc=arnold@skeeve.com \
--cc=ralph@inputplus.co.uk \
--cc=robpike@gmail.com \
--cc=tuhs@tuhs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).