9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Shawn Rutledge <lists@ecloud.org>
To: 9fans <9fans@9fans.net>
Subject: Re: [9fans] Why does utfutf() exist?
Date: Thu, 18 Dec 2025 10:53:35 +0100	[thread overview]
Message-ID: <BCC16A4B-CD61-45EB-B8D0-277D9064BDC2@ecloud.org> (raw)
In-Reply-To: <2ae07915-6e27-49f6-9424-d3eacc73e9e7@posixcafe.org>

> On Dec 17, 2025, at 22:17, Jacob Moody <moody@posixcafe.org> wrote:
> 
> I've been poking at some of the utf* functions lately and utfutf is a bit puzzling.
> At face value, strstr() should be sufficient for handling utf8 encoded strings just as strcmp() is.

Maybe normalization could be the reason: there can be multiple representations, for example, ü might be one code point (Unicode: U+00FC, UTF-8: C3 BC), or might be u with a combining umlaut.  I would assume converting to a rune would turn out the same either way: then you can compare them even if the haystack is represented one way in utf8 and the needle is the other way.  (Disclaimer: I’m not a unicode expert, even less so on 9)


------------------------------------------
9fans: 9fans
Permalink: https://9fans.topicbox.com/groups/9fans/T8831073f8b8bb351-Mcf1aad549b2989d69b4d6347
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

  reply	other threads:[~2025-12-18 13:51 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-17 21:17 [9fans] Why does utfutf() exist? Jacob Moody
2025-12-18  9:53 ` Shawn Rutledge [this message]
2025-12-18 15:50   ` quiekaizam via 9fans
2025-12-18 17:13   ` Jacob Moody
2025-12-18 20:16     ` Rob Pike
2025-12-18 20:48       ` Jacob Moody

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BCC16A4B-CD61-45EB-B8D0-277D9064BDC2@ecloud.org \
    --to=lists@ecloud.org \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).