From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE, MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE,RCVD_IN_ZEN_BLOCKED_OPENDNS, URIBL_DBL_BLOCKED_OPENDNS,URIBL_ZEN_BLOCKED_OPENDNS autolearn=ham autolearn_force=no version=3.4.4 Received: from txout-a4-smtp.messagingengine.com (txout-a4-smtp.messagingengine.com [103.168.172.227]) by inbox.vuxu.org (Postfix) with ESMTP id C21DF2AE67 for ; Thu, 18 Dec 2025 17:18:05 +0100 (CET) Received: from localhost.localdomain (phl-topicbox-01.internal [10.202.2.219]) by mailtxout.phl.internal (Postfix) with ESMTP id 7A9CC1C0164 for ; Thu, 18 Dec 2025 11:18:04 -0500 (EST) ARC-Authentication-Results: i=2; topicbox.com; arc=pass; dkim=pass (2048-bit rsa key sha256) header.d=mg.wilsonb.com header.i=@mg.wilsonb.com header.b=Gd64e07p header.a=rsa-sha256 header.s=krs x-bits=2048; dkim=pass (4096-bit rsa key sha256) header.d=wilsonb.com header.i=@wilsonb.com header.b=Fuq+EcSs header.a=rsa-sha256 header.s=201703 x-bits=4096; dmarc=pass policy.published-domain-policy=quarantine policy.published-subdomain-policy=quarantine policy.applied-disposition=none policy.evaluated-disposition=none (p=quarantine,sp=quarantine,d=none,d.eval=none) policy.policy-from=p header.from=wilsonb.com; spf=pass smtp.mailfrom="bounce+f18fc9.08547a-9fans=9fans.net@mg.wilsonb.com" smtp.helo=m42-5.mailgun.net; x-internal-arc=fail (as.1.topicbox.com=pass, ams.1.topicbox.com=fail (message has been altered)) (Message modified while forwarding at Topicbox) ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d= topicbox.com; h=sender:date:to:subject:in-reply-to:references :message-id:mime-version:content-type:content-transfer-encoding :list-help:list-id:list-post:list-subscribe:reply-to:from :list-unsubscribe; s=sysmsg-1; t=1766074684; bh=i4WaTv5TiCIVU1Uq tvBL8Dwc5OtsS7th1GQJwSf0s58=; b=mk4fi6fKFuGh9U/EM5/NxX2QAxlpRkDX yufbyBl+6o7XGsI6z/Rn32iSLRx438z5cHvwIxAV/NP5eMpzN0LOfbZdTUd02bT9 aoJiqyFLQDOA33CTq4tttB+8dJvBs6rjTCfycQMMLs611jmlzH532vF+4gLF+0nS 8oTRt6d5fXc= ARC-Seal: i=2; a=rsa-sha256; cv=pass; d=topicbox.com; s=sysmsg-1; t= 1766074684; b=aQ5BycVQyNnxv5UCro7qgKBLwKm1H/OWNlDE/OHjDBhxXNZdz3 t0f0Ez0T6MPYUqft8NEwc65owcf6r0fUyQiN5KqRnYp7zwwxjNXGW2eGlDBKXLbD jV3fi1ioxf6UX238YYk78RqeEIZjm5RmAj47z1CxpWFE5ERJVF39svbzs= Authentication-Results: topicbox.com; arc=pass; dkim=pass (2048-bit rsa key sha256) header.d=mg.wilsonb.com header.i=@mg.wilsonb.com header.b=Gd64e07p header.a=rsa-sha256 header.s=krs x-bits=2048; dkim=pass (4096-bit rsa key sha256) header.d=wilsonb.com header.i=@wilsonb.com header.b=Fuq+EcSs header.a=rsa-sha256 header.s=201703 x-bits=4096; dmarc=pass policy.published-domain-policy=quarantine policy.published-subdomain-policy=quarantine policy.applied-disposition=none policy.evaluated-disposition=none (p=quarantine,sp=quarantine,d=none,d.eval=none) policy.policy-from=p header.from=wilsonb.com; spf=pass smtp.mailfrom="bounce+f18fc9.08547a-9fans=9fans.net@mg.wilsonb.com" smtp.helo=m42-5.mailgun.net; x-internal-arc=fail (as.1.topicbox.com=pass, ams.1.topicbox.com=fail (message has been altered)) (Message modified while forwarding at Topicbox) X-Received-Authentication-Results: authmilter.topicbox.com; arc=none (no signatures found); bimi=none (No BIMI records found); dkim=pass (2048-bit rsa key sha256) header.d=mg.wilsonb.com header.i=@mg.wilsonb.com header.b=Gd64e07p header.a=rsa-sha256 header.s=krs x-bits=2048; dkim=pass (4096-bit rsa key sha256) header.d=wilsonb.com header.i=@wilsonb.com header.b=Fuq+EcSs header.a=rsa-sha256 header.s=201703 x-bits=4096; dmarc=pass policy.published-domain-policy=quarantine policy.published-subdomain-policy=quarantine policy.applied-disposition=none policy.evaluated-disposition=none (p=quarantine,sp=quarantine,d=none,d.eval=none) policy.policy-from=p header.from=wilsonb.com; iprev=pass smtp.remote-ip=69.72.42.5 (m42-5.mailgun.net); spf=pass smtp.mailfrom="bounce+f18fc9.08547a-9fans=9fans.net@mg.wilsonb.com" smtp.helo=m42-5.mailgun.net; x-aligned-from=orgdomain_pass (Domain org match); x-me-sender=none; x-ptr=pass smtp.helo=m42-5.mailgun.net policy.ptr=m42-5.mailgun.net; x-return-mx=pass header.domain=wilsonb.com policy.is_org=yes (MX Records found: wilsonb.com); x-return-mx=pass smtp.domain=mg.wilsonb.com policy.org_domain=wilsonb.com policy.is_org=no (MX Records found: mxa.mailgun.org,mxb.mailgun.org); x-tls=pass smtp.version=TLSv1.2 smtp.cipher=ECDHE-RSA-AES128-GCM-SHA256 smtp.bits=128/128; x-vs=clean score=0 state=0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=9fans.net; h=sender:date :to:subject:in-reply-to:references:message-id:mime-version :content-type:content-transfer-encoding:list-help:list-id :list-post:list-subscribe:reply-to:from:list-unsubscribe; s= dkim-1; t=1766074684; x=1766161084; bh=BPBDW8ChL9jtAjAcdnS8swpZK zKXojPqCLn9FnXyKUc=; b=j6YNGtoLUTh28qfGfZRG9+XFhQWiZkq3VG9Mye5U0 t5UZP/XesHf681Og38RKOiPAT2co93DfETHtHgGdPmjHtNTMlaFZhXi31OneMgv3 jIZZ2agWNiv01RDimZbCn6wHcaDhnunpjGvDmUpVKnqXKAB00L/NX/JwVRdrHZqJ w8= Received: from authmilter.topicbox.com (unknown [172.17.0.1]) by mx.topicbox.com (Postfix) with ESMTP id 744104D86404 for <9fans@9fans.net>; Thu, 18 Dec 2025 10:51:03 -0500 (EST) Received: from mx.topicbox.com (172.17.0.1 [172.17.0.1]) by authmilter.topicbox.com (Authentication Milter) with ESMTP id 8C57A317C82; Thu, 18 Dec 2025 10:51:03 -0500 ARC-Seal: i=1; a=rsa-sha256; cv=none; d=topicbox.com; s=arcseal; t= 1766073063; b=blZi0IMg0bCgYfeRtw3IsauVhP51hvadJ6cJSXRKpq+iYJ8SZd 4vg4sKlxZN6L6wWh4lkxwRpRFEKHVJ5exruamNaJFMraijIqywQULGEHtAPFMmpV cWhPSD8DmVn8yycXkdfBc50lf2CePB+lSqb5ezKq9DPgSiT8Km6TNKPCRbVy5okW xPITJ2MH2kPIwdiEvIasxgxOm8BnKIjKKb0WVfZVmJ7LmWyyPpOfBUIsF8rp/kYh uNvwiu8ew1hFFTivQbLW0e93t/64Hjt+k7QywJYD93b2JIwwFtpMyZYxQ/j26PFp +IsMPIaZ/yVSuPteBMn2ZmpKqYNO5BLOR0jA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d= topicbox.com; h=sender:date:from:to:subject:in-reply-to :references:message-id:mime-version:content-type :content-transfer-encoding; s=arcseal; t=1766073063; bh=vgPAlhGX 1C3O912Z6G8COe5ZgngA4AtMebKfbgQfEP0=; b=J3v0bcJcEvGN+1Ru5HYki5M4 Lu58Jjj8JJRbmtiR1NFbHxPnOZNVYvy6Inp8UGxaDKPVNoyvQ+BuZzMwYbwHgk6l C5woH9mR1uSIH4YjJAwLq0o+1y7/AtOpqKqUnBRvNM0IqXmGXkBZubefG4JUjC6N 7ph7+zCEmXFDuIZ8px7bKqT2+BKYYSAu7t5rlKvnULHKCks2ASNo86RG27e4+wIO lLE9MBF2vuOZThcRLJhT8bvAzpvrvazn9C+Y7enBbXob3mn1ySL17t9wuU7splOM 7dzoXiwW8HSsa0rOlh0VDz540ePlTEHgyBtfVkYSxcpNZD5nvLv9Liqevuy/ww== ARC-Authentication-Results: i=1; authmilter.topicbox.com; arc=none (no signatures found); bimi=none (No BIMI records found); dkim=pass (2048-bit rsa key sha256) header.d=mg.wilsonb.com header.i=@mg.wilsonb.com header.b=Gd64e07p header.a=rsa-sha256 header.s=krs x-bits=2048; dkim=pass (4096-bit rsa key sha256) header.d=wilsonb.com header.i=@wilsonb.com header.b=Fuq+EcSs header.a=rsa-sha256 header.s=201703 x-bits=4096; dmarc=pass policy.published-domain-policy=quarantine policy.published-subdomain-policy=quarantine policy.applied-disposition=none policy.evaluated-disposition=none (p=quarantine,sp=quarantine,d=none,d.eval=none) policy.policy-from=p header.from=wilsonb.com; iprev=pass smtp.remote-ip=69.72.42.5 (m42-5.mailgun.net); spf=pass smtp.mailfrom="bounce+f18fc9.08547a-9fans=9fans.net@mg.wilsonb.com" smtp.helo=m42-5.mailgun.net; x-aligned-from=orgdomain_pass (Domain org match); x-me-sender=none; x-ptr=pass smtp.helo=m42-5.mailgun.net policy.ptr=m42-5.mailgun.net; x-return-mx=pass header.domain=wilsonb.com policy.is_org=yes (MX Records found: wilsonb.com); x-return-mx=pass smtp.domain=mg.wilsonb.com policy.org_domain=wilsonb.com policy.is_org=no (MX Records found: mxa.mailgun.org,mxb.mailgun.org); x-tls=pass smtp.version=TLSv1.2 smtp.cipher=ECDHE-RSA-AES128-GCM-SHA256 smtp.bits=128/128; x-vs=clean score=0 state=0 X-ME-VSCause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgdegheekfecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpggftfghnshhusghstghrihgsvgdp uffrtefokffrpgfnqfghnecuuegrihhlohhuthemuceftddtnecunecujfgurhepfffhvf fufggjfhfkgggtgfesrgejmhertderjeenucfhrhhomhepqhhuihgvkhgrihiirghmseif ihhlshhonhgsrdgtohhmnecuggftrfgrthhtvghrnhepffeggeeugeeftedtvddutdefhf egieelffeggfegudeihedvvefffeeuveejueefnecuffhomhgrihhnpehtohhpihgtsgho gidrtghomhenucfkphepieelrdejvddrgedvrdehpddufeefrddvtddtrdduheelrddule elnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehinhgvthepieelrdejvddr gedvrdehpdhhvghlohepmhegvddqhedrmhgrihhlghhunhdrnhgvthdpmhgrihhlfhhroh hmpeeosghouhhntggvodhfudekfhgtledrtdekheegjegrqdelfhgrnhhspeelfhgrnhhs rdhnvghtsehmghdrfihilhhsohhnsgdrtghomheqpdhnsggprhgtphhtthhopedupdhrtg hpthhtohepoeelfhgrnhhsseelfhgrnhhsrdhnvghtqe X-ME-VSScore: 0 X-ME-VSCategory: clean Received-SPF: pass (mg.wilsonb.com: Sender is authorized to use 'bounce+f18fc9.08547a-9fans=9fans.net@mg.wilsonb.com' in 'mfrom' identity (mechanism 'include:mailgun.org' matched)) receiver=authmilter.topicbox.com; identity=mailfrom; envelope-from="bounce+f18fc9.08547a-9fans=9fans.net@mg.wilsonb.com"; helo=m42-5.mailgun.net; client-ip=69.72.42.5 Received: from m42-5.mailgun.net (m42-5.mailgun.net [69.72.42.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx.topicbox.com (Postfix) with ESMTPS for <9fans@9fans.net>; Thu, 18 Dec 2025 10:51:02 -0500 (EST) X-Mailgun-Sid: WyIzMzIyZiIsIjlmYW5zQDlmYW5zLm5ldCIsIjA4NTQ3YSJd Received: from wilsonb.com (42.203.199.104.bc.googleusercontent.com [104.199.203.42]) by a97ade09b8561e8212027b548ed69964eee5bfb44324b744552f9fe0c3d59ea9 with SMTP id 694422e5efb6b0d55fc4901d (version=TLS1.3, cipher=TLS_AES_128_GCM_SHA256); Thu, 18 Dec 2025 15:51:01 GMT X-Mailgun-Sending-Ip: 69.72.42.5 Sender: quiekaizam=wilsonb.com@mg.wilsonb.com Received: from ehlo.thunderbird.net (flh2-133-200-159-199.tky.mesh.ad.jp [133.200.159.199]) by wilsonb.com (Postfix) with ESMTPSA id 628C5A5370; Thu, 18 Dec 2025 15:50:59 +0000 (UTC) Date: Fri, 19 Dec 2025 00:50:59 +0900 To: 9fans <9fans@9fans.net>, Shawn Rutledge Subject: Re: [9fans] Why does utfutf() exist? User-Agent: K-9 Mail for Android In-Reply-To: References: <2ae07915-6e27-49f6-9424-d3eacc73e9e7@posixcafe.org> Message-ID: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary=----991TUHKQ9Y7B5MDX26USED5K2XFU0Q Content-Transfer-Encoding: 7bit Topicbox-Policy-Reasoning: moderate: sender is a member; group holds all messages Topicbox-Message-UUID: 5bf6f664-dc29-11f0-8f1d-75786bc11ef0 Archived-At: =?UTF-8?B?PGh0dHBzOi8vOWZhbnMudG9waWNib3guY29tL2dyb3Vwcy85?= =?UTF-8?B?ZmFucy9UODgzMTA3M2Y4YjhiYjM1MS1NYjcxZjBiNmMzNGI5OGY4OWM3OTUy?= =?UTF-8?B?NDM0Pg==?= List-Help: List-Id: "9fans" <9fans.9fans.net> List-Post: List-Software: Topicbox v0 List-Subscribe: Precedence: list Reply-To: 9fans <9fans@9fans.net> From: "quiekaizam via 9fans" <9fans@9fans.net> List-Unsubscribe: , Topicbox-Delivery-ID: 2:9fans:437d30aa-c441-11e9-8a57-d036212d11b0:522be890-2105-11eb-b15e-8d699134e1fa:Mb71f0b6c34b98f89c7952434:0:6Wqs4QuMi0gQaYMS8Hczb9_BjaFigKaGS9WCZek8Yjg ------991TUHKQ9Y7B5MDX26USED5K2XFU0Q Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable > I would assume converting to a rune would turn out the same either way: This sounds wrong to me. IIUC Runes are just Unicode code points. Glyphs ma= y have multiple representations in Unicode, of which your =C3=BC is a good = example. Mapping these representations together is a question of Unicode no= rmalization, however, and involves lots of fiddly questions whose answers a= re specific to the particular use case. As such, conversation to Runes cann= ot reasonably perform normalization AFAIU. 2025=E5=B9=B412=E6=9C=8818=E6=97=A5 18:53:35 JST=E3=80=81Shawn Rutledge =E3=82=88=E3=82=8A: >> On Dec 17, 2025, at 22:17, Jacob Moody wrote: >>=20 >> I've been poking at some of the utf* functions lately and utfutf is a bi= t puzzling. >> At face value, strstr() should be sufficient for handling utf8 encoded s= trings just as strcmp() is. >=20 > Maybe normalization could be the reason: there can be multiple representa= tions, for example, =C3=BC might be one code point (Unicode: U+00FC, UTF-8:= C3 BC), or might be u with a combining umlaut. I would assume converting = to a rune would turn out the same either way: then you can compare them eve= n if the haystack is represented one way in utf8 and the needle is the othe= r way. (Disclaimer: I=E2=80=99m not a unicode expert, even less so on 9) >=20 ------------------------------------------ 9fans: 9fans Permalink: https://9fans.topicbox.com/groups/9fans/T8831073f8b8bb351-Mb71f0= b6c34b98f89c7952434 Delivery options: https://9fans.topicbox.com/groups/9fans/subscription ------991TUHKQ9Y7B5MDX26USED5K2XFU0Q Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
>  I would a= ssume converting to a rune would turn out the same either way:

T= his sounds wrong to me. IIUC Runes are just Unicode code points. Glyphs may= have multiple representations in Unicode, of which your ü is a good e= xample. Mapping these representations together is a question of Unicode nor= malization, however, and involves lots of fiddly questions whose answers ar= e specific to the particular use case. As such, conversation to Runes canno= t reasonably perform normalization AFAIU.


2025年12月18日 18:53:35 JST&= #x3001;Shawn Rutledge <lists@ecloud.org> より:
=
On Dec 17,= 2025, at 22:17, Jacob Moody <moody@posixcafe.org> wrote:

= I've been poking at some of the utf* functions lately and utfutf is a b= it puzzling.
At face value, strstr() should be sufficient for handling= utf8 encoded strings just as strcmp() is.

Maybe normalization could be the reason: there can be multi= ple representations, for example, ü might be one code point (Unicode: = U+00FC, UTF-8: C3 BC), or might be u with a combining umlaut. I would assu= me converting to a rune would turn out the same either way: then you can co= mpare them even if the haystack is represented one way in utf8 and the need= le is the other way. (Disclaimer: I’m not a unicode expert, even les= s so on 9)
9fans: 9fans
Permalink: https://9= fans.topicbox.com/groups/9fans/T8831073f8b8bb351-Mcf1aad549b2989d69b4d6347<= /a>
Delivery options:
https://9fans.topicbox.com/groups/9fans/subscription
= ------991TUHKQ9Y7B5MDX26USED5K2XFU0Q--