From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, MAILING_LIST_MULTI autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 9829 invoked from network); 18 Dec 2022 10:51:52 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 18 Dec 2022 10:51:52 -0000 ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1671360712; b=itl9a+2GiWlWJb4maJdZboTuaUJPbkn4PlOsILv0J/ta4tsQvzLa30iJvMlEskL7SjX6K7Ae1E loNkVsJ4JMlXdRhuI3GIvv41Udqq7TMh75zrlMo+Iu3NBjtxX/SJONlcVUl4T9WeDB8zl9nc4A El532yuBTPPGki+z49yHT1JuqKjVOja0vyEWTy14WstwbZjuhW2GukcKMNmkeuGZ+xSgS2zJeA 5X7HCTdzMXIYs2acF0A1pCZMJZthyAKceCx8mngv9FBkvE2BSdrV1+yYnL1//u9fmRXaxejewi A0F3xiz5wUvNvXqwq+44rLBEetN2fI0fStwFGTY9wnG8eA==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (snd20007-bg.im.kddi.ne.jp) smtp.remote-ip=222.227.84.7; dmarc=none header.from=kba.biglobe.ne.jp; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1671360712; bh=DkSm7ZpZ0ri9vmBd55p9UQsG0i8WfFJGzV8cet8hTjw=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Message-ID:In-Reply-To:To:References:Date:Subject: MIME-Version:Content-Transfer-Encoding:Content-Type:From:DKIM-Signature; b=Chfau0m1FTYzbUwUVGcmyCxdBtqEwH8vs7+42dJPNnugiZCip2+kVRT6m0TUUZt+Gj9WviJeP9 vU50b5oXOBKm+q+T/Yc7icjnYuAdvfY+jh3NMvHlkSlIwfng2vvf1t3EPFShfWwmvHKAJMlH1i QGKh/d1YotPwOZKv9+KM8KLpUafxdGfOBdmIwaOAa6nmYfiCmmERd1WSYSj72ITdG9z+aBOQlH ikv4br5x1nEwCXoIkJviL4RLJOTiVNVWI0Qrpm76Zj+OdQdzLlbSeeEQxQk6SxpNldYOtafyo7 OrgKPFScHXYZlW/vTzQhUKSW5DAkIB1X4PbtgkxnXFQCzA==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Message-Id:In-Reply-To:To:References: Date:Subject:Mime-Version:Content-Transfer-Encoding:Content-Type:From: Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=7r3b6uULEWloWNjzzC5nOocLMrxNdg4NYlPrR+VCxvE=; b=l2RFx+4ugYaLe7qdtx4c3rNzG/ 6+p+ZkRFMfzYJCYot1dCXRZnGbnuer4lnBr/VF2CSkdMYWboUu//4CX1GYPp1LCxL10zglqYma6Nh uq+uG6mMyyKv6FggB2jCXGbPAeJdrWt5J4+bdxyXcTpfbA4Dxc+JrbZhHZ4EqCwjy0sD04nOAbi4C 893Rm0QVTfT+XlVcEpnNI2/zU9qyk1bcEby3WSJn8R6JcVswVipDkye2BiAZZiTiG90HQvRSpGYjh Sar4DmOHYAitfZZ736Ux4WR+atefUHikNwEq2NDHSGKLqcGaWGc4AB5IbgVqYaFBCrOS1CwOB5O71 eX9zxJDQ==; Received: by zero.zsh.org with local id 1p6rGo-000B6E-Ae; Sun, 18 Dec 2022 10:51:50 +0000 Authentication-Results: zsh.org; iprev=pass (snd20007-bg.im.kddi.ne.jp) smtp.remote-ip=222.227.84.7; dmarc=none header.from=kba.biglobe.ne.jp; arc=none Received: from snd20007-bg.im.kddi.ne.jp ([222.227.84.7]:34913 helo=dfmta0007.biglobe.ne.jp) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_256_GCM_SHA384:256) id 1p6rGV-000ApN-6R; Sun, 18 Dec 2022 10:51:35 +0000 Received: from mail.biglobe.ne.jp by omta0007.biglobe.ne.jp with ESMTP id <20221218105123526.HREN.17220.mail.biglobe.ne.jp@biglobe.ne.jp> for ; Sun, 18 Dec 2022 19:51:23 +0900 From: "Jun. T" Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Subject: Re: read -d $'\200' doesn't work with set +o multibyte (and [PATCH]) Date: Sun, 18 Dec 2022 19:51:22 +0900 References: <20221209154225.2z3lbtf422ypnmjx@chazelas.org> <99492-1670616302.663548@1brw.o7tP.wgJL> <20221210090626.mkv7bxeqnap6awah@chazelas.org> <1FF79E35-0103-4B80-BA4A-ECC6FD2ADF7E@kba.biglobe.ne.jp> <46661-1671054174.401235@OHsn.sB58.XThR> <18686-1671179384.136789@8qJu.Y1PF.BJgr> To: zsh-workers@zsh.org In-Reply-To: <18686-1671179384.136789@8qJu.Y1PF.BJgr> Message-Id: <5D83D776-4F97-499D-8848-A680F712DD31@kba.biglobe.ne.jp> X-Mailer: Apple Mail (2.3696.120.41.1.1) X-Biglobe-Sender: takimoto-j@kba.biglobe.ne.jp X-Seq: 51239 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: , List-Subscribe: , List-Unsubscribe: , List-Post: List-Owner: List-Archive: > 2022/12/16 17:29, Oliver Kiddle wrote: >=20 >>> + read -ed $'\xc2' >>> +0:read delimited by a single byte terminates if the byte is part of = a multibyte character >>> +>> +>one >>=20 >> Is this really what the standard requires (or will require)? >> Breaking in the middle of a valid multibyte character looks >> rather odd to me. >=20 > The proposed standard wording appears to only talk about the case of = the > delimiter consisting of "one single-byte character". $'\xc2' is not a > valid UTF-8 character so my interpretation is that they are leaving = this > undefined. I thought the "one single-byte character" etc. applies only when C or POSIX locale is in use. > Behaviour that treats the input as raw bytes for a raw byte delimiter > is consistent. This retains compatibility with the way things > work for a non-multibyte locale. Not all files are valid UTF-8 and it > can be useful to force things to work at a raw byte level. I was thinking it would be enough if we can do 'byte-by-byte' analysis = by using C/POSIX locale (or by setting MULTIBYTE option to off). In the web page Stehane mentioned: https://austingroupbugs.net/view.php?id=3D243#c6091 "When the current locale is not the C or POSIX locale, pathnames can = contain bytes that do not form part of a valid character, and therefore = portable applications need to ensure that the current locale is the C or = POSIX locale when using read with arbitrary pathnames as input." But I'm not familiar with this type of documents.