From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 10553 invoked from network); 29 Nov 2022 14:27:40 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 29 Nov 2022 14:27:40 -0000 ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1669732060; b=dxOLpblNL3zKNFoc3tU6yg3p9A5LLSfoE+Z8pc+PYIDURxmhwPd8Vw6zhJleJMr2eqcE/NhkQC yc1UbIlUS03P6IqFMpKQJHYIe9Aw4FTaAClckPuTBhkthV5xIOqKyQshqIt5mfrzQGMnRZ+ZYl F1pjZgqMbxrs7Quab8CjH001WhBtJBKJDo94c2l6D8Z2PCFMMP9PLcLoOFCBz847eHDq1+A39W BQO2kH6YjLjkY/2pKrN6srK7zP4HYI4CLTLvpP8PgA33iUsTnpX7WHxsF65pfp82EWtvycFy9p dcPvDQ1k/ViwwnnduiFvaprCkoV3d2w8A2Nxarh4eVrcQg==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (snd00009-bg.im.kddi.ne.jp) smtp.remote-ip=27.86.113.9; dmarc=none header.from=kba.biglobe.ne.jp; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1669732060; bh=32DQXqA3FLUZXR0tr/mpdcEBlb1/eaXBKPareHNyQQc=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Message-ID:In-Reply-To:To:References:Date:Subject: MIME-Version:Content-Transfer-Encoding:Content-Type:From:DKIM-Signature; b=qc12bvHL4bFR1Cw08CFdmm3zRMnAj/LoNDMhna16NkxqAvLVu2BCNRSPYjA45dYKoBoe+Sffci gt6v5U38XnOm/TK9Q7BXGvF9XGD0CMJlELn1hq3kUrTxTRUp+1QpvMX15ziDVYGykwZFV91AnO uzmiuASxplIiA37xsdF0sycmEmr3RaPldmjZ80zivIrPEQIYqmQMuXzCHuuIRXwcIa5b+l7li6 j0Uaq9Ss9NnIaGK70hBhHd0kFw02sLlt9MFvEDB//N6qso0KUEXa7sJAv1+rt6rZXzZsMvLx1y j16eo7QjWf1+zuw5J5ZUbtkuIrTPbSFhFeB2mX3Qzn76ig==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Message-Id:In-Reply-To:To:References: Date:Subject:Mime-Version:Content-Transfer-Encoding:Content-Type:From: Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=UV8jvzK91bFiRm31a4E0ZKS1KYTSfrDO4b1d6H8OmCw=; b=Bptw2jZAl1QFRDuQCUAvofe5PF qF5fbfKHEFFW64C3whOvT/U2q0rIVNas2pnoMBQ3oR6zdPvvc6GfcY4/nUdPPSAy44ZTdGn78b++h rf+AMjIXJ1DudiL9mrvBUGph2kCBI99dmOxJPawVUD9ukDqKJbH8iHmHXQsNcMQGP5yOwATas5fnQ +slqN4UUzYqvfVkBvj0VpyenqUF8cyNi9YiBtENCVJzLlFBFkhjd7UvkDR29O8sL3OdPAY168yTpu p574KuymMDEqbScdHOnptdh2Y08HTb6W0nqTw3fzVQZs1HXKbqrtjmZ0vj6iPsvmuhkr9UU3H618V HPDhuqaw==; Received: by zero.zsh.org with local id 1p01aF-000BS9-O0; Tue, 29 Nov 2022 14:27:39 +0000 Authentication-Results: zsh.org; iprev=pass (snd00009-bg.im.kddi.ne.jp) smtp.remote-ip=27.86.113.9; dmarc=none header.from=kba.biglobe.ne.jp; arc=none Received: from snd00009-bg.im.kddi.ne.jp ([27.86.113.9]:53473 helo=dfmta0011.biglobe.ne.jp) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_256_GCM_SHA384:256) id 1p01Zs-000B5i-Og; Tue, 29 Nov 2022 14:27:19 +0000 Received: from mail.biglobe.ne.jp by omta0011.biglobe.ne.jp with ESMTP id <20221129142710975.KWSR.29589.mail.biglobe.ne.jp@biglobe.ne.jp> for ; Tue, 29 Nov 2022 23:27:10 +0900 From: "Jun. T" Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\)) Subject: Re: [bug] busyloop upon $=var with NULs when $IFS contains both NUL and a byte > 0x7f Date: Tue, 29 Nov 2022 23:27:10 +0900 References: <20221118142717.t4elzrigjeizjm6w@chazelas.org> To: zsh-workers@zsh.org In-Reply-To: <20221118142717.t4elzrigjeizjm6w@chazelas.org> Message-Id: X-Mailer: Apple Mail (2.3696.120.41.1.1) X-Biglobe-Sender: takimoto-j@kba.biglobe.ne.jp X-Seq: 51083 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: , List-Subscribe: , List-Unsubscribe: , List-Post: List-Owner: List-Archive: > 2022/11/18 23:27, Stephane Chazelas wrote: >=20 > $ LC_ALL=3DC zsh -c 'IFS=3D=C3=A9$IFS; echo $=3DIFS' > ^C >=20 > (busy loop had to be interrupted with ^C). This not simple to solve. The basic question is: What should we do if IFS contains invalid characters? When IFS changes, ifssetfn() calls inittyptab(), and it then calls set_widearay() (at line 4172 in utils.c) to set the structure ifs_wide. The origin of the problem seems to be in this function (also in utils.c): 95 mblen =3D mb_metacharlenconv(mb_array, &wci); .. 99 /* No good unless all characters are convertible */ 100 if (wci =3D=3D WEOF) 101 return; mb_array is the current IFS (metafied), and it contains =C3=A9 =3D \xc3\xa9. In the C locale (and at least on Linux), \xc3 is an invalid character, and wci is set to WEOF. Then the function returns without setting ifs_wide (ifs_wide.chars=3DNULL and ifs_wide.len=3D0). The comment at line 99 may look reasonable, but leaving ifs_wide empty is equally 'no good', I think. Due to this empty ifs_wide, itype_end() (and wcsitype()) doesn't work as expected (for character >=3D \x80). The 'busy loop' is in wordcount() (utils.c): 3834 for (; *s; r++) { = =20 3835 char *ie =3D itype_end(s, ISEP, 1); =20 3836 if (ie !=3D s) { = =20 3837 s =3D ie; = =20 .... =20 3840 } =20 3841 (void)findsep(&s, NULL, 0); .... 3845 } Here, the pointer s already points to a ISEP (\x83\x20 =3D metafied = Nul), but itype_end() can't find the next ISEP (ie =3D=3D s) due to the empty ifs_wide, and findsep() does not move s because *s is already ISEP, resulting in infinite-loop with the same s. So the basic question is: What should we do if IFS contains invalid character(s)? I think, at least if MULTIBYTE option is ON, it would be better to force reset IFS to the default, rather than leaving ifs_wide empty. Or store only valid characters in ifs_side.chars? BTW, in set_widearay(): 89 if (STOUC(*mb_array) <=3D 0x7f) { 90 mb_array++; 91 *wcptr++ =3D (wchar_t)*mb_array; I think the lines 90,91 should be *wcptr++ =3D (wchar_t)*mb_array++; But fixing this does not solve the current problem.=