From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 1261 invoked from network); 9 Dec 2022 15:42:42 -0000 Received: from zero.zsh.org (2a02:898:31:0:48:4558:7a:7368) by inbox.vuxu.org with ESMTPUTF8; 9 Dec 2022 15:42:42 -0000 ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1670600562; b=W2W9dtz8fuAX5TfcQcICiF6OQcKcC9FjaXzbUTQV6IpgZisLjrD6wOfixdl/4Mzw3szlGk5xN9 4xuloTn2gCpUAiSa7S8RV3PGT/6/gPzIHm72FrlgfkeZpJwJqipyx+/iUrR7sp2LHaryTST+tk fYtjdlwk5zrxQPJW0aH1x1d48VRM0f9lGi/ANGuU9WylPrHxmB5iuVtloVHw6svmTB8gg0Jy0o Gqp1AmtMZC4S34/Ff01nivFJOBIERogNeVfZ/yT1jkP/Su+d8EoWLqBXcw9U5ylCNAhMhPtqn1 H9XsPYPVLS2NWGziGjEDqDrdEmordTY+3wj7lg6eE2CCcg==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (relay2-d.mail.gandi.net) smtp.remote-ip=217.70.183.194; dmarc=none header.from=chazelas.org; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1670600562; bh=qiLAzY5ArFnFF72pBiaVsXv/paOZZIK9Mwefv1qpcXw=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Content-Transfer-Encoding:Content-Type:MIME-Version: Message-ID:Subject:To:From:Date:DKIM-Signature; b=gt7d+a8UKsqHIEY2Tf9766P0Q0lYkBUfnP5DrUNSBXmE4tBFFS7pC3X7HrE4EbJEzmVzkOeWny m2eLrtGMcLWqVaGTDk0MOaMHmq8ldsYUwxVZart2zSefpEVCBFx+T7tIXuvRCLP9o7G742WXbI d+Co2Rx7PoVHywzq1/xJfY+H0dqtnbznPGVa3ID8Eglt7Vpd5AUKX/MWQoe2jtmeiSW1gs5KK4 lsm2SLQnc8YdoUGZrvcBUAFjLb5nzj60os7JZF90jZesgUw4h7QKPjhWxti+0hgh6N9w28M1DC Qlwxo80z0gZ6lrgG+m2yx7wzXK34ylbreDedkuFqvKyiEQ==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Content-Transfer-Encoding: Content-Type:MIME-Version:Message-ID:Subject:To:From:Date:Reply-To:Cc: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References; bh=kLqw9qI2ZmfIoZEVgO4q70sbbW5jw3/TyfXUT9W6iMQ=; b=d4i6XO8/TTpk7NasIJGiZBiyo+ a5nz1wsjhJzY+Fh75dXD7PGuSwRyvf0BMoKPgTk2F5cJKUguRc6YAMoIfFyUSOrYoLofRm9czE4HA vj423+NiYIp9TD5mSOGqQ0f4d1khpf0P+nUPruoTU2qvUG+ko+KzEWRwTi9nRveLDjiWBgwbtyuok MExKK6CKgFelTklhd7mi9k7zsogWkvVhBO78sRe9RHfKRcobIMeDGeuG6xNJmTQRtIjfNphECNn3x U2hcKKYm3HOuk721QDaLw3eCbZ4DuZaJGH2e0BgyJxxVsIv0ut29wSR1+YZYASwDPQF+YHFeLvgrE 8BGipF8A==; Received: by zero.zsh.org with local id 1p3fWL-000Cav-Rx; Fri, 09 Dec 2022 15:42:41 +0000 Authentication-Results: zsh.org; iprev=pass (relay2-d.mail.gandi.net) smtp.remote-ip=217.70.183.194; dmarc=none header.from=chazelas.org; arc=none Received: from relay2-d.mail.gandi.net ([217.70.183.194]:38803) by zero.zsh.org with esmtps (TLS1.2:ECDHE-RSA-AES256-GCM-SHA384:256) id 1p3fW8-000CIL-6g; Fri, 09 Dec 2022 15:42:28 +0000 Received: (Authenticated sender: stephane@chazelas.org) by mail.gandi.net (Postfix) with ESMTPSA id B9CEE40004 for ; Fri, 9 Dec 2022 15:42:26 +0000 (UTC) Date: Fri, 9 Dec 2022 15:42:25 +0000 From: Stephane Chazelas To: Zsh hackers list Subject: read -d $'\200' doesn't work with set +o multibyte Message-ID: <20221209154225.2z3lbtf422ypnmjx@chazelas.org> Mail-Followup-To: Zsh hackers list MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Seq: 51156 Archived-At: X-Loop: zsh-workers@zsh.org Errors-To: zsh-workers-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-workers-request@zsh.org X-no-archive: yes List-Id: List-Help: , List-Subscribe: , List-Unsubscribe: , List-Post: List-Owner: List-Archive: Even in a locale with a single-byte charmap, when multibyte is off, I can't make read -d work when the delimiter is a byte >= 0x80. $ LC_ALL=en_GB.iso885915 ./Src/zsh +o multibyte $ locale charmap ISO-8859-15 $ locale ctype-mb-cur-max 1 $ print 'a\351b' | read -rd $'\351' $ print $? 1 $ print -r -- $REPLY | LC_ALL=C od -tc 0000000 a 351 b \n 0000004 Without set +o multibyte, the above works (at treating 0351 (é in that charset) as a delimiter), but not for \200 (undefined in ISO-8859-1 which I guess is expected). With LC_ALL=C, on GNU systems where mbrtowc() returns -1 EILSEQ for any byte >= 0x80, I find read -d doesn't work for byte >= 0x80 used as delimiter with or without set +o multibyte. (on Debian GNU/Linux amd64 with 5.9 or git HEAD). I've raised a related issue against ksh93 (https://github.com/ksh93/ksh/issues/590) It looks like POSIX are considering specifying read -d. They would leave it unspecified if the delimiter is neither the empty string nor a single-byte character. https://austingroupbugs.net/view.php?id=243#c6091 -- Stephane