From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: from zero.zsh.org (zero.zsh.org [IPv6:2a02:898:31:0:48:4558:7a:7368]) by inbox.vuxu.org (Postfix) with ESMTP id D162B222FD for ; Wed, 17 Jan 2024 07:09:17 +0100 (CET) ARC-Seal: i=1; cv=none; a=rsa-sha256; d=zsh.org; s=rsa-20210803; t=1705471757; b=YcBUwSLMEeS8zZXANfhhUOs+zTA0fJVDvMRA7r26NUCXlYRBZVV9brGgF8GRKJOXohXAauGVId gn19A2s5V0xtRBZjXX1vgW/EitrgEh5PbfSFzJWgEYbBWvD9BG1iDE4VFH38uGkxB1rhBZASXQ Y6LLCXj0BEPWbDAGPsODZiZJwKZtoBfAzqAArUMEh8ct3fhndqhpwPHqwQepZiworyQ6FbRZ14 E0E/cpE702BR6aVDypPin99GaqzTzEQTHoa0DpFz6ouX30YacvAYknpgWbn2cohdygq250lBdk /5k7/oQHBfihPBvW5dWiXxdZtk+cV7/JdGGiICHeOMFCyQ==; ARC-Authentication-Results: i=1; zsh.org; iprev=pass (mail-lj1-f177.google.com) smtp.remote-ip=209.85.208.177; dkim=pass header.d=gmail.com header.s=20230601 header.a=rsa-sha256; dmarc=pass header.from=gmail.com; arc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed; d=zsh.org; s=rsa-20210803; t=1705471757; bh=dEuzAR5QqY2OZA9Z9M6MhCIl4C2obClDdtskAYc842g=; h=List-Archive:List-Owner:List-Post:List-Unsubscribe:List-Subscribe:List-Help: List-Id:Sender:Content-Transfer-Encoding:Content-Type:Cc:To:Subject: Message-ID:Date:From:In-Reply-To:References:MIME-Version:DKIM-Signature: DKIM-Signature; b=GI48DLIHesH3aufaR60XM9z+GBpJPF8ifD/fq3NXBq4tzgUErWRHhWhc1VYG9w87QPd0icKLhw MuLrH3vK11WM/8l6cJIiCGheamhWgZqfnV4h5n05kg4daxSPZtt/7yIe/nlNqDQ1w3ZqTaksQR FifEku5kl8EkUm7kSSDRh+TGrzKEAmUHsuEbqXJhUh03NR/2OlxL4W/PuHDgr4veMLSgReUATG v1UHvpmdXTYq5ptZ7IBKAAcrrdq3j3B9VfVPfyRccPfltDow0Trqgc3gJFZGdb6YI51PAsmvKr NNzHC4la3wFfHtXbPkMGASCDBqQLO4Abm+5TVeGK/Lcsig==; DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=zsh.org; s=rsa-20210803; h=List-Archive:List-Owner:List-Post:List-Unsubscribe: List-Subscribe:List-Help:List-Id:Sender:Content-Transfer-Encoding: Content-Type:Cc:To:Subject:Message-ID:Date:From:In-Reply-To:References: MIME-Version:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID; bh=a4QMJ5PyXF2ya5LA58Q8AfOnQzR89o310p+65PR9+00=; b=lQ8TTWai023jVm8Z3hVuB3+FQZ PQufHK3nNuJICuYKvapXYKmiToJIkEr2oNH3J5LtXcCDftQQEbMbsp1XwXSXEqYGYQNe05twDlqVQ 0ARnL9R66OIIEDINn/cRcXMlFgcot17BkrJYiXq6HYyeN6lhuv8yb8XqPBccvFLtBVNF7C1xpVM0Q a8si7EVcPDovr/B8Flt+e9pDOGIktCrFItXOLgpuhVmD5P2GCF490Uof6Tibrj91SbBSudUilHZGL byHctXw9H4S+XeMjDOvTqiaTEtoL9pdRJeRs6ANvCVc69P8SskolVEASYau1B+kNpdFbo9hmObbnS 1WSIVs4g==; Received: by zero.zsh.org with local id 1rPz6z-000HM5-9y; Wed, 17 Jan 2024 06:09:17 +0000 Authentication-Results: zsh.org; iprev=pass (mail-lj1-f177.google.com) smtp.remote-ip=209.85.208.177; dkim=pass header.d=gmail.com header.s=20230601 header.a=rsa-sha256; dmarc=pass header.from=gmail.com; arc=none Received: from mail-lj1-f177.google.com ([209.85.208.177]:53497) by zero.zsh.org with esmtps (TLS1.3:TLS_AES_128_GCM_SHA256:128) id 1rPz5o-000GXG-VH; Wed, 17 Jan 2024 06:08:05 +0000 Received: by mail-lj1-f177.google.com with SMTP id 38308e7fff4ca-2cca5d81826so138820141fa.2 for ; Tue, 16 Jan 2024 22:08:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705471684; x=1706076484; darn=zsh.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=a4QMJ5PyXF2ya5LA58Q8AfOnQzR89o310p+65PR9+00=; b=lfonG78mHvynRIIYFbc6HcV45y4HIAxFXBP6F9M0XdAnrPQQYbIq1Vy+dFKKMvWf+E tdIuvOq0ULY6ApKt1E2QrFC3VrJ4aFJaKur4d1NpL5AA2kcwIjvPRddWIO+GV3s6o6oc AVXDL6o+ws7Ss8wsi08iEkTHSyfSrohO08zWMwf3jIvss/D0DHWackrzj3YO3GPw7hpe HVPOrXGSeBt4WxqaBU8DDXvctpXLRbigGRKcX1SvcTGqNRbeHM9UKxa426X22mQmctIB 4roI5bqnL+dj2N3WijWgPlBLvZ/QudnUsD4iK6gC9jpe4MDyTwL9MllulCt0F2rDxz2K sCcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705471684; x=1706076484; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=a4QMJ5PyXF2ya5LA58Q8AfOnQzR89o310p+65PR9+00=; b=xDWe0biIkjIZO+IO5VoZbaXSRDESV6rVAB5q78cKSs6tgnqfaRuW3PNpKiiF3Rb3Zn 1/IOxK/FK8Uik82OpDSjtBlAolA31fOlCloIqaJAL2IK/7GMr2ytzfsYjDPUMJKn0v+4 WOz/2ceDBoqsNLUJEvPKKYKnSzUGGEBpq1RK3mM2TRs+baprp+tkfIjOvuDyPFE26oh8 My5aPwc+o/sCWwj2ISNkEvNmdY7YXaCmXcN7UFYfNJw4rztA79IJaQMQY4ZinDmH5neR r+TpI6Q3sStUgKLTfm07Skcj/knhSQ68J4NMVInwAJMn6GKhojuUN+8EIXRizQiDKMA0 xeEg== X-Gm-Message-State: AOJu0YzGbDJkPvLSZjclnO/3C8/aDuMVjmeog3YVpEtk0fayqOYg2zwO GdZ3upVr9WDjA2YP9kB9O+HjtWv6hM9bpHWGQdGdBu5957w= X-Google-Smtp-Source: AGHT+IEHuq0qngZIo4x1f29XQKF+EFIXA7V5s8Ok8sqtjWWJ5xYOeyIykoEU+UX3+W3DrAsqkC7i5AIva8lNSkU77t8= X-Received: by 2002:a2e:8e68:0:b0:2cd:1ca6:87bf with SMTP id t8-20020a2e8e68000000b002cd1ca687bfmr4090640ljk.8.1705471683981; Tue, 16 Jan 2024 22:08:03 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Roman Perepelitsa Date: Wed, 17 Jan 2024 07:07:50 +0100 Message-ID: Subject: Re: A comment about "slurp" and -o multibyte To: Bart Schaefer Cc: Zsh Users Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Seq: 29511 Archived-At: X-Loop: zsh-users@zsh.org Errors-To: zsh-users-owner@zsh.org Precedence: list Precedence: bulk Sender: zsh-users-request@zsh.org X-no-archive: yes List-Id: List-Help: , List-Subscribe: , List-Unsubscribe: , List-Post: List-Owner: List-Archive: On Wed, Jan 17, 2024 at 4:46=E2=80=AFAM Bart Schaefer wrote: > > On Sun, Jan 14, 2024 at 2:34=E2=80=AFAM Roman Perepelitsa > wrote: > > > > function slurp() { > > emulate -L zsh -o no_multibyte > > [...] > > typeset -g REPLY=3D${(j::)content} > > } > > Although the function faithfully reads the input stream into $REPLY, > later references to $REPLY with the multibyte option back in effect > will (re-)interpret the content as multibyte characters. This may not > be what's desired. > > % slurp < =3Dzsh > % () { > print $#REPLY > print ${(m)#REPLY} > print ${(mm)#REPLY} > setopt localoptions nomultibyte > print $#REPLY > } > 872903 <-- number of characters > 873259 <-- width of printable characters > 872383 <-- number of glyphs > 878288 <-- actual number of bytes > > (Of course those first three numbers are all garbage because it's just > interpreting an executable as wide character text.) To me this behavior looks as expected. It's consistent with `read`, `sysread` and process substitution. % head -c $((1 << 20)) 1MB % slurp <1MB % IFS=3D read -rd '' read <1MB % sysread -s $((1 << 20)) sysread <1MB % procsubst=3D${"$(<1MB; print -n .)"%.} % () { print -r -- $#REPLY $#read $#sysread $#procsubst setopt local_options no_multibyte print -r -- $#REPLY $#read $#sysread $#procsubst } 1008389 1008389 1008389 1008389 1048576 1048576 1048576 1048576 Roman.