zsh-workers
 help / color / mirror / code / Atom feed
* print -v with multibyte characters
@ 2020-02-20 19:03 zsugabubus
  2020-02-20 22:58 ` Mikael Magnusson
  0 siblings, 1 reply; 8+ messages in thread
From: zsugabubus @ 2020-02-20 19:03 UTC (permalink / raw)
  To: zsh-workers

Hi,

  $ echo $ZSH_VERSION
  5.7.1
  $ export {LC_ALL,LANG}=en_US.UTF-8
  $ set -o multibyte && echo ok
  ok

Good:
  $ print ÖÓŐöóő
  ÖÓŐöóő
  $ printf -v var ÖÓŐöóő; echo $var
  ÖÓŐöóő

Bad:
  $ print -v var ÖÓŐ; echo $var
  öó
  $ print -v var öóő; echo $var
  öóŃ

--
zsugabubus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: print -v with multibyte characters
  2020-02-20 19:03 print -v with multibyte characters zsugabubus
@ 2020-02-20 22:58 ` Mikael Magnusson
  2020-08-05 16:13   ` Mikael Magnusson
  0 siblings, 1 reply; 8+ messages in thread
From: Mikael Magnusson @ 2020-02-20 22:58 UTC (permalink / raw)
  To: zsugabubus; +Cc: zsh-workers

On 2/20/20, zsugabubus <zsugabubus@national.shitposting.agency> wrote:
> Hi,
>
>   $ echo $ZSH_VERSION
>   5.7.1
>   $ export {LC_ALL,LANG}=en_US.UTF-8
>   $ set -o multibyte && echo ok
>   ok
>
> Good:
>   $ print ÖÓŐöóő
>   ÖÓŐöóő
>   $ printf -v var ÖÓŐöóő; echo $var
>   ÖÓŐöóő
>
> Bad:
>   $ print -v var ÖÓŐ; echo $var
>   öó
>   $ print -v var öóő; echo $var
>   öóŃ

This gets closer to correct, but seems to leave out the final byte or
two, or change it somehow,
diff --git i/Src/builtin.c w/Src/builtin.c
index 168bf8863b..ed26717b5b 100644
--- i/Src/builtin.c
+++ w/Src/builtin.c
@@ -4848,8 +4848,7 @@ bin_print(char *name, char **args, Options ops, int func)
            if (ret)
                free(buf);
            else
-               setsparam(OPT_ARG(ops, 'v'),
-                         metafy(buf, rcount, META_REALLOC));
+               setsparam(OPT_ARG(ops, 'v'), buf);
            unqueue_signals();
        }
        return ret;
@@ -4972,8 +4971,7 @@ bin_print(char *name, char **args, Options ops, int func)
            if (ret)
                free(buf);
            else
-               setsparam(OPT_ARG(ops, 'v'),
-                         metafy(buf, rcount, META_REALLOC));
+               setsparam(OPT_ARG(ops, 'v'), buf);
            unqueue_signals();
        }
        return ret;

Incidentally you can use print -v var -f %s ÖÓŐ; echo $var to work
around the problem (the handling for -f uses different code which
doesn't have the bug).

-- 
Mikael Magnusson

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: print -v with multibyte characters
  2020-02-20 22:58 ` Mikael Magnusson
@ 2020-08-05 16:13   ` Mikael Magnusson
  2020-08-06  6:51     ` Daniel Shahaf
  2020-08-07 10:10     ` Jun T
  0 siblings, 2 replies; 8+ messages in thread
From: Mikael Magnusson @ 2020-08-05 16:13 UTC (permalink / raw)
  To: zsh-workers

On 2/20/20, Mikael Magnusson <mikachu@gmail.com> wrote:
> On 2/20/20, zsugabubus <zsugabubus@national.shitposting.agency> wrote:
>> Hi,
>>
>>   $ echo $ZSH_VERSION
>>   5.7.1
>>   $ export {LC_ALL,LANG}=en_US.UTF-8
>>   $ set -o multibyte && echo ok
>>   ok
>>
>> Good:
>>   $ print ÖÓŐöóő
>>   ÖÓŐöóő
>>   $ printf -v var ÖÓŐöóő; echo $var
>>   ÖÓŐöóő
>>
>> Bad:
>>   $ print -v var ÖÓŐ; echo $var
>>   öó
>>   $ print -v var öóő; echo $var
>>   öóŃ
>
> This gets closer to correct, but seems to leave out the final byte or
> two, or change it somehow,
> diff --git i/Src/builtin.c w/Src/builtin.c
> index 168bf8863b..ed26717b5b 100644
> --- i/Src/builtin.c
> +++ w/Src/builtin.c
> @@ -4848,8 +4848,7 @@ bin_print(char *name, char **args, Options ops, int
> func)
>             if (ret)
>                 free(buf);
>             else
> -               setsparam(OPT_ARG(ops, 'v'),
> -                         metafy(buf, rcount, META_REALLOC));
> +               setsparam(OPT_ARG(ops, 'v'), buf);
>             unqueue_signals();
>         }
>         return ret;
> @@ -4972,8 +4971,7 @@ bin_print(char *name, char **args, Options ops, int
> func)
>             if (ret)
>                 free(buf);
>             else
> -               setsparam(OPT_ARG(ops, 'v'),
> -                         metafy(buf, rcount, META_REALLOC));
> +               setsparam(OPT_ARG(ops, 'v'), buf);
>             unqueue_signals();
>         }
>         return ret;
>
> Incidentally you can use print -v var -f %s ÖÓŐ; echo $var to work
> around the problem (the handling for -f uses different code which
> doesn't have the bug).

Just looking through some pending patches and saw this is unresolved.
Does anyone have any ideas on how this can be fixed?

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: print -v with multibyte characters
  2020-08-05 16:13   ` Mikael Magnusson
@ 2020-08-06  6:51     ` Daniel Shahaf
  2020-08-07 10:10     ` Jun T
  1 sibling, 0 replies; 8+ messages in thread
From: Daniel Shahaf @ 2020-08-06  6:51 UTC (permalink / raw)
  To: zsh-workers

Mikael Magnusson wrote on Wed, 05 Aug 2020 18:13 +0200:
> Just looking through some pending patches and saw this is unresolved.
> Does anyone have any ideas on how this can be fixed?

No idea, but if you notice other outstanding issues, feel free to add them to Etc/BUGS.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: print -v with multibyte characters
  2020-08-05 16:13   ` Mikael Magnusson
  2020-08-06  6:51     ` Daniel Shahaf
@ 2020-08-07 10:10     ` Jun T
  2020-08-07 22:29       ` PATCH: Test for print -v fix Mikael Magnusson
  1 sibling, 1 reply; 8+ messages in thread
From: Jun T @ 2020-08-07 10:10 UTC (permalink / raw)
  To: zsh-workers


In the case of 'print -v var', args[] are RE-metafied at line 4871
(builtin.c), but the length len[] is not updated.

At line 4959,
        fwrite(*args, *len, 1, fout);
this metafied args[] are written to buf with the wrong length len[].

I think args[] should not have been RE-metafied at line 4871, because
without '-v var' args[] are no metafied at this fwrite().

We need to metafy buf for setsparam(), as the current code wrongly
does at line 4981.


diff --git a/Src/builtin.c b/Src/builtin.c
index ff84ce936..09eb3728c 100644
--- a/Src/builtin.c
+++ b/Src/builtin.c
@@ -4862,7 +4862,7 @@ bin_print(char *name, char **args, Options ops, int func)
 
     /* normal output */
     if (!fmt) {
-	if (OPT_ISSET(ops, 'z') || OPT_ISSET(ops, 'v') ||
+	if (OPT_ISSET(ops, 'z') ||
 	    OPT_ISSET(ops, 's') || OPT_ISSET(ops, 'S')) {
 	    /*
 	     * We don't want the arguments unmetafied after all.






^ permalink raw reply	[flat|nested] 8+ messages in thread

* PATCH: Test for print -v fix
  2020-08-07 10:10     ` Jun T
@ 2020-08-07 22:29       ` Mikael Magnusson
  2020-08-11  4:34         ` Jun T
  0 siblings, 1 reply; 8+ messages in thread
From: Mikael Magnusson @ 2020-08-07 22:29 UTC (permalink / raw)
  To: zsh-workers

---
This seems to work for me, thanks! Here's a test for it.

 Test/B03print.ztst     | 1 +
 Test/D07multibyte.ztst | 9 +++++++++
 2 files changed, 10 insertions(+)

diff --git a/Test/B03print.ztst b/Test/B03print.ztst
index 0ef3743ce3..5634239346 100644
--- a/Test/B03print.ztst
+++ b/Test/B03print.ztst
@@ -4,6 +4,7 @@
 #  Use of print -p to output to coprocess	A01grammar
 #  Prompt expansion with print -P		D01prompt
 #  -l, -r, -R and -n indirectly tested in various places
+#  multibyte tests in D07multibyte
 
 # Not yet tested:
 #  echo and pushln
diff --git a/Test/D07multibyte.ztst b/Test/D07multibyte.ztst
index 989f451837..345e061d94 100644
--- a/Test/D07multibyte.ztst
+++ b/Test/D07multibyte.ztst
@@ -570,6 +570,15 @@
 0:printf %q and quotestring and general metafy / token madness
 >你你
 
+ typeset -a foo
+ print -v foo 'ÖÓŐ'
+ echo $foo
+ printf -v foo 'ÖÓŐ'
+ echo $foo
+0:print and printf into a variable with multibyte text
+>ÖÓŐ
+>ÖÓŐ
+
 # This test is kept last as it introduces an additional
 # dependency on the system regex library.
   if zmodload zsh/regex 2>/dev/null; then
-- 
2.15.1



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PATCH: Test for print -v fix
  2020-08-07 22:29       ` PATCH: Test for print -v fix Mikael Magnusson
@ 2020-08-11  4:34         ` Jun T
  2020-08-12  9:45           ` Mikael Magnusson
  0 siblings, 1 reply; 8+ messages in thread
From: Jun T @ 2020-08-11  4:34 UTC (permalink / raw)
  To: zsh-workers


> 2020/08/08 7:29, Mikael Magnusson <mikachu@gmail.com> wrote:
> 
> --- a/Test/D07multibyte.ztst
> +++ b/Test/D07multibyte.ztst
> @@ -570,6 +570,15 @@
> 0:printf %q and quotestring and general metafy / token madness
>> 你你
> 
> + typeset -a foo

Why -a?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: PATCH: Test for print -v fix
  2020-08-11  4:34         ` Jun T
@ 2020-08-12  9:45           ` Mikael Magnusson
  0 siblings, 0 replies; 8+ messages in thread
From: Mikael Magnusson @ 2020-08-12  9:45 UTC (permalink / raw)
  To: Jun T; +Cc: zsh-workers

On 8/11/20, Jun T <takimoto-j@kba.biglobe.ne.jp> wrote:
>
>> 2020/08/08 7:29, Mikael Magnusson <mikachu@gmail.com> wrote:
>>
>> --- a/Test/D07multibyte.ztst
>> +++ b/Test/D07multibyte.ztst
>> @@ -570,6 +570,15 @@
>> 0:printf %q and quotestring and general metafy / token madness
>>> 你你
>>
>> + typeset -a foo
>
> Why -a?

Ah, must have been because I copypasted from another testcase :). Will
fix before pushing it.

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-08-12  9:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-20 19:03 print -v with multibyte characters zsugabubus
2020-02-20 22:58 ` Mikael Magnusson
2020-08-05 16:13   ` Mikael Magnusson
2020-08-06  6:51     ` Daniel Shahaf
2020-08-07 10:10     ` Jun T
2020-08-07 22:29       ` PATCH: Test for print -v fix Mikael Magnusson
2020-08-11  4:34         ` Jun T
2020-08-12  9:45           ` Mikael Magnusson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).