zsh-workers
 help / color / mirror / code / Atom feed
* getjobtext() gives invalid utf8, leading to segfault
@ 2021-08-08 17:14 Carl Agrell
  2021-08-09  2:10 ` Mikael Magnusson
  0 siblings, 1 reply; 3+ messages in thread
From: Carl Agrell @ 2021-08-08 17:14 UTC (permalink / raw)
  To: zsh-workers

With the powerlevel10k prompt, running either of these two commands
causes the shell to segfault:
    $ AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA月光
    $ AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA月
The AAAs can be replaced with anything as long as the length is
unchanged (did not test with non-ascii though). Changing the kanji at
the end usually makes it not crash, strangely enough.

A minimal zshrc creating the same crash is
    _preexec() {
        [[ $2 == "" ]]
    }
    preexec_functions=(_preexec)

If we echo $2 instead of comparing it, it is printed as
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA月�
hinting that it's a multibyte error. Curiously, /bin/echo instead
gives 月元 at the end.

Looking through the source, it looks like this string is created by
getjobtext(). This hints that similar errors might be seen in other
places where jobs are displayed, and indeed:
    $ cat /dev/stdin
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA月
    ^Z
    zsh: suspended  cat /dev/stdin
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA會

I am running `zsh 5.8 (x86_64-pc-linux-gnu)`, the one that is current
packaged in Arch Linux.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: getjobtext() gives invalid utf8, leading to segfault
  2021-08-08 17:14 getjobtext() gives invalid utf8, leading to segfault Carl Agrell
@ 2021-08-09  2:10 ` Mikael Magnusson
  2021-08-09  5:33   ` Bart Schaefer
  0 siblings, 1 reply; 3+ messages in thread
From: Mikael Magnusson @ 2021-08-09  2:10 UTC (permalink / raw)
  To: Carl Agrell; +Cc: zsh-workers

On 8/8/21, Carl Agrell <caagr98@gmail.com> wrote:
> With the powerlevel10k prompt, running either of these two commands
> causes the shell to segfault:
>     $
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA月光
>     $
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA月
> The AAAs can be replaced with anything as long as the length is
> unchanged (did not test with non-ascii though). Changing the kanji at
> the end usually makes it not crash, strangely enough.
>
> A minimal zshrc creating the same crash is
>     _preexec() {
>         [[ $2 == "" ]]
>     }
>     preexec_functions=(_preexec)
>
> If we echo $2 instead of comparing it, it is printed as
>     AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA月�
> hinting that it's a multibyte error. Curiously, /bin/echo instead
> gives 月元 at the end.
>
> Looking through the source, it looks like this string is created by
> getjobtext(). This hints that similar errors might be seen in other
> places where jobs are displayed, and indeed:
>     $ cat /dev/stdin
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA月
>     ^Z
>     zsh: suspended  cat /dev/stdin
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA會
>
> I am running `zsh 5.8 (x86_64-pc-linux-gnu)`, the one that is current
> packaged in Arch Linux.

Running in debug mode prints the message:
BUG: substring ends in the middle of a metachar in ztrsub()
and breaking here in gdb gives the following backtrace:
(gdb) bt
#0  ztrsub (t=0x7ffff7fe91fd "", s=0x7ffff7fe91fd "") at utils.c:5187
#1  0x0000000000496ac6 in patallocstr (prog=0x701320,
    string=0x7ffff7fe91b0 'A' <repeats 68 times>, "惼\203\250僥\203",
stringlen=77,
    unmetalen=-1, force=0, patstralloc=0x7fffffffc9b0) at pattern.c:2138
#2  0x0000000000496ec1 in pattryrefs (prog=0x701320,
    string=0x7ffff7fe91b0 'A' <repeats 68 times>, "惼\203\250僥\203",
stringlen=77,
    unmetalenin=-1, patstralloc=0x7fffffffc9b0, patoffset=0, nump=0x0,
begp=0x0, endp=0x0)
    at pattern.c:2312
#3  0x0000000000496ce0 in pattry (prog=0x701320,
    string=0x7ffff7fe91b0 'A' <repeats 68 times>, "惼\203\250僥\203") at
pattern.c:2214
#4  0x000000000042cbca in evalcond (state=0x7fffffffcfc0,
fromtest=0x0) at cond.c:322
#5  0x000000000043c36d in execcond (state=0x7fffffffcfc0, do_exec=0)
at exec.c:5122
#6  0x0000000000430dee in execsimple (state=0x7fffffffcfc0) at exec.c:1276
#7  0x000000000043126c in execlist (state=0x7fffffffcfc0,
dont_change_job=1, exiting=0)
    at exec.c:1404
#8  0x0000000000430aa3 in execode (p=0x7198f0, dont_change_job=1, exiting=0,
    context=0x4c7eea "shfunc") at exec.c:1218
#9  0x000000000043ebec in runshfunc (prog=0x7198f0, wrap=0x0,
name=0x7ffff7fe9170 "preexec")
    at exec.c:6066
#10 0x000000000043e41e in doshfunc (shfunc=0x719310,
doshargs=0x7ffff7ff4b50, noreturnval=1)
    at exec.c:5916
#11 0x00000000004b5ce6 in callhookfunc (name=0x4ca0cb "preexec",
lnklst=0x7ffff7ff4b50,
    arrayp=1, retval=0x0) at utils.c:1530
#12 0x0000000000457022 in loop (toplevel=1, justonce=0) at init.c:198
#13 0x000000000045aee1 in zsh_main (argc=2, argv=0x7fffffffd638) at init.c:1799
#14 0x000000000040f9d6 in main (argc=2, argv=0x7fffffffd638) at ./main.c:93



-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: getjobtext() gives invalid utf8, leading to segfault
  2021-08-09  2:10 ` Mikael Magnusson
@ 2021-08-09  5:33   ` Bart Schaefer
  0 siblings, 0 replies; 3+ messages in thread
From: Bart Schaefer @ 2021-08-09  5:33 UTC (permalink / raw)
  To: Zsh hackers list; +Cc: Carl Agrell

On Sun, Aug 8, 2021 at 7:11 PM Mikael Magnusson <mikachu@gmail.com> wrote:
>
> Running in debug mode prints the message:
> BUG: substring ends in the middle of a metachar in ztrsub()
> and breaking here in gdb gives the following backtrace:
> (gdb) bt
> #0  ztrsub (t=0x7ffff7fe91fd "", s=0x7ffff7fe91fd "") at utils.c:5187

Does it seem odd to anyone that (!*s) is one of the cases that
triggers that warning, but in the non-DEBUG case it doesn't end the
loop?

The real problem, though:

On Sun, Aug 8, 2021 at 10:14 AM Carl Agrell <caagr98@gmail.com> wrote:
>
> Looking through the source, it looks like this string is created by
> getjobtext().

getjobtext() loads the string into a fixed-size buffer and then
unconditionally puts a '\0' in the last byte of that buffer, which
cuts off the metacharacter in the middle.

Following is the minimal fix, but nothing else in text.c refers to the
Meta constant, so perhaps someone has a better suggestion?

diff --git a/Src/text.c b/Src/text.c
index 4bf88f2e2..5cd7685fd 100644
--- a/Src/text.c
+++ b/Src/text.c
@@ -335,6 +335,8 @@ getjobtext(Eprog prog, Wordcode c)
     tlim = tptr + JOBTEXTSIZE - 1;
     tjob = 1;
     gettext2(&s);
+    if (tptr[-1] == Meta)
+       --tptr;
     *tptr = '\0';
     freeeprog(prog);           /* mark as unused */
     untokenize(jbuf);


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-08-09  5:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-08 17:14 getjobtext() gives invalid utf8, leading to segfault Carl Agrell
2021-08-09  2:10 ` Mikael Magnusson
2021-08-09  5:33   ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).