zsh-users
 help / color / mirror / code / Atom feed
* precmd: write error: interrupted
@ 2013-04-25 16:47 Yuri D'Elia
  2013-04-25 17:44 ` Yuri D'Elia
  2013-04-25 18:16 ` Bart Schaefer
  0 siblings, 2 replies; 13+ messages in thread
From: Yuri D'Elia @ 2013-04-25 16:47 UTC (permalink / raw)
  To: zsh-users

Hi everyone. I while ago I posted this:

   http://www.zsh.org/mla/users/2012/msg00757.html

about precmd emitting a write error on startup.
I still have the error, as I didn't find any way to silence it (braces 
around print do not work?!).

I tried hard yesterday to debug this issue, by repeatedly stracing zsh 
until I got the error.

What's happening is that the terminal is being resized immediately after 
starting, and a SIGWINCH is emitted while the actual "precmd" write is 
being done on stdout. The call is not restarted and the write fails.

I have no idea which write is actually failing in that function (I 
suppose it's some "fputs" in bin_print). Unfortunately if I try to run 
zsh through a debugger the startup is too slow and never triggers the 
issue. Also, this seems to only happen in tiling window managers, and 
the explanation is now obvious: the terminal is forced onto a specific 
geometry quickly after the window is mapped. Point in fact, "spectrwm" 
gives me this problem twice as much as "awesomewm" because it's snappier 
:(. I really wished I had dtrace here...

I suppose SIGWINCH should me masked when writing to the terminal.
Can somebody help with this?

1) SIGWINCH should either be masked or allow write to restart.
2) Why "precmd() { { print "HELLO" } >&- 2>&-; } doesn't suppress the 
error in this case?

Thanks.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: precmd: write error: interrupted
  2013-04-25 16:47 precmd: write error: interrupted Yuri D'Elia
@ 2013-04-25 17:44 ` Yuri D'Elia
  2013-04-25 18:27   ` Bart Schaefer
  2013-04-25 18:16 ` Bart Schaefer
  1 sibling, 1 reply; 13+ messages in thread
From: Yuri D'Elia @ 2013-04-25 17:44 UTC (permalink / raw)
  To: zsh-users

On 04/25/2013 06:47 PM, Yuri D'Elia wrote:
> 2) Why "precmd() { { print "HELLO" } >&- 2>&-; } doesn't suppress the
> error in this case?

Closest analogue I could find is:

% { { print >&2 } 2>&- } 2>/dev/null
zsh: write error

braces here don't help. A real subshell of course works, but it's 
prohibitive for precmd.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: precmd: write error: interrupted
  2013-04-25 16:47 precmd: write error: interrupted Yuri D'Elia
  2013-04-25 17:44 ` Yuri D'Elia
@ 2013-04-25 18:16 ` Bart Schaefer
  2013-04-25 18:38   ` Peter Stephenson
                     ` (2 more replies)
  1 sibling, 3 replies; 13+ messages in thread
From: Bart Schaefer @ 2013-04-25 18:16 UTC (permalink / raw)
  To: zsh-users

On Apr 25,  6:47pm, Yuri D'Elia wrote:
} 
} I still have the error, as I didn't find any way to silence it (braces 
} around print do not work?!).

That's curious, but try using parentheses to force a fork and redirect
the stderr of the subshell.

} I have no idea which write is actually failing in that function (I 
} suppose it's some "fputs" in bin_print).

strace should be able to show you what bytes are being written, which
would narrow it down.

} :(. I really wished I had dtrace here...

On every host where I have dtrace, I wish I had strace.
 
} What's happening is that the terminal is being resized immediately after 
} starting, and a SIGWINCH is emitted while the actual "precmd" write is 
} being done on stdout. The call is not restarted and the write fails.

Yes, the zsh signal handler setup explicitly prevents system call restart
whenever possible, so that (among other reasons) behavior from traps is
consistent across operating systems.

} 1) SIGWINCH should either be masked or allow write to restart.

This requires some thought about the appropriate layer to handle this.
bin_print does already do some signal queuing when writing to internal
data structures (print -z, print -s), but that's deliberately isolated
to bin_print, whereas all sorts of other things might write to the
terminal -- including other error messages! -- so patching bin_print is
not covering all the bases.

On the other hand we probably don't want to build a wrapper around the
entire stdio library just to differentiate terminal writes.

} 2) Why "precmd() { { print "HELLO" } >&- 2>&-; } doesn't suppress the 
} error in this case?

It's not the same error.  Try 2>/dev/null instead of 2>&- ... with the
stderr closed, you're actually getting a second error from outside the
braces, about not being able to write the first error from inside!

torch% print -u99 "Hello"    
print: bad file number: -1
torch% { print -u99 "Hello" } 2>&-
zsh: write error
torch% { print -u99 "Hello" } 2>/dev/null
torch% 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: precmd: write error: interrupted
  2013-04-25 17:44 ` Yuri D'Elia
@ 2013-04-25 18:27   ` Bart Schaefer
  0 siblings, 0 replies; 13+ messages in thread
From: Bart Schaefer @ 2013-04-25 18:27 UTC (permalink / raw)
  To: zsh-users

On Apr 25,  7:44pm, Yuri D'Elia wrote:
}
} On 04/25/2013 06:47 PM, Yuri D'Elia wrote:
} > 2) Why "precmd() { { print "HELLO" } >&- 2>&-; } doesn't suppress the
} > error in this case?
} 
} Closest analogue I could find is:
} 
} % { { print >&2 } 2>&- } 2>/dev/null
} zsh: write error

Of course what's happening here is in outside-in order.  Starting with
the stuff you can mostly see:

1. stderr is redirected to /dev/null (old stderr is saved)
2. stderr is closed (at this point the outer redirection is gone)
3. print generates an error

Now the stuff you can't see happens:

4. zsh gets another error writing to the closed stderr
5. zsh warns about that to the saved stderr from step 1

You avoid the whole mess if you never close stderr, or if you close it
before step 1 (e.g. by "exec 2>&-"), though the latter is impractical
in the case we're discussing.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: precmd: write error: interrupted
  2013-04-25 18:16 ` Bart Schaefer
@ 2013-04-25 18:38   ` Peter Stephenson
  2013-04-25 22:18     ` Bart Schaefer
  2013-04-25 19:38   ` Yuri D'Elia
  2013-04-25 20:05   ` Yuri D'Elia
  2 siblings, 1 reply; 13+ messages in thread
From: Peter Stephenson @ 2013-04-25 18:38 UTC (permalink / raw)
  To: zsh-users

On Thu, 25 Apr 2013 11:16:46 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> } 1) SIGWINCH should either be masked or allow write to restart.
> 
> This requires some thought about the appropriate layer to handle this.
> bin_print does already do some signal queuing when writing to internal
> data structures (print -z, print -s), but that's deliberately isolated
> to bin_print, whereas all sorts of other things might write to the
> terminal -- including other error messages! -- so patching bin_print is
> not covering all the bases.

Certainly true, but I'm hesitant to do nothing except declare it's
difficult...  Explicit user output via print and error messages via
zsh's own error and warning functions are two cases that cover quite a
lot.  If there's already signal queuing in print, is it up to snuff?  Is
there ever a good reason for allowing a single print to be interrupted
at the point of output --- surely it's always going to do unhelpful
things?

I don't think we'd want to queue interrupts round all builtins, but could
we mark those that produce output but otherwise return immediately with
a flag in the builtin table and do some queueing in the builtin handler?
 
> On the other hand we probably don't want to build a wrapper around the
> entire stdio library just to differentiate terminal writes.

It certainly doesn't sound feasible to do much at the stdio level.
If we were just talking about write() it might feasible to use a simple
wrapper in key places, but it sounds like it can't be reduced to that.

pws


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: precmd: write error: interrupted
  2013-04-25 18:16 ` Bart Schaefer
  2013-04-25 18:38   ` Peter Stephenson
@ 2013-04-25 19:38   ` Yuri D'Elia
  2013-04-26  0:53     ` Bart Schaefer
  2013-04-25 20:05   ` Yuri D'Elia
  2 siblings, 1 reply; 13+ messages in thread
From: Yuri D'Elia @ 2013-04-25 19:38 UTC (permalink / raw)
  To: zsh-users

On 04/25/2013 08:16 PM, Bart Schaefer wrote:
> On Apr 25,  6:47pm, Yuri D'Elia wrote:
> }
> } I still have the error, as I didn't find any way to silence it (braces
> } around print do not work?!).
>
> That's curious, but try using parentheses to force a fork and redirect
> the stderr of the subshell.

This, of course, works. But I wouldn't want to fork here just to ignore 
the error message.

> } I have no idea which write is actually failing in that function (I
> } suppose it's some "fputs" in bin_print).
>
> strace should be able to show you what bytes are being written, which
> would narrow it down.

not really, apart from the "write error:" which I already knew. But 
there are only two possible points in bin_print, and I'm using the 
"unformatted" case.

> } 2) Why "precmd() { { print "HELLO" }>&- 2>&-; } doesn't suppress the
> } error in this case?
>
> It's not the same error.  Try 2>/dev/null instead of 2>&- ... with the
> stderr closed, you're actually getting a second error from outside the
> braces, about not being able to write the first error from inside!

You are right. I pasted one of my numerous trials in an attempt to 
exploit the EBADF errno.

But anyway:

precmd() { { print x } 2>/dev/null }

still doesn't suppress the error.

By looking at bin_print, zwarnnam is used.

In the "normal output case" (Src/builtin.c:6310) there's an explicit 
test for >&- redirection of stdout, and that's it. I see nothing in 
zwarnnam except for the noerrs global that would allow error redirection 
to work in lists.

Maybe there's a builtin to force "noerrs" instead?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: precmd: write error: interrupted
  2013-04-25 18:16 ` Bart Schaefer
  2013-04-25 18:38   ` Peter Stephenson
  2013-04-25 19:38   ` Yuri D'Elia
@ 2013-04-25 20:05   ` Yuri D'Elia
  2013-04-25 20:58     ` Yuri D'Elia
                       ` (2 more replies)
  2 siblings, 3 replies; 13+ messages in thread
From: Yuri D'Elia @ 2013-04-25 20:05 UTC (permalink / raw)
  To: zsh-users

On 04/25/2013 08:16 PM, Bart Schaefer wrote:
> } 1) SIGWINCH should either be masked or allow write to restart.
>
> This requires some thought about the appropriate layer to handle this.
> bin_print does already do some signal queuing when writing to internal
> data structures (print -z, print -s), but that's deliberately isolated
> to bin_print, whereas all sorts of other things might write to the
> terminal -- including other error messages! -- so patching bin_print is
> not covering all the bases.

I actually tried to set SA_RESTART only when installing the handler for 
SIGWINCH when debugging [1]. It works in this case, but I'm not entirely 
sure it is side-effect free (is doing an ioctl on the tty safe while 
mid-write?).

What other syscalls would be interrupted by SIGWINCH that shouldn't be 
restarted? Right now I cannot think of anything that SIGWINCH should 
interrupt.

You can actually reproduce the bug easily by doing:

while print 'x'; do; done

and resizing the terminal, or by sending SIGWINCH to zsh directly.

[1] interestingly enough, why "interact" is tested for the Sunos4 case?


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: precmd: write error: interrupted
  2013-04-25 20:05   ` Yuri D'Elia
@ 2013-04-25 20:58     ` Yuri D'Elia
  2013-04-26 15:08     ` Bart Schaefer
       [not found]     ` <130426080805.ZM18619__18102.73175729$1366989065$gmane$org@torch.brasslantern.com>
  2 siblings, 0 replies; 13+ messages in thread
From: Yuri D'Elia @ 2013-04-25 20:58 UTC (permalink / raw)
  To: zsh-users

[-- Attachment #1: Type: text/plain, Size: 290 bytes --]

On 04/25/2013 10:05 PM, Yuri D'Elia wrote:
> I actually tried to set SA_RESTART only when installing the handler for
> SIGWINCH when debugging [1]. It works in this case, but I'm not entirely
> sure it is side-effect free (is doing an ioctl on the tty safe while
> mid-write?).

Attached.


[-- Attachment #2: restart-sigwinch.diff --]
[-- Type: text/x-patch, Size: 1654 bytes --]

--- signals.c.Orig	2013-04-25 22:34:36.342080454 +0200
+++ signals.c	2013-04-25 22:55:15.994027929 +0200
@@ -112,9 +112,18 @@
     act.sa_handler = (SIGNAL_HANDTYPE) zhandler;
     sigemptyset(&act.sa_mask);        /* only block sig while in handler */
     act.sa_flags = 0;
-# ifdef SA_INTERRUPT                  /* SunOS 4.x */
-    if (interact)
-        act.sa_flags |= SA_INTERRUPT; /* make sure system calls are not restarted */
+# if defined(SIGWINCH) && defined(SA_RESTART)
+    if (sig == SIGWINCH) {
+	act.sa_flags |= SA_RESTART;
+    } else {
+#  ifdef SA_INTERRUPT                  /* SunOS 4.x */
+	act.sa_flags |= SA_INTERRUPT;  /* make sure system calls are not restarted */
+#  endif
+    }
+# else
+#  ifdef SA_INTERRUPT                  /* SunOS 4.x */
+    act.sa_flags |= SA_INTERRUPT;      /* make sure system calls are not restarted */
+#  endif
 # endif
     sigaction(sig, &act, (struct sigaction *)NULL);
 #else
@@ -122,9 +131,13 @@
     struct sigvec vec;
  
     vec.sv_handler = (SIGNAL_HANDTYPE) zhandler;
-    vec.sv_mask = sigmask(sig);    /* mask out this signal while in handler    */
-#  ifdef SV_INTERRUPT
-    vec.sv_flags = SV_INTERRUPT;   /* make sure system calls are not restarted */
+    vec.sv_mask = sigmask(sig);      /* mask out this signal while in handler    */
+#  if defined(SIGWINCH) && defined(SV_INTERRUPT)
+    if (sig != SIGWINCH) {
+	vec.sv_flags = SV_INTERRUPT; /* make sure system calls are not restarted */
+    }
+#  elif defined(SV_INTERRUPT)
+    vec.sv_flags = SV_INTERRUPT;     /* make sure system calls are not restarted */
 #  endif
     sigvec(sig, &vec, (struct sigvec *)NULL);
 # else

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: precmd: write error: interrupted
  2013-04-25 18:38   ` Peter Stephenson
@ 2013-04-25 22:18     ` Bart Schaefer
  2013-04-26  0:52       ` Bart Schaefer
  0 siblings, 1 reply; 13+ messages in thread
From: Bart Schaefer @ 2013-04-25 22:18 UTC (permalink / raw)
  To: zsh-users

On Apr 25,  7:38pm, Peter Stephenson wrote:
} Subject: Re: precmd: write error: interrupted
}
} On Thu, 25 Apr 2013 11:16:46 -0700
} Bart Schaefer <schaefer@brasslantern.com> wrote:
} > } 1) SIGWINCH should either be masked or allow write to restart.
} > 
} > This requires some thought about the appropriate layer to handle this.
} > bin_print does already do some signal queuing when writing to internal
} > data structures (print -z, print -s), but that's deliberately isolated
} > to bin_print, whereas all sorts of other things might write to the
} > terminal -- including other error messages! -- so patching bin_print is
} > not covering all the bases.
} 
} Certainly true, but I'm hesitant to do nothing

I'm not suggesting doing nothing, just haven't decided yet what's the
right thing.

} Explicit user output via print and error messages via
} zsh's own error and warning functions are two cases that cover quite a
} lot.  If there's already signal queuing in print, is it up to snuff?

It's using queue_signals()/unqueue_signals() which of course queues
*all* signals.  I don't think we want to do that in the "ordinary" case,
it introduces side-effects like loops you can't interrupt with ctrl-C.

} Is there ever a good reason for allowing a single print to be
} interrupted at the point of output --- surely it's always going to do
} unhelpful things?

Consider something like:

    x=({1..1000000}
    print $x

If you can't ^C that print, you're potentially in a world of pain.  (It's
already enough of a problem that you can't ^C the assignment itself.)

} I don't think we'd want to queue interrupts round all builtins,
} but could we mark those that produce output but otherwise return
} immediately with a flag in the builtin table and do some queueing in
} the builtin handler?

I'm pretty sure SIGWINCH is an outlier case here and we should focus on
the question of when the shell SHOULD react to window size changes,
rather trying to identify all the builtins that should NOT react.

For example, we might *always* queue the SIGWINCH signal except when the
shell is blocked in zleread (or is about to, but hasn't yet, displayed
the prompt if ZLE is not active).  Those probably don't cover all the
cases, but you get the idea.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: precmd: write error: interrupted
  2013-04-25 22:18     ` Bart Schaefer
@ 2013-04-26  0:52       ` Bart Schaefer
  0 siblings, 0 replies; 13+ messages in thread
From: Bart Schaefer @ 2013-04-26  0:52 UTC (permalink / raw)
  To: zsh-users

On Apr 25,  3:18pm, Bart Schaefer wrote:
}
} Consider something like:
} 
}     x=({1..1000000}
}     print $x
} 
} If you can't ^C that print, you're potentially in a world of pain.

Of course this only applies in scripts/subshells already, because you
can't ^C most builtins that are run directly from the shell prompt.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: precmd: write error: interrupted
  2013-04-25 19:38   ` Yuri D'Elia
@ 2013-04-26  0:53     ` Bart Schaefer
  0 siblings, 0 replies; 13+ messages in thread
From: Bart Schaefer @ 2013-04-26  0:53 UTC (permalink / raw)
  To: zsh-workers, Yuri D'Elia, zsh-users

(Redirecting this to -workers, so Yuri Cc'd in case he's only on -users)

On Apr 25,  9:38pm, Yuri D'Elia wrote:
}
} precmd() { { print x } 2>/dev/null }
} 
} still doesn't suppress the error.

Interesting!  Something is restoring the stderr descriptors before the
error message in bin_print is written.  Here is a bit of strace from
'zsh -f' with { print -n $'x\r' } 2>/devnull running in a loop on the
command line (not from a precmd):

open("/dev/null", O_WRONLY|O_CREAT|O_NOCTTY|O_TRUNC|O_LARGEFILE, 0666) = 3
dup2(3, 2)                              = 2
write(1, "x\r", 2)                      = ? ERESTARTSYS (To be restarted)
--- SIGWINCH (Window changed) @ 0 (0) ---
dup2(12, 2)                             = 2
dup2(11, 2)                             = 2
write(10, "\33[1m\33[7m%\33[27m\33[1m\33[0m                                     
                                           \r \r", 105) = 105
write(2, "print: write error: interrupt\n", 30) = 30

Those dup2 calls should happen only at the end of execcmd after bin_print
has already returned, but somehow they appear to be happening after the
signal handler is called but before the error message.

Attaching with a debugger blocks the interrupt so I haven't been able to
stack-trace the source of the dup2 calls.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: precmd: write error: interrupted
  2013-04-25 20:05   ` Yuri D'Elia
  2013-04-25 20:58     ` Yuri D'Elia
@ 2013-04-26 15:08     ` Bart Schaefer
       [not found]     ` <130426080805.ZM18619__18102.73175729$1366989065$gmane$org@torch.brasslantern.com>
  2 siblings, 0 replies; 13+ messages in thread
From: Bart Schaefer @ 2013-04-26 15:08 UTC (permalink / raw)
  To: zsh-users

On Apr 25, 10:05pm, Yuri D'Elia wrote:
}
} I actually tried to set SA_RESTART only when installing the handler for 
} SIGWINCH when debugging [1]. It works in this case, but I'm not entirely 
} sure it is side-effect free (is doing an ioctl on the tty safe while 
} mid-write?).
} 
} What other syscalls would be interrupted by SIGWINCH that shouldn't be 
} restarted? Right now I cannot think of anything that SIGWINCH should 
} interrupt.

I've been thinking about this, and the problem with using SA_RESTART is
twofold:

(1) [Minor] Some platforms don't have restartable syscalls, so this won't
work everywhere.  But perhaps the intersection of non-restarable syscalls
and support for SIGWINCH is empty.

(2) [Potentially major] A user-defined trap can be installed for the
SIGWINCH signal.  That means arbitrary shell code might execute during
handling of the signal, so all sorts of things might happen mid-write,
not just the default ioctls.

We're not especially POSIX-beholden but it would also be best if we can
preserve the POSIX rules governing trap handlers.  There may not be any
conflict with that here, but it needs to be kept in mind.

} [1] interestingly enough, why "interact" is tested for the Sunos4 case?

I have a vague recollection that it has something to do with the way
STREAMS i/o drivers interoperate with terminals, but I could be entirely
wrong.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: precmd: write error: interrupted
       [not found]     ` <130426080805.ZM18619__18102.73175729$1366989065$gmane$org@torch.brasslantern.com>
@ 2013-04-26 17:59       ` Yuri D'Elia
  0 siblings, 0 replies; 13+ messages in thread
From: Yuri D'Elia @ 2013-04-26 17:59 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-users

On 04/26/2013 05:08 PM, Bart Schaefer wrote:
> } What other syscalls would be interrupted by SIGWINCH that shouldn't be
> } restarted? Right now I cannot think of anything that SIGWINCH should
> } interrupt.
>
> I've been thinking about this, and the problem with using SA_RESTART is
> twofold:
>
> (1) [Minor] Some platforms don't have restartable syscalls, so this won't
> work everywhere.  But perhaps the intersection of non-restarable syscalls
> and support for SIGWINCH is empty.

(I'm somehow curious of which systems don't support SIGWINCH, must be 
particularly old).

> (2) [Potentially major] A user-defined trap can be installed for the
> SIGWINCH signal.  That means arbitrary shell code might execute during
> handling of the signal, so all sorts of things might happen mid-write,
> not just the default ioctls.

I see, and now I also see your reasoning about queuing SIGWINCH 
everywhere except when waiting in zleread. At any rate, that's the only 
point where updating the terminal for the upcoming output makes sense. 
Updating the column count (for example) while in the middle of a widget 
expansion for instance won't likely help (or maybe even break some 
invariant).


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-04-26 18:22 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-25 16:47 precmd: write error: interrupted Yuri D'Elia
2013-04-25 17:44 ` Yuri D'Elia
2013-04-25 18:27   ` Bart Schaefer
2013-04-25 18:16 ` Bart Schaefer
2013-04-25 18:38   ` Peter Stephenson
2013-04-25 22:18     ` Bart Schaefer
2013-04-26  0:52       ` Bart Schaefer
2013-04-25 19:38   ` Yuri D'Elia
2013-04-26  0:53     ` Bart Schaefer
2013-04-25 20:05   ` Yuri D'Elia
2013-04-25 20:58     ` Yuri D'Elia
2013-04-26 15:08     ` Bart Schaefer
     [not found]     ` <130426080805.ZM18619__18102.73175729$1366989065$gmane$org@torch.brasslantern.com>
2013-04-26 17:59       ` Yuri D'Elia

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).