zsh-workers
 help / color / mirror / code / Atom feed
* echo > * and EMFILE
@ 2007-02-12 22:32 Stephane Chazelas
  2007-02-13 10:53 ` Peter Stephenson
  0 siblings, 1 reply; 5+ messages in thread
From: Stephane Chazelas @ 2007-02-12 22:32 UTC (permalink / raw)
  To: Zsh hackers list

Hi guys,

$ touch {1..2000}
$ : > *
$ echo *(L1) | wc
  1    1007    4921

zsh obviously couldn't open that many files, which is normal,
the number of files a process can open at the same time is
generally limited.

But the problem here is that zsh didn't output any error
message. So that the user might think he erased the content of
every file when actually he didn't.

Also:

$ zsh -c 'zmodload zsh/net/tcp; for f ({1..1020}) ztcp -l $((f+2000))'
ztcp: could not bind to port 40968: address already in use

strace gives:

bind(3, {sa_family=AF_INET, sin_port=htons(2208), sin_addr=inet_addr("0.0.0.0")}, 16) = -1 EADDRINUSE (Address already in use)
[...]
write(2, "ztcp: could not bind to port 409"..., 59ztcp: could not bind to port 40968: address already in use

There is someone indeed already listening on 2208, but not on 40968.

strace also gives:

bind(3, {sa_family=AF_INET, sin_port=htons(3014), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
fcntl64(3, F_DUPFD, 10)                 = -1 EMFILE (Too many open files)
bind(3, {sa_family=AF_INET, sin_port=htons(3015), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
fcntl64(3, F_DUPFD, 10)                 = -1 EMFILE (Too many open files)
bind(3, {sa_family=AF_INET, sin_port=htons(3016), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
fcntl64(3, F_DUPFD, 10)                 = -1 EMFILE (Too many open files)
bind(3, {sa_family=AF_INET, sin_port=htons(3017), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
fcntl64(3, F_DUPFD, 10)                 = -1 EMFILE (Too many open files)
bind(3, {sa_family=AF_INET, sin_port=htons(3018), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
fcntl64(3, F_DUPFD, 10)                 = -1 EMFILE (Too many open files)
bind(3, {sa_family=AF_INET, sin_port=htons(3019), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
fcntl64(3, F_DUPFD, 10)                 = -1 EMFILE (Too many open files)
bind(3, {sa_family=AF_INET, sin_port=htons(3020), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
fcntl64(3, F_DUPFD, 10)                 = -1 EMFILE (Too many open files)

but no error message.

I get similar problems with zsocket.

Cheers,
Stéphane


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: echo > * and EMFILE
  2007-02-12 22:32 echo > * and EMFILE Stephane Chazelas
@ 2007-02-13 10:53 ` Peter Stephenson
  2007-02-13 17:11   ` Bart Schaefer
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Stephenson @ 2007-02-13 10:53 UTC (permalink / raw)
  To: Zsh hackers list

Stephane Chazelas wrote:
> Hi guys,
> 
> $ touch {1..2000}
> $ : > *
> $ echo *(L1) | wc
>   1    1007    4921
> 
> zsh obviously couldn't open that many files, which is normal,
> the number of files a process can open at the same time is
> generally limited.
> 
> But the problem here is that zsh didn't output any error
> message. So that the user might think he erased the content of
> every file when actually he didn't.

There are lots and lots of unchecked system calls; it would be very
useful to handle them better.  Fixing them up as well as consistently
handling the errors without, for example, file descriptor leaks needs a
lot of work.  (The fact we don't handle errors well at the moment is
probably actually a bonus from the latter point of view: we're just
closing a lot of file descriptors that are -1, which is useless but
harmless.)

> Also:
> 
> $ zsh -c 'zmodload zsh/net/tcp; for f ({1..1020}) ztcp -l $((f+2000))'
> ztcp: could not bind to port 40968: address already in use

The error message doesn't switch from network to host order; that's
easy to fix.

> I get similar problems with zsocket.

Which message is wrong?  There's no network order problem here, so it
must be something else.

Index: Src/Modules/tcp.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/Modules/tcp.c,v
retrieving revision 1.41
diff -u -r1.41 tcp.c
--- Src/Modules/tcp.c	30 May 2006 22:35:03 -0000	1.41
+++ Src/Modules/tcp.c	13 Feb 2007 10:46:16 -0000
@@ -432,7 +432,7 @@
 	if (bind(sess->fd, (struct sockaddr *)&sess->sock.in, sizeof(struct sockaddr_in)))
 	{
 	    char buf[DIGBUFSIZE];
-	    convbase(buf, (zlong)lport, 10);
+	    convbase(buf, (zlong)ntohs(lport), 10);
 	    zwarnnam(nam, "could not bind to port %s: %e", buf, errno);
 	    tcp_close(sess);
 	    return 1;

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


To access the latest news from CSR copy this link into a web browser:  http://www.csr.com/email_sig.php

To get further information regarding CSR, please visit our Investor Relations page at http://ir.csr.com/csr/about/overview


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: echo > * and EMFILE
  2007-02-13 10:53 ` Peter Stephenson
@ 2007-02-13 17:11   ` Bart Schaefer
  2007-02-13 17:28     ` Peter Stephenson
  0 siblings, 1 reply; 5+ messages in thread
From: Bart Schaefer @ 2007-02-13 17:11 UTC (permalink / raw)
  To: Zsh hackers list

On Feb 13, 10:53am, Peter Stephenson wrote:
} Subject: Re: echo > * and EMFILE
}
} Stephane Chazelas wrote:
} > zsh obviously couldn't open that many files, which is normal,
} > 
} > But the problem here is that zsh didn't output any error
} > message.
} 
} There are lots and lots of unchecked system calls; it would be very
} useful to handle them better.

That may be true, but this is not a case of an unchecked system call.
movefd(), which actually makes the system call, does check the result
and do something appropriate, ending with returning -1 to propagate the
error to the calling scope.

Also, addfd() calls zerr() in some other circumstances, so I don't
think there's an inherent problem with calling it in this case too.

} Fixing them up as well as consistently handling the errors without,
} for example, file descriptor leaks needs a lot of work.

In fact I think the one place where addfd() does call zerr() is a file
descriptor leak, because zerr() has the side-effect of stopping the
command execution at that point and closemnodes() never gets called.

The next question is what is best from the user's point of view.
Suppose the example were something like

	print $SECONDS > *

If we call zerr() and closemnodes(), then an error message is printed
once but the command never actually executes, so the files that it was
possible to open are truncated to zero size.   If we call zwarn() and
not closemnodes(), then hundreds of error messages are printed, the
command goes ahead, some files get $SECONDS and some remain unchanged.
Which is preferable?

The patch for using zerr() would be something like this (line numbers
may be off, and the error remains a bit cryptic because all we know at
this point is the file descriptor number, not the file name):


Index: Src/exec.c
===================================================================
diff -c -r1.32 exec.c
--- Src/exec.c	1 Oct 2006 02:38:52 -0000	1.32
+++ Src/exec.c	13 Feb 2007 16:56:21 -0000
@@ -1632,11 +1674,24 @@
     } else {
 	if (mfds[fd1]->rflag != rflag) {
 	    zerr("file mode mismatch on fd %d", fd1);
+	    closemnodes(mfds);
 	    return;
 	}
 	if (mfds[fd1]->ct == 1) {	/* split the stream */
-	    mfds[fd1]->fds[0] = movefd(fd1);
-	    mfds[fd1]->fds[1] = movefd(fd2);
+	    int fdN = movefd(fd1);
+	    if (fdN < 0) {
+		zerr("multio failed for fd %d: %e", fd1, errno);
+		closemnodes(mfds);
+		return;
+	    }
+	    mfds[fd1]->fds[0] = fdN;
+	    fdN = movefd(fd2);
+	    if (fdN < 0) {
+		zerr("multio failed for fd %d: %e", fd2, errno);
+		closemnodes(mfds);
+		return;
+	    }
+	    mfds[fd1]->fds[1] = fdN;
 	    mpipe(pipes);
 	    mfds[fd1]->pipe = pipes[1 - rflag];
 	    redup(pipes[rflag], fd1);
@@ -1647,7 +1702,13 @@
 		int old = new - sizeof(int) * MULTIOUNIT;
 		mfds[fd1] = hrealloc((char *)mfds[fd1], old, new);
 	    }
-	    mfds[fd1]->fds[mfds[fd1]->ct++] = movefd(fd2);
+	    int fdN = movefd(fd2);
+	    if (fdN < 0) {
+		zerr("multio failed for fd %d: %e", fd2, errno);
+		closemnodes(mfds);
+		return;
+	    }
+	    mfds[fd1]->fds[mfds[fd1]->ct++] = fdN;
 	}
     }
     if (subsh_close >= 0 && fdtable[subsh_close] == FDT_UNUSED)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: echo > * and EMFILE
  2007-02-13 17:11   ` Bart Schaefer
@ 2007-02-13 17:28     ` Peter Stephenson
  2007-02-14  7:53       ` Bart Schaefer
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Stephenson @ 2007-02-13 17:28 UTC (permalink / raw)
  To: Zsh hackers list

Bart Schaefer wrote:
> The next question is what is best from the user's point of view.
> Suppose the example were something like
> 
> 	print $SECONDS > *
> 
> If we call zerr() and closemnodes(), then an error message is printed
> once but the command never actually executes, so the files that it was
> possible to open are truncated to zero size.   If we call zwarn() and
> not closemnodes(), then hundreds of error messages are printed, the
> command goes ahead, some files get $SECONDS and some remain unchanged.
> Which is preferable?

Probably the former.  Failing to open an output channel should in general
be a fatal error, at least for the command that's executing.  That seems
to be how it works currently:

% { echo foo; echo Done 1 >&2; } >/noaccess; echo Done 2
zsh: permission denied: /noaccess
Done 2

Multios have the inevitable property that a failure can't be atomic,
i.e. you can't necessarily have either the state before, or a success.
I think we just have to live with aborting as early as possible as a
bit better than getting further but still not doing what the user
asked.  In the case of a warning you don't even get a non-zero status,
so in a function or script---even a carefully written one designed to
bail out at the first sniff of an error---you might not notice until far
too late.  (How careful a script or function actually is if it's so
cavalier with multios is another question.)

However, what the user actually wants depends on the context, so it's
possible you can come up with cases where the other alternative is
better.

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


To access the latest news from CSR copy this link into a web browser:  http://www.csr.com/email_sig.php

To get further information regarding CSR, please visit our Investor Relations page at http://ir.csr.com/csr/about/overview


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: echo > * and EMFILE
  2007-02-13 17:28     ` Peter Stephenson
@ 2007-02-14  7:53       ` Bart Schaefer
  0 siblings, 0 replies; 5+ messages in thread
From: Bart Schaefer @ 2007-02-14  7:53 UTC (permalink / raw)
  To: Zsh hackers list

On Feb 13,  5:28pm, Peter Stephenson wrote:
}
} Bart Schaefer wrote:
} > If we call zerr() and closemnodes(), then an error message is printed
} > once but the command never actually executes, so the files that it was
} > possible to open are truncated to zero size.   If we call zwarn() and
} > not closemnodes(), then hundreds of error messages are printed, the
} > command goes ahead, some files get $SECONDS and some remain unchanged.
} > Which is preferable?
} 
} Probably the former.

OK, I'll go ahead and commit the patch from 23169, along with one other
spot in addfd() where it's possible to detect movefd() failure.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-02-14  7:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-12 22:32 echo > * and EMFILE Stephane Chazelas
2007-02-13 10:53 ` Peter Stephenson
2007-02-13 17:11   ` Bart Schaefer
2007-02-13 17:28     ` Peter Stephenson
2007-02-14  7:53       ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).