From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: <21e3f8eee50170d6fcd21c43384eba04@felloff.net> References: <8FB7CBFD-7334-4F9F-8C71-571DEF9FAD31@ar.aichi-u.ac.jp> <21e3f8eee50170d6fcd21c43384eba04@felloff.net> Date: Tue, 16 Feb 2016 21:16:27 +0000 Message-ID: From: Charles Forsyth To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: multipart/alternative; boundary=001a11469c6284482e052be9a584 Subject: Re: [9fans] file descriptor leak Topicbox-Message-UUID: 8544f27a-ead9-11e9-9d60-3106f5b1d025 --001a11469c6284482e052be9a584 Content-Type: text/plain; charset=UTF-8 On 16 February 2016 at 18:01, wrote: > and the parent proc doesnt need the fd to /dev/null, it could as well just > open it in the child like: > > close(0); open("/dev/null", OREAD); > > There's no harm in making and using a more general function, even in a specific way, so that part's ok. The caller just needs to play its part properly. after spending 5 minutes writing the code fixing all these issues mentiond > above, i'll just throw it all away and delete the whole remounting logic > for /net.alt in 9front. It's often better to use the Erlang fail-fast ("just fail") and restart approach for persistent services. More important would be to look at /proc/N/fd on a failing system. I've a feeling that the system/outside stuff isn't actually the problem, since I've seen the diagnostic on a system that wasn't using /net.alt. In that case, the problem (as I remember it) was that an Internet link further on was down, so no messages got through to remote DNS, and file descriptors were building up in slave processes waiting for replies on /net/udp. Once the link was up, it went back to normal. --001a11469c6284482e052be9a584 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

= On 16 February 2016 at 18:01, <cinap_lenrek@felloff.net> wrote:
and the parent proc doesnt need the fd to /dev/null, it could as= well just
open it in the child like:

close(0); open("/dev/null", OREAD);


There's no harm in making and using a = more general function, even in a specific way, so that part's ok.
=
The caller just needs to play its part properly.=

after spending 5 minutes wri= ting the code fixing all these issues mentiond
above, i'll just throw it a= ll away and delete the whole remounting logic
for /net.alt in 9front.

= It's often better to use the Erlang fail-fast ("just fail") a= nd restart approach for persistent services.

More important would be to look at /= proc/N/fd on a failing system.
I've a f= eeling that the system/outside stuff isn't actually the problem,
<= div class=3D"gmail_extra">since I've seen the diagnostic on a system th= at wasn't using /net.alt.
In that case,= the problem (as I remember it) was that an Internet link further on was do= wn,
so no messages got through to remote DN= S, and file descriptors were building up in slave processes
waiting for replies on /net/udp. Once the link was up,= it went back to normal.
--001a11469c6284482e052be9a584--