From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13387 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Deadlock when calling fflush/fclose in multiple threads Date: Fri, 2 Nov 2018 11:33:52 -0400 Message-ID: <20181102153352.GH5150@brightrain.aerifal.cx> References: <2018110213110009300613@kooiot.com> <20181102142915.GG5150@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1541172720 4810 195.159.176.226 (2 Nov 2018 15:32:00 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 2 Nov 2018 15:32:00 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-13403-gllmg-musl=m.gmane.org@lists.openwall.com Fri Nov 02 16:31:56 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1gIbQS-0001AX-AS for gllmg-musl@m.gmane.org; Fri, 02 Nov 2018 16:31:56 +0100 Original-Received: (qmail 11264 invoked by uid 550); 2 Nov 2018 15:34:05 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 10213 invoked from network); 2 Nov 2018 15:34:04 -0000 Content-Disposition: inline In-Reply-To: Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:13387 Archived-At: On Fri, Nov 02, 2018 at 03:15:51PM +0000, John Starks wrote: > > > > I think we just have to move the __ofl_lock to the top of the > > function, before FLOCK, and the __ofl_unlock to after the > > fflush/close. Unfortunately this makes fclose much more serializing > > than it was before, but I don't see any way to avoid it. > > > > Perhaps you could keep a global count of FILEs that are still being > flushed after having been removed from the list. fflush could > perform a futex wait on this becoming 0. I think such an approach is plausible, but involves the kind of complex and error-prone direct use of atomics I'm actively trying to eliminate. The same could be done without low level hacks via clever use of rwlocks or a mutex+condvar pair, but all of these involve namespace-safety issues and a lot more code than should be introduced into minimal static programs using stdio. For what it's worth, the only consumers of the open file list that can be executed more than once are fflush(NULL), fclose, and __ofl_add (used by fopen, etc.). Rich