From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/2018 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: exit locking madness Date: Wed, 26 Sep 2012 14:15:33 -0400 Message-ID: <20120926181533.GA26153@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1348683860 13545 80.91.229.3 (26 Sep 2012 18:24:20 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 26 Sep 2012 18:24:20 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-2019-gllmg-musl=m.gmane.org@lists.openwall.com Wed Sep 26 20:24:24 2012 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1TGwHU-0000n0-6V for gllmg-musl@plane.gmane.org; Wed, 26 Sep 2012 20:24:20 +0200 Original-Received: (qmail 16154 invoked by uid 550); 26 Sep 2012 18:24:14 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 16145 invoked from network); 26 Sep 2012 18:24:14 -0000 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Xref: news.gmane.org gmane.linux.lib.musl.general:2018 Archived-At: Some recent discussions regarding the way glibc and musl handle flushing/closing files at exit suggest that both are probably buggy. glibc basically ignores locking entirely, guaranteeing that exit will not deadlock but possibly giving incorrect/corrupt output, and in the worst cases even crashing with a race condition when a file is closed (in another thread) while exiting. musl does maximal locking: first the open file list is permanently locked (causing any further fopen/fclose attempts to deadlock) and then each file, one by one, is permanently locked and flushed. The permanent locking ensures that no further io can be performed (and thus lost) after flushing. Unfortunately, there are some potential problems with musl's approach: 1. If any thread is holding a file lock with flockfile, or by being blocked in a stdio function such as fprintf, fwrite, fread, etc. that's waiting for input, exit is blocked, perhaps indefinitely. This seems bad, but it might actually be the behavior POSIX mandates. I've opened a request for interpretation: http://austingroupbugs.net/view.php?id=611 2. Even if no thread is doing abusive long-term locking, a thread may be holding a lock on f1 while attempting to obtain a lock on f2, while exit is holding a lock on f2 and attempting to obtain a lock on f1. Should exit attempt to work around this? 3. Likewise, a thread might be holding a short-term lock on a file f while attempting to open or close another file. This will deadlock since the open file list lock is permanently held by exit. Here are some partial ideas for a fix: Instead of permanently locking files, lock them temporarily, flush them, then unbuffer them. In unbuffered mode, further io on the file will _mostly_ avoid messing up the state exit is supposed to end with, but it could become inconsistent (wrong final file offset) in the case where fscanf or ungetc is used. Switch to a special type of lock for the open file list, which has 3 modes: unlocked, locked, and perma-locked. When fopen/fclose detect the perma-locked mode, they can take special action. fopen would either always fail, or open an unbuffered file that's not linked in the open file list. fclose would do everything but unlinking the file from the open file list and freeing it, and would set the file descriptor to -1 before unlocking the file so that exit could not later attempt to operate on the fd. The only major issue I see which is not solved is #1, the long-term locks issue. In this case, we'd like to be able "steal" the lock out from under a thread to flush whatever's in the buffer, then let the thread go back to doing unbuffered io on its locked file. But I see no clean way to do this, and depending on the interpretation of the standard, it might not even be valid to do so (since presumably exit must behave "as if" it called flockfile, since it references the file). Ideas? Rich