From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 18757 invoked from network); 6 Nov 2020 03:36:36 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 6 Nov 2020 03:36:36 -0000 Received: (qmail 28086 invoked by uid 550); 6 Nov 2020 03:36:30 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 28064 invoked from network); 6 Nov 2020 03:36:29 -0000 Date: Thu, 5 Nov 2020 22:36:17 -0500 From: Rich Felker To: musl@lists.openwall.com Message-ID: <20201106033616.GX534@brightrain.aerifal.cx> References: <20201027211735.GV534@brightrain.aerifal.cx> <20201030185205.GA10849@arya.arvanta.net> <20201030185716.GE534@brightrain.aerifal.cx> <5298816.XTEcGr0bgB@nanabozho> <20201031033117.GH534@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201031033117.GH534@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] [PATCH v2] MT fork On Fri, Oct 30, 2020 at 11:31:17PM -0400, Rich Felker wrote: > On Fri, Oct 30, 2020 at 03:31:54PM -0600, Ariadne Conill wrote: > > Hello, > > > > On Friday, October 30, 2020 12:57:17 PM MDT Rich Felker wrote: > > > There was a regression in musl too, I think. With > > > 27b2fc9d6db956359727a66c262f1e69995660aa you should be able to > > > re-enable parallel mark. If you get a chance to test, let us know if > > > it works for you. > > > > I have pushed current musl git plus the MT fork patch to Alpine edge as Alpine > > musl 1.2.2_pre0, and reenabling parallel mark has worked fine. > > > > It would be nice to have a musl 1.2.2 release that I can use for the source > > tarball instead of a git snapshot, but this will do for now. > > Thanks for the feedback. I'll try to wrap up this release cycle pretty > quickly now, since I know this has been a big stress for distros, but > I want to make sure the MT-fork doesn't introduce other breakage. > > One thing I know is potentially problematic is interaction with malloc > replacement -- locking of any of the subsystems locked at fork time > necessarily takes place after application atfork handlers, so if the > malloc replacement registers atfork handlers (as many do), it could > deadlock. I'm exploring whether malloc use in these systems can be > eliminated. A few are almost-surely better just using direct mmap > anyway, but for some it's borderline. I'll have a better idea sometime > in the next few days. OK, here's a summary of the affected locks (where there's a lock order conflict between them and application-replaced malloc): - atexit: uses calloc to allocate more handler slots of the builtin 32 are exhausted. Could reasonably be changed to just mmap a whole page of slots in this case. - dlerror: the lock is just for a queue of buffers to be freed on future calls, since they can't be freed at thread exit time because the calling context (thread that's "already exited") is not valid to call application code, and malloc might be replaced. one plausible solution here is getting rid of the free queue hack (and thus the lock) entirely and instead calling libc's malloc/free via dlsym rather than using the potentially-replaced symbol. but this would not work for static linking (same dlerror is used even though dlopen always fails; in future it may work) so it's probably not a good approach. mmap is really not a good option here because it's excessive mem usage. It's probably possible to just repeatedly unlock/relock around performing each free so that only one lock is held at once. - gettext: bindtextdomain calls calloc while holding the lock on list of bindings. It could drop the lock, allocate, retake it, recheck for an existing binding, and free in that case, but this is undesirable because it introduces a dependency on free in static-linked programs. Otherwise all memory gettext allocates is permanent. Because of this we could just mmap an area and bump allocate it, but that's wasteful because most programs will only use one tiny binding. We could also just leak on the rare possibility of concurrent binding allocations; the number of such leaks is bounded by nthreads*ndomains, and we could make it just nthreads by keeping and reusing abandoned ones. - sem_open: a one-time calloc of global semtab takes place with the lock held. On 32-bit archs this table is exactly 4k; on 64-bit it's 6k. So it seems very reasonable to just mmap instead of calloc. - timezone: The tz core allocates memory to remember the last-seen value of TZ env var to react if it changes. Normally it's small, so perhaps we could just use a small (e.g. 32 byte) static buffer and replace it with a whole mmapped page if a value too large for that is seen. Also, somehow I failed to find one of the important locks MT-fork needs to be taking: locale_map.c has a lock for the records of mapped locales. Allocation also takes place with it held, and for the same reason as gettext it really shouldn't be changed to allocate differently. It could possibly do the allocation without the lock held though and leak it (or save it for reuse later if needed) when another thread races to load the locale. So anywya, it looks like there's some nontrivial work to be done here in order to make the MT-fork not be a regression for replaced-malloc uage... :( Ideas for the above and keeping the solutions non-invasive will be very much welcome! Rich