From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 16899 invoked from network); 31 Jul 2020 19:24:00 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 31 Jul 2020 19:24:00 -0000 Received: (qmail 30182 invoked by uid 550); 31 Jul 2020 19:23:58 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 30156 invoked from network); 31 Jul 2020 19:23:57 -0000 Date: Fri, 31 Jul 2020 15:23:45 -0400 From: Rich Felker To: musl@lists.openwall.com Message-ID: <20200731192339.GV6949@brightrain.aerifal.cx> References: <20200731172216.GB2076@voyager> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20200731172216.GB2076@voyager> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] When to reclaim pages in __bin_chunk? On Fri, Jul 31, 2020 at 07:22:16PM +0200, Markus Wichmann wrote: > On Fri, Jul 31, 2020 at 08:17:02AM +0800, Zhao Zhengyu wrote: > > Hello, > > > > When chunks are merged, we use "(curr_size + pre_size) ^ pre_size > > > pre_size" to decide whether to reclaim. I think this may be something > > related to performance, but I can’t prove it. I want to know the > > reason. > > > > Thank you! > > > > Zhengyu > > I asked that same question a while ago. For one, this was in the old > malloc code, which is now usually no longer used. For two, this tries to > figure out if adding current size to previous size rolls over into a new > power of two. Usually, curr_size will be small and pre_size will be > large. Therefore, adding the two will not change much about the high > bits of pre_size, so due to the XOR operator, those bits will cancel out > and the result will be smaller than pre_size. However, if the sum does > roll over, then the sum has one bit set that isn't in pre_size and is > larger than all the ones in pre_size. So the XOR can't cancel that bit, > and the result of the XOR is greater than pre_size. > > Fundamentally, it is an optimized version of (a_clz(curr_size + > pre_size) < a_clz(pre_size)). Yes, it's a heuristic at least approximately equivalent to "crossed the next power of two size boundary" to limit the frequency of madvise syscalls when a large free zone keeps getting expanded by adjacent tiny frees. However it does not work very well in practice, and doesn't even mitigate the possibility of continuous syscall load when a repeated malloc/free cycle occurs right at a power-of-two boundary. mallocng handles this kind of thing much better by grouping same-sized allocations and returning them as a group when all are freed, only holding back from doing so if it's observed allocations of this size "bouncing" (repeatedly creating and destroying the same group). Rich