From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,HTML_FONT_FACE_BAD,HTML_MESSAGE, MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H2,T_KAM_HTML_FONT_INVALID autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 16270 invoked from network); 20 Sep 2022 05:56:25 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 20 Sep 2022 05:56:25 -0000 Received: (qmail 30449 invoked by uid 550); 20 Sep 2022 05:56:23 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 30416 invoked from network); 20 Sep 2022 05:56:21 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:mime-version:references:subject:cc:to:from:date:from:to :cc:subject:date; bh=rt6Lo3AV/SHzcvRdrRHsvbWl+CjAz2gle9rMSmsUFrM=; b=USTrpqavIZH1WtdnyFlPGkjr9Q1ZT5hp8OHEwcKROtUaKtUjApykuPEj+Sifs6hwyJ 4mIAxG3FELbUzOSQxRXt9YOFt+d3a90Wq6GfH75Y+3+zF3sx7WmuKUJBa4WRMi0nsdsb bniCil9GKrfoEHRgwK8sHxuFRzZtZlPJ6Xyoj7GY4l/IZZk7sSSSbR5mVLZgjuDbpjWk +ioHkMWXMgq9xdBFc64T3/eY7YRCKKSGRnb2t8kTNP4vNagBJyozTVbT1yXQhrM20ARn MwzLlyTyPx3jD6O5loWNsGG2Wr645jvH2cm7x9srBGCQh+NMBlZOQ8wiEFN+wVCRYENk MPag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=message-id:mime-version:references:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date; bh=rt6Lo3AV/SHzcvRdrRHsvbWl+CjAz2gle9rMSmsUFrM=; b=pMAcdLokzLFKkXY5DXTKbqfSDWNwgAo/uyhwV5zOYwXi6hgfIWFLTtV4HosQOtoaP6 qBQ/v2EBnY4QJXmI9ysY5/ApCC/4YawZRMvda+76zV8z3cECq/rVCp9SnhnOTcdy+mfW jaMrYXMRtg7Oy8iAShRVOcwzwDViPyj8zZOMx1wjQyqWdc0hjKql3DSzoqdmBGrjwPAh oJz2QxXaBkk17I39Motkbv9oXEfEi9zFS79VvycyjCFnyf+SQUmiK/n39L34rMGgwVtg DjV81RihyT934wvB/sitLwV4cs1aaEVEZ9+GQndOWA1xEHSabBrjqqS22hW8dLDZrMeK Z+sw== X-Gm-Message-State: ACrzQf17HuoWDu5iMoa5T9BvVXs66Vb8N9GaWsaaIq1+UDo5AlNKi2Ga 0YBFEVfR6XGK5Nf7+SDCHOU= X-Google-Smtp-Source: AMsMyM7JIocfACe44/CcshK+BGrLrPQgSrRa6MnEpaoHVqrFioW0lBYXRnmW9KwuU0iAi5r8kbB+mQ== X-Received: by 2002:a63:1304:0:b0:439:ac9b:34af with SMTP id i4-20020a631304000000b00439ac9b34afmr18610896pgl.464.1663653369608; Mon, 19 Sep 2022 22:56:09 -0700 (PDT) Date: Tue, 20 Sep 2022 13:56:12 +0800 From: baiyang To: "Rich Felker" Cc: musl References: , <2022092008254998320584@gmail.com>, <20220920003811.GF9709@brightrain.aerifal.cx>, <2022092008470636285288@gmail.com>, <20220920010056.GG9709@brightrain.aerifal.cx>, <2022092009180277847194@gmail.com>, <20220920021511.GH9709@brightrain.aerifal.cx>, <20220920103500598557106@gmail.com>, <20220920032806.GI9709@brightrain.aerifal.cx>, <20220920115350521974120@gmail.com>, <20220920054149.GK9709@brightrain.aerifal.cx> X-Priority: 3 X-GUID: F066E303-0116-4DC5-BBDC-560AEA8FA8BF X-Has-Attach: no X-Mailer: Foxmail 7.2.23.116[cn] Mime-Version: 1.0 Message-ID: <20220920135610661572125@gmail.com> Content-Type: multipart/alternative; boundary="----=_001_NextPart026326884663_=----" Subject: Re: Re: [musl] The heap memory performance (malloc/free/realloc) is significantly degraded in musl 1.2 (compared to 1.1) This is a multi-part message in MIME format. ------=_001_NextPart026326884663_=---- Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 PiAvLyAgICAgVGhpcyBtdWx0aS10aHJlYWRlZCBhY2Nlc3MgdG8gdGhlIHBhZ2VtYXAgaXMgc2Fm ZSBmb3IgZmFpcmx5DQo+IC8vICAgICBzdWJ0bGUgcmVhc29ucy4gIFdlIGJhc2ljYWxseSBhc3N1 bWUgdGhhdCB3aGVuIGFuIG9iamVjdCBYIGlzDQo+IC8vICAgICBhbGxvY2F0ZWQgYnkgdGhyZWFk IEEgYW5kIGRlYWxsb2NhdGVkIGJ5IHRocmVhZCBCLCB0aGVyZSBtdXN0DQo+IC8vICAgICBoYXZl IGJlZW4gYXBwcm9wcmlhdGUgc3luY2hyb25pemF0aW9uIGluIHRoZSBoYW5kb2ZmIG9mIG9iamVj dA0KPiAvLyAgICAgWCBmcm9tIHRocmVhZCBBIHRvIHRocmVhZCBCLg0KDQpUaGFua3MgZm9yIHlv dXIgaW5mb3JtYXRpb24uDQpJIGZlZWwgdGhpcyBhc3N1bXB0aW9uIGlzIHZlcnkgcmVhc29uYWJs ZTogeW91IGNhbid0IGhhdmUgb25lIHRocmVhZCBkb2luZyAiZnJlZShwKSIgd2hpbGUgYW5vdGhl ciB0aHJlYWQgaXMgYWNjZXNzaW5nIHRoZSBibG9jayBwb2ludGVkIHRvIGJ5IHAgd2l0aG91dCBh bnkgc3luY2hyb25pemF0aW9uIG1lY2hhbmlzbSBhdCB0aGUgc2FtZSB0aW1lLiANCg0KPiBidXQg ZWl0aGVyIHdheSB0aGF0J3Mgbm90IGNvbXBhdGlibGUgd2l0aCBzbWFsbC1tZW1vcnktc3BhY2Ug c3lzdGVtcyBvciB3aXRoIG5vbW11Lg0KT0ssIHRoYXQncyByZWFzb25hYmxlLiBBbmQgYWdhaW4s IHRoYW5rcyBmb3IgeW91ciBwYXRpZW5jZSBhbmQgdGltZSA6LUQNCg0KLS0NCg0KICAgQmVzdCBS ZWdhcmRzDQogIEJhaVlhbmcNCiAgYmFpeWFuZ0BnbWFpbC5jb20NCiAgaHR0cDovL2kuYmFpeS5j bg0KKioqKiA8IEVORCBPRiBFTUFJTCA+ICoqKiogDQogDQogDQpGcm9tOiBSaWNoIEZlbGtlcg0K RGF0ZTogMjAyMi0wOS0yMCAxMzo0MQ0KVG86IGJhaXlhbmcNCkNDOiBtdXNsDQpTdWJqZWN0OiBS ZTogUmU6IFttdXNsXSBUaGUgaGVhcCBtZW1vcnkgcGVyZm9ybWFuY2UgKG1hbGxvYy9mcmVlL3Jl YWxsb2MpIGlzIHNpZ25pZmljYW50bHkgZGVncmFkZWQgaW4gbXVzbCAxLjIgKGNvbXBhcmVkIHRv IDEuMSkNCk9uIFR1ZSwgU2VwIDIwLCAyMDIyIGF0IDExOjUzOjUyQU0gKzA4MDAsIGJhaXlhbmcg d3JvdGU6DQo+ID4gVGhlIG9uZXMgdGhhdCByZXR1cm4gc29tZSB2YWx1ZSBsYXJnZXIgdGhhbiB0 aGUgcmVxdWVzdGVkIHNpemUgYXJlDQo+ID4gcmV0dXJuaW5nICJ0aGUgcmVxdWVzdGVkIHNpemUs IHJvdW5kZWQgdXAgdG8gYSBtdWx0aXBsZSBvZiAxNiIgb3INCj4gPiBzaW1pbGFyLiBOb3QgInRo ZSByZXF1ZXN0ZWQgc2l6ZSBwbHVzIDE1MDAgYnl0ZXMiLiANCj4gLi4uDQo+ID4gVGhleSBkb24n dCByZXR1cm4gODEwMC4gVGhleSByZXR1cm4gc29tZXRoaW5nIGxpa2UgNjYwOCBvciA2NjI0Lg0K PiANCj4gTm8sIEFGQUlLLCBUaGVyZSBhcmUgbWFueSBhbGxvY2F0b3JzIHdob3NlIHJldHVybiB2 YWx1ZSBvZg0KPiBtYWxsb2NfdXNhYmxlX3NpemUgaXMgMUtCIChvciBtb3JlKSBsYXJnZXIgdGhh biB0aGUgcmVxdWVzdGVkIHZhbHVlDQo+IGF0IG1hbGxvYyB0aW1lLg0KPiBGb3IgRXhhbXBsZTog aWYgeW91IGRvICJ2b2lkKiBwID0gbWFsbG9jKDY3MDApIiBvbiB0Y21hbGxvYywgdGhlbg0KPiAi bWFsbG9jX3VzYWJsZV9zaXplKHApIiB3aWxsIHJldHVybiAqKjgxOTIqKi4gRmFyIG1vcmUgdGhh biBqdXN0DQo+ICJyb3VuZGVkIHVwIHRvIGEgbXVsdGlwbGUgb2YgMTYiLg0KIA0KT0ssIHRoYW5r cyBmb3IgY2hlY2tpbmcgYW5kIGNvcnJlY3RpbmcuDQogDQo+ID4gVGhpcyBkb2VzIG5vdCBmb2xs b3cgYXQgYWxsLiB0Y21hbGxvYyBpcyBmYXN0IGJlY2F1c2UgaXQgZG9lcyBub3QgaGF2ZQ0KPiA+ IGdsb2JhbCBjb25zaXN0ZW5jeSwgZG9lcyBub3QgaGF2ZSBhbnkgbm90YWJsZSBoYXJkZW5pbmcs IGFuZCAoc2VlIHRoZQ0KPiA+IG5hbWUpIGtlZXBzIGxhcmdlIG51bWJlcnMgb2YgZnJlZWQgc2xv dHMgKmNhY2hlZCogdG8gcmV1c2UsIHRoZXJlYnkNCj4gPiB1c2luZyBsb3RzIG9mIGV4dHJhIG1l bW9yeS4gSXRzIG1hbGxvY191c2FibGVfc2l6ZSBpcyBub3QgZmFzdCBiZWNhdXNlDQo+ID4gb2Yg cmV0dXJuaW5nIHRoZSB3cm9uZyB2YWx1ZSwgaWYgaXQgZXZlbiBkb2VzIHJldHVybiB0aGUgd3Jv bmcgdmFsdWUNCj4gPiAoSSBoYXZlIG5vIGlkZWEpLiANCj4gDQo+IFdlIGRvbid0IG5lZWQgdG8g cmVmZXIgdG8gdGhlc2UgZmVhdHVyZXMgb2YgdGNtYWxsb2MsIHdlIG9ubHkgbmVlZA0KPiB0byBy ZWZlciB0byBpdHMgbWFsbG9jX3VzYWJsZV9zaXplIGFsZ29yaXRobS4NCiANClRob3NlIChtaXMp ZmVhdHVyZXMgYXJlIHdoYXQgcHJvdmlkZSBhIGZhc3QgcGF0aCBoZXJlLCByZWdhcmRsZXNzIG9m DQp3aGV0aGVyIHlvdSBjYXJlIGFib3V0IHRoZW0uDQogDQo+ID4gSXQncyBmYXN0IGJlY2F1c2Ug dGhleSBzdG9yZSB0aGUgc2l6ZSBpbi1iYW5kIHJpZ2h0DQo+ID4gbmV4dCB0byB0aGUgYWxsb2Nh dGVkIG1lbW9yeSBhbmQgdHJ1c3QgdGhhdCBpdCdzIHZhbGlkLCByYXRoZXIgdGhhbg0KPiA+IGNv bXB1dGluZyBpdCBmcm9tIG91dC1vZi1iYW5kIG1ldGFkYXRhIHRoYXQgaXMgbm90IHN1YmplY3Qg dG8NCj4gPiBmYWxzaWZpY2F0aW9uIHVubGVzcyB0aGUgYXR0YWNrZXIgYWxyZWFkeSBoYXMgbmVh cmx5IGZ1bGwgY29udHJvbCBvZg0KPiA+IGV4ZWN1dGlvbi4NCj4gDQo+IE5vLCBpZiBJIHVuZGVy c3RhbmQgY29ycmVjdGx5LCB0Y21hbGxvY2UgZG9lc24ndCBzdG9yZSB0aGUgc2l6ZQ0KPiBpbi1i YW5kIHJpZ2h0IG5leHQgdG8gdGhlIGFsbG9jYXRlZCBtZW1vcnkuIE9uIHRoZSBjb250cmFyeSwg d2hlbg0KPiBleGVjdXRpbmcgbWFsbG9jX3VzYWJsZV9zaXplKHApIChhY3R1YWxseSBHZXRTaXpl KHApKSwgaXQgd2lsbCBmaXJzdA0KPiBmaW5kIHRoZSBzaXplIGNsYXNzIGNvcnJlc3BvbmRpbmcg dG8gcCB0aHJvdWdoIGEgcXVpY2sgbG9va3VwIHRhYmxlLA0KPiBhbmQgdGhlbiByZXR1cm4gdGhl IGxlbmd0aCBvZiB0aGUgc2l6ZSBjbGFzcy4gU2VlOg0KPiBodHRwczovL2dpdGh1Yi5jb20vZ29v Z2xlL3RjbWFsbG9jL2Jsb2IvOTE3OWJiODg0ODQ4YzMwNjE2NjY3YmExMjliY2Y5YWZlZTExNGMz Mi90Y21hbGxvYy90Y21hbGxvYy5jYyNMMTA5OQ0KIA0KT0ssIEkgd2FzIGNvbmZ1c2luZyB0Y21h bGxvYyB3aXRoIHRoZSBtb3JlIGNvbnZlbnRpb25hbCAidGhyZWFkLWxvY2FsDQpmcmVlbGlzdCBj YWNoaW5nIG9uIHRvcCBvZiBkbG1hbGxvYyB0eXBlIGJhc2UiIGFsbG9jYXRvciBzdHJhdGVneS4N CkluZGVlZCB0Y21hbGxvYyBob3dldmVyIGlzIG9uZSBvZiB0aGUgZ2lnYW50aWMgb25lcy4NCiAN Cj4gTXkgdW5kZXJzdGFuZGluZzogdGhlIGJpZ2dlc3QgaW1wZWRpbWVudCB0byBvdXIgaW5hYmls aXR5IHRvIGFwcGx5DQo+IHNpbWlsYXIgb3B0aW1pemF0aW9ucyBpcyB0aGF0IHdlIGhhdmUgdG8g cmV0dXJuIDY3MDAsIG5vdCA4MTkyIChvZg0KPiBjb3Vyc2UsIHlvdSd2ZSBkZW5pZWQgdGhpcyBp cyB0aGUgcmVhc29uKS4NCiANCllvdXIgdW5kZXJzdGFuZGluZyBpcyB3cm9uZy4gSSd2ZSB0b2xk IHlvdSBob3cgeW91IGNhbiBtZWFzdXJlIHRoYXQNCml0J3Mgd3JvbmcuIFlvdSBpbnNpc3Qgb24g YmVpbmcgc3R1Y2sgb24gaXQgZm9yIG5vIGdvb2QgcmVhc29uLg0KIA0KSWYgeW91IHdhbnQgdG8g dW5kZXJzdGFuZCAqd2h5KiB0Y21hbGxvYyBpcyBkaWZmZXJlbnQsIHN0YXJ0IHdpdGggdGhlDQpj b21tZW50cyBhdCB0aGUgdG9wIG9mIHRoZSBmaWxlIHlvdSBsaW5rZWQ6DQogDQo+IC8vICA0LiBU aGUgcGFnZW1hcCAod2hpY2ggbWFwcyBmcm9tIHBhZ2UtbnVtYmVyIHRvIGRlc2NyaXB0b3IpLA0K PiAvLyAgICAgY2FuIGJlIHJlYWQgd2l0aG91dCBob2xkaW5nIGFueSBsb2NrcywgYW5kIHdyaXR0 ZW4gd2hpbGUgaG9sZGluZw0KPiAvLyAgICAgdGhlICJwYWdlaGVhcF9sb2NrIi4NCj4gLy8NCj4g Ly8gICAgIFRoaXMgbXVsdGktdGhyZWFkZWQgYWNjZXNzIHRvIHRoZSBwYWdlbWFwIGlzIHNhZmUg Zm9yIGZhaXJseQ0KPiAvLyAgICAgc3VidGxlIHJlYXNvbnMuICBXZSBiYXNpY2FsbHkgYXNzdW1l IHRoYXQgd2hlbiBhbiBvYmplY3QgWCBpcw0KPiAvLyAgICAgYWxsb2NhdGVkIGJ5IHRocmVhZCBB IGFuZCBkZWFsbG9jYXRlZCBieSB0aHJlYWQgQiwgdGhlcmUgbXVzdA0KPiAvLyAgICAgaGF2ZSBi ZWVuIGFwcHJvcHJpYXRlIHN5bmNocm9uaXphdGlvbiBpbiB0aGUgaGFuZG9mZiBvZiBvYmplY3QN Cj4gLy8gICAgIFggZnJvbSB0aHJlYWQgQSB0byB0aHJlYWQgQi4NCiANClRoaXMgaXMgdGhlIGtp bmQgb2YgdGhpbmcgSSBtZWFuIGJ5IGxhY2sgb2YgZ2xvYmFsIGNvbnNpc3RlbmN5IChubw0Kc3lu Y2hyb25pemF0aW9uIGFyb3VuZCBhY2Nlc3MgdG8gdGhlc2UgZGF0YSBzdHJ1Y3R1cmVzKSBhbmQg bGFjayBvZg0KYW55IG1lYW5pbmdmdWwgaGFyZGVuaW5nICgqYXNzdW1pbmcqIG5vIG1lbW9yeSBs aWZldGltZSB1c2FnZSBlcnJvcnMNCmluIHRoZSBjYWxsaW5nIGFwcGxpY2F0aW9uKS4NCiANClRo ZSBHZXRTaXplIGZ1bmN0aW9uIHlvdSBjaXRlZCB1c2VzIHRoaXMgZ2xvYmFsIHBhZ2VtYXAgdG8g Z28gc3RyYWlnaHQNCmZyb20gYSBwYWdlIGFkZHJlc3MgdG8gYSBzaXplY2xhc3MsIHZpYSB3aGF0 IGFtb3VudHMgdG8gYSB0d28tbGV2ZWwgb3INCnRocmVlLWxldmVsIHRhYmxlIGluZGV4ZWQgYnkg dXBwZXIgYml0cyBvZiB0aGUgYWRkcmVzcyAoY29tbWVudCBzYXlzDQozLWxldmVsIGlzIG9ubHkg dXNlZCBpbiBzbG93ZXIgYnV0IGxvd2VyLW1lbS11c2UgY29uZmlndXJhdGlvbikuIFRoZXNlDQp0 YWJsZXMsIGF0IGxlYXN0IGluIHRoZSAyLWxldmVsIGZvcm0sIGFyZSB1dHRlcmx5ICptYXNzaXZl Ki4gSSdtIG5vdA0Kc3VyZSBpZiBpdCBjcmVhdGVzIHRoZW0gUFJPVF9OT05FIGFuZCB0aGVuIG9u bHkgaW5zdGFudGlhdGVzIHJlYWwNCm1lbW9yeSBmb3IgdGhlIChpbml0aWFsbHkgZmFpcmx5IHNw YXJzZSkgcGFydHMgdGhhdCBnZXQgdXNlZCwgb3IgaWYgaXQNCmp1c3QgYWxsb2NhdGVzIHRoZXNl IGdpYW50IHRoaW5ncyByZWx5aW5nIG9uIG92ZXJjb21taXQsIGJ1dCBlaXRoZXINCndheSB0aGF0 J3Mgbm90IGNvbXBhdGlibGUgd2l0aCBzbWFsbC1tZW1vcnktc3BhY2Ugc3lzdGVtcyBvciB3aXRo DQpub21tdS4NCiANCk9uIHRvcCBvZiB0aGF0LCB0aGlzIGFwcHJvYWNoIHJlbGllcyBvbiBsYXlp bmcgb3V0IHdob2xlIHBhZ2VzIChsaWtlbHkNCmxhcmdlIHNsYWJzIG9mIG1hbnkgcGFnZXMgYXQg YSB0aW1lKSBvZiBpZGVudGljYWwtc2l6ZWQgb2JqZWN0cyBzbw0KdGhhdCB0aGUgc2l6ZSBhbmQg b3RoZXIgcHJvcGVydGllcyBjYW4gYmUgbG9va2VkIHVwIGJ5IHBhZ2UgbnVtYmVyLiBJDQpoYXZl IG5vdCBsb29rZWQgaW50byB0aGUgZGV0YWlscyBvZiAiaG93IGJhZCIgaXQgZ2V0cywgYnV0IGl0 DQpjb21wbGV0ZWx5IHByZWNsdWRlcyBoYXZpbmcgYW55IHNtYWxsIHByb2Nlc3NlcywgYW5kIHBy ZWNsdWRlcw0KcHJvbXB0bHkgcmV0dXJuaW5nIGZyZWVkIG1lbW9yeSB0byB0aGUgc3lzdGVtLCBz aW5jZSAqY2hhbmdpbmcqIHRoZQ0KcGFnZW1hcCBpcyBnb2luZyB0byBiZSBjb3N0bHkgYW5kIHRo ZXkncmUgZ29pbmcgdG8gYXZvaWQgZG9pbmcgaXQNCihub3RlIHRoZSBhYm92ZSBjb21tZW50IG9u IGxvY2tpbmcpLg0KIA0KbWFsbG9jbmcgZG9lcyBub3QgaGF2ZSBhbnkgZ2xvYmFsIG1hcHBpbmcg b3B0aW1pemluZyB0cmFuc2xhdGlvbiBmcm9tDQphZGRyZXNzZXMgdG8gZ3JvdXBzL21ldGFkYXRh IG9iamVjdHMuIEJlY2F1c2Ugd2UgaW5zaXN0IG9uIGdsb2JhbA0KY29uc2lzdGVuY3kgKGEgcHJl cmVxdWlzaXRlIGZvciBiZWluZyBhYmxlIHRvIGRlZmluZSBzdHJvbmcgaGFyZGVuaW5nDQpwcm9w ZXJ0aWVzKSBhbmQgb24gYmVpbmcgYWJsZSB0byByZXR1cm4gZnJlZWQgbWVtb3J5IHByb21wdGx5 IHRvIHRoZQ0Kc3lzdGVtLCBtYWludGFpbmluZyBzdWNoIGEgZGF0YSBzdHJ1Y3R1cmUgd291bGQg Y29zdCBhIGxvdCBtb3JlIHRpbWUNCihwZXJmb3JtYW5jZSkgdGhhbiBhbnl0aGluZyBpdCBjb3Vs ZCBnaXZlLCBhbmQgaXQgd291bGQgbWFrZSBsb2NrLWZyZWUNCm9wZXJhdGlvbnMgKGxpa2UgeW91 ciBtYWxsb2NfdXNhYmxlX3NpemUsIG9yIHRyaXZpYWwgcmVhbGxvYyBjYWxscykNCnBvdGVudGlh bGx5IHJlcXVpcmUgbG9ja2luZy4NCiANCkluc3RlYWQgb2YgdXNpbmcgdGhlIG51bWVyaWMgdmFs dWUgb2YgdGhlIGFkZHJlc3MgdG8gbWFwIHRvIG1ldGFkYXRhLA0Kd2UgY2hhc2Ugb2Zmc2V0cyBm cm9tIHRoZSBvYmplY3QgYmFzZSBhZGRyZXNzIHRvIHRoZSBtZXRhZGF0YSwgdGhlbg0KdmFsaWRh dGUgdGhhdCBpdCByb3VuZC10cmlwcyBiYWNrIHRvIGNvbmNsdWRlIHRoYXQgd2UgZGlkbid0IGp1 c3QNCmZvbGxvdyByYW5kb20ganVuayBmcm9tIHRoZSBjYWxsZXIgcGFzc2luZyBhbiBpbnZhbGlk L2RhbmdsaW5nIHBvaW50ZXINCmluLCBvciBmcm9tIHRoaXMgZGF0YSBiZWluZyBvdmVyd3JpdHRl biB2aWEgaGVhcC1iYXNlZCBidWZmZXINCm92ZXJmbG93cy4NCiANCkZ1bmRhbWVudGFsbHksIHRo aXMgcG9pbnRlciBjaGFzaW5nIGlzIGdvaW5nIHRvIGJlIGEgbGl0dGxlIGJpdCBtb3JlDQpleHBl bnNpdmUgdGhhbiBqdXN0IHVzaW5nIGFkZHJlc3MgYml0cyBhcyB0YWJsZSBpbmRpY2VzLCBidXQg bm90DQpyZWFsbHkgYWxsIHRoYXQgbXVjaC4gQXQgbGVhc3QgaGFsZiBvZiB0aGUgY29zdCBkaWZm ZXJlbmNlLCBhbmQNCnByb2JhYmx5IGEgbG90IG1vcmUsIGlzIG5vdCB0aGUgcG9pbnRlci9vZmZz ZXQgY2hhc2luZyBidXQgdGhlDQp2YWxpZGF0aW9uIChoYXJkZW5pbmcpLiBJZiBoeXBvdGhldGlj YWxseSB5b3Ugd2FudGVkIHRvIHR1cm4gdGhhdCBhbGwNCm9mZiAoZS5nLiBieSBkZWZpbmluZyB0 aGUgYXNzZXJ0IG1hY3JvIHRvIGEgbm8tb3ApIHlvdSBjb3VsZCBoYXZlIGl0DQpiZSBhIGxvdCBm YXN0ZXIsIGFuZCBzdGlsbCBoYXZlIGxvdyBtZW1vcnkgdXNhZ2UgdG9vLiBJJ20gbm90IHN1cmUg YnV0DQpmb3Igc2luZ2xlLXRocmVhZGVkIGxvYWRzIEkgd291bGQgbm90IGJlIHN1cnByaXNlZCBp ZiBpdCB3ZXJlIGdldHRpbmcNCmNsb3NlIHRvIHRjbWFsbG9jIHNwZWVkLiBKdXN0IGNhc3VhbGx5 IGJ1aWxkaW5nIHdpdGggYXNzZXJ0KCkgZGVmaW5lZA0KdG8gbm9wIG91dCB0aGUgdGVzdHMsIEkg Z290IGRvdWJsZSBwZXJmb3JtYW5jZSBvbiB5b3VyIFRFU1QyLiBPZg0KY291cnNlLCBJIGRvbid0 IHJlY29tbWVuZCBkb2luZyB0aGlzLiBCdXQgaXQncyBhbiBpbnRlcmVzdGluZyB0ZXN0IGZvcg0K cGVyZm9ybWluZyAqbWVhc3VyZW1lbnQqICh3aGljaCB5b3Ugc28gZmFyIHJlZnVzZSB0byBkbykg b2Ygd2hhdCdzDQphY3R1YWxseSBtYWtpbmcgdGhlIHBlcmZvcm1hbmNlIGRpZmZlcmVuY2VzLg0K IA0KPiBPbiB0aGUgb3RoZXIgaGFuZCwgaWYgdGhlIGxvdyBzcGVlZCBpcyBub3QgY2F1c2VkIGJ5 IGhhdmluZyB0bw0KPiByZXR1cm4gNjcwMCwgdGhlbiB3ZSBzaG91bGQgYmUgYWJsZSB0byB1c2Ug YSBzaW1pbGFyIHF1aWNrIGxvb2t1cA0KPiB0YWJsZSBvcHRpbWl6YXRpb24gKCJ0Y19nbG9iYWxz LnBhZ2VtYXAoKS5zaXplY2xhc3MocCkiKSB0byBhY2hpZXZlDQo+IGF0IGxlYXN0IGRvemVucyBv ZiB0aW1lcyBwZXJmb3JtYW5jZSBpbXByb3ZlbWVudC4NCiANCk9uY2UgYWdhaW4sIHRoZSBiaWcg ZGlmZmVyZW5jZSBpcyBub3QgdGhlICI2NzAwIi4gVGhlDQp0Y19nbG9iYWxzLnBhZ2VtYXAoKS5z aXplY2xhc3MocCkgaW4gdGNtYWxsb2MgY29ycmVzcG9uZHMgdG8gbGluZXMNCjYtMTEgb2YgbWFs bG9jX3VzYWJsZV9zaXplIGluIG1hbGxvY25nLCBub3QgbGluZSAxMiwgYW5kIHRoZSBidWxrIG9m DQp0aGUgd29yayBoZXJlIGlzIGluIGxpbmVzIDYtMTEsIG1haW5seSBsaW5lcyA3IGFuZCAxMC4g SSBkaWQgYSBzaW1pbGFyDQpjYXN1YWwgdGVzdCByZW1vdmluZyBsaW5lIDEyIGFuZCBqdXN0IHJl dHVybmluZyBzb21ldGhpbmcgYmFzZWQgb24gdGhlDQplYXJsaWVyIGNvbXB1dGF0aW9ucywgYW5k IGl0IG1hZGUgc29tZXRoaW5nIGxpa2UgYSAzMCUgcmVkdWN0aW9uIGluDQp0ZXN0IHJ1biB0aW1l ICh3aXRoIG9yIHdpdGhvdXQgdGhlIGhhcmRlbmluZyBhc3NlcnRzIG5vcHBlZCBvdXQpLg0KIA0K UmljaA0K ------=_001_NextPart026326884663_=---- Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable =0A
> //&nb= sp;    This multi-threaded access to the pagemap is safe fo= r fairly
&g= t; //     subtle reasons.  We basically assume th= at when an object X is
> //     allocated by thread A and deal= located by thread B, there must
> //     have been appropriat= e synchronization in the handoff of object
> //     X from thr= ead A to thread B.

= Thanks for your information.
I feel this assumption is very reasonable: you can't = have one thread doing "free(p)" while another thread is accessing the bloc= k pointed to by p without any synchronization mechanism at the same time.&= nbsp;
=
= but either w= ay that's not compatible with small-memory-space systems or with nommu.
OK, that's rea= sonable. And again, thanks for your patience and ti= me :-D

=0A
<= span>
--

   Best Regards
  BaiYa= ng
  baiyang@gmail.com
  
http://i.baiy.cn
**** < END OF E= MAIL > ****
 
 
 
From:&nbs= p;Rich Felker
Date: 2022-09-20 13:41
To: baiyang
CC: musl
Subject: Re: Re: [= musl] The heap memory performance (malloc/free/realloc) is significantly d= egraded in musl 1.2 (compared to 1.1)
On Tue, S= ep 20, 2022 at 11:53:52AM +0800, baiyang wrote:
=0A
> > The= ones that return some value larger than the requested size are
=0A> > returning "the requested size, rounded up to a multiple of 16= " or
=0A
> > similar. Not "the requested size plus 1500 byt= es".
=0A
> ...
=0A
> > They don't return 8100.= They return something like 6608 or 6624.
=0A
>
=0A
= > No, AFAIK, There are many allocators whose return value of
=0A> malloc_usable_size is 1KB (or more) larger than the requested valu= e=0A
> at malloc time.
=0A
> For Example: if you = do "void* p =3D malloc(6700)" on tcmalloc, then
=0A
> "malloc_= usable_size(p)" will return **8192**. Far more than just
=0A
>= "rounded up to a multiple of 16".
=0A
 
=0A
OK, th= anks for checking and correcting.
=0A
 
=0A
> &g= t; This does not follow at all. tcmalloc is fast because it does not have<= /div>=0A
> > global consistency, does not have any notable harde= ning, and (see the
=0A
> > name) keeps large numbers of fre= ed slots *cached* to reuse, thereby
=0A
> > using lots of e= xtra memory. Its malloc_usable_size is not fast because
=0A
> = > of returning the wrong value, if it even does return the wrong value<= /div>=0A
> > (I have no idea).
=0A
>
=0A
= > We don't need to refer to these features of tcmalloc, we only need=0A
> to refer to its malloc_usable_size algorithm.
=0A 
=0A
Those (mis)features are what provide a fast path here= , regardless of
=0A
whether you care about them.
=0A
&nb= sp;
=0A
> > It's fast because they store the size in-band r= ight
=0A
> > next to the allocated memory and trust that it= 's valid, rather than
=0A
> > computing it from out-of-band= metadata that is not subject to
=0A
> > falsification unle= ss the attacker already has nearly full control of
=0A
> > = execution.
=0A
>
=0A
> No, if I understand correc= tly, tcmalloce doesn't store the size
=0A
> in-band right next= to the allocated memory. On the contrary, when
=0A
> executin= g malloc_usable_size(p) (actually GetSize(p)), it will first
=0A
= > find the size class corresponding to p through a quick lookup table,<= /div>=0A
> and then return the length of the size class. See:
= =0A
> https://github.com/google/tcmalloc/blob/9179bb884848c30616667= ba129bcf9afee114c32/tcmalloc/tcmalloc.cc#L1099
=0A
 
= =0A
OK, I was confusing tcmalloc with the more conventional "thread-lo= cal
=0A
freelist caching on top of dlmalloc type base" allocator = strategy.
=0A
Indeed tcmalloc however is one of the gigantic ones= .
=0A
 
=0A
> My understanding: the biggest impe= diment to our inability to apply
=0A
> similar optimizations i= s that we have to return 6700, not 8192 (of
=0A
> course, you'= ve denied this is the reason).
=0A
 
=0A
Your under= standing is wrong. I've told you how you can measure that
=0A
it'= s wrong. You insist on being stuck on it for no good reason.
=0A
=  
=0A
If you want to understand *why* tcmalloc is different,= start with the
=0A
comments at the top of the file you linked:=0A
 
=0A
> //  4. The pagemap (which maps f= rom page-number to descriptor),
=0A
> //   &nbs= p; can be read without holding any locks, and written while holding
= =0A
> //     the "pageheap_lock".
=0A
= > //
=0A
> //     This multi-threaded a= ccess to the pagemap is safe for fairly
=0A
> //  &n= bsp;  subtle reasons.  We basically assume that when an object X= is
=0A
> //     allocated by thread A and= deallocated by thread B, there must
=0A
> //   = ;  have been appropriate synchronization in the handoff of object=0A
> //     X from thread A to thread B.=0A
 
=0A
This is the kind of thing I mean by lack of = global consistency (no
=0A
synchronization around access to these= data structures) and lack of
=0A
any meaningful hardening (*assu= ming* no memory lifetime usage errors
=0A
in the calling applicat= ion).
=0A
 
=0A
The GetSize function you cited uses= this global pagemap to go straight
=0A
from a page address to a = sizeclass, via what amounts to a two-level or
=0A
three-level tab= le indexed by upper bits of the address (comment says
=0A
3-level= is only used in slower but lower-mem-use configuration). These
=0Atables, at least in the 2-level form, are utterly *massive*. I'm not=0A
sure if it creates them PROT_NONE and then only instantiates re= al
=0A
memory for the (initially fairly sparse) parts that get us= ed, or if it
=0A
just allocates these giant things relying on ove= rcommit, but either
=0A
way that's not compatible with small-memo= ry-space systems or with
=0A
nommu.
=0A
 
=0A<= div>On top of that, this approach relies on laying out whole pages (likely=
=0A
large slabs of many pages at a time) of identical-sized obje= cts so
=0A
that the size and other properties can be looked up by= page number. I
=0A
have not looked into the details of "how bad"= it gets, but it
=0A
completely precludes having any small proces= ses, and precludes
=0A
promptly returning freed memory to the sys= tem, since *changing* the
=0A
pagemap is going to be costly and t= hey're going to avoid doing it
=0A
(note the above comment on loc= king).
=0A
 
=0A
mallocng does not have any global = mapping optimizing translation from
=0A
addresses to groups/metad= ata objects. Because we insist on global
=0A
consistency (a prere= quisite for being able to define strong hardening
=0A
properties)= and on being able to return freed memory promptly to the
=0A
sys= tem, maintaining such a data structure would cost a lot more time
=0A=
(performance) than anything it could give, and it would make lock-fre= e
=0A
operations (like your malloc_usable_size, or trivial reallo= c calls)
=0A
potentially require locking.
=0A
 =0A
Instead of using the numeric value of the address to map to meta= data,
=0A
we chase offsets from the object base address to the me= tadata, then
=0A
validate that it round-trips back to conclude th= at we didn't just
=0A
follow random junk from the caller passing = an invalid/dangling pointer
=0A
in, or from this data being overw= ritten via heap-based buffer
=0A
overflows.
=0A
 =0A
Fundamentally, this pointer chasing is going to be a little bi= t more
=0A
expensive than just using address bits as table indice= s, but not
=0A
really all that much. At least half of the cost di= fference, and
=0A
probably a lot more, is not the pointer/offset = chasing but the
=0A
validation (hardening). If hypothetically you= wanted to turn that all
=0A
off (e.g. by defining the assert mac= ro to a no-op) you could have it
=0A
be a lot faster, and still h= ave low memory usage too. I'm not sure but
=0A
for single-threade= d loads I would not be surprised if it were getting
=0A
close to = tcmalloc speed. Just casually building with assert() defined
=0A
= to nop out the tests, I got double performance on your TEST2. Of
=0A<= div>course, I don't recommend doing this. But it's an interesting test for=
=0A
performing *measurement* (which you so far refuse to do) of = what's
=0A
actually making the performance differences.
=0A<= div> 
=0A
> On the other hand, if the low speed is not ca= used by having to
=0A
> return 6700, then we should be able to= use a similar quick lookup
=0A
> table optimization ("tc_glob= als.pagemap().sizeclass(p)") to achieve
=0A
> at least dozens = of times performance improvement.
=0A
 
=0A
Once ag= ain, the big difference is not the "6700". The
=0A
tc_globals.pag= emap().sizeclass(p) in tcmalloc corresponds to lines
=0A
6-11 of = malloc_usable_size in mallocng, not line 12, and the bulk of
=0A
= the work here is in lines 6-11, mainly lines 7 and 10. I did a similar=0A
casual test removing line 12 and just returning something based = on the
=0A
earlier computations, and it made something like a 30%= reduction in
=0A
test run time (with or without the hardening as= serts nopped out).
=0A
 
=0A
Rich
=0A
=0A ------=_001_NextPart026326884663_=------