From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,HTML_FONT_FACE_BAD,HTML_MESSAGE, MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H2,T_KAM_HTML_FONT_INVALID autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 29573 invoked from network); 19 Sep 2022 18:01:56 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 19 Sep 2022 18:01:56 -0000 Received: (qmail 7484 invoked by uid 550); 19 Sep 2022 18:01:54 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 7450 invoked from network); 19 Sep 2022 18:01:53 -0000 Resent-From: Rich Felker Resent-Date: Mon, 19 Sep 2022 14:01:41 -0400 Resent-Message-ID: <20220919180141.GA18480@brightrain.aerifal.cx> Resent-To: musl@lists.openwall.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:mime-version:references:subject:to:from:date:from:to:cc :subject:date; bh=hdJipKaY1G0zc/WhI5d9by98AONdjZ3R0f128GZUSdw=; b=mydEnBhHZqQm2gTQDQGBJbwCALbyGPgmX0WeijOXcPh5Jj0vEm6ju3bfvjoExtc0i7 1UgYGw+r7a06FBNI9MQqvwUVFjLPww9zDFqiL2zkoZb7xrRoNzgVTBzBcBri8tvpGcnE /xS1+UMOr010y7wQycDZjNPZ30Nw/ygUdX2nhIeDz3krunmsFzNBT0p1/ZzbbZFjkpE3 eoKJOMvy1QrU899mNVLo+xBQpYpZBt9X46Gd1JSTQd5/7LQ04K4VtfDdDDV2iCX/4Yxx JPifqYxm9yxGgu325Wckzd3l6vCgvjt7kPzXj1K+pDdO+rjN7iC71qRp5sXf/MewMLb3 J/qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=message-id:mime-version:references:subject:to:from:date :x-gm-message-state:from:to:cc:subject:date; bh=hdJipKaY1G0zc/WhI5d9by98AONdjZ3R0f128GZUSdw=; b=nSkeUta+y/Foxib5I/DefzbNi5PlcHFw7AkCblyVRqP8IUHxPB1mlidocsFJ3CxHy8 MI4Iqmjq47jwJfTPE9MVaHzpD+Uv8kPUQ070DhaBc3tEQebUXRDuOfwXBbxHq1ml/JpN PNv/g/bAqJS3NBit9Rdb1wwStCwIYGbPORfUMYP9xP/Vxl4UnZkIIgpdM4YthsNgMWP4 1YHAgpKsvmSlBcvzXjvlrueTkLvOsvVOVPfj4/PldhFL2qEDGbYn2ZwYkrIG419i6LlM Rn14FWEUw73vINfAN/wb4+l/iL13SjZ3OgewP+XtJ9QFGFlVXXC8IpAPAOvxq5efDQ1g v7iw== X-Gm-Message-State: ACrzQf3N0ozp1KHXIVMW2F1l+gDlE9TuAw3c9XW3WVh0t2IDdonPlM+8 Wq6bU2OTF344MKnRMq4vxr6CwpzMwVo= X-Google-Smtp-Source: AMsMyM6XjCB0wYxvsCIVFKlj6XK+3D+MKETA1TDaJcCfHVNSzJ/5yaMYGOEJf99MiO3Mny+WtKy8xA== X-Received: by 2002:a63:8542:0:b0:43a:5ca7:c710 with SMTP id u63-20020a638542000000b0043a5ca7c710mr2279305pgd.264.1663608748956; Mon, 19 Sep 2022 10:32:28 -0700 (PDT) Date: Tue, 20 Sep 2022 01:32:31 +0800 From: baiyang To: "Rich Felker" Cc: musl@lists.openwall.com References: <2022091915532777412615@gmail.com>, <20220919134319.GN9709@brightrain.aerifal.cx> X-Priority: 3 X-GUID: ED183849-AB90-4550-A4A1-6476A05F7A44 X-Has-Attach: no X-Mailer: Foxmail 7.2.23.116[cn] Mime-Version: 1.0 Message-ID: <202209200132289145679@gmail.com> Content-Type: multipart/alternative; boundary="----=_001_NextPart521851730013_=----" Subject: Re: Re: [musl] The heap memory performance (malloc/free/realloc) is significantly degraded in musl 1.2 (compared to 1.1) This is a multi-part message in MIME format. ------=_001_NextPart521851730013_=---- Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 SGkgUmljaCwNCg0KVGhhbmtzIGZvciB5b3VyIHJlcGx5Lg0KDQo+IFVubGVzcyB5b3UgaGF2ZSBh biBhcHBsaWNhdGlvbiB0aGF0J3MgZXhwbGljaXRseSB1c2luZw0KPiBtYWxsb2NfdXNhYmxlX3Np emUgYWxsIG92ZXIgdGhlIHBsYWNlLCBpdCdzIGhpZ2hseSB1bmxpa2VseSB0aGF0IHRoaXMNCj4g aXMgdGhlIGNhdXNlIG9mIHlvdXIgcmVhbC13b3JsZCBwZXJmb3JtYW5jZSBwcm9ibGVtcy4gDQoN CjEuIFllcywgd2UgaGF2ZSBhIHJlYWwgc2NlbmFyaW8gd2hlcmUgYG1hbGxvY191c2FibGVfc2l6 ZWAgaXMgY2FsbGVkIGZyZXF1ZW50bHk6IHdlIG5lZWQgdG8gb3B0aW1pemUgdGhlIHJlYWxsb2Mg ZXhwZXJpZW5jZS4gV2UgYWRkIGFuIGV4dHJhIHBhcmFtZXRlciB0byByZWFsbG9jIC0gbWluaW1h bENvcHlCeXRlczogaXQgcmVwcmVzZW50cyB0aGUgYWN0dWFsIHNpemUgb2YgZGF0YSB0aGF0IG5l ZWRzIHRvIGJlIGNvcGllZCBhZnRlciBmYWxsYmFjayB0byBtYWxsb2MtY29weS1mcmVlIG1vZGUu IFdlIHdpbGwganVkZ2Ugd2hldGhlciB0byBjYWxsIHJlYWxsb2Mgb3IgY29tcGxldGUgbWFsbG9j LW1lbWNweS1mcmVlIGJ5IG91cnNlbGYgYmFzZWQgb24gZmFjdG9ycyBzdWNoIGFzIHRoZSBzaXpl IG9mIHRoZSBkYXRhIHRoYXQgcmVhbGxvYyBuZWVkcyB0byBjb3B5IChvYnRhaW5lZCB0aHJvdWdo IGBtYWxsb2NfdXNhYmxlX3NpemVgKSwgdGhlIHNpemUgdGhhdCB3ZSBhY3R1YWxseSBuZWVkIHRv IGNvcHkgd2hlbiB3ZSBkb2luZyBtYWxsb2MtbWVtY3B5LWZyZWUgb3Vyc2VsZiAobWluaW1hbENv cHlCeXRlcykgYW5kIHRoZSBjaGFuY2Ugb2YgbWVyZ2luZyBjaHVua3MgKHNtYWxsIGJsb2Nrcykg b3IgbXJlbWFwIChsYXJnZSBibG9ja3MpIGluIHRoZSB1bmRlcmxheWVyIHJlYWxsb2MuIFNvLCB0 aGlzIGlzIGEgcmVhbCBzY2VuYXJpbywgd2UgbmVlZCB0byBjYWxsIGBtYWxsb2NfdXNhYmxlX3Np emVgIGZyZXF1ZW50bHkuDQoNCjIuIEFzIEkgbWVudGlvbmVkIGJlZm9yZSwgdGhpcyBpc24ndCBq dXN0IGEgcHJvYmxlbSB3aXRoIGBtYWxsb2NfdXNhYmxlX3NpemVgLCBzaW5jZSB3ZSBhY3R1YWxs eSBpbmNsdWRlIGEgZnVsbCBgbWFsbG9jX3VzYWJsZV9zaXplYCBwcm9jZWR1cmUgaW4gYm90aCBg cmVhbGxvY2AgYW5kIGBmcmVlYCwgaXQgYWN0dWFsbHkgc2xvd3MgZG93biBUaGUgc3BlZWQgb2Yg b3RoZXIgY2FsbHMgc3VjaCBhcyBgZnJlZWAgYW5kIGByZWFsbG9jYC4gU28gdGhpcyBwcm9ibGVt IGFjdHVhbGx5IHNsb3dzIGRvd24gbm90IG9ubHkgdGhlIGBtYWxsb2NfdXNhYmxlX3NpemVgIGNh bGwgaXRzZWxmLCBidXQgYWxzbyB0aGUgcmVhbGxvYyBhbmQgZnJlZSBjYWxscy4NCg0KVGhhbmtz IDotKQ0KDQotLQ0KDQogICBCZXN0IFJlZ2FyZHMNCiAgQmFpWWFuZw0KICBiYWl5YW5nQGdtYWls LmNvbQ0KICBodHRwOi8vaS5iYWl5LmNuDQoqKioqIDwgRU5EIE9GIEVNQUlMID4gKioqKiANCiAN CiANCkZyb206IFJpY2ggRmVsa2VyDQpEYXRlOiAyMDIyLTA5LTE5IDIxOjQzDQpUbzogYmFpeWFu Zw0KQ0M6IG11c2wNClN1YmplY3Q6IFJlOiBbbXVzbF0gVGhlIGhlYXAgbWVtb3J5IHBlcmZvcm1h bmNlIChtYWxsb2MvZnJlZS9yZWFsbG9jKSBpcyBzaWduaWZpY2FudGx5IGRlZ3JhZGVkIGluIG11 c2wgMS4yIChjb21wYXJlZCB0byAxLjEpDQpPbiBNb24sIFNlcCAxOSwgMjAyMiBhdCAwMzo1Mzoz MFBNICswODAwLCBiYWl5YW5nIHdyb3RlOg0KPiBIaSB0aGVyZSwNCj4gDQo+IEFzIHdlIGhhdmUg ZGlzY3Vzc2VkIGF0DQo+IGh0dHBzOi8vZ2l0aHViLmNvbS9vcGVud3J0L29wZW53cnQvaXNzdWVz LzEwNzUyLiBUaGUNCj4gbWFsbG9jX3VzYWJsZV9zaXplKCkgZnVuY3Rpb24gaW4gbXVzbCAxLjIg KG1hbGxvY25nKSBzZWVtcyB0byBoYXZlDQo+IHNvbWUgcGVyZm9ybWFuY2UgaXNzdWVzLg0KPiAN Cj4gSXQgY2F1c2VkIHJlYWxsb2MgYW5kIGZyZWUgc3BlbmRzIHRvbyBsb25nIHRpbWUgZm9yIGdl dCB0aGUgY2h1bmsgc2l6ZS4NCj4gDQo+IEFzIHdlIG1lbnRpb25lZCBpbiB0aGUgZGlzY3Vzc2lv biwgdGNtYWxsb2MgYW5kIHNvbWUgb3RoZXINCj4gYWxsb2NhdG9ycyBjYW4gYWxzbyBhY2N1cmF0 ZWx5IG9idGFpbiB0aGUgc2l6ZSBjbGFzcyBjb3JyZXNwb25kaW5nDQo+IHRvIGEgbWVtb3J5IGJs b2NrIGFuZCBpdHMgcHJlY2lzZSBzaXplLCBhbmQgaXQgaXMgYWxzbyB2ZXJ5IGZhc3QgYXQNCj4g dGhlIHNhbWUgdGltZS4NCj4gDQo+IENhbiB3ZSBtYWtlIHNvbWUgaW1wcm92ZW1lbnRzIHRvIHRo ZSBleGlzdGluZyBtYWxsb2NfdXNhYmxlX3NpemUNCj4gYWxnb3JpdGhtIGluIG1hbGxvY25nPyBU aGlzIHNob3VsZCBzaWduaWZpY2FudGx5IGltcHJvdmUgdGhlDQo+IHBlcmZvcm1hbmNlIG9mIGV4 aXN0aW5nIGFsZ29yaXRobXMuDQogDQpDYW4geW91IHBsZWFzZSBzdGFydCBmcm9tIGEgcG9pbnQg b2YgaWRlbnRpZnlpbmcgdGhlIHJlYWwtd29ybGQgY2FzZQ0KaW4gd2hpY2ggeW91J3JlIGhpdHRp bmcgYSBwZXJmb3JtYW5jZSBkZWdyZWRhdGlvbj8gTWFkZS11cCB0ZXN0cyBhcmUNCmdlbmVyYWxs eSBub3QgaGVscGZ1bCBhbmQgd2lsbCBhbG1vc3QgYWx3YXlzIGxlYWQgdG8gZm9jdXNpbmcgb24g dGhlDQp3cm9uZyBwcm9ibGVtLg0KIA0KRm9yIG5vdyBJJ20gZ29pbmcgdG8gZm9jdXMgb24gc29t ZSB0aGluZ3MgZnJvbSB0aGUgbGlua2VkIHRocmVhZDoNCiANCj4gPiBDb25zaWRlcmluZyB0aGF0 IHJlYWxsb2MgaXRzZWxmIGNvbnRhaW5zIGEgY29tcGxldGUNCj4gPiBtYWxsb2NfdXNhYmxlX3Np emUgKHJlZmVyIHRvIGhlcmUgYW5kIGhlcmUpLCBTbyBhY3R1YWxseSBtb3N0DQo+ID4gKDY2Ljcl KSBvZiB0aGUgcmVhbGxvYyB0aW1lIGlzIHNwZW50IGRvaW5nIG1hbGxvY191c2FibGVfc2l6ZS4N CiANCkluIHlvdXIgdGVzdCB0aGF0IGluY3JlbWVudHMgdGhlIHJlYWxsb2Mgc2l6ZSBieSBvbmUg ZWFjaCBpdGVyYXRpb24sDQpvbmx5IG9uZSBpbiBldmVyeSBQQUdFU0laRSBjYWxscyBoYXMgYW55 IHJlYWwgd29yayB0byBkby4gVGhlIHJlc3QgZG8NCm5vdGhpbmcgYnV0IHNldF9zaXplIGFmdGVy IG9idGFpbmluZyB0aGUgbWV0YWRhdGEgb24gdGhlIG9iamVjdA0KdGhleSdyZSBhY3Rpbmcgb24u IEl0J3MgY29tcGxldGVseSBleHBlY3RlZCB0aGF0IHRoZSBydW50aW1lIG9mIHRoZXNlDQp3aWxs IGJlIGRvbWluYXRlZCBieSBvYnRhaW5pbmcgdGhlIG1ldGFkYXRhOyB0aGlzIGlzbid0IGV2aWRl bmNlIG9mDQphbnl0aGluZyB3cm9uZy4gQW5kLCBtb3Jlb3ZlciwgaXQncyBhbG1vc3Qgc3VyZWx5 IGEgbG90IG1vcmUgdGhhbg0KNjYuNyUuIE1vc3Qgb2YgdGhlIDAuOHMgZGlmZmVyZW5jZSBpcyBs aWtlbHkgc3BlbnQgb24gdGhlIDI1NjAgbW1hcA0Kc3lzY2FsbHMgYW5kIHBhZ2UgZmF1bHRzIGFj Y2Vzc2luZyB0aGUgbmV3IHBhZ2VzIHRoZXkgcHJvZHVjZS4NCiANCj4gPiBJbiBpbXBsZW1lbnRh dGlvbnMgc3VjaCBhczogZ2xpYmMsIHRjbWFsbG9jLCBtc3cgY3J0IChfbXNpemUpLCBtYWMNCj4g PiBvcyB4IChtYWxsb2Nfc2l6ZSksIGFuZCBtdXNsIDEuMSwgZXZlbiBvbiBsb3ctZW5kIGVtYmVk ZGVkDQo+ID4gcHJvY2Vzc29ycywgdGhlIGNvbnN1bXB0aW9uIG9mIG1hbGxvY191c2FibGVfc2l6 ZSBwZXIgMTAgbWlsbGlvbg0KPiA+IGNhbGxzIGlzIG1vc3RseSBub3QgbW9yZSB0aGFuIGEgZmV3 IGh1bmRyZWQgbWlsbGlzZWNvbmRzLg0KIA0KSXQgbG9va3MgbGlrZSBtYWxsb2NuZydzIG1hbGxv Y191c2FibGVfc2l6ZSBpcyB0YWtpbmcgYXJvdW5kIDE1MCBucw0KcGVyIGNhbGwgb24geW91ciBz eXN0ZW0sIHZzIG1heWJlIDMwLTUwIGZvciBvdGhlcnM/DQogDQo+ID4gSW4gYWRkaXRpb24sIHRo aXMgdmVyeSBzbG93IHNsYWIgc2l6ZSBhY3F1aXNpdGlvbiBhbGdvcml0aG0gYWxzbw0KPiA+IG5l ZWRzIHRvIGJlIGNhbGxlZCBldmVyeSB0aW1lIGZyZWUgKHNlZSBoZXJlKS4gU28gd2UgYmVsaWV2 ZSBpdA0KPiA+IHNob3VsZCBiZSB0aGUgbWFpbiByZWFzb24gZm9yIG1hbGxvYy9mcmVlIGFuZCBy ZWFsbG9jIHBlcmZvcm1hbmNlDQo+ID4gZGVncmFkYXRpb24gaW4gdmVyc2lvbiAxLjIuDQogDQpV bmxlc3MgeW91IGhhdmUgYW4gYXBwbGljYXRpb24gdGhhdCdzIGV4cGxpY2l0bHkgdXNpbmcNCm1h bGxvY191c2FibGVfc2l6ZSBhbGwgb3ZlciB0aGUgcGxhY2UsIGl0J3MgaGlnaGx5IHVubGlrZWx5 IHRoYXQgdGhpcw0KaXMgdGhlIGNhdXNlIG9mIHlvdXIgcmVhbC13b3JsZCBwZXJmb3JtYW5jZSBw cm9ibGVtcy4gVGhlIHZhc3QNCm1ham9yaXR5IG9mIHJlcG9ydGVkIHByb2JsZW1zIHdpdGggbWFs bG9jIHBlcmZvcm1hbmNlIGhhdmUgYmVlbiBpbg0KbXVsdGl0aHJlYWRlZCBhcHBsaWNhdGlvbnMs IHdoZXJlIHRoZSBkb21pbmF0aW5nIHRpbWUgY29zdCBpcw0KZnVuZGFtZW50YWw6IHN5bmNocm9u aXphdGlvbiBjb3N0IG9mIGhhdmluZyBnbG9iYWwgY29uc2lzdGVuY3kuIFRoZXJlDQp5b3UnbGwg ZXhwZWN0IHRvIGZpbmQgdmVyeSBzaW1pbGFyIHBlcmZvcm1hbmNlIGZpZ3VyZXMgZnJvbSBhbnkg b3RoZXINCmFsbG9jYXRvciB3aXRoIGdsb2JhbCBjb25zaXN0ZW5jeSwgc3VjaCBhcyBoYXJkZW5l ZF9tYWxsb2MuDQogDQpJZiB5b3UncmUgcmVhbGx5IGhhdmluZyBzaW5nbGUtdGhyZWFkZWQgcGVy Zm9ybWFuY2UgcHJvYmxlbXMgdGhhdA0KYXJlbid0IGp1c3QgaW4gbWFkZS11cCBiZW5jaG1hcmtz LCBwbGVhc2Ugc2VlIGlmIHlvdSBjYW4gbmFycm93IGRvd24NCnRoZSBjYXVzZSBlbXBpcmljYWxs eSByYXRoZXIgdGhhbiBzcGVjdWxhdGl2ZWx5LiBGb3IgZXhhbXBsZSwgcnVubmluZw0KdGhlIHBy b2dyYW0gdW5kZXIgcGVyZiBhbmQgbG9va2luZyBhdCB3aGVyZSB0aGUgdGltZSBpcyBiZWluZyBz cGVudC4NCiANCj4gPiBJZiB3ZSBjYW4gaW1wcm92ZSBpdHMgc3BlZWQgYW5kIG1ha2UgaXQgY2xv c2UgdG8gaW1wbGVtZW50YXRpb25zDQo+ID4gbGlrZSB0Y21hbGxvYyAodGNtYWxsb2MgY2FuIGFs c28gYWNjdXJhdGVseSByZXR1cm4gdGhlIHNpemUgb2YgdGhlDQo+ID4gc2l6ZSBjbGFzcyB0byB3 aGljaCB0aGUgY2h1bmsgYmVsb25ncyksIGl0IHNob3VsZCBzaWduaWZpY2FudGx5DQo+ID4gaW1w cm92ZSB0aGUgcGVyZm9ybWFuY2Ugb2YgbWFsbG9jbmcgKGF0IGxlYXN0IGluIHNpbmdsZS10aHJl YWRlZA0KPiA+IHNjZW5hcmlvcykgLg0KIA0KdGNtYWxsb2MgaXMgZmFzdCBieSBub3QgaGF2aW5n IGdsb2JhbCBjb25zaXN0ZW5jeSwgbm90IGJlaW5nIGhhcmRlbmVkDQphZ2FpbnN0IG1lbW9yeSBl cnJvcnMgbGlrZSBkb3VibGUtZnJlZSBhbmQgdXNlLWFmdGVyLWZyZWUsIGFuZCBub3QNCmF2b2lk aW5nIGZyYWdtZW50YXRpb24gYW5kIGV4Y2Vzc2l2ZSBtZW1vcnkgdXNhZ2UuIExpa2V3aXNlIGZv ciBtb3N0DQpvZiB0aGUgb3RoZXJzLiBUaGUgcnVuIHRpbWUgY29zdHMgaW4gbWFsbG9jbmcgZm9y IGxvb2tpbmcgdXAgdGhlDQpvdXQtb2YtYmFuZCBtZXRhZGF0YSBhcmUgbGFyZ2VseSBmdW5kYW1l bnRhbCB0byBpdCBiZWluZyBvdXQtb2YtYmFuZA0KKG5vdCBzdWJqZWN0IHRvIGRpcmVjdCBmYWxz aWZpY2F0aW9uIHZpYSB0eXBpY2FsbHkgZXhwbG9pdGFibGUNCmFwcGxpY2F0aW9uIGJ1Z3MpLCBz aXplLWVmZmljaWVudCwgMzItYml0LWNvbXBhdGlibGUsDQpub21tdS1jb21wYXRpYmxlLCBldGMu IE90aGVyIGFwcHJvYWNoZXMgbGlrZSBpbiBoYXJkZW5lZF9tYWxsb2MgY2FuIGJlDQptb2RlcmF0 ZWx5IG1vcmUgZWZmaWNpZW50IHRvIGFjY2VzcyB0aGUgbWV0YWRhdGEsIGF0IHRoZSBwcmljZSBv ZiBub3QNCmJlaW5nIGF0IGFsbCBhbWVuYWJsZSB0byBzbWFsbCBzeXN0ZW1zLCB3aGljaCBhcmUg YSBjb3JlIGdvYWwgb2YgbXVzbA0Kd2UgY2FuJ3QgcmVhbGx5IGRpc3JlZ2FyZC4NCiANCkkgY2Fu J3Qgc2F5IGZvciBzdXJlIHRoZXJlJ3Mgbm90IGFueSByb29tIGZvciBvcHRpbWl6YXRpb24gaW4g dGhlDQptZXRhZGF0YSBmZXRjaGluZyB0aG91Z2guIExvb2tpbmcgYXQgdGhlIGFzc2VtYmx5IG91 dHB1dCBtaWdodCBiZQ0KaW5mb3JtYXRpdmUsIHRvIHNlZSBpZiB3ZSdyZSBkb2luZyBhbnl0aGlu ZyB0aGF0J3MgbWFraW5nIHRoZSBjb21waWxlcg0KZW1pdCBncmF0dWl0b3VzbHkgaW5lZmZpY2ll bnQgY29kZS4NCiANClJpY2gNCg== ------=_001_NextPart521851730013_=---- Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable =0A
Hi Rich,

Thanks for your reply.

>&= nbsp;Unless you have an application that'= s explicitly using
> malloc_usable_size all over the place, it's highly unli= kely that this
> is the cause of your real-world p= erformance problems. 

1. Yes, we have a real scenario where `malloc_us= able_size` is called frequently: we need to optimize the realloc experienc= e. We add an extra parameter= to realloc - minimalCopyBytes: it represents the actual size of data that= needs to be copied after fallback to malloc-copy-free mode. <= span style=3D"background-color: transparent;">We will judge whether to call realloc or complete malloc-= memcpy-free by ourself based on factors such as the size of the data that = realloc needs to copy (obtained through `malloc_usable_size`), the size that we actually need to copy when we d= oing malloc-memcpy-free ourself (minimalCopyBytes) and the chance of merging chu= nks (small blocks) or mremap (large blocks) in the underlayer reallocS= o, this is a real scenario, we need to call `malloc_usable_size` frequentl= y.

2. = As I mentioned before, this isn't just a problem with `malloc_usable_size`= , since we actually include a full `malloc_usable_size` procedure in both = `realloc` and `free`, it actually slows down The speed of other calls such= as `free` and `realloc`. So this probl= em actually slows down not only the `malloc_usable_size` call itself, but = also the realloc and free calls.

Thanks = :-)

=0A
--

   Be= st Regards
  BaiYang
  baiyang@gmail.com
  h= ttp://i.baiy.cn
**** < END OF EMAIL > ****
 
 
 
Date: 2022-09-19 21:43
To: baiyang
CC: musl
S= ubject: Re: [musl] The heap memory performance (malloc/free/reall= oc) is significantly degraded in musl 1.2 (compared to 1.1)
On Mon, Sep 19, 2022 at 03:53:30PM +0800, baiyang wrote:=0A
> Hi there,
=0A
>
=0A
> As we have di= scussed at
=0A
> https://github.com/openwrt/openwrt/issues/107= 52. The
=0A
> malloc_usable_size() function in musl 1.2 (mallo= cng) seems to have
=0A
> some performance issues.
=0A>
=0A
> It caused realloc and free spends too long time f= or get the chunk size.
=0A
>
=0A
> As we mentione= d in the discussion, tcmalloc and some other
=0A
> allocators = can also accurately obtain the size class corresponding
=0A
> = to a memory block and its precise size, and it is also very fast at
= =0A
> the same time.
=0A
>
=0A
> Can we ma= ke some improvements to the existing malloc_usable_size
=0A
> = algorithm in mallocng? This should significantly improve the
=0A
= > performance of existing algorithms.
=0A
 
=0A
= Can you please start from a point of identifying the real-world case
= =0A
in which you're hitting a performance degredation? Made-up tests a= re
=0A
generally not helpful and will almost always lead to focus= ing on the
=0A
wrong problem.
=0A
 
=0A
Fo= r now I'm going to focus on some things from the linked thread:
=0A 
=0A
> > Considering that realloc itself contains = a complete
=0A
> > malloc_usable_size (refer to here and he= re), So actually most
=0A
> > (66.7%) of the realloc time i= s spent doing malloc_usable_size.
=0A
 
=0A
In your= test that increments the realloc size by one each iteration,
=0Aonly one in every PAGESIZE calls has any real work to do. The rest do=0A
nothing but set_size after obtaining the metadata on the object<= /div>=0A
they're acting on. It's completely expected that the runtime = of these
=0A
will be dominated by obtaining the metadata; this is= n't evidence of
=0A
anything wrong. And, moreover, it's almost su= rely a lot more than
=0A
66.7%. Most of the 0.8s difference is li= kely spent on the 2560 mmap
=0A
syscalls and page faults accessin= g the new pages they produce.
=0A
 
=0A
> > I= n implementations such as: glibc, tcmalloc, msw crt (_msize), mac
=0A=
> > os x (malloc_size), and musl 1.1, even on low-end embedded<= /div>=0A
> > processors, the consumption of malloc_usable_size p= er 10 million
=0A
> > calls is mostly not more than a few h= undred milliseconds.
=0A
 
=0A
It looks like malloc= ng's malloc_usable_size is taking around 150 ns
=0A
per call on y= our system, vs maybe 30-50 for others?
=0A
 
=0A
&g= t; > In addition, this very slow slab size acquisition algorithm also=0A
> > needs to be called every time free (see here). So we= believe it
=0A
> > should be the main reason for malloc/fr= ee and realloc performance
=0A
> > degradation in version 1= ..2.
=0A
 
=0A
Unless you have an application that's= explicitly using
=0A
malloc_usable_size all over the place, it's= highly unlikely that this
=0A
is the cause of your real-world pe= rformance problems. The vast
=0A
majority of reported problems wi= th malloc performance have been in
=0A
multithreaded applications= , where the dominating time cost is
=0A
fundamental: synchronizat= ion cost of having global consistency. There
=0A
you'll expect to= find very similar performance figures from any other
=0A
allocat= or with global consistency, such as hardened_malloc.
=0A
 =0A
If you're really having single-threaded performance problems t= hat
=0A
aren't just in made-up benchmarks, please see if you can = narrow down
=0A
the cause empirically rather than speculatively. = For example, running
=0A
the program under perf and looking at wh= ere the time is being spent.
=0A
 
=0A
> > If= we can improve its speed and make it close to implementations
=0A> > like tcmalloc (tcmalloc can also accurately return the size of= the
=0A
> > size class to which the chunk belongs), it sho= uld significantly
=0A
> > improve the performance of malloc= ng (at least in single-threaded
=0A
> > scenarios) .
= =0A
 
=0A
tcmalloc is fast by not having global consiste= ncy, not being hardened
=0A
against memory errors like double-fre= e and use-after-free, and not
=0A
avoiding fragmentation and exce= ssive memory usage. Likewise for most
=0A
of the others. The run = time costs in mallocng for looking up the
=0A
out-of-band metadat= a are largely fundamental to it being out-of-band
=0A
(not subjec= t to direct falsification via typically exploitable
=0A
applicati= on bugs), size-efficient, 32-bit-compatible,
=0A
nommu-compatible= , etc. Other approaches like in hardened_malloc can be
=0A
modera= tely more efficient to access the metadata, at the price of not
=0Abeing at all amenable to small systems, which are a core goal of musl=0A
we can't really disregard.
=0A
 
=0A
I = can't say for sure there's not any room for optimization in the
=0Ametadata fetching though. Looking at the assembly output might be
= =0A
informative, to see if we're doing anything that's making the comp= iler
=0A
emit gratuitously inefficient code.
=0A
 <= /div>=0A
Rich
=0A
=0A ------=_001_NextPart521851730013_=------