From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,HTML_FONT_FACE_BAD,HTML_MESSAGE, MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H2,T_KAM_HTML_FONT_INVALID autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 2129 invoked from network); 19 Sep 2022 18:45:11 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 19 Sep 2022 18:45:11 -0000 Received: (qmail 7903 invoked by uid 550); 19 Sep 2022 18:45:09 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 7855 invoked from network); 19 Sep 2022 18:45:07 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:mime-version:references:subject:cc:to:from:date:from:to :cc:subject:date; bh=oTSk6ZhUbprjgIbw82J1CTaXGBKMyK59r9G2TFu6HHU=; b=jeD6JQP0pO8ygA4HT7CyS8AFaTeM9ORLESw3Roz1Q0a6gOJgCivM1Ng9s3ON2L07FW Fx/euOXxo1jZo0YkT0swVM2S9+UiB92lz5yhITrkJRsf5DtgqbKEYHzM6xG1Qw8ngFvu wqucC/M4NJc22ZubQ3tds4nOwAsB+P1e9lEorSiO3djTj3SafDvFlpe4HSgi0okLEZLF gRKOdSH63DWMs0BmTNTP7UtaS4d0hACKiFCbOkTBXYXavTiVqa/tVRjRBdP14HO4+E3h vXIPdlJ8Grb/i9bqwMRvlZasOh4f1f6WcTaeGynlFu12Bvz/YKh4U1XRqoJ76wvywsTy JihA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=message-id:mime-version:references:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date; bh=oTSk6ZhUbprjgIbw82J1CTaXGBKMyK59r9G2TFu6HHU=; b=pUoXFFTVP6oQSLMT7917mQ30Sf+50tVIeKkBdUdZKSD1AEtuwtFarOXPNuZsyS5F+w naztre/NFV0xJnvziNkGfjh8sBJv3hqXDeQKrGdKHulWFbPp1NEg81aDQIpx/YoDOVDa DhXzc/1/RRnkINk+d7jWysO7g5hvPmdh5pVMpN+QjospabumMzLFECKSSwuSegGP6BKp eadf6KMoKXVxet9G25Y1/U57ZlNtDbtmKGlJbjcJGYQDydHsngnzCXI7yXAbrttXUbp7 JT81Vq6J6D8NaoBXY/zeOyfkF234S13ruD247ipEhm+7vdJ9fzGSuXEzmwTuAgIO6G4C 03oQ== X-Gm-Message-State: ACrzQf2hn6rSkBYYXUBu3KMIkKJIhWyFBIRpsjPehhKpqHpfoVyZ3Z9g u8c+t0cxS0RREPQGZ6WrNCntx974f1PyqQ== X-Google-Smtp-Source: AMsMyM7QtSf6hSv6ed7WpTTwV6L1MLXxBPOQl925rIQl0LBnJSUTLMVDdKcqXjutHkkHj4ruTVEICw== X-Received: by 2002:a17:90b:4c8e:b0:202:be8c:518e with SMTP id my14-20020a17090b4c8e00b00202be8c518emr21936921pjb.26.1663613095587; Mon, 19 Sep 2022 11:44:55 -0700 (PDT) Date: Tue, 20 Sep 2022 02:44:58 +0800 From: baiyang To: "Rich Felker" Cc: musl References: <2022091915532777412615@gmail.com>, <20220919134319.GN9709@brightrain.aerifal.cx>, <202209200132289145679@gmail.com>, <20220919181556.GT9709@brightrain.aerifal.cx> X-Priority: 3 X-GUID: 23316EC1-9818-484C-B087-842F3108616D X-Has-Attach: no X-Mailer: Foxmail 7.2.23.116[cn] Mime-Version: 1.0 Message-ID: <2022092002445709017731@gmail.com> Content-Type: multipart/alternative; boundary="----=_001_NextPart414728634101_=----" Subject: Re: Re: [musl] The heap memory performance (malloc/free/realloc) is significantly degraded in musl 1.2 (compared to 1.1) This is a multi-part message in MIME format. ------=_001_NextPart414728634101_=---- Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 PiBJcyB0aGVyZSBhIHJlYXNvbiB5b3UncmUgcmVseWluZyBvbiBhbiB1bnJlbGlhYmxlIGFuZCBu b25zdGFuZGFyZA0KPiBmdW5jdGlvbiAobWFsbG9jX3VzYWJsZV9zaXplKSB0byBkbyB0aGlzIHJh dGhlciB0aGFuIHlvdXIgcHJvZ3JhbQ0KPiBrZWVwaW5nIHRyYWNrIG9mIGl0cyBvd24ga25vd2xl ZGdlIG9mIHRoZSBhbGxvY2F0ZWQgc2l6ZT8gVGhpcyBpcyB3aGF0DQo+IHRoZSBDIGxhbmd1YWdl IGV4cGVjdHMgeW91IHRvIGRvLiBGb3IgZXhhbXBsZSBpZiB5b3UgaGF2ZSBhIHN0cnVjdHVyZQ0K PiB0aGF0IGNvbnRhaW5zIGEgcG9pbnRlciB0byBhIGR5bmFtaWNhbGx5IHNpemVkIGJ1ZmZlciwg bm9ybWFsbHkgeW91DQo+IHN0b3JlIHRoZSBzaXplIGluIGEgc2l6ZV90IG1lbWJlciByaWdodCBu ZXh0IHRvIHRoYXQgcG9pbnRlciwgYWxsb3dpbmcNCj4geW91IHRvIG1ha2UgdGhlc2Uga2luZCBv ZiBkZWNpc2lvbnMgd2l0aG91dCBoYXZpbmcgdG8gcHJvYmUgYW55dGhpbmcuDQoNClllcywgYXMg SSBoYXZlIGJlZW4gc2FpZCwgYnkgY29tcGFyaW5nIHRoZSBudW1iZXIgb2YgYnl0ZXMgdGhhdCBy ZWFsbG9jIG5lZWRzIHRvIGNvcHkgaW4gdGhlIHdvcnN0IGNhc2UgKHRoZSByZXR1cm4gdmFsdWUg b2YgbWFsbG9jX3VzYWJsZV9zaXplKSwgYW5kIHRoZSBudW1iZXIgb2YgYnl0ZXMgd2UgYWN0dWFs bHkgbmVlZCB0byBjb3B5LCB3ZSBjYW4gb3B0aW1pemUgdGhlIHBlcmZvcm1hbmNlIG9mIHJlYWxs b2MgaW4gcmVhbCBzY2VuYXJpb3MgYW5kIGF2b2lkIHVubmVjZXNzYXJ5IG1lbW9yeSBjb3BpZXMu DQoNCkluIGZhY3QsIGluIHNjZW5hcmlvcyBpbmNsdWRpbmcgZ2xpYmMsIHRjbWFsbG9jLCB3aW5k b3dzIGNydCwgbWFjIG9zIHgsIHVjbGliYyBhbmQgbXVzbCAxLjEsIHdlIGRpZCBhY2hpZXZlIGdv b2Qgb3B0aW1pemF0aW9uIHJlc3VsdHMuDQoNCk9uIHRoZSBvdGhlciBoYW5kLCBvZiBjb3Vyc2Ug d2Uga2VlcCB0aGUgbnVtYmVyIG9mIGJ5dGVzIGFjdHVhbGx5IGFsbG9jYXRlZCwgYnV0IGl0IGRv ZXNuJ3QgcmVhbGx5IHJlZmxlY3Qgb2JqZWN0aXZlbHkgdGhlIG51bWJlciBvZiBieXRlcyB0byBi ZSBjb3BpZWQgYnkgcmVhbGxvYyB3aGVuIHRoZSBtZW1jcHkgYWN0dWFsbHkgb2NjdXJzLiBBbmQg bWFsbG9jX3VzYWJsZV9zaXplKCkgbW9yZSBhY2N1cmF0ZWx5IHJlZmxlY3RzIGhvdyBtYW55IGJ5 dGVzIHJlYWxsb2MgbmVlZHMgdG8gY29weSB3aGVuIGl0IGRlZ2VuZXJhdGVzIGJhY2sgdG8gbWFs bG9jLW1lbWNweS1mcmVlIG1vZGUuDQoNClNvIG91ciBleHBlY3RhdGlvbiBpcyBhcyBtZW50aW9u ZWQgaW4gdGhlIG1hbiBwYWdlIGZvciBsaW51eCwgbWFjIG9zIG9yIHdpbmRvd3M6ICJUaGUgdmFs dWUgcmV0dXJuZWQgYnkgbWFsbG9jX3VzYWJsZV9zaXplKCkgbWF5IGJlICoqZ3JlYXRlciB0aGFu KiogdGhlIHJlcXVlc3RlZCBzaXplIG9mIHRoZSBhbGxvY2F0aW9uIiBvciAiVGhlIG1lbW9yeSBi bG9jayBzaXplIGlzIGFsd2F5cyBhdCBsZWFzdCBhcyBsYXJnZSBhcyB0aGUgYWxsb2NhdGlvbiBp dCBiYWNrcywgKiphbmQgbWF5IGJlIGxhcmdlcioqLiIgLSBXZSBleHBlY3QgdG8gZ2V0IGl0cyBp bnRlcm5hbCBzaXplIHRvIGV2YWx1YXRlIHRoZSBjb3N0IG9mIG1lbW9yeSBjb3B5aW5nLg0KDQpU aGFua3MgOi0pDQoNCi0tDQoNCiAgIEJlc3QgUmVnYXJkcw0KICBCYWlZYW5nDQogIGJhaXlhbmdA Z21haWwuY29tDQogIGh0dHA6Ly9pLmJhaXkuY24NCioqKiogPCBFTkQgT0YgRU1BSUwgPiAqKioq IA0KIA0KIA0KRnJvbTogUmljaCBGZWxrZXINCkRhdGU6IDIwMjItMDktMjAgMDI6MTUNClRvOiBi YWl5YW5nDQpDQzogbXVzbA0KU3ViamVjdDogUmU6IFJlOiBbbXVzbF0gVGhlIGhlYXAgbWVtb3J5 IHBlcmZvcm1hbmNlIChtYWxsb2MvZnJlZS9yZWFsbG9jKSBpcyBzaWduaWZpY2FudGx5IGRlZ3Jh ZGVkIGluIG11c2wgMS4yIChjb21wYXJlZCB0byAxLjEpDQpPbiBUdWUsIFNlcCAyMCwgMjAyMiBh dCAwMTozMjozMUFNICswODAwLCBiYWl5YW5nIHdyb3RlOg0KPiBIaSBSaWNoLA0KPiANCj4gVGhh bmtzIGZvciB5b3VyIHJlcGx5Lg0KPiANCj4gPiBVbmxlc3MgeW91IGhhdmUgYW4gYXBwbGljYXRp b24gdGhhdCdzIGV4cGxpY2l0bHkgdXNpbmcNCj4gPiBtYWxsb2NfdXNhYmxlX3NpemUgYWxsIG92 ZXIgdGhlIHBsYWNlLCBpdCdzIGhpZ2hseSB1bmxpa2VseSB0aGF0IHRoaXMNCj4gPiBpcyB0aGUg Y2F1c2Ugb2YgeW91ciByZWFsLXdvcmxkIHBlcmZvcm1hbmNlIHByb2JsZW1zLiANCj4gDQo+IDEu IFllcywgd2UgaGF2ZSBhIHJlYWwgc2NlbmFyaW8gd2hlcmUgYG1hbGxvY191c2FibGVfc2l6ZWAg aXMgY2FsbGVkDQo+IGZyZXF1ZW50bHk6IHdlIG5lZWQgdG8gb3B0aW1pemUgdGhlIHJlYWxsb2Mg ZXhwZXJpZW5jZS4gV2UgYWRkIGFuDQo+IGV4dHJhIHBhcmFtZXRlciB0byByZWFsbG9jIC0gbWlu aW1hbENvcHlCeXRlczogaXQgcmVwcmVzZW50cyB0aGUNCj4gYWN0dWFsIHNpemUgb2YgZGF0YSB0 aGF0IG5lZWRzIHRvIGJlIGNvcGllZCBhZnRlciBmYWxsYmFjayB0bw0KPiBtYWxsb2MtY29weS1m cmVlIG1vZGUuIFdlIHdpbGwganVkZ2Ugd2hldGhlciB0byBjYWxsIHJlYWxsb2Mgb3INCj4gY29t cGxldGUgbWFsbG9jLW1lbWNweS1mcmVlIGJ5IG91cnNlbGYgYmFzZWQgb24gZmFjdG9ycyBzdWNo IGFzIHRoZQ0KPiBzaXplIG9mIHRoZSBkYXRhIHRoYXQgcmVhbGxvYyBuZWVkcyB0byBjb3B5IChv YnRhaW5lZCB0aHJvdWdoDQo+IGBtYWxsb2NfdXNhYmxlX3NpemVgKSwgdGhlIHNpemUgdGhhdCB3 ZSBhY3R1YWxseSBuZWVkIHRvIGNvcHkgd2hlbg0KPiB3ZSBkb2luZyBtYWxsb2MtbWVtY3B5LWZy ZWUgb3Vyc2VsZiAobWluaW1hbENvcHlCeXRlcykgYW5kIHRoZQ0KPiBjaGFuY2Ugb2YgbWVyZ2lu ZyBjaHVua3MgKHNtYWxsIGJsb2Nrcykgb3IgbXJlbWFwIChsYXJnZSBibG9ja3MpIGluDQo+IHRo ZSB1bmRlcmxheWVyIHJlYWxsb2MuIFNvLCB0aGlzIGlzIGEgcmVhbCBzY2VuYXJpbywgd2UgbmVl ZCB0byBjYWxsDQo+IGBtYWxsb2NfdXNhYmxlX3NpemVgIGZyZXF1ZW50bHkuDQogDQpJcyB0aGVy ZSBhIHJlYXNvbiB5b3UncmUgcmVseWluZyBvbiBhbiB1bnJlbGlhYmxlIGFuZCBub25zdGFuZGFy ZA0KZnVuY3Rpb24gKG1hbGxvY191c2FibGVfc2l6ZSkgdG8gZG8gdGhpcyByYXRoZXIgdGhhbiB5 b3VyIHByb2dyYW0NCmtlZXBpbmcgdHJhY2sgb2YgaXRzIG93biBrbm93bGVkZ2Ugb2YgdGhlIGFs bG9jYXRlZCBzaXplPyBUaGlzIGlzIHdoYXQNCnRoZSBDIGxhbmd1YWdlIGV4cGVjdHMgeW91IHRv IGRvLiBGb3IgZXhhbXBsZSBpZiB5b3UgaGF2ZSBhIHN0cnVjdHVyZQ0KdGhhdCBjb250YWlucyBh IHBvaW50ZXIgdG8gYSBkeW5hbWljYWxseSBzaXplZCBidWZmZXIsIG5vcm1hbGx5IHlvdQ0Kc3Rv cmUgdGhlIHNpemUgaW4gYSBzaXplX3QgbWVtYmVyIHJpZ2h0IG5leHQgdG8gdGhhdCBwb2ludGVy LCBhbGxvd2luZw0KeW91IHRvIG1ha2UgdGhlc2Uga2luZCBvZiBkZWNpc2lvbnMgd2l0aG91dCBo YXZpbmcgdG8gcHJvYmUgYW55dGhpbmcuDQogDQo+IDIuIEFzIEkgbWVudGlvbmVkIGJlZm9yZSwg dGhpcyBpc24ndCBqdXN0IGEgcHJvYmxlbSB3aXRoDQo+IGBtYWxsb2NfdXNhYmxlX3NpemVgLCBz aW5jZSB3ZSBhY3R1YWxseSBpbmNsdWRlIGEgZnVsbA0KPiBgbWFsbG9jX3VzYWJsZV9zaXplYCBw cm9jZWR1cmUgaW4gYm90aCBgcmVhbGxvY2AgYW5kIGBmcmVlYCwgaXQNCj4gYWN0dWFsbHkgc2xv d3MgZG93biBUaGUgc3BlZWQgb2Ygb3RoZXIgY2FsbHMgc3VjaCBhcyBgZnJlZWAgYW5kDQo+IGBy ZWFsbG9jYC4gU28gdGhpcyBwcm9ibGVtIGFjdHVhbGx5IHNsb3dzIGRvd24gbm90IG9ubHkgdGhl DQo+IGBtYWxsb2NfdXNhYmxlX3NpemVgIGNhbGwgaXRzZWxmLCBidXQgYWxzbyB0aGUgcmVhbGxv YyBhbmQgZnJlZQ0KPiBjYWxscy4NCiANCklmIHRoaXMgaXMgYWZmZWN0aW5nIHlvdSB0b28sIHRo YXQncyBhIHNlcGFyYXRlIGlzc3VlLiBCdXQgSSBjYW4ndA0KdGVsbCBmcm9tIHdoYXQgeW91J3Zl IHJlcG9ydGVkIHNvIGZhciB3aGV0aGVyIHlvdSdyZSBqdXN0IGNsYWltaW5nDQp0aGlzIG9uIGEg dGhlb3JldGljYWwgYmFzaXMgb3Igd2hldGhlciB5b3UncmUgYWN0dWFsbHkgZXhwZXJpZW5jaW5n DQp1bmFjY2VwdGFibGUgcGVyZm9ybWFuY2UuDQo= ------=_001_NextPart414728634101_=---- Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable =0A
> Is there a reason yo= u're relying on an unreliable and nonstandard
> function (malloc_usable_size) to do this = rather than your program
&g= t; keeping track of its own knowledge of the allocated size? Th= is is what
> th= e C language expects you to do. For example if you have a structure
<= div style=3D"font-family: =E5=AE=8B=E4=BD=93;">> that contains a poi= nter to a dynamically sized buffer, normally you
> store the size in a size_t member righ= t next to that pointer, allowing
> you to make these kind of decisions without having to = probe anything.
=0A

Yes, as I have been said, by comparing the number of bytes that re= alloc needs to copy in the worst case (the return value of malloc_usable_s= ize), and the number of bytes we actually need to copy, we can optimize th= e performance of realloc in real scenarios and avoid unnecessary memory co= pies.

In fact, in = scenarios including glibc, tcmalloc, windows crt, mac os x, uclibc and mus= l 1.1, we did achieve good optimization results.

On the other hand, of cou= rse we keep the number of bytes actually allocated, but it doesn't really = reflect objectively the number of bytes to be copied by realloc when the m= emcpy actually occurs. And malloc_usable_size(= ) more accurately reflects how many bytes realloc needs to copy when it de= generates back to malloc-memcpy-free mode.

So our expectation is as mentione= d in the man page for linux, mac os or windows: "The value returned by mal= loc_usable_size() may be **greater than** the requested size of the alloca= tion" or "The memory block size is always at least as large as the allo= cation it backs, **and may be larger**." - We expect to get its internal size to e= valuate the cost of memory copying.

Than= ks :-)

=0A
--

  = Best Regards
  BaiYang
  baiyang@gmail.com
  http://i.baiy.cn
**** < END OF EMAIL > ****
 
 = ;
 
Date: 2022-09-20 02:15
To:=  baiyang
CC= : musl
<= b>Subject: Re: Re: [musl] The heap memory performance (malloc/fre= e/realloc) is significantly degraded in musl 1.2 (compared to 1.1)
On Tue, Sep 20, 2022 at 01:32:31AM +0800, baiyang wrot= e:
=0A
> Hi Rich,
=0A
>
=0A
> Thanks = for your reply.
=0A
>
=0A
> > Unless you have = an application that's explicitly using
=0A
> > malloc_usabl= e_size all over the place, it's highly unlikely that this
=0A
>= ; > is the cause of your real-world performance problems.
=0A>
=0A
> 1. Yes, we have a real scenario where `malloc_usa= ble_size` is called
=0A
> frequently: we need to optimize the = realloc experience. We add an
=0A
> extra parameter to realloc= - minimalCopyBytes: it represents the
=0A
> actual size of da= ta that needs to be copied after fallback to
=0A
> malloc-copy= -free mode. We will judge whether to call realloc or
=0A
> com= plete malloc-memcpy-free by ourself based on factors such as the
=0A<= div>> size of the data that realloc needs to copy (obtained through=0A
> `malloc_usable_size`), the size that we actually need to co= py when
=0A
> we doing malloc-memcpy-free ourself (minimalCopy= Bytes) and the
=0A
> chance of merging chunks (small blocks) o= r mremap (large blocks) in
=0A
> the underlayer realloc. So, t= his is a real scenario, we need to call
=0A
> `malloc_usable_s= ize` frequently.
=0A
 
=0A
Is there a reason you're= relying on an unreliable and nonstandard
=0A
function (malloc_us= able_size) to do this rather than your program
=0A
keeping track = of its own knowledge of the allocated size? This is what
=0A
the = C language expects you to do. For example if you have a structure
=0A=
that contains a pointer to a dynamically sized buffer, normally you=0A
store the size in a size_t member right next to that pointer, = allowing
=0A
you to make these kind of decisions without having t= o probe anything.
=0A
 
=0A
> 2. As I mentioned = before, this isn't just a problem with
=0A
> `malloc_usable_si= ze`, since we actually include a full
=0A
> `malloc_usable_siz= e` procedure in both `realloc` and `free`, it
=0A
> actually s= lows down The speed of other calls such as `free` and
=0A
> `r= ealloc`. So this problem actually slows down not only the
=0A
>= ; `malloc_usable_size` call itself, but also the realloc and free
=0A=
> calls.
=0A
 
=0A
If this is affecting you= too, that's a separate issue. But I can't
=0A
tell from what you= 've reported so far whether you're just claiming
=0A
this on a th= eoretical basis or whether you're actually experiencing
=0A
unacc= eptable performance.
=0A
=0A ------=_001_NextPart414728634101_=------