From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,HTML_FONT_FACE_BAD,HTML_MESSAGE, MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H2,T_KAM_HTML_FONT_INVALID autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 16022 invoked from network); 20 Sep 2022 01:18:17 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 20 Sep 2022 01:18:17 -0000 Received: (qmail 3102 invoked by uid 550); 20 Sep 2022 01:18:14 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 3079 invoked from network); 20 Sep 2022 01:18:13 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=message-id:mime-version:references:subject:cc:to:from:date:from:to :cc:subject:date; bh=N4QgQes1wo1RHn9y3MQdGYEFi0U1WOnN8/v/v0ys0No=; b=pAiSr0Len6W/EZ5XUM7TsCDmkho4D2+blUZx4Mgy1pjWjPKoCw9JYqforpsKBTvAkq acPzneHO9X60Gr+YdygBZiBvTxnEVE+sITI3pfV8X3wV4qkQ7y9ofqDia0zshS58OJyy 7VuJuWxcQ6JjJ4W2eJDhDSz5RuE1A+3hAOGBxPxDEYLKPwH7u4OncHHKV37InIcunSsr P6yKWAZOtB1Cld0I5GI6Ib7kS3brVqM0cam2mKgeBWRPH3MiyctH0iwrV9w38Xzc5Lf7 D0yMHhsiFP48+8PDaQBYT6csdDjTztkjuM4eW+s3/jEBbB1GWsIHgoneHc8ATQ56KfKh V05w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=message-id:mime-version:references:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date; bh=N4QgQes1wo1RHn9y3MQdGYEFi0U1WOnN8/v/v0ys0No=; b=2pONhJvHoRWCx0kaxMzo3d+9HbKx4xJVhRjjVQvGFEblT2QT9vYrlI0i5RguHwZH0C 0544ICxmcvgl1MrOKvML7evxvcVr4M18aDw76PGo/xTvZks0CYFyVD9E1V2DrotTPMTJ Iv5m/xzKbsK4TbY18B/178h0cz+/aaFmxvj37bA4VJsCfeynM3UnpJV+nOL2/twrCS86 26N2KV30MCES26F2BcavmJia+E4EsZiKUoShHAz1muDuoAgw8VndCsJJ3uN/cB1zKcWc 4NVQQswUpm6il1C6jYctP4rbLqJijnBVXZXOK3iZmc50ENFA7t6g1l4JoTRwdfF/Bjn6 L5DQ== X-Gm-Message-State: ACrzQf13JEh1nt2Mj0kG/ACL5yr+Gs9sllVKcY3MhrIbVvYUeQGGgDtN 9+wyz2frtuw5+ZhKMdVFiSRRTIV1yQ1kxA== X-Google-Smtp-Source: AMsMyM7Vzwt4QZasLPaRwpfaY/UgIpHcc0WuIqACDd+7PYfSneSMUaT1cuUgBZHFQ+dIQF9xxt/86g== X-Received: by 2002:a63:2a08:0:b0:439:36bc:6320 with SMTP id q8-20020a632a08000000b0043936bc6320mr17808479pgq.43.1663636681504; Mon, 19 Sep 2022 18:18:01 -0700 (PDT) Date: Tue, 20 Sep 2022 09:18:04 +0800 From: baiyang To: "Rich Felker" Cc: musl References: <2022091915532777412615@gmail.com>, <20220919110829.GA2158779@port70.net>, <874jx3h76u.fsf@oldenburg.str.redhat.com>, <20220919134659.GO9709@brightrain.aerifal.cx>, , <2022092001404698842815@gmail.com>, , <2022092008254998320584@gmail.com>, <20220920003811.GF9709@brightrain.aerifal.cx>, <2022092008470636285288@gmail.com>, <20220920010056.GG9709@brightrain.aerifal.cx> X-Priority: 3 X-GUID: 72B66B83-BF92-4F4C-BE8B-1D6BAF98389E X-Has-Attach: no X-Mailer: Foxmail 7.2.23.116[cn] Mime-Version: 1.0 Message-ID: <2022092009180277847194@gmail.com> Content-Type: multipart/alternative; boundary="----=_001_NextPart728245801267_=----" Subject: Re: Re: [musl] The heap memory performance (malloc/free/realloc) is significantly degraded in musl 1.2 (compared to 1.1) This is a multi-part message in MIME format. ------=_001_NextPart728245801267_=---- Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 PiBUaGVyZSBpcyBubyBoaWRkZW4gInNpemUgYWN0dWFsbHkgYWxsb2NhdGVkIGludGVybmFsbHki LiBUaGUgc2l6ZSB5b3UNCj4gZ2V0IGlzIHRoZSBzaXplIHlvdSByZXF1ZXN0ZWQuIEV2ZXJ5dGhp bmcgZWxzZSBpcyBhbGxvY2F0b3IgZGF0YQ0KPiBzdHJ1Y3R1cmVzICpvdXRzaWRlIG9mIHRoZSBv YmplY3QqIHRoYXQgdGhlIGNhbGxlciBoYXMgbm8gZW50aXRsZW1lbnQNCj4gdG8gcGVlayBvciBw b2tlIGF0LCBhbmQgbWFsbG9jX3VzYWJsZV9zaXplJ3MgcmV0dXJuIHZhbHVlIHJlZmxlY3RzDQo+ IHRoYXQuDQoNCklmIEkgdW5kZXJzdGFuZCBjb3JyZWN0bHksIGFjY29yZGluZyB0byB0aGUgZGVm aW5pdGlvbiBvZiBzaXplX2NsYXNzZXMgaW4gdGhlIG1hbGxvY25nIGNvZGU6IA0KMS4gV2hlbiBJ IGNhbGwgYHZvaWQqIHAgPSBtYWxsb2MoNjYwMClgLCBtYWxsb2NuZyBhY3R1YWxseSBhbGxvY2F0 ZXMgbW9yZSB0aGFuIDgxMDAgYnl0ZXMgb2YgdXNhYmxlIHNwYWNlLCByaWdodD8NCjIuIEFjY29y ZGluZyB0byB5b3VyIHByZXZpb3VzIGV4cGxhbmF0aW9uLCBjYWxsaW5nIG1hbGxvY191c2FibGVf c2l6ZShwKSBhdCB0aGlzIHRpbWUgcmV0dXJucyA2NjAwLCByaWdodD8NCg0KTXkgcXVlc3Rpb24g aXMsIGlmIG1hbGxvY191c2FibGVfc2l6ZShwKSBjYW4gZGlyZWN0bHkgcmV0dXJuIDgxOTEgKG9y IHNpbWlsYXIgYWN0dWFsIGFsbG9jYXRlZCBzaXplLCBhcyBvdGhlciBsaWJjIGRvKSBpbnN0ZWFk IG9mIDY2MDAsIGlzIGl0IHBvc3NpYmxlIHRvIG1ha2UgbWFsbG9jbmcgYWNoaWV2ZSBoaWdoZXIg cGVyZm9ybWFuY2UgYm90aCBpbiB0aW1lIGFuZCBzcGFjZT8NCiANCi0tDQoNCiAgIEJlc3QgUmVn YXJkcw0KICBCYWlZYW5nDQogIGJhaXlhbmdAZ21haWwuY29tDQogIGh0dHA6Ly9pLmJhaXkuY24N CioqKiogPCBFTkQgT0YgRU1BSUwgPiAqKioqIA0KIA0KIA0KRnJvbTogUmljaCBGZWxrZXINCkRh dGU6IDIwMjItMDktMjAgMDk6MDANClRvOiBiYWl5YW5nDQpDQzogbXVzbA0KU3ViamVjdDogUmU6 IFJlOiBbbXVzbF0gVGhlIGhlYXAgbWVtb3J5IHBlcmZvcm1hbmNlIChtYWxsb2MvZnJlZS9yZWFs bG9jKSBpcyBzaWduaWZpY2FudGx5IGRlZ3JhZGVkIGluIG11c2wgMS4yIChjb21wYXJlZCB0byAx LjEpDQpPbiBUdWUsIFNlcCAyMCwgMjAyMiBhdCAwODo0NzowN0FNICswODAwLCBiYWl5YW5nIHdy b3RlOg0KPiA+IFdvdWxkIGl0IGJlIHBvc3NpYmxlIHRvIGxpbWl0IHVzZSBvZiB0aGUgbGlzdCB0 byBhY3R1YWxseSByZXF1ZXN0aW5nDQo+ID4gaGVscCBvciBtYWtpbmcgcmVwb3J0cywgcmF0aGVy IHRoYW4gaW5jaXRpbmcgZGViYXRlcyBhYm91dCB3aGF0IGlzIFVCDQo+ID4gb3Igd2hhdCB0aGUg Y29uc2VxdWVuY2VzIG9mIFVCIG1pZ2h0IGJlPw0KPiANCj4gWW91IGFyZSByaWdodC4gDQo+IA0K PiBUaGUgcmVhbCBxdWVzdGlvbiBpczogaWYgd2Ugb25seSBuZWVkIG1hbGxvY191c2FibGVfc2l6 ZSB0byByZXR1cm4NCj4gdGhlIHNpemUgYWN0dWFsbHkgYWxsb2NhdGVkIGludGVybmFsbHkgKG5v dCB0aGUgc2l6ZSByZXF1ZXN0ZWQgYnkNCj4gdGhlIHVzZXIsICoqanVzdCBhcyBtdXNsIHZlcnNp b24gMS4xIGFuZCBhbGwgb3RoZXIgbGliYw0KPiBpbXBsZW1lbnRhdGlvbnMgZG8qKiksIGlzIGl0 IHBvc3NpYmxlIHRvIGltcHJvdmUgaXRzIHRpbWUgYW5kIHNwYWNlDQo+IGVmZmljaWVuY3k/DQog DQpUaGVyZSBpcyBubyBoaWRkZW4gInNpemUgYWN0dWFsbHkgYWxsb2NhdGVkIGludGVybmFsbHki LiBUaGUgc2l6ZSB5b3UNCmdldCBpcyB0aGUgc2l6ZSB5b3UgcmVxdWVzdGVkLiBFdmVyeXRoaW5n IGVsc2UgaXMgYWxsb2NhdG9yIGRhdGENCnN0cnVjdHVyZXMgKm91dHNpZGUgb2YgdGhlIG9iamVj dCogdGhhdCB0aGUgY2FsbGVyIGhhcyBubyBlbnRpdGxlbWVudA0KdG8gcGVlayBvciBwb2tlIGF0 LCBhbmQgbWFsbG9jX3VzYWJsZV9zaXplJ3MgcmV0dXJuIHZhbHVlIHJlZmxlY3RzDQp0aGF0Lg0K IA0KSWYgeW91IHdhbnQgdG8gc2VlIHdoYXQgcG9ydGlvbiBvZiB0aGUgdGltZSBpcyBiZWluZyBz cGVudCBvbg0KZGlmZmVyZW50IHBhcnRzIG9mIHByb2Nlc3NpbmcgdGhlIG1ldGFkYXRhLCB5b3Ug Y291bGQgc2l0IGRvd24gYW5kDQphY3R1YWxseSBydW4gaXQgdW5kZXIgcGVyZiB0byBnZXQgYSBw cm9maWxpbmcgcmVwb3J0L2ZsYW1lIGdyYXBoLiBJJ20NCnByZXR0eSBzdXJlIHlvdSdsbCBmaW5k IHRoYXQgdGhlIGZpbmFsIGdldF9ub21pbmFsX3NpemUgc3RlcCBpcyBhDQpzbWFsbCBwb3J0aW9u IG9mIHRoZSB0aW1lIHNwZW50LiBnZXRfbWV0YSBpcyBwcm9iYWJseSB0aGUgbWFqb3JpdHkgb2YN CnRoZSB0aW1lLCBzb21lIG9mIGl0IGZ1bmRhbWVudGFsLCBhbmQgc29tZSBvZiBpdCBoYXJkZW5p bmcuIEJ1dCBkb24ndA0KdGFrZSBteSB3b3JkIGZvciBpdC4gTWVhc3VyZS4NCiANCk9uZSB0aGlu ZyBJIGNhbiB0ZWxsIHlvdSBkZWZpbml0aXZlbHkgdGhvdWdoOiBpZiB5b3UgZGlkIHdoYXQgdGhl IEMNCmxhbmd1YWdlICh3aGljaCBsYWNrcyBtYWxsb2NfdXNhYmxlX3NpemUpIGludGVuZGVkIHlv dSB0byBkbywgYW5kIGtlcHQNCnRyYWNrIG9mIHRoZSBzaXplIG9mIHlvdXIgb3duIGJ1ZmZlciwg YW5kIGp1c3QgdXNlZCB0aGF0LCB5b3Ugd291bGQNCnNwZW5kIDAlIG9mIHRoZSB0aW1lIHlvdSdy ZSBzcGVuZGluZyBvbiB0aGlzLiBZb3Ugd291bGQgYWxzbyBzYXZlIHRoZQ0KZW50aXJlICJzZXZl cmFsIGh1bmRyZWQgbXMgcGVyIDEwIG1pbGxpb24gY2FsbHMiIGl0J3MgY29zdGluZyBvbiBvdGhl cg0KbWFsbG9jIGltcGxlbWVudGF0aW9ucywgYnkganVzdCAqbm90IGRvaW5nIHNvbWV0aGluZyB5 b3UgZG9uJ3QgbmVlZCB0bw0KZG8qLg0KIA0KUmljaA0K ------=_001_NextPart728245801267_=---- Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable =0A
> There is no hidden "size actual= ly allocated internally". The size you
> get is the size you requested. Everything else i= s allocator data
><= span style=3D"font-size: 10.5pt; background-color: transparent;"> structures *outside of the object* that the caller has no entitlement<= /div>
> to peek or po= ke at, and malloc_usable_size's return value reflects
> that.

If I understand correctly, according to the definiti= on of size_classes in the mallocng code: 
1. W= hen I call `void* p =3D malloc(6600)`, mallocng actually allocates more th= an 8100 bytes of usable space, right?
2. Acc= ording to your previous explanation, calling malloc_usable_size(p) at this= time returns 6600, right?

My question is, if malloc_usable_size(p) can directly return 8191 (o= r similar actual allocated size, as other libc do) instead of 6600,= is it possible to make mallocng achieve higher performance both in time a= nd space?
&nb= sp;
=0A
--

   Best Regards
  BaiYang
  baiyang@gmail.com
  http://i.baiy.cn
**** < END OF EMAIL > ****
 =
 
 
Date: 2022-09-20 09:00
=
To: baiyang
CC: musl<= /div>
Subject: Re: Re: [musl] The heap memory performance = (malloc/free/realloc) is significantly degraded in musl 1.2 (compared to 1= .1)
On Tue, Sep 20, 2022 at 08:47:07AM +0800, b= aiyang wrote:
=0A
> > Would it be possible to limit use of = the list to actually requesting
=0A
> > help or making repo= rts, rather than inciting debates about what is UB
=0A
> > = or what the consequences of UB might be?
=0A
>
=0A
&= gt; You are right.
=0A
>
=0A
> The real question= is: if we only need malloc_usable_size to return
=0A
> the si= ze actually allocated internally (not the size requested by
=0A
&= gt; the user, **just as musl version 1.1 and all other libc
=0A
&= gt; implementations do**), is it possible to improve its time and space=0A
> efficiency?
=0A
 
=0A
There is no h= idden "size actually allocated internally". The size you
=0A
get = is the size you requested. Everything else is allocator data
=0A
= structures *outside of the object* that the caller has no entitlement=0A
to peek or poke at, and malloc_usable_size's return value reflect= s
=0A
that.
=0A
 
=0A
If you want to see w= hat portion of the time is being spent on
=0A
different parts of = processing the metadata, you could sit down and
=0A
actually run = it under perf to get a profiling report/flame graph. I'm
=0A
pret= ty sure you'll find that the final get_nominal_size step is a
=0Asmall portion of the time spent. get_meta is probably the majority of=0A
the time, some of it fundamental, and some of it hardening. But = don't
=0A
take my word for it. Measure.
=0A
 
= =0A
One thing I can tell you definitively though: if you did what the = C
=0A
language (which lacks malloc_usable_size) intended you to d= o, and kept
=0A
track of the size of your own buffer, and just us= ed that, you would
=0A
spend 0% of the time you're spending on th= is. You would also save the
=0A
entire "several hundred ms per 10= million calls" it's costing on other
=0A
malloc implementations,= by just *not doing something you don't need to
=0A
do*.
=0A=
 
=0A
Rich
=0A
=0A ------=_001_NextPart728245801267_=------