mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: Gabriel Ravier <gabravier@gmail.com>
Cc: baiyang <baiyang@gmail.com>, James Y Knight <jyknight@google.com>,
	musl <musl@lists.openwall.com>,
	Florian Weimer <fweimer@redhat.com>
Subject: Re: [musl] The heap memory performance (malloc/free/realloc) is significantly degraded in musl 1.2 (compared to 1.1)
Date: Mon, 19 Sep 2022 17:47:40 -0400	[thread overview]
Message-ID: <20220919214740.GA9709@brightrain.aerifal.cx> (raw)
In-Reply-To: <83f1e705-22b2-79b1-123c-1403d3e2281c@gmail.com>

On Mon, Sep 19, 2022 at 11:02:16PM +0200, Gabriel Ravier wrote:
> 
> On 9/19/22 21:21, Rich Felker wrote:
> >On Mon, Sep 19, 2022 at 09:07:57PM +0200, Gabriel Ravier wrote:
> >>On 9/19/22 20:14, Szabolcs Nagy wrote:
> >>>* baiyang <baiyang@gmail.com> [2022-09-20 01:40:48 +0800]:
> >>>>I looked at the code of tcmalloc, but I didn't find any of the problems you mentioned in the implementation of malloc_usable_size (see: https://github.com/google/tcmalloc/blob/9179bb884848c30616667ba129bcf9afee114c32/tcmalloc/tcmalloc.cc#L1099 ).
> >>>>
> >>>>On the contrary, similar to musl, tcmalloc also directly uses the return value of malloc_usable_size in its realloc implementation to determine whether memory needs to be reallocated: https://github.com/google/tcmalloc/blob/9179bb884848c30616667ba129bcf9afee114c32/tcmalloc/tcmalloc.cc#L1499
> >>>>
> >>>>I think this is enough to show that the return value of malloc_usable_size in tcmalloc is accurate and reliable, otherwise its own realloc will cause a segment fault.
> >>>obviously internally the implementation can use the internal chunk size...
> >>>
> >>>GetSize(p) is not the exact size (that the user allocated) but an internal
> >>>size (which may be larger) and that must not be exposed *outside* of the
> >>>malloc implementation (other than for diagnostic purposes).
> >>>
> >>>you can have 2 views:
> >>>
> >>>(1) tcmalloc and jemalloc are buggy because they expose an internal
> >>>     that must not be exposed (becaues it can break user code).
> >>>
> >>>(2) user code is buggy if it uses malloc_usable_size for any purpose
> >>>     other than diagnostic/statistics (because other uses are broken
> >>>     on many implementations).
> >>>
> >>>either way the brokenness you want to support is a security hazard
> >>>and you are lucky that musl saves the day: it works hard not to
> >>>expose internal sizes so the code you seem to care about can operate
> >>>safely (which is not true on tcmalloc and jemalloc: the compiler
> >>>may break that code).
> >>While I would agree that using malloc_usable_size is generally not a
> >>great idea (it's at most acceptable as a small micro-optimization,
> >>but I would only ever expect it to be seen in very well-tested code
> >>in very hot loops, as it is indeed quite easily misused), it seems
> >>like a bit of a stretch to say that all of:
> >>
> >>- sqlite3 (https://github.com/sqlite/sqlite/blob/master/src/mem1.c)
> >>
> >>- systemd
> >>(https://github.com/systemd/systemd/blob/main/src/basic/alloc-util.h
> >>, along with all files using MALLOC_SIZEOF_SAFE, i.e.
> >>src/basic/alloc-util.c, src/basic/compress.c, src/basic/fileio.c,
> >>src/basic/memory-util.h, src/basic/recurse-dir.c,
> >>src/basic/string-util.c, src/libsystemd/sd-netlink/netlink-socket.c,
> >>src/shared/journal-importer.c, src/shared/varlink.c,
> >>src/test/test-alloc-util.c and src/test/test-compress.c)
> >>
> >>- rocksdb (https://github.com/facebook/rocksdb/blob/main/table/block_based/filter_policy.cc
> >>, along with at least 20 other uses)
> >>
> >>- folly (https://github.com/facebook/folly/blob/main/folly/small_vector.h)
> >>
> >>- lzham_codec (https://github.com/richgel999/lzham_codec/blob/master/lzhamdecomp/lzham_mem.cpp)
> >>
> >>- quickjs
> >>(https://raw.githubusercontent.com/bellard/quickjs/master/quickjs.c)
> >>
> >>- redis
> >>(https://github.com/redis/redis/blob/unstable/src/networking.c,
> >>along with a few other uses elsewhere)
> >>
> >>along with so many more well-known projects that I've given up on
> >>listing them, are all buggy because of their usage of
> >>malloc_usable_size...
> >Depending on how you interpret the contract of malloc_usable_size
> >(which was historically ambigious), either (1) or (2) above is
> >*necessarily* true. It's not a matter of opinion just logical
> >consequences of the choice you make.
> 
> Hmmmm... to me none of the two options you have given are true.
> Instead, I'd argue that, as seen by the C abstract machine,
> invocation of malloc_usable_size actually increases the size of a
> memory block, in place, as if by magic. This may seem stupid to some
> degree, but to me it is the only way one can make sense of the
> documentation for malloc_usable_size in glibc, as it has been
> present in glibc since malloc_usable_size was first added in 1996:

I don't think that's stupid; it's a very reasonable model.
Unfortunately it's one that's very hard to make the compiler aware of,
and if the compiler can't be aware of the possibility of the object
changing size, then it loses the ability to protect you from memory
errors without having this kind of "false positive".

Note that you can't just reason that "malloc_usable_size could
internally have called realloc" because the compiler can see that the
caller is still using the "old pointer", which is invalid, *even if
realloc returned a bitwise-identical pointer* (even just comparing to
see if it's equal is UB).

I don't know how to make any of this non-awful. Basically,
malloc_usable_size ties your hands with lots of unexpected
constraints. This function alone is almost solely responsible for the
musl inclusion/exclusion criteria policy for nonstandard extensions
being extremely cautious about adding dubious extensions even if they
"look harmless" at first -- we made the mistake of including it long
ago, and don't want to repeat that.

Rich

  reply	other threads:[~2022-09-19 21:47 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-19  7:53 baiyang
2022-09-19 11:08 ` Szabolcs Nagy
2022-09-19 12:36   ` Florian Weimer
2022-09-19 13:46     ` Rich Felker
2022-09-19 13:53       ` James Y Knight
2022-09-19 17:40         ` baiyang
2022-09-19 18:14           ` Szabolcs Nagy
2022-09-19 18:40             ` baiyang
2022-09-19 19:07             ` Gabriel Ravier
2022-09-19 19:21               ` Rich Felker
2022-09-19 21:02                 ` Gabriel Ravier
2022-09-19 21:47                   ` Rich Felker [this message]
2022-09-19 22:31                     ` Gabriel Ravier
2022-09-19 22:46                       ` baiyang
2022-09-19 20:46             ` Nat!
2022-09-20  8:51               ` Szabolcs Nagy
2022-09-20  0:13           ` James Y Knight
2022-09-20  0:25             ` baiyang
2022-09-20  0:38               ` Rich Felker
2022-09-20  0:47                 ` baiyang
2022-09-20  1:00                   ` Rich Felker
2022-09-20  1:18                     ` baiyang
2022-09-20  2:15                       ` Rich Felker
2022-09-20  2:35                         ` baiyang
2022-09-20  3:28                           ` Rich Felker
2022-09-20  3:53                             ` baiyang
2022-09-20  5:41                               ` Rich Felker
2022-09-20  5:56                                 ` baiyang
2022-09-20 12:16                                   ` Rich Felker
2022-09-20 17:21                                     ` baiyang
2022-09-20  8:33       ` Florian Weimer
2022-09-20 13:54         ` Siddhesh Poyarekar
2022-09-20 16:59           ` James Y Knight
2022-09-20 17:34             ` Szabolcs Nagy
2022-09-20 19:53               ` James Y Knight
2022-09-24  8:55               ` Fangrui Song
2022-09-20 17:39             ` baiyang
2022-09-20 18:12               ` Quentin Rameau
2022-09-20 18:19                 ` Rich Felker
2022-09-20 18:26                   ` Alexander Monakov
2022-09-20 18:35                     ` baiyang
2022-09-20 20:33                       ` Gabriel Ravier
2022-09-20 20:45                         ` baiyang
2022-09-21  8:42                           ` NRK
2022-09-20 18:37                     ` Quentin Rameau
2022-09-21 10:15                   ` [musl] " 王志强
2022-09-21 16:11                     ` [musl] " 王志强
2022-09-21 17:15                     ` [musl] " Rich Felker
2022-09-21 17:58                       ` Rich Felker
2022-09-22  3:34                         ` [musl] " 王志强
2022-09-22  9:10                           ` [musl] " 王志强
2022-09-22  9:39                             ` [musl] " 王志强
2022-09-20 17:28           ` baiyang
2022-09-20 17:44             ` Siddhesh Poyarekar
2022-10-10 14:13           ` Florian Weimer
2022-09-19 13:43 ` Rich Felker
2022-09-19 17:32   ` baiyang
2022-09-19 18:15     ` Rich Felker
2022-09-19 18:44       ` baiyang
2022-09-19 19:18         ` Rich Felker
2022-09-19 19:45           ` baiyang
2022-09-19 20:07             ` Rich Felker
2022-09-19 20:17               ` baiyang
2022-09-19 20:28                 ` Rich Felker
2022-09-19 20:38                   ` baiyang
2022-09-19 22:02                 ` Quentin Rameau
2022-09-19 20:17             ` Joakim Sindholt
2022-09-19 20:33               ` baiyang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220919214740.GA9709@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=baiyang@gmail.com \
    --cc=fweimer@redhat.com \
    --cc=gabravier@gmail.com \
    --cc=jyknight@google.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).