* Re: [musl] Memory leak issue in multi-threaded program
2020-01-28 13:29 ` Rich Felker
@ 2020-01-29 1:55 ` Leesoo Ahn
2020-02-05 10:17 ` Leesoo Ahn
1 sibling, 0 replies; 5+ messages in thread
From: Leesoo Ahn @ 2020-01-29 1:55 UTC (permalink / raw)
To: Rich Felker; +Cc: musl
Dear Rich,
Thank you for the quick feedback. I am currently taking a look at the
hotfix patch and do stress testing.
However, I can't wait for the next-gen new malloc implementation!
Cheers,
Leesoo
20. 1. 28. 오후 10:29에 Rich Felker 이(가) 쓴 글:
> On Tue, Jan 28, 2020 at 02:44:07PM +0900, Leesoo Ahn wrote:
>> Dear musl developers,
>>
>> Hello!, it seems that musl currently has a memory leak issue in
>> multi-threaded program. It occurs in the below situation of latest
>> (v1.1.24) source. Also, not only in 32-bits[1], but also 64-bits[2]
>> as well.
>>
>> When a program create and run, at least, two threads or more with
>> pthread APIs, VSZ of the program by ps command keeps increasing. But
>> here is a weird thing that it is fine 'IF ONLY ONE' pthread is
>> created and run.
>>
>> To confirm the issue in your host machine, please follow the instructions,
>>
>> 0. Clone the musl git and get inside.
>> 1. Build with these options for static build, ./configure
>> --prefix=$(pwd)/_build_dir --disable-shared
>> 2. Download the test code[3], then build with the command,
>> ../_build_dir/bin/musl-gcc ./test.c
>> 3. Run this script, ./a.out &; while [ 1 ]; do { ps aux | grep
>> [a].out | grep -v grep; sleep 1; } done
>>
>> You may figure out that VSZ keeps increasing.
>>
>> BUT, when I make it to try to allocate memory all the time by kernel
>> mmap with this diff[4] as workaround, although it creates more
>> pthreads than 2, the issue never happens.
>>
>> It would be really thankful if you guys could confirm it and find
>> out the way to fix the bug.
>
> This is a known issue described in:
>
> https://www.openwall.com/lists/musl/2018/10/30/2
>
> and likely several times before that, though it was not realized that
> people were hitting it in practice (vs it just being theoretical)
> until around that time. I posted an experimental mitigation patch last
> spring:
>
> https://www.openwall.com/lists/musl/2019/04/12/4
>
> but it's not heavily tested and its impact on performance is
> significant. I think it should be ok if you need an immediate fix, but
> you should do some testing to make sure. If you go this route, reports
> of any problems (or success) would be nice to hear about.
>
> Further work in that direction was not done because it was already
> planned that musl's malloc implementation will be replaced, and that
> the replacement will solve this and other problems in much better
> ways. This is work in progress and is intended for merge in the next
> release cycle:
>
> https://www.openwall.com/lists/musl/2019/10/22/3
> https://github.com/richfelker/mallocng-draft
>
> Hope this information helps.
>
> Rich
>
>
>
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [musl] Memory leak issue in multi-threaded program
2020-01-28 13:29 ` Rich Felker
2020-01-29 1:55 ` Leesoo Ahn
@ 2020-02-05 10:17 ` Leesoo Ahn
2020-02-05 20:00 ` Rich Felker
1 sibling, 1 reply; 5+ messages in thread
From: Leesoo Ahn @ 2020-02-05 10:17 UTC (permalink / raw)
To: Rich Felker; +Cc: musl
Dear Rich,
My coworker and I had been trying to solve this leak issue in embedded
system which is based on OpenWRT, ARM64 arch and currently musl-1.1.16
for our product. However, musl-1.1.24 patch you referred below, we
figured out that backporting of the patch into 1.1.16 is quite difficult
by such problems, for examples, translation faults raised, or in another
way of without the patch, double-locking issue in atomically calling
malloc/free with this changes[1].
But not only in 1.1.16, but also 1.1.24 that we tested with, has the
same problems as well. So, we are currently like in the middle of Sea
without any foods. It has a big risk and so much dangerous for our product.
We are considering to keep 1.1.16 as our base in product, because
although in 1.1.24, a lot of bugs fixed, nobody can guarantee for our
product when we put 1.1.24 on it.
Could you give us any ideas for fixing the issue in v1.1.16, please? Ah,
we are in so much pain...
Or what do you think this case that all the time, all processes ask to
kernel via mmap syscall? Does this solve the issue...even though it has
bad performance...?
I wish I can solve this problem sooner.
Best regards,
Leesoo
----
[1]
diff --git a/src/malloc/malloc.c b/src/malloc/malloc.c
index 9698259..f914cff 100644
--- a/src/malloc/malloc.c
+++ b/src/malloc/malloc.c
@@ -14,6 +14,10 @@
#define inline inline __attribute__((always_inline))
#endif
+#include <pthread.h>
+pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
+
static struct {
volatile uint64_t binmap;
struct bin bins[64];
@@ -281,8 +285,25 @@ static void trim(struct chunk *self, size_t n)
__bin_chunk(split);
}
+#if 1
+static void *__malloc(size_t n);
+
void *malloc(size_t n)
{
+ void *new_heap;
+
+ pthread_mutex_lock(&lock);
+ new_heap = __malloc(n);
+ pthread_mutex_unlock(&lock);
+
+ return new_heap;
+}
+
+static void *__malloc(size_t n)
+#else
+void *malloc(size_t n)
+#endif
+{
struct chunk *c;
int i, j;
@@ -516,8 +537,21 @@ static void unmap_chunk(struct chunk *self)
__munmap(base, len);
}
+#if 1
+static void __free(void *p);
+
void free(void *p)
{
+ pthread_mutex_lock(&lock);
+ __free(p);
+ pthread_mutex_unlock(&lock);
+}
+
+static void __free(void *p)
+#else
+void free(void *p)
+#endif
+{
if (!p) return;
struct chunk *self = MEM_TO_CHUNK(p);
20. 1. 28. 오후 10:29에 Rich Felker 이(가) 쓴 글:
> On Tue, Jan 28, 2020 at 02:44:07PM +0900, Leesoo Ahn wrote:
>> Dear musl developers,
>>
>> Hello!, it seems that musl currently has a memory leak issue in
>> multi-threaded program. It occurs in the below situation of latest
>> (v1.1.24) source. Also, not only in 32-bits[1], but also 64-bits[2]
>> as well.
>>
>> When a program create and run, at least, two threads or more with
>> pthread APIs, VSZ of the program by ps command keeps increasing. But
>> here is a weird thing that it is fine 'IF ONLY ONE' pthread is
>> created and run.
>>
>> To confirm the issue in your host machine, please follow the instructions,
>>
>> 0. Clone the musl git and get inside.
>> 1. Build with these options for static build, ./configure
>> --prefix=$(pwd)/_build_dir --disable-shared
>> 2. Download the test code[3], then build with the command,
>> ../_build_dir/bin/musl-gcc ./test.c
>> 3. Run this script, ./a.out &; while [ 1 ]; do { ps aux | grep
>> [a].out | grep -v grep; sleep 1; } done
>>
>> You may figure out that VSZ keeps increasing.
>>
>> BUT, when I make it to try to allocate memory all the time by kernel
>> mmap with this diff[4] as workaround, although it creates more
>> pthreads than 2, the issue never happens.
>>
>> It would be really thankful if you guys could confirm it and find
>> out the way to fix the bug.
>
> This is a known issue described in:
>
> https://www.openwall.com/lists/musl/2018/10/30/2
>
> and likely several times before that, though it was not realized that
> people were hitting it in practice (vs it just being theoretical)
> until around that time. I posted an experimental mitigation patch last
> spring:
>
> https://www.openwall.com/lists/musl/2019/04/12/4
>
> but it's not heavily tested and its impact on performance is
> significant. I think it should be ok if you need an immediate fix, but
> you should do some testing to make sure. If you go this route, reports
> of any problems (or success) would be nice to hear about.
>
> Further work in that direction was not done because it was already
> planned that musl's malloc implementation will be replaced, and that
> the replacement will solve this and other problems in much better
> ways. This is work in progress and is intended for merge in the next
> release cycle:
>
> https://www.openwall.com/lists/musl/2019/10/22/3
> https://github.com/richfelker/mallocng-draft
>
> Hope this information helps.
>
> Rich
>
>
>
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread