* Re: SIGSEGV and SIGILL at malloc/free on ARM926
@ 2018-06-06 1:35 徐露
2018-06-06 2:27 ` Rich Felker
0 siblings, 1 reply; 4+ messages in thread
From: 徐露 @ 2018-06-06 1:35 UTC (permalink / raw)
To: musl
[-- Attachment #1: Type: text/plain, Size: 2759 bytes --]
On Tue, 5 Jun 2018 01:58:48 -0400, Rich Felker <dalias@...c.org> wrote:
>On Tue, Jun 05, 2018 at 11:24:34AM +0800, 徐露 wrote:
>>
>> Mon, 4 Jun 2018 05:41:29 -0400, Rich Felker <dalias@...c.org> wrote:
>> > Looks like classic double-free.
>> The program crashes randomly. If it is double free, it will may
>> crash at the same place or same time.
>
>Only if the program completely lacks any concurrency. If it's
>multithreaded, or if it's dealing with any external communications
>(pipes, network, etc.) that may be subject to timing differences,
>there is no reason to expect it to behave deterministically.
>
>> Besides, if the memory is freed before, the csize of this chunk
>> should be the same with next chunk's psize.
>
>Not necessarily. If the freed chunk was merged with neighboring free
>space, the bytes which were headers and footers at the time of free
>need not be headers and footers now. If they were not overwritten,
>they'll still have their old values but there's no reason to assume
>the old values are consistent at this point.
>
>> And the chunk's next and
>> prev pointer should point at <mal+xx>.
>
>Not necessarily; they could point to the bin head at mal+xx or to
>other free chunks in the same bin. In the case of two frees in
>immediate succession (with no concurrency) you would expect to find
>the freed chunk at the start of its bin, but in general that need not
>be the case.
Thank you for your prompt reply. I learned a lot from it.
>> For example, the 3rd case.
>> #0 0x0045e320 in a_crash () at src/malloc/malloc.c:465
>> #1 free (p=0x7b81e0) at src/malloc/malloc.c:465
>> psize csize prev next
>> 0x7b81d8: 0x11 0x30 0x4 0x3d0504 <json_object_object_delete>
>> 0x7b81e8: 0x3d0268 <json_object_object_to_json_string> 0x0 0x0 0x0
>> 0x7b81f8: 0x7b8210 0x0 0x0 0x0
>> 0x7b8208: 0x31 0x40 0x60ed30 <mal+56> 0x60ed30 <mal+56>
>
>Of these 4 lines, only the first and last look like it's likely that
>they are or were chunk headers. Assuming the 0x30 in the first line is
>correct, the second and third lines are just space inside the freed
>chunk. But then the fourth line wrongly has the chunk marked as
>in-use, and has another free chunk (of size 0x40) adjacent, which is
>inconsistent.
>
>In case there is any actual bug on our side, rather than just memory
>corruption by the application, can you fill us in on any additional
>details, especially whether the process is multithreaded? Knowing that
>would determine what sorts of further investigation might or couldn't
>be useful.
ARM926 is a signle core processor, and the application is multi-threaded.
I am sorry I sent several wrong mail to you. Perhaps I attached a large file.
[-- Attachment #2: Type: text/html, Size: 9814 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Re: SIGSEGV and SIGILL at malloc/free on ARM926
2018-06-06 1:35 SIGSEGV and SIGILL at malloc/free on ARM926 徐露
@ 2018-06-06 2:27 ` Rich Felker
0 siblings, 0 replies; 4+ messages in thread
From: Rich Felker @ 2018-06-06 2:27 UTC (permalink / raw)
To: musl
On Wed, Jun 06, 2018 at 09:35:06AM +0800, 徐露 wrote:
>
> On Tue, 5 Jun 2018 01:58:48 -0400, Rich Felker <dalias@...c.org> wrote:
> >On Tue, Jun 05, 2018 at 11:24:34AM +0800, 徐露 wrote:
> >>
> >> Mon, 4 Jun 2018 05:41:29 -0400, Rich Felker <dalias@...c.org> wrote:
> >> > Looks like classic double-free.
> >> The program crashes randomly. If it is double free, it will may
> >> crash at the same place or same time.
> >
> >Only if the program completely lacks any concurrency. If it's
> >multithreaded, or if it's dealing with any external communications
> >(pipes, network, etc.) that may be subject to timing differences,
> >there is no reason to expect it to behave deterministically.
> >
> >> Besides, if the memory is freed before, the csize of this chunk
> >> should be the same with next chunk's psize.
> >
> >Not necessarily. If the freed chunk was merged with neighboring free
> >space, the bytes which were headers and footers at the time of free
> >need not be headers and footers now. If they were not overwritten,
> >they'll still have their old values but there's no reason to assume
> >the old values are consistent at this point.
> >
> >> And the chunk's next and
> >> prev pointer should point at <mal+xx>.
> >
> >Not necessarily; they could point to the bin head at mal+xx or to
> >other free chunks in the same bin. In the case of two frees in
> >immediate succession (with no concurrency) you would expect to find
> >the freed chunk at the start of its bin, but in general that need not
> >be the case.
>
> Thank you for your prompt reply. I learned a lot from it.
>
> >> For example, the 3rd case.
> >> #0 0x0045e320 in a_crash () at src/malloc/malloc.c:465
> >> #1 free (p=0x7b81e0) at src/malloc/malloc.c:465
> >> psize csize prev next
> >> 0x7b81d8: 0x11 0x30 0x4 0x3d0504 <json_object_object_delete>
> >> 0x7b81e8: 0x3d0268 <json_object_object_to_json_string> 0x0 0x0 0x0
> >> 0x7b81f8: 0x7b8210 0x0 0x0 0x0
> >> 0x7b8208: 0x31 0x40 0x60ed30 <mal+56> 0x60ed30 <mal+56>
> >
> >Of these 4 lines, only the first and last look like it's likely that
> >they are or were chunk headers. Assuming the 0x30 in the first line is
> >correct, the second and third lines are just space inside the freed
> >chunk. But then the fourth line wrongly has the chunk marked as
> >in-use, and has another free chunk (of size 0x40) adjacent, which is
> >inconsistent.
> >
> >In case there is any actual bug on our side, rather than just memory
> >corruption by the application, can you fill us in on any additional
> >details, especially whether the process is multithreaded? Knowing that
> >would determine what sorts of further investigation might or couldn't
> >be useful.
>
> ARM926 is a signle core processor, and the application is multi-threaded.
>
> I am sorry I sent several wrong mail to you. Perhaps I attached a large file.
Yes, the list seems to have rejected the large attachment. I got the
off-list cc's. I can't follow up right now but I'll see if I can tell
later if it looks like it could be a bug on our side.
Rich
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: SIGSEGV and SIGILL at malloc/free on ARM926
2018-06-04 8:45 徐露
@ 2018-06-04 9:41 ` Rich Felker
0 siblings, 0 replies; 4+ messages in thread
From: Rich Felker @ 2018-06-04 9:41 UTC (permalink / raw)
To: musl
On Mon, Jun 04, 2018 at 04:45:28PM +0800, 徐露 wrote:
> Hi all,
>
> I use Openwrt project and the version of musl libc is 1.1.16.
> I have been experiencing random crashes when running customer's application.
> From the coredump files, the segfault looks like a memory corruption issue. But when I add some malloc and free log, the issues did not occur.
> After analyzing several coredump files, I found that the last bit of cszie in chunk has seemed to be set from 1 to 0.
> This is very strange and I don't have many ideas how to go further.
> Could you please give us some pointers, thanks! I can supply more details as needed.
Looks like classic double-free.
Rich
^ permalink raw reply [flat|nested] 4+ messages in thread
* SIGSEGV and SIGILL at malloc/free on ARM926
@ 2018-06-04 8:45 徐露
2018-06-04 9:41 ` Rich Felker
0 siblings, 1 reply; 4+ messages in thread
From: 徐露 @ 2018-06-04 8:45 UTC (permalink / raw)
To: musl
[-- Attachment #1: Type: text/plain, Size: 7846 bytes --]
Hi all,
I use Openwrt project and the version of musl libc is 1.1.16.
I have been experiencing random crashes when running customer's application.
From the coredump files, the segfault looks like a memory corruption issue. But when I add some malloc and free log, the issues did not occur.
After analyzing several coredump files, I found that the last bit of cszie in chunk has seemed to be set from 1 to 0.
This is very strange and I don't have many ideas how to go further.
Could you please give us some pointers, thanks! I can supply more details as needed.
This is the unbin and free function from src/malloc/malloc.c.
224 static void unbin(struct chunk *c, int i)
225 {
226 if (c->prev == c->next)
227 a_and_64(&mal.binmap, ~(1ULL<<i));
228 c->prev->next = c->next;
229 c->next->prev = c->prev;
230 c->csize |= C_INUSE;
231 NEXT_CHUNK(c)->psize |= C_INUSE;
232 }
450 void free(void *p)
451 {
452 struct chunk *self = MEM_TO_CHUNK(p);
453 struct chunk *next;
454 size_t final_size, new_size, size;
455 int reclaim=0;
456 int i;
457
458 if (!p) return;
459
460 if (IS_MMAPPED(self)) {
461 size_t extra = self->psize;
462 char *base = (char *)self - extra;
463 size_t len = CHUNK_SIZE(self) + extra;
464 /* Crash on double free */
465 if (extra & 1) a_crash();
466 __munmap(base, len);
467 return;
468 }
......
531 }
Here are some backtraces and memory dump I got.
1) In this case, the coredump shows that c->prev->next is the error. But from the memory context, the csize of chunk(0x1d12608) should be 0x31, not 0x30.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0045e20c in unbin (c=c@entry=0x1d12608, i=i@entry=2) at src/malloc/malloc.c:228
228 src/malloc/malloc.c: 没有那个文件或目录.
(gdb) bt
#0 0x0045e20c in unbin (c=c@entry=0x1d12608, i=i@entry=2) at src/malloc/malloc.c:228
#1 0x0045e34c in alloc_fwd (c=c@entry=0x1d12608) at src/malloc/malloc.c:242
#2 0x0045e48c in free (p=<optimized out>) at src/malloc/malloc.c:497
#3 0x003d77e4 in lh_table_free ()
#4 0x003d05ac in json_object_object_delete ()
#5 0x003cff00 in json_object_put ()
#6 0x003d0580 in json_object_lh_entry_free ()
#7 0x003d77b4 in lh_table_free ()
#8 0x003d05ac in json_object_object_delete ()
#9 0x003cff00 in json_object_put ()
#10 0x00248b98 in msgHandleLoop(void*) ()
#11 0x00471a40 in start (p=0xb687cd34) at src/thread/pthread_create.c:145
(gdb) x/64wa 0x1d125a8
0x1d125a8: 0x1d125c0 0x0 0x0 0x0
0x1d125b8: 0x31 0x41 0x10 0x1
0x1d125c8: 0x0 0x0 0x2 0x1
0x1d125d8: 0x0 0x0 0x1d6aaa0 0x1d6aaa0
0x1d125e8: 0x1d6aa70 0x3d0550 <json_object_lh_entry_free> 0x3d7410 <lh_char_hash> 0x3d7484 <lh_char_equal>
0x1d125f8: 0x41 0x11 0x60ed00 <mal+8> 0x60ed00 <mal+8>
0x1d12608: 0x11 0x30 0x4 0x3d058c <json_object_object_delete>
0x1d12618: 0x3d02f0 <json_object_object_to_json_string> 0x1 0x0 0x0
0x1d12628: 0x1d12c60 0x0 0x0 0x0
0x1d12638: 0x31 0x11 0x1d125f8 0x1d12610
0x1d12648: 0x11 0x21 0x646e6576 0x735f726f
0x1d12658: 0x75746174 0x3d0073 <json_object_set_serializer+22> 0x3d7410 <lh_char_hash> 0x3d7484 <lh_char_equal>
0x1d12668: 0x21 0x20 0x60ed10 <mal+24> 0x1d13318
0x1d12678: 0x736d65 0x0 0x0 0x0
0x1d12688: 0x20 0x21 0x1cc9f20 0x1d07050
0x1d12698: 0x1d12640 0x0 0x0 0x0
2) In this case, the coredump shows that c->prev->next is the error. But from the memory context, the csize of chunk(0x1d12608) should be 0x21, not 0x20.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0045e20c in unbin (c=c@entry=0x1127518, i=i@entry=1) at src/malloc/malloc.c:228
228 src/malloc/malloc.c: 没有那个文件或目录.
(gdb) bt
#0 0x0045e20c in unbin (c=c@entry=0x1127518, i=i@entry=1) at src/malloc/malloc.c:228
#1 0x0045e34c in alloc_fwd (c=c@entry=0x1127518) at src/malloc/malloc.c:242
#2 0x0045e48c in free (p=<optimized out>) at src/malloc/malloc.c:497
#3 0x00051cd0 in hook_free ()
#4 0x00045388 in cJSON_Delete ()
#5 0x000453e0 in cJSON_Delete ()
#6 0x000453e0 in cJSON_Delete ()
#7 0x000453e0 in cJSON_Delete ()
#8 0x000453e0 in cJSON_Delete ()
#9 0x000453e0 in cJSON_Delete ()
#10 0x00053258 in _vendor_stub_json_handle_status ()
#11 0x0005376c in _vendor_stub_json_handle_battery_status_cloud ()
#12 0x00054c74 in _vendor_stub_json_recv_thread_parse_json ()
#13 0x00054e20 in _vendor_stub_json_recv_thread ()
#14 0x00471a40 in start (p=0xb6bdad34) at src/thread/pthread_create.c:145
(gdb) x/64wa 0x11274e8
0x11274e8: 0x11 0x21 0x756e694c 0x600078
0x11274f8: 0x10f5700 0x5 0x0 0x0
0x1127508: 0x21 0x11 0x63726570 0x746e65
0x1127518: 0x11 0x20 0x706f7270 0x79747265
0x1127528: 0x0 0x3ff00000 0x1127540 0x0
0x1127538: 0x21 0x41 0x10d3e30 0x10dce40
0x1127548: 0x0 0x3 0x0 0x5c
0x1127558: 0x0 0x40570000 0x10d3e20 0x0
0x1127568: 0x0 0x0 0x10dce00 0x0
0x1127578: 0x41 0x31 0x10f57d0 0x13c
0x1127588: 0x1 0x36d060 <av_buffer_default_free> 0x0 0x2
0x1127598: 0x0 0x36d060 <av_buffer_default_free> 0x0 0x2
0x11275a8: 0x31 0x991 0xf38ffbca 0xe404eab5
0x11275b8: 0xe087e0fe 0xe246e0df 0xedf1e714 0xfa4af4a6
0x11275c8: 0x4ffff58 0xcc40a13 0xd730d15 0xeb30ed1
0x11275d8: 0xa450c55 0x87c0975 0x24a05eb 0xfe1cffbb
3) In this case, from the memory context, the csize of chunk(0x7b81d8) should be 0x31, not 0x30.
Program terminated with signal SIGILL, Illegal instruction.
#0 0x0045e320 in a_crash () at src/malloc/malloc.c:465
465 src/malloc/malloc.c: 没有那个文件或目录.
(gdb) bt
#0 0x0045e320 in a_crash () at src/malloc/malloc.c:465
#1 free (p=0x7b81e0) at src/malloc/malloc.c:465
#2 0x003cfeb8 in json_object_generic_delete ()
#3 0x003d052c in json_object_object_delete ()
#4 0x003cfe78 in json_object_put ()
#5 0x003d04f8 in json_object_lh_entry_free ()
#6 0x003d772c in lh_table_free ()
#7 0x003d0524 in json_object_object_delete ()
#8 0x003cfe78 in json_object_put ()
#9 0x0024aa1c in ServerAdapter::sendTextMsg(char const*, char const*, int, json_object*) ()
#10 0x00240658 in ServerEvent::incrementSync(json_object*) ()
#11 0x002476e0 in HostAdapter::onHandleNotifyMsg(json_object*) ()
#12 0x00248c90 in msgHandleLoop(void*) ()
#13 0x004719b8 in start (p=0xb679ed34) at src/thread/pthread_create.c:145
(gdb) x/64wa 0x7b81a8
0x7b81a8: 0x3d1280 <json_object_string_to_json_string> 0x0 0x0 0x0
0x7b81b8: 0x7b97e0 0x3 0x0 0x0
0x7b81c8: 0x140 0x11 0x61726170 0x736d params
0x7b81d8: 0x11 0x30 0x4 0x3d0504 <json_object_object_delete>
0x7b81e8: 0x3d0268 <json_object_object_to_json_string> 0x0 0x0 0x0
0x7b81f8: 0x7b8210 0x0 0x0 0x0
0x7b8208: 0x31 0x40 0x60ed30 <mal+56> 0x60ed30 <mal+56>
0x7b8218: 0x0 0x0 0x1 0x1
0x7b8228: 0x0 0x0 0x7b7910 0x7b7910
0x7b8238: 0x7b78a0 0x3d04c8 <json_object_lh_entry_free> 0x3d7388 <lh_char_hash> 0x3d73fc <lh_char_equal>
0x7b8248: 0x40 0x21 0x6c796170 0x64616f <ff_cos_65536+90991>
0x7b8258: 0x736d65 0x0 0xffffffff 0x0
0x7b8268: 0x21 0x100 0x60edf0 <mal+248> 0x60edf0 <mal+248>
0x7b8278: 0x3d1574 <json_object_array_to_json_string> 0x0 0x0 0x0
0x7b8288: 0x7b82a0 0x0 0x0 0x0
0x7b8298: 0x31 0xd1 0x60edc0 <mal+200> 0x60edc0 <mal+200>
Best regards!
—————————————————————
徐 露
全志科技 事业一部
MOBI:+86 13425063650
ADDR:广东省珠海市高新区唐家湾镇科技2路9号
MAIL:xulu@allwinnertech.com
[-- Attachment #2: Type: text/html, Size: 16643 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-06-06 2:27 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-06 1:35 SIGSEGV and SIGILL at malloc/free on ARM926 徐露
2018-06-06 2:27 ` Rich Felker
-- strict thread matches above, loose matches on Subject: below --
2018-06-04 8:45 徐露
2018-06-04 9:41 ` Rich Felker
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).