mailing list of musl libc
 help / color / mirror / code / Atom feed
* Re: SIGSEGV and SIGILL at malloc/free on ARM926
@ 2018-06-06  1:35 徐露
  2018-06-06  2:27 ` Rich Felker
  0 siblings, 1 reply; 4+ messages in thread
From: 徐露 @ 2018-06-06  1:35 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 2759 bytes --]


On Tue, 5 Jun 2018 01:58:48 -0400, Rich Felker <dalias@...c.org> wrote:
>On Tue, Jun 05, 2018 at 11:24:34AM +0800, 徐露 wrote:
>> 
>> Mon, 4 Jun 2018 05:41:29 -0400, Rich Felker <dalias@...c.org> wrote:
>> > Looks like classic double-free.
>> The program crashes randomly. If it is double free, it will may
>> crash at the same place or same time.
>
>Only if the program completely lacks any concurrency. If it's
>multithreaded, or if it's dealing with any external communications
>(pipes, network, etc.) that may be subject to timing differences,
>there is no reason to expect it to behave deterministically.
>
>> Besides, if the memory is freed before, the csize of this chunk
>> should be the same with next chunk's psize.
>
>Not necessarily. If the freed chunk was merged with neighboring free
>space, the bytes which were headers and footers at the time of free
>need not be headers and footers now. If they were not overwritten,
>they'll still have their old values but there's no reason to assume
>the old values are consistent at this point.
>
>> And the chunk's next and
>> prev pointer should point at <mal+xx>.
>
>Not necessarily; they could point to the bin head at mal+xx or to
>other free chunks in the same bin. In the case of two frees in
>immediate succession (with no concurrency) you would expect to find
>the freed chunk at the start of its bin, but in general that need not
>be the case.

Thank you for your prompt reply. I learned a lot from it.

>> For example, the 3rd case. 
>>  #0  0x0045e320 in a_crash () at src/malloc/malloc.c:465
>>  #1  free (p=0x7b81e0) at src/malloc/malloc.c:465
>> psize   csize    prev    next
>>  0x7b81d8:       0x11    0x30 0x4  0x3d0504 <json_object_object_delete>
>>  0x7b81e8:       0x3d0268 <json_object_object_to_json_string>    0x0     0x0     0x0
>>  0x7b81f8:       0x7b8210        0x0     0x0     0x0
>>  0x7b8208:       0x31    0x40    0x60ed30 <mal+56>       0x60ed30 <mal+56>
>
>Of these 4 lines, only the first and last look like it's likely that
>they are or were chunk headers. Assuming the 0x30 in the first line is
>correct, the second and third lines are just space inside the freed
>chunk. But then the fourth line wrongly has the chunk marked as
>in-use, and has another free chunk (of size 0x40) adjacent, which is
>inconsistent.
>
>In case there is any actual bug on our side, rather than just memory
>corruption by the application, can you fill us in on any additional
>details, especially whether the process is multithreaded? Knowing that
>would determine what sorts of further investigation might or couldn't
>be useful.

ARM926 is a signle core processor, and the application is multi-threaded.

I am sorry I sent several wrong mail to you. Perhaps  I attached a large file.

[-- Attachment #2: Type: text/html, Size: 9814 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Re: SIGSEGV and SIGILL at malloc/free on ARM926
  2018-06-06  1:35 SIGSEGV and SIGILL at malloc/free on ARM926 徐露
@ 2018-06-06  2:27 ` Rich Felker
  0 siblings, 0 replies; 4+ messages in thread
From: Rich Felker @ 2018-06-06  2:27 UTC (permalink / raw)
  To: musl

On Wed, Jun 06, 2018 at 09:35:06AM +0800, 徐露 wrote:
> 
> On Tue, 5 Jun 2018 01:58:48 -0400, Rich Felker <dalias@...c.org> wrote:
> >On Tue, Jun 05, 2018 at 11:24:34AM +0800, 徐露 wrote:
> >> 
> >> Mon, 4 Jun 2018 05:41:29 -0400, Rich Felker <dalias@...c.org> wrote:
> >> > Looks like classic double-free.
> >> The program crashes randomly. If it is double free, it will may
> >> crash at the same place or same time.
> >
> >Only if the program completely lacks any concurrency. If it's
> >multithreaded, or if it's dealing with any external communications
> >(pipes, network, etc.) that may be subject to timing differences,
> >there is no reason to expect it to behave deterministically.
> >
> >> Besides, if the memory is freed before, the csize of this chunk
> >> should be the same with next chunk's psize.
> >
> >Not necessarily. If the freed chunk was merged with neighboring free
> >space, the bytes which were headers and footers at the time of free
> >need not be headers and footers now. If they were not overwritten,
> >they'll still have their old values but there's no reason to assume
> >the old values are consistent at this point.
> >
> >> And the chunk's next and
> >> prev pointer should point at <mal+xx>.
> >
> >Not necessarily; they could point to the bin head at mal+xx or to
> >other free chunks in the same bin. In the case of two frees in
> >immediate succession (with no concurrency) you would expect to find
> >the freed chunk at the start of its bin, but in general that need not
> >be the case.
> 
> Thank you for your prompt reply. I learned a lot from it.
> 
> >> For example, the 3rd case. 
> >>  #0  0x0045e320 in a_crash () at src/malloc/malloc.c:465
> >>  #1  free (p=0x7b81e0) at src/malloc/malloc.c:465
> >> psize   csize    prev    next
> >>  0x7b81d8:       0x11    0x30 0x4  0x3d0504 <json_object_object_delete>
> >>  0x7b81e8:       0x3d0268 <json_object_object_to_json_string>    0x0     0x0     0x0
> >>  0x7b81f8:       0x7b8210        0x0     0x0     0x0
> >>  0x7b8208:       0x31    0x40    0x60ed30 <mal+56>       0x60ed30 <mal+56>
> >
> >Of these 4 lines, only the first and last look like it's likely that
> >they are or were chunk headers. Assuming the 0x30 in the first line is
> >correct, the second and third lines are just space inside the freed
> >chunk. But then the fourth line wrongly has the chunk marked as
> >in-use, and has another free chunk (of size 0x40) adjacent, which is
> >inconsistent.
> >
> >In case there is any actual bug on our side, rather than just memory
> >corruption by the application, can you fill us in on any additional
> >details, especially whether the process is multithreaded? Knowing that
> >would determine what sorts of further investigation might or couldn't
> >be useful.
> 
> ARM926 is a signle core processor, and the application is multi-threaded.
> 
> I am sorry I sent several wrong mail to you. Perhaps  I attached a large file.

Yes, the list seems to have rejected the large attachment. I got the
off-list cc's. I can't follow up right now but I'll see if I can tell
later if it looks like it could be a bug on our side.

Rich


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: SIGSEGV and SIGILL at malloc/free on ARM926
  2018-06-04  8:45 徐露
@ 2018-06-04  9:41 ` Rich Felker
  0 siblings, 0 replies; 4+ messages in thread
From: Rich Felker @ 2018-06-04  9:41 UTC (permalink / raw)
  To: musl

On Mon, Jun 04, 2018 at 04:45:28PM +0800, 徐露 wrote:
> Hi all,
> 
>  I use Openwrt project and the version of musl libc is 1.1.16. 
>  I have been experiencing random crashes when running customer's application.
>  From the coredump files, the segfault looks like a memory corruption issue. But when I add some malloc and free log, the issues did not occur.
>  After analyzing several coredump files, I found that the last bit of cszie in chunk has seemed to be set from 1 to 0.
>  This is very strange and I don't have many ideas how to go further.
>  Could you please give us some pointers, thanks! I can supply more details as needed.

Looks like classic double-free.

Rich


^ permalink raw reply	[flat|nested] 4+ messages in thread

* SIGSEGV and SIGILL at malloc/free on ARM926
@ 2018-06-04  8:45 徐露
  2018-06-04  9:41 ` Rich Felker
  0 siblings, 1 reply; 4+ messages in thread
From: 徐露 @ 2018-06-04  8:45 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 7846 bytes --]

Hi all,

 I use Openwrt project and the version of musl libc is 1.1.16. 
 I have been experiencing random crashes when running customer's application.
 From the coredump files, the segfault looks like a memory corruption issue. But when I add some malloc and free log, the issues did not occur.
 After analyzing several coredump files, I found that the last bit of cszie in chunk has seemed to be set from 1 to 0.
 This is very strange and I don't have many ideas how to go further.
 Could you please give us some pointers, thanks! I can supply more details as needed.

 This is the unbin and free function from src/malloc/malloc.c.
 224 static void unbin(struct chunk *c, int i)
 225 {
 226         if (c->prev == c->next)
 227                 a_and_64(&mal.binmap, ~(1ULL<<i));
 228         c->prev->next = c->next;
 229         c->next->prev = c->prev;
 230         c->csize |= C_INUSE;
 231         NEXT_CHUNK(c)->psize |= C_INUSE;
 232 }

 450 void free(void *p)
 451 {
 452         struct chunk *self = MEM_TO_CHUNK(p);
 453         struct chunk *next;
 454         size_t final_size, new_size, size;
 455         int reclaim=0;
 456         int i;
 457 
 458         if (!p) return;
 459 
 460         if (IS_MMAPPED(self)) {
 461                 size_t extra = self->psize;
 462                 char *base = (char *)self - extra;
 463                 size_t len = CHUNK_SIZE(self) + extra;
 464                 /* Crash on double free */
 465                 if (extra & 1) a_crash();
 466                 __munmap(base, len);
 467                 return;
 468         }
  ......
 531 }

 Here are some backtraces and memory dump I got.

 1) In this case, the coredump shows that c->prev->next is the error. But from the memory context, the csize of chunk(0x1d12608) should be 0x31, not 0x30.
 Program terminated with signal SIGSEGV, Segmentation fault.
 #0  0x0045e20c in unbin (c=c@entry=0x1d12608, i=i@entry=2) at src/malloc/malloc.c:228
 228 src/malloc/malloc.c: 没有那个文件或目录.
 (gdb) bt
 #0  0x0045e20c in unbin (c=c@entry=0x1d12608, i=i@entry=2) at src/malloc/malloc.c:228
 #1  0x0045e34c in alloc_fwd (c=c@entry=0x1d12608) at src/malloc/malloc.c:242
 #2  0x0045e48c in free (p=<optimized out>) at src/malloc/malloc.c:497
 #3  0x003d77e4 in lh_table_free ()
 #4  0x003d05ac in json_object_object_delete ()
 #5  0x003cff00 in json_object_put ()
 #6  0x003d0580 in json_object_lh_entry_free ()
 #7  0x003d77b4 in lh_table_free ()
 #8  0x003d05ac in json_object_object_delete ()
 #9  0x003cff00 in json_object_put ()
 #10 0x00248b98 in msgHandleLoop(void*) ()
 #11 0x00471a40 in start (p=0xb687cd34) at src/thread/pthread_create.c:145

 (gdb) x/64wa 0x1d125a8
 0x1d125a8: 0x1d125c0 0x0 0x0 0x0
 0x1d125b8: 0x31 0x41 0x10 0x1
 0x1d125c8: 0x0 0x0 0x2 0x1
 0x1d125d8: 0x0 0x0 0x1d6aaa0 0x1d6aaa0
 0x1d125e8: 0x1d6aa70 0x3d0550 <json_object_lh_entry_free> 0x3d7410 <lh_char_hash> 0x3d7484 <lh_char_equal>
 0x1d125f8: 0x41 0x11 0x60ed00 <mal+8> 0x60ed00 <mal+8>
 0x1d12608: 0x11 0x30 0x4 0x3d058c <json_object_object_delete>
 0x1d12618: 0x3d02f0 <json_object_object_to_json_string> 0x1 0x0 0x0
 0x1d12628: 0x1d12c60 0x0 0x0 0x0
 0x1d12638: 0x31 0x11 0x1d125f8 0x1d12610
 0x1d12648: 0x11 0x21 0x646e6576 0x735f726f
 0x1d12658: 0x75746174 0x3d0073 <json_object_set_serializer+22> 0x3d7410 <lh_char_hash> 0x3d7484 <lh_char_equal>
 0x1d12668: 0x21 0x20 0x60ed10 <mal+24> 0x1d13318
 0x1d12678: 0x736d65 0x0 0x0 0x0
 0x1d12688: 0x20 0x21 0x1cc9f20 0x1d07050
 0x1d12698: 0x1d12640 0x0 0x0 0x0

 2) In this case, the coredump shows that c->prev->next is the error. But from the memory context, the csize of chunk(0x1d12608) should be 0x21, not 0x20.
 Program terminated with signal SIGSEGV, Segmentation fault.
 #0  0x0045e20c in unbin (c=c@entry=0x1127518, i=i@entry=1) at src/malloc/malloc.c:228
 228 src/malloc/malloc.c: 没有那个文件或目录.
 (gdb) bt
 #0  0x0045e20c in unbin (c=c@entry=0x1127518, i=i@entry=1) at src/malloc/malloc.c:228
 #1  0x0045e34c in alloc_fwd (c=c@entry=0x1127518) at src/malloc/malloc.c:242
 #2  0x0045e48c in free (p=<optimized out>) at src/malloc/malloc.c:497
 #3  0x00051cd0 in hook_free ()
 #4  0x00045388 in cJSON_Delete ()
 #5  0x000453e0 in cJSON_Delete ()
 #6  0x000453e0 in cJSON_Delete ()
 #7  0x000453e0 in cJSON_Delete ()
 #8  0x000453e0 in cJSON_Delete ()
 #9  0x000453e0 in cJSON_Delete ()
 #10 0x00053258 in _vendor_stub_json_handle_status ()
 #11 0x0005376c in _vendor_stub_json_handle_battery_status_cloud ()
 #12 0x00054c74 in _vendor_stub_json_recv_thread_parse_json ()
 #13 0x00054e20 in _vendor_stub_json_recv_thread ()
 #14 0x00471a40 in start (p=0xb6bdad34) at src/thread/pthread_create.c:145

 (gdb) x/64wa 0x11274e8
 0x11274e8: 0x11 0x21 0x756e694c 0x600078
 0x11274f8: 0x10f5700 0x5 0x0 0x0
 0x1127508: 0x21 0x11 0x63726570 0x746e65
 0x1127518: 0x11 0x20 0x706f7270 0x79747265
 0x1127528: 0x0 0x3ff00000 0x1127540 0x0
 0x1127538: 0x21 0x41 0x10d3e30 0x10dce40
 0x1127548: 0x0 0x3 0x0 0x5c
 0x1127558: 0x0 0x40570000 0x10d3e20 0x0
 0x1127568: 0x0 0x0 0x10dce00 0x0
 0x1127578: 0x41 0x31 0x10f57d0 0x13c
 0x1127588: 0x1 0x36d060 <av_buffer_default_free> 0x0 0x2
 0x1127598: 0x0 0x36d060 <av_buffer_default_free> 0x0 0x2
 0x11275a8: 0x31 0x991 0xf38ffbca 0xe404eab5
 0x11275b8: 0xe087e0fe 0xe246e0df 0xedf1e714 0xfa4af4a6
 0x11275c8: 0x4ffff58 0xcc40a13 0xd730d15 0xeb30ed1
 0x11275d8: 0xa450c55 0x87c0975 0x24a05eb 0xfe1cffbb


 3) In this case, from the memory context, the csize of chunk(0x7b81d8) should be 0x31, not 0x30.
 Program terminated with signal SIGILL, Illegal instruction.
 #0  0x0045e320 in a_crash () at src/malloc/malloc.c:465
 465 src/malloc/malloc.c: 没有那个文件或目录.
 (gdb) bt
 #0  0x0045e320 in a_crash () at src/malloc/malloc.c:465
 #1  free (p=0x7b81e0) at src/malloc/malloc.c:465
 #2  0x003cfeb8 in json_object_generic_delete ()
 #3  0x003d052c in json_object_object_delete ()
 #4  0x003cfe78 in json_object_put ()
 #5  0x003d04f8 in json_object_lh_entry_free ()
 #6  0x003d772c in lh_table_free ()
 #7  0x003d0524 in json_object_object_delete ()
 #8  0x003cfe78 in json_object_put ()
 #9  0x0024aa1c in ServerAdapter::sendTextMsg(char const*, char const*, int, json_object*) ()
 #10 0x00240658 in ServerEvent::incrementSync(json_object*) ()
 #11 0x002476e0 in HostAdapter::onHandleNotifyMsg(json_object*) ()
 #12 0x00248c90 in msgHandleLoop(void*) ()
 #13 0x004719b8 in start (p=0xb679ed34) at src/thread/pthread_create.c:145

 (gdb) x/64wa 0x7b81a8
 0x7b81a8:       0x3d1280 <json_object_string_to_json_string>    0x0     0x0     0x0
 0x7b81b8:       0x7b97e0        0x3     0x0     0x0
 0x7b81c8:       0x140   0x11    0x61726170      0x736d     params
 0x7b81d8:       0x11    0x30    0x4     0x3d0504 <json_object_object_delete>
 0x7b81e8:       0x3d0268 <json_object_object_to_json_string>    0x0     0x0     0x0
 0x7b81f8:       0x7b8210        0x0     0x0     0x0
 0x7b8208:       0x31    0x40    0x60ed30 <mal+56>       0x60ed30 <mal+56>
 0x7b8218:       0x0     0x0     0x1     0x1
 0x7b8228:       0x0     0x0     0x7b7910        0x7b7910
 0x7b8238:       0x7b78a0        0x3d04c8 <json_object_lh_entry_free>    0x3d7388 <lh_char_hash> 0x3d73fc <lh_char_equal>
 0x7b8248:       0x40    0x21    0x6c796170      0x64616f <ff_cos_65536+90991>
 0x7b8258:       0x736d65        0x0     0xffffffff      0x0
 0x7b8268:       0x21    0x100   0x60edf0 <mal+248>      0x60edf0 <mal+248>
 0x7b8278:       0x3d1574 <json_object_array_to_json_string>     0x0     0x0     0x0
 0x7b8288:       0x7b82a0        0x0     0x0     0x0
 0x7b8298:       0x31    0xd1    0x60edc0 <mal+200>      0x60edc0 <mal+200>


Best regards!
————————————————————— 
徐 露
全志科技 事业一部
 MOBI:+86 13425063650
 ADDR:广东省珠海市高新区唐家湾镇科技2路9号
 MAIL:xulu@allwinnertech.com


[-- Attachment #2: Type: text/html, Size: 16643 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-06-06  2:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-06  1:35 SIGSEGV and SIGILL at malloc/free on ARM926 徐露
2018-06-06  2:27 ` Rich Felker
  -- strict thread matches above, loose matches on Subject: below --
2018-06-04  8:45 徐露
2018-06-04  9:41 ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).