Re: [PATCH] Add comments to i386 assembly source

mailing list of musl libc
 help / color / mirror / code / Atom feed

From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: [PATCH] Add comments to i386 assembly source
Date: Tue, 2 Jan 2018 14:49:09 -0500	[thread overview]
Message-ID: <20180102194909.GM1627@brightrain.aerifal.cx> (raw)
In-Reply-To: <64245dca-3c6e-3918-701c-dcf3f8e00783@bitwagon.com>

On Mon, Jan 01, 2018 at 07:15:50PM -0800, John Reiser wrote:
> On 01/01/2018 13:49 UTC, Rich Felker wrote:
> >On Mon, Jan 01, 2018 at 02:57:02PM -0800, John Reiser wrote:
> >>There's a bug.  clone() is a user-level function that can be used
> >>independently of the musl internal implementation of threads.
> >>Thus when clone() in musl/src/linux/clone.c calls
> >>         return __syscall_ret(__clone(func, stack, flags, arg, ptid, tls, ctid));
> >>then the i386 implementation of __clone has no guarantee about
> >>the value in %gs, and it is a bug to assume that (%gs >> 3)
> >>fits in 8 bits.
> >
> >The ABI is that at function call or any time a signal could be
> >received, %gs must always be a valid segment register value reflecting
> >the current thread's thread pointer. If this is violated, the program
> >has undefined behavior.
> 
> More than one segment descriptor can designate the same subset
> of the linear address space.  Duplicate the segment descriptor
> to a target selector that is >= 256, and load %gs with the
> duplicate selector before calling clone().

It's not clear to me that such a substition is valid; as far as I can
tell no explicit effort to ensure that it works is made, and it would
not happen without writing asm to do specifically that.

> >>The code in musl/src/thread/i386/clone.s wastes up to 12 bytes
> >>when aligning the new stack, by aligning before [pre-]allocating
> >>space for the one argument to the thread function.
> >
> >I suspect the initial value happens to be aligned anyway in which case
> >reserving 16 bytes and aligning to 16 is the same as reserving 4 and
> >aligning to 16. If you think it's not, I don't mind changing if you
> >can do careful testing to make sure it doesn't introduce any bugs.
> 
> This is another bug!  Consider the valid code:
> 	void **lo_stack = malloc(5 * sizeof(void *));
> 	/* malloc() guarantees 16-byte alignment of lo_stack */
> 	clone(func, &lo_stack[5], ...);

You can't run code on a 20-byte stack. This is not a surprise. In
theory it might be possible if the callee is only asm, but you can't
make C function calls since each call frame will consume at least 16
bytes (return address and alignment). I also disagree with considering
it valid to assume clone invokes the provided callback function
directly with no intervening functions; this is incorrect on SH right
now since we use a C function to smooth over the difference between
plain and fdpic calling conventions. ARM will probably do the same
once fdpic for cortex-M is added.

> then __clone() does:
> 	and $-16,%ecx  /* &lo_stack[4] */
> 	sub $ 16,%ecx  /* &lo_stack[0] */
> 	  ...
> 	mov %ecx,%esp  /* new thread: implicit action of ___NR_clone system call */
> 	call *%eax  /* OUT-OF-BOUNDS:  lo_stack[-1] = return address */
> 
> Thus, starting the thread function has scribbled outside the allocated area,
> even though the lo_stack[] array can accommodate the call by the code I showed:
> 	lea -NBPW(arg2),%ecx  /* &lo_stack[4] */
> 	and $-16,%ecx  /* still &lo_stack[4] */
> 	  ...
> 	mov %ecx,%esp  /* new thread: implicit action of __NR_clone system call */
> 	call *%eax  /* lo_stack[3] = return address */
> 
> The danger is not "new bugs", but rather revealing latent bugs that were
> obscured by the less-strict old code.  For instance, if the thread
> function actually has two formal parameters, or if it uses va_arg()
> to reference beyond the first actual argument, then running the optimal
> code is more likely to notice.

I agree with your analysis of what happens but I don't think it's
particularly interesting or a bug.

Rich

next prev parent reply	other threads:[~2018-01-02 19:49 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-23  9:45 Markus Wichmann
2017-12-31  4:15 ` John Reiser
2017-12-31  6:54   ` Markus Wichmann
2017-12-31 15:49   ` Rich Felker
2018-01-01 19:52     ` Markus Wichmann
2018-01-01 22:57       ` John Reiser
2018-01-02  1:49         ` Rich Felker
2018-01-02  3:15           ` John Reiser
2018-01-02 19:49             ` Rich Felker [this message]
2018-01-02 18:24           ` a third bug in musl clone() John Reiser
2018-01-02 19:58             ` Rich Felker
2018-01-02 22:09               ` Florian Weimer
2018-01-03  2:51                 ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180102194909.GM1627@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).