mailing list of musl libc
 help / color / mirror / Atom feed
* [musl] [PATCH 1/3] setjmp: fix x86-64 longjmp argument adjustment
@ 2020-08-11 18:11 Alexander Monakov
  2020-08-11 18:45 ` Rich Felker
  2020-08-11 23:10 ` [musl] [PATCH] setjmp: optimize x86-64 longjmp prologues Alexander Monakov
  0 siblings, 2 replies; 9+ messages in thread
From: Alexander Monakov @ 2020-08-11 18:11 UTC (permalink / raw)
  To: musl

longjmp 'val' argument is an int, but the assembly is referencing 64-bit
registers as if the argument was a long, or the caller was responsible
for extending the argument. Though the psABI is not clear on this, the
interpretation in GCC is that high bits may be arbitrary and the callee
is responsible for sign/zero-extending the value as needed (likewise for
return values: callers must anticipate that high bits may be garbage).

Therefore testing %rax is a functional bug: setjmp would wrongly return
zero if longjmp was called with val==0, but high bits of %rsi happened
to be non-zero.

Rewrite the prologue to refer to 32-bit registers. In passing, change
'test' to use %rsi, as there's no advantage to using %rax and the new
form is cheaper on processors that do not perform move elimination.
---
 src/setjmp/x32/longjmp.s    | 6 +++---
 src/setjmp/x86_64/longjmp.s | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/setjmp/x32/longjmp.s b/src/setjmp/x32/longjmp.s
index e175a4b9..e709acad 100644
--- a/src/setjmp/x32/longjmp.s
+++ b/src/setjmp/x32/longjmp.s
@@ -5,10 +5,10 @@
 .type longjmp,@function
 _longjmp:
 longjmp:
-	mov %rsi,%rax           /* val will be longjmp return */
-	test %rax,%rax
+	mov %esi,%eax           /* val will be longjmp return */
+	test %esi,%esi
 	jnz 1f
-	inc %rax                /* if val==0, val=1 per longjmp semantics */
+	inc %eax                /* if val==0, val=1 per longjmp semantics */
 1:
 	mov (%rdi),%rbx         /* rdi is the jmp_buf, restore regs from it */
 	mov 8(%rdi),%rbp
diff --git a/src/setjmp/x86_64/longjmp.s b/src/setjmp/x86_64/longjmp.s
index e175a4b9..e709acad 100644
--- a/src/setjmp/x86_64/longjmp.s
+++ b/src/setjmp/x86_64/longjmp.s
@@ -5,10 +5,10 @@
 .type longjmp,@function
 _longjmp:
 longjmp:
-	mov %rsi,%rax           /* val will be longjmp return */
-	test %rax,%rax
+	mov %esi,%eax           /* val will be longjmp return */
+	test %esi,%esi
 	jnz 1f
-	inc %rax                /* if val==0, val=1 per longjmp semantics */
+	inc %eax                /* if val==0, val=1 per longjmp semantics */
 1:
 	mov (%rdi),%rbx         /* rdi is the jmp_buf, restore regs from it */
 	mov 8(%rdi),%rbp
-- 
2.11.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [musl] [PATCH 1/3] setjmp: fix x86-64 longjmp argument adjustment
  2020-08-11 18:11 [musl] [PATCH 1/3] setjmp: fix x86-64 longjmp argument adjustment Alexander Monakov
@ 2020-08-11 18:45 ` Rich Felker
  2020-08-12 13:29   ` Alexander Monakov
  2020-08-11 23:10 ` [musl] [PATCH] setjmp: optimize x86-64 longjmp prologues Alexander Monakov
  1 sibling, 1 reply; 9+ messages in thread
From: Rich Felker @ 2020-08-11 18:45 UTC (permalink / raw)
  To: musl

On Tue, Aug 11, 2020 at 09:11:14PM +0300, Alexander Monakov wrote:
> longjmp 'val' argument is an int, but the assembly is referencing 64-bit
> registers as if the argument was a long, or the caller was responsible
> for extending the argument. Though the psABI is not clear on this, the
> interpretation in GCC is that high bits may be arbitrary and the callee
> is responsible for sign/zero-extending the value as needed (likewise for
> return values: callers must anticipate that high bits may be garbage).
> 
> Therefore testing %rax is a functional bug: setjmp would wrongly return
> zero if longjmp was called with val==0, but high bits of %rsi happened
> to be non-zero.
> 
> Rewrite the prologue to refer to 32-bit registers. In passing, change
> 'test' to use %rsi, as there's no advantage to using %rax and the new
> form is cheaper on processors that do not perform move elimination.
> ---
>  src/setjmp/x32/longjmp.s    | 6 +++---
>  src/setjmp/x86_64/longjmp.s | 6 +++---
>  2 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/src/setjmp/x32/longjmp.s b/src/setjmp/x32/longjmp.s
> index e175a4b9..e709acad 100644
> --- a/src/setjmp/x32/longjmp.s
> +++ b/src/setjmp/x32/longjmp.s
> @@ -5,10 +5,10 @@
>  .type longjmp,@function
>  _longjmp:
>  longjmp:
> -	mov %rsi,%rax           /* val will be longjmp return */
> -	test %rax,%rax
> +	mov %esi,%eax           /* val will be longjmp return */
> +	test %esi,%esi
>  	jnz 1f
> -	inc %rax                /* if val==0, val=1 per longjmp semantics */
> +	inc %eax                /* if val==0, val=1 per longjmp semantics */
>  1:
>  	mov (%rdi),%rbx         /* rdi is the jmp_buf, restore regs from it */
>  	mov 8(%rdi),%rbp
> diff --git a/src/setjmp/x86_64/longjmp.s b/src/setjmp/x86_64/longjmp.s
> index e175a4b9..e709acad 100644
> --- a/src/setjmp/x86_64/longjmp.s
> +++ b/src/setjmp/x86_64/longjmp.s
> @@ -5,10 +5,10 @@
>  .type longjmp,@function
>  _longjmp:
>  longjmp:
> -	mov %rsi,%rax           /* val will be longjmp return */
> -	test %rax,%rax
> +	mov %esi,%eax           /* val will be longjmp return */
> +	test %esi,%esi
>  	jnz 1f
> -	inc %rax                /* if val==0, val=1 per longjmp semantics */
> +	inc %eax                /* if val==0, val=1 per longjmp semantics */
>  1:
>  	mov (%rdi),%rbx         /* rdi is the jmp_buf, restore regs from it */
>  	mov 8(%rdi),%rbp
> -- 
> 2.11.0

Thanks! This whole series looks good and I'm applying it now.

Re: optimizing the tails, I wonder why they were ever written the way
they were to begin with. At least the i386 one was written by me so
there's nobody else to ask, but do you have any insight into what I
might have been thinking?

Rich

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [musl] [PATCH] setjmp: optimize x86-64 longjmp prologues
  2020-08-11 18:11 [musl] [PATCH 1/3] setjmp: fix x86-64 longjmp argument adjustment Alexander Monakov
  2020-08-11 18:45 ` Rich Felker
@ 2020-08-11 23:10 ` Alexander Monakov
  2020-08-12 11:34   ` [musl] [PATCH 4/3] setjmp: optimize " Alexander Monakov
  1 sibling, 1 reply; 9+ messages in thread
From: Alexander Monakov @ 2020-08-11 23:10 UTC (permalink / raw)
  To: musl

Use a branchless sequence that is one byte shorter as well.
---
 src/setjmp/x32/longjmp.s    | 9 ++++-----
 src/setjmp/x86_64/longjmp.s | 9 ++++-----
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/src/setjmp/x32/longjmp.s b/src/setjmp/x32/longjmp.s
index bb88afa1..c8ad86fa 100644
--- a/src/setjmp/x32/longjmp.s
+++ b/src/setjmp/x32/longjmp.s
@@ -5,11 +5,10 @@
 .type longjmp,@function
 _longjmp:
 longjmp:
-	mov %esi,%eax           /* val will be longjmp return */
-	test %esi,%esi
-	jnz 1f
-	inc %eax                /* if val==0, val=1 per longjmp semantics */
-1:
+	xor %eax,%eax
+	cmp %esi,%eax           /* CF==0 iff val==0 */
+	cmc                     /* CF = !val        */
+	adc %esi,%eax           /* eax = val + !val */
 	mov (%rdi),%rbx         /* rdi is the jmp_buf, restore regs from it */
 	mov 8(%rdi),%rbp
 	mov 16(%rdi),%r12
diff --git a/src/setjmp/x86_64/longjmp.s b/src/setjmp/x86_64/longjmp.s
index bb88afa1..c8ad86fa 100644
--- a/src/setjmp/x86_64/longjmp.s
+++ b/src/setjmp/x86_64/longjmp.s
@@ -5,11 +5,10 @@
 .type longjmp,@function
 _longjmp:
 longjmp:
-	mov %esi,%eax           /* val will be longjmp return */
-	test %esi,%esi
-	jnz 1f
-	inc %eax                /* if val==0, val=1 per longjmp semantics */
-1:
+	xor %eax,%eax
+	cmp %esi,%eax           /* CF==0 iff val==0 */
+	cmc                     /* CF = !val        */
+	adc %esi,%eax           /* eax = val + !val */
 	mov (%rdi),%rbx         /* rdi is the jmp_buf, restore regs from it */
 	mov 8(%rdi),%rbp
 	mov 16(%rdi),%r12
-- 
2.11.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [musl] [PATCH 4/3] setjmp: optimize longjmp prologues
  2020-08-11 23:10 ` [musl] [PATCH] setjmp: optimize x86-64 longjmp prologues Alexander Monakov
@ 2020-08-12 11:34   ` Alexander Monakov
  0 siblings, 0 replies; 9+ messages in thread
From: Alexander Monakov @ 2020-08-12 11:34 UTC (permalink / raw)
  To: musl

Use a branchless sequence that is one byte shorter on 64-bit, same size
on 32-bit. Thanks to Pete Cawley for suggesting this variant.
---

I'm sending a revised variant after Pete Cawley (@corsix) suggested a
preferable variant on Twitter. A similar cmp-adc combo can be used to
replace the branchy sequence in i386 longjmp code.

 src/setjmp/i386/longjmp.s   | 6 ++----
 src/setjmp/x32/longjmp.s    | 8 +++-----
 src/setjmp/x86_64/longjmp.s | 8 +++-----
 3 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/src/setjmp/i386/longjmp.s b/src/setjmp/i386/longjmp.s
index b429f135..8188f06b 100644
--- a/src/setjmp/i386/longjmp.s
+++ b/src/setjmp/i386/longjmp.s
@@ -6,10 +6,8 @@ _longjmp:
 longjmp:
 	mov  4(%esp),%edx
 	mov  8(%esp),%eax
-	test    %eax,%eax
-	jnz 1f
-	inc     %eax
-1:
+	cmp       $1,%eax
+	adc       $0, %al
 	mov   (%edx),%ebx
 	mov  4(%edx),%esi
 	mov  8(%edx),%edi
diff --git a/src/setjmp/x32/longjmp.s b/src/setjmp/x32/longjmp.s
index bb88afa1..1b2661c3 100644
--- a/src/setjmp/x32/longjmp.s
+++ b/src/setjmp/x32/longjmp.s
@@ -5,11 +5,9 @@
 .type longjmp,@function
 _longjmp:
 longjmp:
-	mov %esi,%eax           /* val will be longjmp return */
-	test %esi,%esi
-	jnz 1f
-	inc %eax                /* if val==0, val=1 per longjmp semantics */
-1:
+	xor %eax,%eax
+	cmp $1,%esi             /* CF = val ? 0 : 1 */
+	adc %esi,%eax           /* eax = val + !val */
 	mov (%rdi),%rbx         /* rdi is the jmp_buf, restore regs from it */
 	mov 8(%rdi),%rbp
 	mov 16(%rdi),%r12
diff --git a/src/setjmp/x86_64/longjmp.s b/src/setjmp/x86_64/longjmp.s
index bb88afa1..1b2661c3 100644
--- a/src/setjmp/x86_64/longjmp.s
+++ b/src/setjmp/x86_64/longjmp.s
@@ -5,11 +5,9 @@
 .type longjmp,@function
 _longjmp:
 longjmp:
-	mov %esi,%eax           /* val will be longjmp return */
-	test %esi,%esi
-	jnz 1f
-	inc %eax                /* if val==0, val=1 per longjmp semantics */
-1:
+	xor %eax,%eax
+	cmp $1,%esi             /* CF = val ? 0 : 1 */
+	adc %esi,%eax           /* eax = val + !val */
 	mov (%rdi),%rbx         /* rdi is the jmp_buf, restore regs from it */
 	mov 8(%rdi),%rbp
 	mov 16(%rdi),%r12
-- 
2.11.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [musl] [PATCH 1/3] setjmp: fix x86-64 longjmp argument adjustment
  2020-08-11 18:45 ` Rich Felker
@ 2020-08-12 13:29   ` Alexander Monakov
  2020-08-12 21:44     ` Szabolcs Nagy
  2020-08-13  8:01     ` Jens Gustedt
  0 siblings, 2 replies; 9+ messages in thread
From: Alexander Monakov @ 2020-08-12 13:29 UTC (permalink / raw)
  To: musl

On Tue, 11 Aug 2020, Rich Felker wrote:

> Thanks! This whole series looks good and I'm applying it now.
> 
> Re: optimizing the tails, I wonder why they were ever written the way
> they were to begin with. At least the i386 one was written by me so
> there's nobody else to ask, but do you have any insight into what I
> might have been thinking?

No idea.

By the way, you should check if other 64-bit ports in musl exhibit
the same issue. The AArch64 port definitely does, for example.

Here's a standalone testcase, needs -O2 so test_lj passes its second
argument unchanged:

#include <setjmp.h>

static void test_lj(jmp_buf jb, long lv)
{
        longjmp(jb, lv);
}

int main()
{
        void (*volatile ptest)(jmp_buf, long) = 0;
        jmp_buf jb;

        int v = setjmp(jb);
        ptest = ptest ? 0 : test_lj;
        if (ptest) ptest(jb, 1l<<32);
        return !v;
}


Alexander

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [musl] [PATCH 1/3] setjmp: fix x86-64 longjmp argument adjustment
  2020-08-12 13:29   ` Alexander Monakov
@ 2020-08-12 21:44     ` Szabolcs Nagy
  2020-08-12 22:13       ` Szabolcs Nagy
  2020-08-13  8:01     ` Jens Gustedt
  1 sibling, 1 reply; 9+ messages in thread
From: Szabolcs Nagy @ 2020-08-12 21:44 UTC (permalink / raw)
  To: Alexander Monakov; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 734 bytes --]

* Alexander Monakov <amonakov@ispras.ru> [2020-08-12 16:29:59 +0300]:
> By the way, you should check if other 64-bit ports in musl exhibit
> the same issue. The AArch64 port definitely does, for example.
> 
> Here's a standalone testcase, needs -O2 so test_lj passes its second
> argument unchanged:

yes it seems wrong, see attached patch (with that the test passes).
thanks.

> 
> #include <setjmp.h>
> 
> static void test_lj(jmp_buf jb, long lv)
> {
>         longjmp(jb, lv);
> }
> 
> int main()
> {
>         void (*volatile ptest)(jmp_buf, long) = 0;
>         jmp_buf jb;
> 
>         int v = setjmp(jb);
>         ptest = ptest ? 0 : test_lj;
>         if (ptest) ptest(jb, 1l<<32);
>         return !v;
> }
> 
> 
> Alexander

[-- Attachment #2: 0001-aarch64-fix-setjmp-return-value.patch --]
[-- Type: text/x-diff, Size: 821 bytes --]

From c1b4a33e7cb1f4df0cc5f8237930ff81786cfbde Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy <nsz@port70.net>
Date: Wed, 12 Aug 2020 21:00:26 +0000
Subject: [PATCH] aarch64: fix setjmp return value

longjmp should set the return value of setjmp, but 64bit
registers were used for the 0 check while the type is int.

use the code that gcc generates for return val ? val : 1;
---
 src/setjmp/aarch64/longjmp.s | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/setjmp/aarch64/longjmp.s b/src/setjmp/aarch64/longjmp.s
index 7c4655fa..b22042a2 100644
--- a/src/setjmp/aarch64/longjmp.s
+++ b/src/setjmp/aarch64/longjmp.s
@@ -18,7 +18,6 @@ longjmp:
 	ldp d12, d13, [x0,#144]
 	ldp d14, d15, [x0,#160]
 
-	mov x0, x1
-	cbnz x1, 1f
-	mov x0, #1
+	cmp w1, 0
+	csinc w0, w1, wzr, ne
 1:	br x30
-- 
2.28.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [musl] [PATCH 1/3] setjmp: fix x86-64 longjmp argument adjustment
  2020-08-12 21:44     ` Szabolcs Nagy
@ 2020-08-12 22:13       ` Szabolcs Nagy
  0 siblings, 0 replies; 9+ messages in thread
From: Szabolcs Nagy @ 2020-08-12 22:13 UTC (permalink / raw)
  To: Alexander Monakov, musl

[-- Attachment #1: Type: text/plain, Size: 477 bytes --]

* Szabolcs Nagy <nsz@port70.net> [2020-08-12 23:44:05 +0200]:
> diff --git a/src/setjmp/aarch64/longjmp.s b/src/setjmp/aarch64/longjmp.s
> index 7c4655fa..b22042a2 100644
> --- a/src/setjmp/aarch64/longjmp.s
> +++ b/src/setjmp/aarch64/longjmp.s
> @@ -18,7 +18,6 @@ longjmp:
>  	ldp d12, d13, [x0,#144]
>  	ldp d14, d15, [x0,#160]
>  
> -	mov x0, x1
> -	cbnz x1, 1f
> -	mov x0, #1
> +	cmp w1, 0
> +	csinc w0, w1, wzr, ne
>  1:	br x30

v2 because the 1: label is no longer used


[-- Attachment #2: 0001-aarch64-fix-setjmp-return-value.patch --]
[-- Type: text/x-diff, Size: 835 bytes --]

From 5a842cde8974b6927fd7199e713f89852fc467cd Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy <nsz@port70.net>
Date: Wed, 12 Aug 2020 21:00:26 +0000
Subject: [PATCH v2] aarch64: fix setjmp return value

longjmp should set the return value of setjmp, but 64bit
registers were used for the 0 check while the type is int.

use the code that gcc generates for return val ? val : 1;
---
 src/setjmp/aarch64/longjmp.s | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/setjmp/aarch64/longjmp.s b/src/setjmp/aarch64/longjmp.s
index 7c4655fa..0af9c50e 100644
--- a/src/setjmp/aarch64/longjmp.s
+++ b/src/setjmp/aarch64/longjmp.s
@@ -18,7 +18,6 @@ longjmp:
 	ldp d12, d13, [x0,#144]
 	ldp d14, d15, [x0,#160]
 
-	mov x0, x1
-	cbnz x1, 1f
-	mov x0, #1
-1:	br x30
+	cmp w1, 0
+	csinc w0, w1, wzr, ne
+	br x30
-- 
2.28.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [musl] [PATCH 1/3] setjmp: fix x86-64 longjmp argument adjustment
  2020-08-12 13:29   ` Alexander Monakov
  2020-08-12 21:44     ` Szabolcs Nagy
@ 2020-08-13  8:01     ` Jens Gustedt
  2020-08-13  8:53       ` Alexander Monakov
  1 sibling, 1 reply; 9+ messages in thread
From: Jens Gustedt @ 2020-08-13  8:01 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 663 bytes --]

Alexander,
probably this has nothing to do with the issue at hand but 

on Wed, 12 Aug 2020 16:29:59 +0300 (MSK) you (Alexander Monakov
<amonakov@ispras.ru>) wrote:

>         int v = setjmp(jb);

is a use of `setjmp` that is not conforming. `setjmp` is only allowed
either standalone or as controlling expression either directly or
negated.

Jens

-- 
:: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536   ::
:: :::::::::::::::::::::: gsm France : +33 651400183   ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [musl] [PATCH 1/3] setjmp: fix x86-64 longjmp argument adjustment
  2020-08-13  8:01     ` Jens Gustedt
@ 2020-08-13  8:53       ` Alexander Monakov
  0 siblings, 0 replies; 9+ messages in thread
From: Alexander Monakov @ 2020-08-13  8:53 UTC (permalink / raw)
  To: musl



On Thu, 13 Aug 2020, Jens Gustedt wrote:

> Alexander,
> probably this has nothing to do with the issue at hand but 
> 
> on Wed, 12 Aug 2020 16:29:59 +0300 (MSK) you (Alexander Monakov
> <amonakov@ispras.ru>) wrote:
> 
> >         int v = setjmp(jb);
> 
> is a use of `setjmp` that is not conforming. `setjmp` is only allowed
> either standalone or as controlling expression either directly or
> negated.

Thanks. Yeah, this is a moot point as the test needs GCC with -O anyway, and
GCC allows setjmp in any context. But nevertheless it's easy to adjust it
by changing the offending line to 'if (setjmp(jb)) return 0;' and changing
the final 'return !v'  to 'return 1'.

Alexander

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-08-13  8:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-11 18:11 [musl] [PATCH 1/3] setjmp: fix x86-64 longjmp argument adjustment Alexander Monakov
2020-08-11 18:45 ` Rich Felker
2020-08-12 13:29   ` Alexander Monakov
2020-08-12 21:44     ` Szabolcs Nagy
2020-08-12 22:13       ` Szabolcs Nagy
2020-08-13  8:01     ` Jens Gustedt
2020-08-13  8:53       ` Alexander Monakov
2020-08-11 23:10 ` [musl] [PATCH] setjmp: optimize x86-64 longjmp prologues Alexander Monakov
2020-08-12 11:34   ` [musl] [PATCH 4/3] setjmp: optimize " Alexander Monakov

mailing list of musl libc

This inbox may be cloned and mirrored by anyone:

	git clone --mirror http://inbox.vuxu.org/musl

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 musl musl/ http://inbox.vuxu.org/musl \
		musl@inbox.vuxu.org
	public-inbox-index musl

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.vuxu.org/vuxu.archive.musl


code repositories for the project(s) associated with this inbox:

	https://git.vuxu.org/mirror/musl/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git