mailing list of musl libc
 help / color / mirror / code / Atom feed
* [PATCH] Wasm support patch 2 (static syscalls)
@ 2017-11-28 12:31 Nicholas Wilson
  2017-11-28 12:59 ` Szabolcs Nagy
  0 siblings, 1 reply; 14+ messages in thread
From: Nicholas Wilson @ 2017-11-28 12:31 UTC (permalink / raw)
  To: musl

The next patch is more complicated.

For WebAssembly, I think we'll want to be able to "statically link" in syscalls.  Under the hood, Musl is basically a layer that translates calls to the libc API into calls to an underlying syscall API - so for Wasm we want to exploit that with a twist, by allowing the interpreter environment to provide the syscall definitions.

In order for linkage to work nicely, the syscalls need to be split out into their own functions. I'll give a worked example.

Imagine an application that calls "getpid()".  This will cause "getpid.o" to be linked in, to provide the C API:

pid_t getpid(void)
{
	return __syscall(SYS_getpid);
}

On all of Musl's existing archs, the syscalls are implemented via a seven generic "__syscallN" functions. The assumption is that the kernel provides all syscalls.

For Wasm, what I've done is made it so that the interpreter environment instead provides *named* syscall functions, in this case, a "__syscall_getpid" function. Then, at link-time, when the linker links against libc.a it's able to link in to the application only the syscalls that are actually used.

This keeps the changes to Musl down to an absolute minimum. We can compile all of Musl for the Wasm target, without exclusions, including pthreads, signals, I/O. All that's needed to make it link is for the application developer to provide an embedding environment that implements the named syscalls that the application uses.

It might be clearer when I post Patch #3 shortly, which contains my current stab at a minimal Wasm arch. This patch just contains the code changes needed to accommodate static syscall linkage.

There is one wrinkle I've had to work around - by linking in *named* syscalls, it means that syscalls can't be called by number. There are only three places where this is done in Musl currently (ie only three places where the argument to "__syscall" is not one of the named "SYS_XXX" constants).
1. The "syscall" function itself. We can simply ifdef this out, that's easy (src/misc/syscall.c)
2. The pthread_cancel() support does some fiddly stuff. For Wasm, I've just added an ifdef again so that this isn't provided. I think that's reasonably given how messy a feature it is generally, and application developers really can't expect Wasm to provide it, surely!
3. The generic "__setxid" function that wraps the "set(r)(e)(u|g)id" functions.  I've worked around this by tweaking it to use a simple table - a bit clumsy, but I hope no-one minds too much?

Feedback welcome, this is just really to start a dialogue about how the "static syscall" model could be used in Musl.

All the best,
Nick


diff --git a/src/internal/libc.h b/src/internal/libc.h
index 5e145183..6265062d 100644
--- a/src/internal/libc.h
+++ b/src/internal/libc.h
@@ -55,6 +55,10 @@ void __unlockfile(FILE *) ATTR_LIBC_VISIBILITY;
 #define UNLOCK(x) __unlock(x)
 
 void __synccall(void (*)(void *), void *);
+
+typedef enum __xid {
+	xid_resuid, xid_resgid, xid_reuid, xid_regid, xid_uid, xid_gid
+} xid_t;
 int __setxid(int, int, int, int);
 
 extern char **__environ;
diff --git a/src/internal/syscall.h b/src/internal/syscall.h
index 6d378a81..92650c45 100644
--- a/src/internal/syscall.h
+++ b/src/internal/syscall.h
@@ -26,7 +26,19 @@ long __syscall_ret(unsigned long), __syscall(syscall_arg_t, ...),
 	__syscall_cp(syscall_arg_t, syscall_arg_t, syscall_arg_t, syscall_arg_t,
 	             syscall_arg_t, syscall_arg_t, syscall_arg_t);
 
-#ifdef SYSCALL_NO_INLINE
+#define __SYSCALL_CONCAT_X(a,b) a##b
+#define __SYSCALL_CONCAT(a,b) __SYSCALL_CONCAT_X(a,b)
+
+#ifdef SYSCALL_STATIC
+#define __syscall0(n) __SYSCALL_CONCAT(__,n)(0)
+#define __syscall1(n,a) __SYSCALL_CONCAT(__,n)(__scc(a))
+#define __syscall2(n,a,b) __SYSCALL_CONCAT(__,n)(__scc(a),__scc(b))
+#define __syscall3(n,a,b,c) __SYSCALL_CONCAT(__,n)(__scc(a),__scc(b),__scc(c))
+#define __syscall4(n,a,b,c,d) __SYSCALL_CONCAT(__,n)(__scc(a),__scc(b),__scc(c),__scc(d))
+#define __syscall5(n,a,b,c,d,e) __SYSCALL_CONCAT(__,n)(__scc(a),__scc(b),__scc(c),__scc(d),__scc(e))
+#define __syscall6(n,a,b,c,d,e,f) __SYSCALL_CONCAT(__,n)(__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),__scc(f))
+#define __syscall7(n,a,b,c,d,e,f,g) __SYSCALL_CONCAT(__,n)(__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),__scc(f),__scc(g))
+#elif defined SYSCALL_NO_INLINE
 #define __syscall0(n) (__syscall)(n)
 #define __syscall1(n,a) (__syscall)(n,__scc(a))
 #define __syscall2(n,a,b) (__syscall)(n,__scc(a),__scc(b))
@@ -34,20 +46,20 @@ long __syscall_ret(unsigned long), __syscall(syscall_arg_t, ...),
 #define __syscall4(n,a,b,c,d) (__syscall)(n,__scc(a),__scc(b),__scc(c),__scc(d))
 #define __syscall5(n,a,b,c,d,e) (__syscall)(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e))
 #define __syscall6(n,a,b,c,d,e,f) (__syscall)(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),__scc(f))
+#define __syscall7(n,a,b,c,d,e,f,g) (__syscall)(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),__scc(f),__scc(g))
 #else
+#define __syscall0(n) __syscall0(n)
 #define __syscall1(n,a) __syscall1(n,__scc(a))
 #define __syscall2(n,a,b) __syscall2(n,__scc(a),__scc(b))
 #define __syscall3(n,a,b,c) __syscall3(n,__scc(a),__scc(b),__scc(c))
 #define __syscall4(n,a,b,c,d) __syscall4(n,__scc(a),__scc(b),__scc(c),__scc(d))
 #define __syscall5(n,a,b,c,d,e) __syscall5(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e))
 #define __syscall6(n,a,b,c,d,e,f) __syscall6(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),__scc(f))
+#define __syscall7(n,a,b,c,d,e,f,g) __syscall7(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),__scc(f),__scc(g))
 #endif
-#define __syscall7(n,a,b,c,d,e,f,g) (__syscall)(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),__scc(f),__scc(g))
 
 #define __SYSCALL_NARGS_X(a,b,c,d,e,f,g,h,n,...) n
 #define __SYSCALL_NARGS(...) __SYSCALL_NARGS_X(__VA_ARGS__,7,6,5,4,3,2,1,0,)
-#define __SYSCALL_CONCAT_X(a,b) a##b
-#define __SYSCALL_CONCAT(a,b) __SYSCALL_CONCAT_X(a,b)
 #define __SYSCALL_DISP(b,...) __SYSCALL_CONCAT(b,__SYSCALL_NARGS(__VA_ARGS__))(__VA_ARGS__)
 
 #define __syscall(...) __SYSCALL_DISP(__syscall,__VA_ARGS__)
@@ -56,6 +68,18 @@ long __syscall_ret(unsigned long), __syscall(syscall_arg_t, ...),
 #define socketcall __socketcall
 #define socketcall_cp __socketcall_cp
 
+#ifdef SYSCALL_STATIC
+// For archs that define SYSCALL_STATIC (wasm), we basically just don't allow
+// for pthread_cancel().  I don't expect wasm will ever allow for cancellable
+// waits so that's OK.
+#define __syscall_cp0(n) __syscall0(n)
+#define __syscall_cp1(n,a) __syscall1(n,a)
+#define __syscall_cp2(n,a,b) __syscall2(n,a,b)
+#define __syscall_cp3(n,a,b,c) __syscall3(n,a,b,c)
+#define __syscall_cp4(n,a,b,c,d) __syscall4(n,a,b,c,d)
+#define __syscall_cp5(n,a,b,c,d,e) __syscall5(n,a,b,c,d,e)
+#define __syscall_cp6(n,a,b,c,d,e,f) __syscall6(n,a,b,c,d,e,f)
+#else
 #define __syscall_cp0(n) (__syscall_cp)(n,0,0,0,0,0,0)
 #define __syscall_cp1(n,a) (__syscall_cp)(n,__scc(a),0,0,0,0,0)
 #define __syscall_cp2(n,a,b) (__syscall_cp)(n,__scc(a),__scc(b),0,0,0,0)
@@ -63,6 +87,7 @@ long __syscall_ret(unsigned long), __syscall(syscall_arg_t, ...),
 #define __syscall_cp4(n,a,b,c,d) (__syscall_cp)(n,__scc(a),__scc(b),__scc(c),__scc(d),0,0)
 #define __syscall_cp5(n,a,b,c,d,e) (__syscall_cp)(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),0)
 #define __syscall_cp6(n,a,b,c,d,e,f) (__syscall_cp)(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),__scc(f))
+#endif
 
 #define __syscall_cp(...) __SYSCALL_DISP(__syscall_cp,__VA_ARGS__)
 #define syscall_cp(...) __syscall_ret(__syscall_cp(__VA_ARGS__))
diff --git a/src/misc/syscall.c b/src/misc/syscall.c
index 9d435a97..013787a4 100644
--- a/src/misc/syscall.c
+++ b/src/misc/syscall.c
@@ -3,6 +3,7 @@
 
 #undef syscall
 
+#ifndef SYSCALL_STATIC
 long syscall(long n, ...)
 {
 	va_list ap;
@@ -17,3 +18,4 @@ long syscall(long n, ...)
 	va_end(ap);
 	return __syscall_ret(__syscall(n,a,b,c,d,e,f));
 }
+#endif
diff --git a/src/thread/pthread_cancel.c b/src/thread/pthread_cancel.c
index 3d229223..d14d96ed 100644
--- a/src/thread/pthread_cancel.c
+++ b/src/thread/pthread_cancel.c
@@ -20,6 +20,7 @@ long __syscall_cp_asm(volatile void *, syscall_arg_t,
                       syscall_arg_t, syscall_arg_t, syscall_arg_t,
                       syscall_arg_t, syscall_arg_t, syscall_arg_t);
 
+#ifndef SYSCALL_STATIC
 long __syscall_cp_c(syscall_arg_t nr,
                     syscall_arg_t u, syscall_arg_t v, syscall_arg_t w,
                     syscall_arg_t x, syscall_arg_t y, syscall_arg_t z)
@@ -38,6 +39,7 @@ long __syscall_cp_c(syscall_arg_t nr,
 		r = __cancel();
 	return r;
 }
+#endif
 
 static void _sigaddset(sigset_t *set, int sig)
 {
diff --git a/src/unistd/setegid.c b/src/unistd/setegid.c
index e6da2573..dc2702b6 100644
--- a/src/unistd/setegid.c
+++ b/src/unistd/setegid.c
@@ -1,8 +1,7 @@
 #include <unistd.h>
 #include "libc.h"
-#include "syscall.h"
 
 int setegid(gid_t egid)
 {
-	return __setxid(SYS_setresgid, -1, egid, -1);
+	return __setxid(xid_resgid, -1, egid, -1);
 }
diff --git a/src/unistd/seteuid.c b/src/unistd/seteuid.c
index ef8b9df4..d84d6186 100644
--- a/src/unistd/seteuid.c
+++ b/src/unistd/seteuid.c
@@ -1,8 +1,7 @@
 #include <unistd.h>
-#include "syscall.h"
 #include "libc.h"
 
 int seteuid(uid_t euid)
 {
-	return __setxid(SYS_setresuid, -1, euid, -1);
+	return __setxid(xid_resuid, -1, euid, -1);
 }
diff --git a/src/unistd/setgid.c b/src/unistd/setgid.c
index bae4616a..88197ec1 100644
--- a/src/unistd/setgid.c
+++ b/src/unistd/setgid.c
@@ -1,8 +1,7 @@
 #include <unistd.h>
-#include "syscall.h"
 #include "libc.h"
 
 int setgid(gid_t gid)
 {
-	return __setxid(SYS_setgid, gid, 0, 0);
+	return __setxid(xid_gid, gid, 0, 0);
 }
diff --git a/src/unistd/setregid.c b/src/unistd/setregid.c
index f5a8972a..4729c31b 100644
--- a/src/unistd/setregid.c
+++ b/src/unistd/setregid.c
@@ -1,8 +1,7 @@
 #include <unistd.h>
-#include "syscall.h"
 #include "libc.h"
 
 int setregid(gid_t rgid, gid_t egid)
 {
-	return __setxid(SYS_setregid, rgid, egid, 0);
+	return __setxid(xid_regid, rgid, egid, 0);
 }
diff --git a/src/unistd/setresgid.c b/src/unistd/setresgid.c
index b9af540a..929bb72b 100644
--- a/src/unistd/setresgid.c
+++ b/src/unistd/setresgid.c
@@ -1,9 +1,8 @@
 #define _GNU_SOURCE
 #include <unistd.h>
-#include "syscall.h"
 #include "libc.h"
 
 int setresgid(gid_t rgid, gid_t egid, gid_t sgid)
 {
-	return __setxid(SYS_setresgid, rgid, egid, sgid);
+	return __setxid(xid_resgid, rgid, egid, sgid);
 }
diff --git a/src/unistd/setresuid.c b/src/unistd/setresuid.c
index 83692b4c..74b6ac33 100644
--- a/src/unistd/setresuid.c
+++ b/src/unistd/setresuid.c
@@ -1,9 +1,8 @@
 #define _GNU_SOURCE
 #include <unistd.h>
-#include "syscall.h"
 #include "libc.h"
 
 int setresuid(uid_t ruid, uid_t euid, uid_t suid)
 {
-	return __setxid(SYS_setresuid, ruid, euid, suid);
+	return __setxid(xid_resuid, ruid, euid, suid);
 }
diff --git a/src/unistd/setreuid.c b/src/unistd/setreuid.c
index 3fcc59e2..9b3efd6a 100644
--- a/src/unistd/setreuid.c
+++ b/src/unistd/setreuid.c
@@ -1,8 +1,7 @@
 #include <unistd.h>
-#include "syscall.h"
 #include "libc.h"
 
 int setreuid(uid_t ruid, uid_t euid)
 {
-	return __setxid(SYS_setreuid, ruid, euid, 0);
+	return __setxid(xid_reuid, ruid, euid, 0);
 }
diff --git a/src/unistd/setuid.c b/src/unistd/setuid.c
index 602ecbbf..de91d0c1 100644
--- a/src/unistd/setuid.c
+++ b/src/unistd/setuid.c
@@ -1,8 +1,7 @@
 #include <unistd.h>
-#include "syscall.h"
 #include "libc.h"
 
 int setuid(uid_t uid)
 {
-	return __setxid(SYS_setuid, uid, 0, 0);
+	return __setxid(xid_uid, uid, 0, 0);
 }
diff --git a/src/unistd/setxid.c b/src/unistd/setxid.c
index 0239f8af..6e11f507 100644
--- a/src/unistd/setxid.c
+++ b/src/unistd/setxid.c
@@ -6,14 +6,25 @@
 
 struct ctx {
 	int id, eid, sid;
-	int nr, err;
+	int xid, err;
 };
 
+static int setxid_syscall(int xid, int id, int eid, int sid)
+{
+	if (xid == xid_resuid) return __syscall(SYS_setresuid, id, eid, sid);
+	if (xid == xid_resgid) return __syscall(SYS_setresgid, id, eid, sid);
+	if (xid == xid_reuid) return __syscall(SYS_setreuid, id, eid);
+	if (xid == xid_regid) return __syscall(SYS_setregid, id, eid);
+	if (xid == xid_uid) return __syscall(SYS_setuid, id);
+	if (xid == xid_gid) return __syscall(SYS_setgid, id);
+	abort();
+}
+
 static void do_setxid(void *p)
 {
 	struct ctx *c = p;
 	if (c->err>0) return;
-	int ret = -__syscall(c->nr, c->id, c->eid, c->sid);
+	int ret = -setxid_syscall(c->xid, c->id, c->eid, c->sid);
 	if (ret && !c->err) {
 		/* If one thread fails to set ids after another has already
 		 * succeeded, forcibly killing the process is the only safe
@@ -25,11 +36,11 @@ static void do_setxid(void *p)
 	c->err = ret;
 }
 
-int __setxid(int nr, int id, int eid, int sid)
+int __setxid(int xid, int id, int eid, int sid)
 {
 	/* err is initially nonzero so that failure of the first thread does not
 	 * trigger the safety kill above. */
-	struct ctx c = { .nr = nr, .id = id, .eid = eid, .sid = sid, .err = -1 };
+	struct ctx c = { .xid = xid, .id = id, .eid = eid, .sid = sid, .err = -1 };
 	__synccall(do_setxid, &c);
 	if (c.err) {
 		if (c.err>0) errno = c.err;


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Wasm support patch 2 (static syscalls)
  2017-11-28 12:31 [PATCH] Wasm support patch 2 (static syscalls) Nicholas Wilson
@ 2017-11-28 12:59 ` Szabolcs Nagy
  2017-11-28 13:23   ` Nicholas Wilson
  0 siblings, 1 reply; 14+ messages in thread
From: Szabolcs Nagy @ 2017-11-28 12:59 UTC (permalink / raw)
  To: musl

* Nicholas Wilson <nicholas.wilson@realvnc.com> [2017-11-28 12:31:18 +0000]:
> Imagine an application that calls "getpid()".  This will cause "getpid.o" to be linked in, to provide the C API:
> 
> pid_t getpid(void)
> {
> 	return __syscall(SYS_getpid);
> }
> 
> On all of Musl's existing archs, the syscalls are implemented via a seven generic "__syscallN" functions. The assumption is that the kernel provides all syscalls.
> 
> For Wasm, what I've done is made it so that the interpreter environment instead provides *named* syscall functions, in this case, a "__syscall_getpid" function. Then, at link-time, when the linker links against libc.a it's able to link in to the application only the syscalls that are actually used.

we may change the syscall abstraction at some point, but
to do what you want, there is no need for intrusive changes.
just have

static inline long __syscall0(long n)
{
	switch (n) {
	...
	case SYS_getpid: return __syscall_getpid();
	...
	}
}

in arc/wasm/syscall_arch.h and the compiler will do the
right thing without any change to generic musl code.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Wasm support patch 2 (static syscalls)
  2017-11-28 12:59 ` Szabolcs Nagy
@ 2017-11-28 13:23   ` Nicholas Wilson
  2017-11-28 14:05     ` Szabolcs Nagy
  0 siblings, 1 reply; 14+ messages in thread
From: Nicholas Wilson @ 2017-11-28 13:23 UTC (permalink / raw)
  To: musl

That suggestion is exactly what I'm trying to avoid!

With a jump table like that, you can't get static linkage. The compiler will link in *every* syscall. I don't think it's reasonable for the Wasm hosting environment to provide emulation for every syscall.

So then comes the impossible task of pleasing everyone: someone will say, "my application wants emulation for all 100 syscalls", and someone else will say, "I don't want to link in emulation for 100 syscalls when my application only uses one". There's just nowhere to draw the line - endless arguments over *which* syscalls are worthy in inclusion in your big jump table.

The patch I've proposed is really as minimal as things can get. Apart from the change to "__setxid", the rest of Musl is completely undisturbed.

Nick

________________________________________
From: Szabolcs Nagy <nsz@port70.net>
Sent: 28 November 2017 12:59:48
To: musl@lists.openwall.com
Subject: Re: [musl] [PATCH] Wasm support patch 2 (static syscalls)

* Nicholas Wilson <nicholas.wilson@realvnc.com> [2017-11-28 12:31:18 +0000]:
> Imagine an application that calls "getpid()".  This will cause "getpid.o" to be linked in, to provide the C API:
>
> pid_t getpid(void)
> {
>       return __syscall(SYS_getpid);
> }
>
> On all of Musl's existing archs, the syscalls are implemented via a seven generic "__syscallN" functions. The assumption is that the kernel provides all syscalls.
>
> For Wasm, what I've done is made it so that the interpreter environment instead provides *named* syscall functions, in this case, a "__syscall_getpid" function. Then, at link-time, when the linker links against libc.a it's able to link in to the application only the syscalls that are actually used.

we may change the syscall abstraction at some point, but
to do what you want, there is no need for intrusive changes.
just have

static inline long __syscall0(long n)
{
        switch (n) {
        ...
        case SYS_getpid: return __syscall_getpid();
        ...
        }
}

in arc/wasm/syscall_arch.h and the compiler will do the
right thing without any change to generic musl code.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Wasm support patch 2 (static syscalls)
  2017-11-28 13:23   ` Nicholas Wilson
@ 2017-11-28 14:05     ` Szabolcs Nagy
  2017-11-28 14:34       ` Nicholas Wilson
  0 siblings, 1 reply; 14+ messages in thread
From: Szabolcs Nagy @ 2017-11-28 14:05 UTC (permalink / raw)
  To: musl

* Nicholas Wilson <nicholas.wilson@realvnc.com> [2017-11-28 13:23:15 +0000]:
> With a jump table like that, you can't get static linkage. The compiler will link in *every* syscall.

use a better compiler with dce then.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Wasm support patch 2 (static syscalls)
  2017-11-28 14:05     ` Szabolcs Nagy
@ 2017-11-28 14:34       ` Nicholas Wilson
  2017-11-28 14:35         ` Szabolcs Nagy
  2017-11-28 16:51         ` John Starks
  0 siblings, 2 replies; 14+ messages in thread
From: Nicholas Wilson @ 2017-11-28 14:34 UTC (permalink / raw)
  To: musl

That's some sophisticated link-time optimisation you're expecting there! DCE on a single source file wouldn't do it.

When you have callers like __setxid, it's not until the final link that that kind of elimination could be run; just by examining setxid.c there's no way the compiler could eliminate anything. And, it would rely on several stages of inlining taking place, which you really can't guarantee. (Not to mention that the Clang LLD linker doesn't do LTO.) The traditional unix linking model isn't really geared up for this level of link-time inlining and optimising.

The current Emscripten fork of Musl implements the "static syscalls" a little differently. They have retained the numeric constants for the syscalls, but the calls are redirected to functions with names like "__syscall_42" instead of "__syscall_getpid". At the end of the day, there isn't a big difference between my approach here and the current Emscripten one - I've just reduced the amount of Musl code that needs to be touched to support the static syscalls, and I can't see how to reduce the changes further...

Nick

________________________________________
From: Szabolcs Nagy <nsz@port70.net>
Sent: 28 November 2017 14:05:31
To: musl@lists.openwall.com
Subject: Re: [musl] [PATCH] Wasm support patch 2 (static syscalls)

* Nicholas Wilson <nicholas.wilson@realvnc.com> [2017-11-28 13:23:15 +0000]:
> With a jump table like that, you can't get static linkage. The compiler will link in *every* syscall.

use a better compiler with dce then.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Wasm support patch 2 (static syscalls)
  2017-11-28 14:34       ` Nicholas Wilson
@ 2017-11-28 14:35         ` Szabolcs Nagy
  2017-11-28 14:53           ` Nicholas Wilson
  2017-11-28 16:51         ` John Starks
  1 sibling, 1 reply; 14+ messages in thread
From: Szabolcs Nagy @ 2017-11-28 14:35 UTC (permalink / raw)
  To: musl

* Nicholas Wilson <nicholas.wilson@realvnc.com> [2017-11-28 14:34:41 +0000]:
> That's some sophisticated link-time optimisation you're expecting there! DCE on a single source file wouldn't do it.

it's a static inline function, no lto involved.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Wasm support patch 2 (static syscalls)
  2017-11-28 14:35         ` Szabolcs Nagy
@ 2017-11-28 14:53           ` Nicholas Wilson
  2017-11-28 15:08             ` Szabolcs Nagy
  0 siblings, 1 reply; 14+ messages in thread
From: Nicholas Wilson @ 2017-11-28 14:53 UTC (permalink / raw)
  To: musl

I think LTO would be required - but I might be missing something in your scheme? I'm considering the cases like __setxid and pthread_cancel, where the compiler can't work out which syscall numbers are possible (because the numbers come from another translation unit). In these cases, there's no way to eliminate any of the branches in your switch statement.

In particular, getuid() is such a common library call that I think we do want to be able to support it on Wasm, without having Musl link in everything. I mean, it's not an obscure piece of functionality, so it's worth a small refactor to make it usable with Wasm.

It would be a bit awkward but acceptable I suppose to not support Wasm at -O0 (ie link in hundreds of syscalls when the optimiser is off).

Nick

________________________________________
From: Szabolcs Nagy <nsz@port70.net>
Sent: 28 November 2017 14:35:50
To: musl@lists.openwall.com
Subject: Re: [musl] [PATCH] Wasm support patch 2 (static syscalls)

* Nicholas Wilson <nicholas.wilson@realvnc.com> [2017-11-28 14:34:41 +0000]:
> That's some sophisticated link-time optimisation you're expecting there! DCE on a single source file wouldn't do it.

it's a static inline function, no lto involved.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Wasm support patch 2 (static syscalls)
  2017-11-28 14:53           ` Nicholas Wilson
@ 2017-11-28 15:08             ` Szabolcs Nagy
  2017-11-28 15:50               ` Rich Felker
  0 siblings, 1 reply; 14+ messages in thread
From: Szabolcs Nagy @ 2017-11-28 15:08 UTC (permalink / raw)
  To: musl

* Nicholas Wilson <nicholas.wilson@realvnc.com> [2017-11-28 14:53:58 +0000]:
> I think LTO would be required - but I might be missing something in your scheme? I'm considering the cases like __setxid and pthread_cancel, where the compiler can't work out which syscall numbers are possible (because the numbers come from another translation unit). In these cases, there's no way to eliminate any of the branches in your switch statement.
> 
> In particular, getuid() is such a common library call that I think we do want to be able to support it on Wasm, without having Musl link in everything. I mean, it's not an obscure piece of functionality, so it's worth a small refactor to make it usable with Wasm.

getuid is not affected, only __setxid and cancellable syscalls are.

__setxid is not that important and can be worked around in generic code,
but i don't see an easy solution for syscall_cp except that wasm
probably don't need cancellation at all so it can be just defined
to syscall.



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Wasm support patch 2 (static syscalls)
  2017-11-28 15:08             ` Szabolcs Nagy
@ 2017-11-28 15:50               ` Rich Felker
  2017-11-28 16:20                 ` Nicholas Wilson
  0 siblings, 1 reply; 14+ messages in thread
From: Rich Felker @ 2017-11-28 15:50 UTC (permalink / raw)
  To: musl

On Tue, Nov 28, 2017 at 04:08:06PM +0100, Szabolcs Nagy wrote:
> * Nicholas Wilson <nicholas.wilson@realvnc.com> [2017-11-28 14:53:58 +0000]:
> > I think LTO would be required - but I might be missing something
> > in your scheme? I'm considering the cases like __setxid and
> > pthread_cancel, where the compiler can't work out which syscall
> > numbers are possible (because the numbers come from another
> > translation unit). In these cases, there's no way to eliminate any
> > of the branches in your switch statement.
> > 
> > In particular, getuid() is such a common library call that I think
> > we do want to be able to support it on Wasm, without having Musl
> > link in everything. I mean, it's not an obscure piece of
> > functionality, so it's worth a small refactor to make it usable
> > with Wasm.
> 
> getuid is not affected, only __setxid and cancellable syscalls are.
> 
> __setxid is not that important and can be worked around in generic code,
> but i don't see an easy solution for syscall_cp except that wasm
> probably don't need cancellation at all so it can be just defined
> to syscall.

I don't think calling __setxid even makes sense in wasm code. Do you
actually have a multiuser model with a privileged user who can change
uid to other users?

Cancellation is actually hard to do via an approach like this, since
you need some mechanism for determining the point at which the syscall
has atomically succeeded versus being blocked with no side effects
yet. Implementing it would probably require doing something similar to
what midipix does, but for now it can probably just be omitted.

Rich


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Wasm support patch 2 (static syscalls)
  2017-11-28 15:50               ` Rich Felker
@ 2017-11-28 16:20                 ` Nicholas Wilson
  0 siblings, 0 replies; 14+ messages in thread
From: Nicholas Wilson @ 2017-11-28 16:20 UTC (permalink / raw)
  To: musl

Oh dear, confusion between setuid and getuid is embarrassing!

You're quite right, we don't really need to make setuid work well, any applications intend to use it for something useful won't be portable to Wasm without some modification.

The (much slimmer) patch that just turns off __syscall_cp for Wasm is attached below. Thanks for the feedback.

Nick


diff --git a/src/internal/syscall.h b/src/internal/syscall.h
index 6d378a81..bef9fae9 100644
--- a/src/internal/syscall.h
+++ b/src/internal/syscall.h
@@ -34,15 +34,17 @@ long __syscall_ret(unsigned long), __syscall(syscall_arg_t, ...),
 #define __syscall4(n,a,b,c,d) (__syscall)(n,__scc(a),__scc(b),__scc(c),__scc(d))
 #define __syscall5(n,a,b,c,d,e) (__syscall)(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e))
 #define __syscall6(n,a,b,c,d,e,f) (__syscall)(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),__scc(f))
+#define __syscall7(n,a,b,c,d,e,f,g) (__syscall)(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),__scc(f),__scc(g))
 #else
+#define __syscall0(n) __syscall0(n)
 #define __syscall1(n,a) __syscall1(n,__scc(a))
 #define __syscall2(n,a,b) __syscall2(n,__scc(a),__scc(b))
 #define __syscall3(n,a,b,c) __syscall3(n,__scc(a),__scc(b),__scc(c))
 #define __syscall4(n,a,b,c,d) __syscall4(n,__scc(a),__scc(b),__scc(c),__scc(d))
 #define __syscall5(n,a,b,c,d,e) __syscall5(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e))
 #define __syscall6(n,a,b,c,d,e,f) __syscall6(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),__scc(f))
+#define __syscall7(n,a,b,c,d,e,f,g) __syscall7(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),__scc(f),__scc(g))
 #endif
-#define __syscall7(n,a,b,c,d,e,f,g) (__syscall)(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),__scc(f),__scc(g))

 #define __SYSCALL_NARGS_X(a,b,c,d,e,f,g,h,n,...) n
 #define __SYSCALL_NARGS(...) __SYSCALL_NARGS_X(__VA_ARGS__,7,6,5,4,3,2,1,0,)
@@ -56,6 +58,18 @@ long __syscall_ret(unsigned long), __syscall(syscall_arg_t, ...),
 #define socketcall __socketcall
 #define socketcall_cp __socketcall_cp

+#ifdef SYSCALL_STATIC
+// For archs that define SYSCALL_STATIC (wasm), we basically just don't allow
+// for pthread_cancel().  I don't expect wasm will ever allow for cancellable
+// waits so that's OK.
+#define __syscall_cp0(n) __syscall0(n)
+#define __syscall_cp1(n,a) __syscall1(n,a)
+#define __syscall_cp2(n,a,b) __syscall2(n,a,b)
+#define __syscall_cp3(n,a,b,c) __syscall3(n,a,b,c)
+#define __syscall_cp4(n,a,b,c,d) __syscall4(n,a,b,c,d)
+#define __syscall_cp5(n,a,b,c,d,e) __syscall5(n,a,b,c,d,e)
+#define __syscall_cp6(n,a,b,c,d,e,f) __syscall6(n,a,b,c,d,e,f)
+#else
 #define __syscall_cp0(n) (__syscall_cp)(n,0,0,0,0,0,0)
 #define __syscall_cp1(n,a) (__syscall_cp)(n,__scc(a),0,0,0,0,0)
 #define __syscall_cp2(n,a,b) (__syscall_cp)(n,__scc(a),__scc(b),0,0,0,0)
@@ -63,6 +77,7 @@ long __syscall_ret(unsigned long), __syscall(syscall_arg_t, ...),
 #define __syscall_cp4(n,a,b,c,d) (__syscall_cp)(n,__scc(a),__scc(b),__scc(c),__scc(d),0,0)
 #define __syscall_cp5(n,a,b,c,d,e) (__syscall_cp)(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),0)
 #define __syscall_cp6(n,a,b,c,d,e,f) (__syscall_cp)(n,__scc(a),__scc(b),__scc(c),__scc(d),__scc(e),__scc(f))
+#endif

 #define __syscall_cp(...) __SYSCALL_DISP(__syscall_cp,__VA_ARGS__)
 #define syscall_cp(...) __syscall_ret(__syscall_cp(__VA_ARGS__))
diff --git a/src/thread/pthread_cancel.c b/src/thread/pthread_cancel.c
index 3d229223..d14d96ed 100644
--- a/src/thread/pthread_cancel.c
+++ b/src/thread/pthread_cancel.c
@@ -20,6 +20,7 @@ long __syscall_cp_asm(volatile void *, syscall_arg_t,
                       syscall_arg_t, syscall_arg_t, syscall_arg_t,
                       syscall_arg_t, syscall_arg_t, syscall_arg_t);

+#ifndef SYSCALL_STATIC
 long __syscall_cp_c(syscall_arg_t nr,
                     syscall_arg_t u, syscall_arg_t v, syscall_arg_t w,
                     syscall_arg_t x, syscall_arg_t y, syscall_arg_t z)
@@ -38,6 +39,7 @@ long __syscall_cp_c(syscall_arg_t nr,
                r = __cancel();
        return r;
 }
+#endif

 static void _sigaddset(sigset_t *set, int sig)
 {

________________________________________
From: Rich Felker <dalias@aerifal.cx> on behalf of Rich Felker <dalias@libc.org>
Sent: 28 November 2017 15:50:14
To: musl@lists.openwall.com
Subject: Re: [musl] [PATCH] Wasm support patch 2 (static syscalls)

On Tue, Nov 28, 2017 at 04:08:06PM +0100, Szabolcs Nagy wrote:
> * Nicholas Wilson <nicholas.wilson@realvnc.com> [2017-11-28 14:53:58 +0000]:
> > I think LTO would be required - but I might be missing something
> > in your scheme? I'm considering the cases like __setxid and
> > pthread_cancel, where the compiler can't work out which syscall
> > numbers are possible (because the numbers come from another
> > translation unit). In these cases, there's no way to eliminate any
> > of the branches in your switch statement.
> >
> > In particular, getuid() is such a common library call that I think
> > we do want to be able to support it on Wasm, without having Musl
> > link in everything. I mean, it's not an obscure piece of
> > functionality, so it's worth a small refactor to make it usable
> > with Wasm.
>
> getuid is not affected, only __setxid and cancellable syscalls are.
>
> __setxid is not that important and can be worked around in generic code,
> but i don't see an easy solution for syscall_cp except that wasm
> probably don't need cancellation at all so it can be just defined
> to syscall.

I don't think calling __setxid even makes sense in wasm code. Do you
actually have a multiuser model with a privileged user who can change
uid to other users?

Cancellation is actually hard to do via an approach like this, since
you need some mechanism for determining the point at which the syscall
has atomically succeeded versus being blocked with no side effects
yet. Implementing it would probably require doing something similar to
what midipix does, but for now it can probably just be omitted.

Rich


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH] Wasm support patch 2 (static syscalls)
  2017-11-28 14:34       ` Nicholas Wilson
  2017-11-28 14:35         ` Szabolcs Nagy
@ 2017-11-28 16:51         ` John Starks
  2017-11-28 17:52           ` Szabolcs Nagy
  2017-11-28 17:53           ` Nicholas Wilson
  1 sibling, 2 replies; 14+ messages in thread
From: John Starks @ 2017-11-28 16:51 UTC (permalink / raw)
  To: musl

What if you redefine the syscall numbers in wasm to be function pointers to the actual syscalls, e.g. #define SYS_unlink ((long)&__syscall_unlink). Then __syscall and friends can just apply the arguments to the function pointer. This should play nice with the linker and probably optimizes well.

-----Original Message-----
From: Nicholas Wilson [mailto:nicholas.wilson@realvnc.com] 
Sent: Tuesday, November 28, 2017 6:35 AM
To: musl@lists.openwall.com
Subject: Re: [musl] [PATCH] Wasm support patch 2 (static syscalls)

That's some sophisticated link-time optimisation you're expecting there! DCE on a single source file wouldn't do it.

When you have callers like __setxid, it's not until the final link that that kind of elimination could be run; just by examining setxid.c there's no way the compiler could eliminate anything. And, it would rely on several stages of inlining taking place, which you really can't guarantee. (Not to mention that the Clang LLD linker doesn't do LTO.) The traditional unix linking model isn't really geared up for this level of link-time inlining and optimising.

The current Emscripten fork of Musl implements the "static syscalls" a little differently. They have retained the numeric constants for the syscalls, but the calls are redirected to functions with names like "__syscall_42" instead of "__syscall_getpid". At the end of the day, there isn't a big difference between my approach here and the current Emscripten one - I've just reduced the amount of Musl code that needs to be touched to support the static syscalls, and I can't see how to reduce the changes further...

Nick

________________________________________
From: Szabolcs Nagy <nsz@port70.net>
Sent: 28 November 2017 14:05:31
To: musl@lists.openwall.com
Subject: Re: [musl] [PATCH] Wasm support patch 2 (static syscalls)

* Nicholas Wilson <nicholas.wilson@realvnc.com> [2017-11-28 13:23:15 +0000]:
> With a jump table like that, you can't get static linkage. The compiler will link in *every* syscall.

use a better compiler with dce then.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Wasm support patch 2 (static syscalls)
  2017-11-28 16:51         ` John Starks
@ 2017-11-28 17:52           ` Szabolcs Nagy
  2017-11-28 18:55             ` Rich Felker
  2017-11-28 17:53           ` Nicholas Wilson
  1 sibling, 1 reply; 14+ messages in thread
From: Szabolcs Nagy @ 2017-11-28 17:52 UTC (permalink / raw)
  To: musl

* John Starks <John.Starks@microsoft.com> [2017-11-28 16:51:19 +0000]:
> What if you redefine the syscall numbers in wasm to be function pointers to the actual syscalls, e.g. #define SYS_unlink ((long)&__syscall_unlink). Then __syscall and friends can just apply the arguments to the function pointer. This should play nice with the linker and probably optimizes well.

you have to be able to cast it to the right type of
function pointer then and pass the right amount of
arguments.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Wasm support patch 2 (static syscalls)
  2017-11-28 16:51         ` John Starks
  2017-11-28 17:52           ` Szabolcs Nagy
@ 2017-11-28 17:53           ` Nicholas Wilson
  1 sibling, 0 replies; 14+ messages in thread
From: Nicholas Wilson @ 2017-11-28 17:53 UTC (permalink / raw)
  To: musl

Thanks John! In fact we'd have to cast them through an int, since Musl sometimes stores syscall numbers in an int, but that cast is safe on Wasm since function pointers are just an index into a global array of functions, so are guaranteed to be "small".

It's cunning, and I like it!

OK, so this patch can also be scrapped, I think it's not necessary with John's idea.

Nick

________________________________________
From: John Starks <John.Starks@microsoft.com>
Sent: 28 November 2017 16:51:19
To: musl@lists.openwall.com
Subject: RE: [musl] [PATCH] Wasm support patch 2 (static syscalls)

What if you redefine the syscall numbers in wasm to be function pointers to the actual syscalls, e.g. #define SYS_unlink ((long)&__syscall_unlink). Then __syscall and friends can just apply the arguments to the function pointer. This should play nice with the linker and probably optimizes well.

-----Original Message-----
From: Nicholas Wilson [mailto:nicholas.wilson@realvnc.com]
Sent: Tuesday, November 28, 2017 6:35 AM
To: musl@lists.openwall.com
Subject: Re: [musl] [PATCH] Wasm support patch 2 (static syscalls)

That's some sophisticated link-time optimisation you're expecting there! DCE on a single source file wouldn't do it.

When you have callers like __setxid, it's not until the final link that that kind of elimination could be run; just by examining setxid.c there's no way the compiler could eliminate anything. And, it would rely on several stages of inlining taking place, which you really can't guarantee. (Not to mention that the Clang LLD linker doesn't do LTO.) The traditional unix linking model isn't really geared up for this level of link-time inlining and optimising.

The current Emscripten fork of Musl implements the "static syscalls" a little differently. They have retained the numeric constants for the syscalls, but the calls are redirected to functions with names like "__syscall_42" instead of "__syscall_getpid". At the end of the day, there isn't a big difference between my approach here and the current Emscripten one - I've just reduced the amount of Musl code that needs to be touched to support the static syscalls, and I can't see how to reduce the changes further...

Nick

________________________________________
From: Szabolcs Nagy <nsz@port70.net>
Sent: 28 November 2017 14:05:31
To: musl@lists.openwall.com
Subject: Re: [musl] [PATCH] Wasm support patch 2 (static syscalls)

* Nicholas Wilson <nicholas.wilson@realvnc.com> [2017-11-28 13:23:15 +0000]:
> With a jump table like that, you can't get static linkage. The compiler will link in *every* syscall.

use a better compiler with dce then.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Wasm support patch 2 (static syscalls)
  2017-11-28 17:52           ` Szabolcs Nagy
@ 2017-11-28 18:55             ` Rich Felker
  0 siblings, 0 replies; 14+ messages in thread
From: Rich Felker @ 2017-11-28 18:55 UTC (permalink / raw)
  To: musl

On Tue, Nov 28, 2017 at 06:52:22PM +0100, Szabolcs Nagy wrote:
> * John Starks <John.Starks@microsoft.com> [2017-11-28 16:51:19 +0000]:
> > What if you redefine the syscall numbers in wasm to be function
> > pointers to the actual syscalls, e.g. #define SYS_unlink
> > ((long)&__syscall_unlink). Then __syscall and friends can just
> > apply the arguments to the function pointer. This should play nice
> > with the linker and probably optimizes well.
> 
> you have to be able to cast it to the right type of
> function pointer then and pass the right amount of
> arguments.

To make this formally correct, I think all the __syscall_x functions
should just take 6 (or 7?) arguments and always be called with a dummy
arg of 0 in the additional slots. This can be done in the wasm
syscall_arch.h with no changes to musl code.

Rich


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-11-28 18:55 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-28 12:31 [PATCH] Wasm support patch 2 (static syscalls) Nicholas Wilson
2017-11-28 12:59 ` Szabolcs Nagy
2017-11-28 13:23   ` Nicholas Wilson
2017-11-28 14:05     ` Szabolcs Nagy
2017-11-28 14:34       ` Nicholas Wilson
2017-11-28 14:35         ` Szabolcs Nagy
2017-11-28 14:53           ` Nicholas Wilson
2017-11-28 15:08             ` Szabolcs Nagy
2017-11-28 15:50               ` Rich Felker
2017-11-28 16:20                 ` Nicholas Wilson
2017-11-28 16:51         ` John Starks
2017-11-28 17:52           ` Szabolcs Nagy
2017-11-28 18:55             ` Rich Felker
2017-11-28 17:53           ` Nicholas Wilson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).