* [musl] [Patch Request] Name-bound syscalls within musl
@ 2025-11-25 17:29 Arjun Ramesh
2025-11-25 17:41 ` Daniel Gutson
2025-11-26 1:44 ` Rich Felker
0 siblings, 2 replies; 6+ messages in thread
From: Arjun Ramesh @ 2025-11-25 17:29 UTC (permalink / raw)
To: musl
[-- Attachment #1: Type: text/plain, Size: 1406 bytes --]
Hi everyone,
I am currently working on a research project using musl that uses
name-bound syscalls. Skimming the codebase, nearly all references to
"syscall" invocations within musl use the SYS_* defines. These numbers
differ across ISAs and are susceptible to type-safety
bugs, providing virtually no type-checking on their arguments. Syscalls
bound statically by name will make the codebase much less prone to mistakes
and cleaner, allowing type-checked syscall arguments and also cutting down
on the amount of ISA-specific code surface. Below is an example of the
patch I'm suggesting for a single syscall:
```
diff --git a/src/fcntl/open.c b/src/fcntl/open.c
index 4c3c8275..ff5f7973 100644
--- a/src/fcntl/open.c
+++ b/src/fcntl/open.c
@@ -15,7 +15,7 @@ int open(const char *filename, int flags, ...)
int fd = __sys_open_cp(filename, flags, mode);
if (fd>=0 && (flags & O_CLOEXEC))
- __syscall(SYS_fcntl, fd, F_SETFD, FD_CLOEXEC);
+ __syscall_SYS_fcntl(fd, F_SETFD, FD_CLOEXEC);
return __syscall_ret(fd);
}
```
"__syscall_SYS_fcntl" can be defined with a unified static type signature
across ISAs. Given the highly structured nature of this patch, it could
mostly be accomplished with a simple `sed` command across the entire
project, with no impact on functionality. Would the community be open to a
patch of this nature?
Thanks
Best,
Arjun
[-- Attachment #2: Type: text/html, Size: 1573 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [musl] [Patch Request] Name-bound syscalls within musl
2025-11-25 17:29 [musl] [Patch Request] Name-bound syscalls within musl Arjun Ramesh
@ 2025-11-25 17:41 ` Daniel Gutson
2025-11-25 18:04 ` Arjun Ramesh
2025-11-26 1:44 ` Rich Felker
1 sibling, 1 reply; 6+ messages in thread
From: Daniel Gutson @ 2025-11-25 17:41 UTC (permalink / raw)
To: musl
[-- Attachment #1: Type: text/plain, Size: 1737 bytes --]
El mar, 25 de nov de 2025, 14:29, Arjun Ramesh <arjunr2@andrew.cmu.edu>
escribió:
> Hi everyone,
>
> I am currently working on a research project using musl that uses
> name-bound syscalls. Skimming the codebase, nearly all references to
> "syscall" invocations within musl use the SYS_* defines. These numbers
> differ across ISAs and are susceptible to type-safety
> bugs, providing virtually no type-checking on their arguments. Syscalls
> bound statically by name will make the codebase much less prone to mistakes
> and cleaner, allowing type-checked syscall arguments and also cutting down
> on the amount of ISA-specific code surface. Below is an example of the
> patch I'm suggesting for a single syscall:
>
> ```
> diff --git a/src/fcntl/open.c b/src/fcntl/open.c
> index 4c3c8275..ff5f7973 100644
> --- a/src/fcntl/open.c
> +++ b/src/fcntl/open.c
> @@ -15,7 +15,7 @@ int open(const char *filename, int flags, ...)
>
> int fd = __sys_open_cp(filename, flags, mode);
> if (fd>=0 && (flags & O_CLOEXEC))
> - __syscall(SYS_fcntl, fd, F_SETFD, FD_CLOEXEC);
> + __syscall_SYS_fcntl(fd, F_SETFD, FD_CLOEXEC);
>
> return __syscall_ret(fd);
> }
> ```
>
> "__syscall_SYS_fcntl" can be defined with a unified static type signature
> across ISAs. Given the highly structured nature of this patch, it could
> mostly be accomplished with a simple `sed` command across the entire
> project, with no impact on functionality. Would the community be open to a
> patch of this nature?
>
What about defining the functions in an x-macro table, in order to avoid
the repeated boilerplate that this change would bring?
> Thanks
> Best,
> Arjun
>
[-- Attachment #2: Type: text/html, Size: 2383 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [musl] [Patch Request] Name-bound syscalls within musl
2025-11-25 17:41 ` Daniel Gutson
@ 2025-11-25 18:04 ` Arjun Ramesh
0 siblings, 0 replies; 6+ messages in thread
From: Arjun Ramesh @ 2025-11-25 18:04 UTC (permalink / raw)
To: musl
[-- Attachment #1: Type: text/plain, Size: 3797 bytes --]
>
> El mar, 25 de nov de 2025, 14:29, Arjun Ramesh <arjunr2@andrew.cmu.edu>
> escribió:
>
>> Hi everyone,
>>
>> I am currently working on a research project using musl that uses
>> name-bound syscalls. Skimming the codebase, nearly all references to
>> "syscall" invocations within musl use the SYS_* defines. These numbers
>> differ across ISAs and are susceptible to type-safety
>> bugs, providing virtually no type-checking on their arguments. Syscalls
>> bound statically by name will make the codebase much less prone to mistakes
>> and cleaner, allowing type-checked syscall arguments and also cutting down
>> on the amount of ISA-specific code surface. Below is an example of the
>> patch I'm suggesting for a single syscall:
>>
>> ```
>> diff --git a/src/fcntl/open.c b/src/fcntl/open.c
>> index 4c3c8275..ff5f7973 100644
>> --- a/src/fcntl/open.c
>> +++ b/src/fcntl/open.c
>> @@ -15,7 +15,7 @@ int open(const char *filename, int flags, ...)
>>
>> int fd = __sys_open_cp(filename, flags, mode);
>> if (fd>=0 && (flags & O_CLOEXEC))
>> - __syscall(SYS_fcntl, fd, F_SETFD, FD_CLOEXEC);
>> + __syscall_SYS_fcntl(fd, F_SETFD, FD_CLOEXEC);
>>
>> return __syscall_ret(fd);
>> }
>> ```
>>
>> "__syscall_SYS_fcntl" can be defined with a unified static type signature
>> across ISAs. Given the highly structured nature of this patch, it could
>> mostly be accomplished with a simple `sed` command across the entire
>> project, with no impact on functionality. Would the community be open to a
>> patch of this nature?
>>
>
> What about defining the functions in an x-macro table, in order to avoid
> the repeated boilerplate that this change would bring?
>
>
>> Thanks
>> Best,
>> Arjun
>>
>
That's a good idea, an x-macro syscall table might actually be a cleaner
direction without scattered changes.
Arjun
On Tue, Nov 25, 2025 at 12:41 PM Daniel Gutson <danielgutson@gmail.com>
wrote:
>
>
> El mar, 25 de nov de 2025, 14:29, Arjun Ramesh <arjunr2@andrew.cmu.edu>
> escribió:
>
>> Hi everyone,
>>
>> I am currently working on a research project using musl that uses
>> name-bound syscalls. Skimming the codebase, nearly all references to
>> "syscall" invocations within musl use the SYS_* defines. These numbers
>> differ across ISAs and are susceptible to type-safety
>> bugs, providing virtually no type-checking on their arguments. Syscalls
>> bound statically by name will make the codebase much less prone to mistakes
>> and cleaner, allowing type-checked syscall arguments and also cutting down
>> on the amount of ISA-specific code surface. Below is an example of the
>> patch I'm suggesting for a single syscall:
>>
>> ```
>> diff --git a/src/fcntl/open.c b/src/fcntl/open.c
>> index 4c3c8275..ff5f7973 100644
>> --- a/src/fcntl/open.c
>> +++ b/src/fcntl/open.c
>> @@ -15,7 +15,7 @@ int open(const char *filename, int flags, ...)
>>
>> int fd = __sys_open_cp(filename, flags, mode);
>> if (fd>=0 && (flags & O_CLOEXEC))
>> - __syscall(SYS_fcntl, fd, F_SETFD, FD_CLOEXEC);
>> + __syscall_SYS_fcntl(fd, F_SETFD, FD_CLOEXEC);
>>
>> return __syscall_ret(fd);
>> }
>> ```
>>
>> "__syscall_SYS_fcntl" can be defined with a unified static type signature
>> across ISAs. Given the highly structured nature of this patch, it could
>> mostly be accomplished with a simple `sed` command across the entire
>> project, with no impact on functionality. Would the community be open to a
>> patch of this nature?
>>
>
> What about defining the functions in an x-macro table, in order to avoid
> the repeated boilerplate that this change would bring?
>
>
>> Thanks
>> Best,
>> Arjun
>>
>
[-- Attachment #2: Type: text/html, Size: 5552 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [musl] [Patch Request] Name-bound syscalls within musl
2025-11-25 17:29 [musl] [Patch Request] Name-bound syscalls within musl Arjun Ramesh
2025-11-25 17:41 ` Daniel Gutson
@ 2025-11-26 1:44 ` Rich Felker
2025-11-26 3:17 ` Arjun Ramesh
1 sibling, 1 reply; 6+ messages in thread
From: Rich Felker @ 2025-11-26 1:44 UTC (permalink / raw)
To: Arjun Ramesh; +Cc: musl
On Tue, Nov 25, 2025 at 12:29:08PM -0500, Arjun Ramesh wrote:
> Hi everyone,
>
> I am currently working on a research project using musl that uses
> name-bound syscalls. Skimming the codebase, nearly all references to
> "syscall" invocations within musl use the SYS_* defines. These numbers
> differ across ISAs and are susceptible to type-safety
> bugs, providing virtually no type-checking on their arguments. Syscalls
> bound statically by name will make the codebase much less prone to mistakes
> and cleaner, allowing type-checked syscall arguments and also cutting down
> on the amount of ISA-specific code surface. Below is an example of the
> patch I'm suggesting for a single syscall:
>
> ```
> diff --git a/src/fcntl/open.c b/src/fcntl/open.c
> index 4c3c8275..ff5f7973 100644
> --- a/src/fcntl/open.c
> +++ b/src/fcntl/open.c
> @@ -15,7 +15,7 @@ int open(const char *filename, int flags, ...)
>
> int fd = __sys_open_cp(filename, flags, mode);
> if (fd>=0 && (flags & O_CLOEXEC))
> - __syscall(SYS_fcntl, fd, F_SETFD, FD_CLOEXEC);
> + __syscall_SYS_fcntl(fd, F_SETFD, FD_CLOEXEC);
>
> return __syscall_ret(fd);
> }
> ```
>
> "__syscall_SYS_fcntl" can be defined with a unified static type signature
> across ISAs. Given the highly structured nature of this patch, it could
> mostly be accomplished with a simple `sed` command across the entire
> project, with no impact on functionality. Would the community be open to a
> patch of this nature?
This kind of invasive change will not be accepted upstream. It just
moves the logic for the type signatures to a different location, and
has no concrete benefit. Unnecessary churn like this has a high cost,
as it prevents users from backporting security and other bugfix
patches to older or forked versions of the codebase they may be using,
and requires users who read the commit log as a basis for trusting
changes to wade through the churn to determine that all the changes it
made were as-described and correct.
Where there have historically been issues with syscall signatures, it
has almost entirely been a consequence of weird things the kernel did
that we didn't know about, and encoding our assumption about the
signatures at a different abstraction layer would not have made the
assumptions that mismatched what the kernel was doing any more
correct. If someone does want to check signatures, this could probably
be done mechanically on the codebase as-is, extracting information
from the kernel and checking callpoints against it.
For your research project, I think it's entirely possible to do the
name-binding just by the choice of how you define the __syscallN
macros in syscall_arch.h and the SYS_* macros, so that they expand to
"name bindings". If you're targeting a system where these name-bound
syscalls are not actual syscalls but callable functions, I think you
could even do some magic in the macros to make the type checking
happen like this.
Rich
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [musl] [Patch Request] Name-bound syscalls within musl
2025-11-26 1:44 ` Rich Felker
@ 2025-11-26 3:17 ` Arjun Ramesh
2025-11-26 22:31 ` Szabolcs Nagy
0 siblings, 1 reply; 6+ messages in thread
From: Arjun Ramesh @ 2025-11-26 3:17 UTC (permalink / raw)
To: Rich Felker; +Cc: musl
[-- Attachment #1: Type: text/plain, Size: 4732 bytes --]
On Tue, Nov 25, 2025 at 8:44 PM Rich Felker <dalias@libc.org> wrote:
> On Tue, Nov 25, 2025 at 12:29:08PM -0500, Arjun Ramesh wrote:
> > Hi everyone,
> >
> > I am currently working on a research project using musl that uses
> > name-bound syscalls. Skimming the codebase, nearly all references to
> > "syscall" invocations within musl use the SYS_* defines. These numbers
> > differ across ISAs and are susceptible to type-safety
> > bugs, providing virtually no type-checking on their arguments. Syscalls
> > bound statically by name will make the codebase much less prone to
> mistakes
> > and cleaner, allowing type-checked syscall arguments and also cutting
> down
> > on the amount of ISA-specific code surface. Below is an example of the
> > patch I'm suggesting for a single syscall:
> >
> > ```
> > diff --git a/src/fcntl/open.c b/src/fcntl/open.c
> > index 4c3c8275..ff5f7973 100644
> > --- a/src/fcntl/open.c
> > +++ b/src/fcntl/open.c
> > @@ -15,7 +15,7 @@ int open(const char *filename, int flags, ...)
> >
> > int fd = __sys_open_cp(filename, flags, mode);
> > if (fd>=0 && (flags & O_CLOEXEC))
> > - __syscall(SYS_fcntl, fd, F_SETFD, FD_CLOEXEC);
> > + __syscall_SYS_fcntl(fd, F_SETFD, FD_CLOEXEC);
> >
> > return __syscall_ret(fd);
> > }
> > ```
> >
> > "__syscall_SYS_fcntl" can be defined with a unified static type signature
> > across ISAs. Given the highly structured nature of this patch, it could
> > mostly be accomplished with a simple `sed` command across the entire
> > project, with no impact on functionality. Would the community be open to
> a
> > patch of this nature?
>
> This kind of invasive change will not be accepted upstream. It just
> moves the logic for the type signatures to a different location, and
> has no concrete benefit. Unnecessary churn like this has a high cost,
> as it prevents users from backporting security and other bugfix
> patches to older or forked versions of the codebase they may be using,
> and requires users who read the commit log as a basis for trusting
> changes to wade through the churn to determine that all the changes it
> made were as-described and correct.
>
> Where there have historically been issues with syscall signatures, it
> has almost entirely been a consequence of weird things the kernel did
> that we didn't know about, and encoding our assumption about the
> signatures at a different abstraction layer would not have made the
> assumptions that mismatched what the kernel was doing any more
> correct. If someone does want to check signatures, this could probably
> be done mechanically on the codebase as-is, extracting information
> from the kernel and checking callpoints against it.
>
> For your research project, I think it's entirely possible to do the
> name-binding just by the choice of how you define the __syscallN
> macros in syscall_arch.h and the SYS_* macros, so that they expand to
> "name bindings". If you're targeting a system where these name-bound
> syscalls are not actual syscalls but callable functions, I think you
> could even do some magic in the macros to make the type checking
> happen like this.
>
> Rich
>
Thanks for the comments.
This makes sense, and macro magic within syscall_arch.h can certainly work.
Looking through the codebase, luckily most call-sites use exclusively SYS_*
macros, allowing this sort of magic to work. However, there are still a
couple of spots that might need some patching where a variable is used for
syscall numbers. These will likely have to expand out to a different macro
expansion -- one which has a giant switch case over all possible syscalls
to name-bind them. At the moment, I identify very few places where this
happens, which is a good thing (seems like both are just for generic
syscall-by-number invocations):
* src/misc/syscall.c
* src/thread/__syscall_cp.c
Given this, would you then be open to minimal patches that would route
these "variable" numbered to a different macro? Perhaps something of the
nature of this in those spots:
```
diff --git a/src/misc/syscall.c b/src/misc/syscall.c
index 6f3ef656..72356346 100644
--- a/src/misc/syscall.c
+++ b/src/misc/syscall.c
@@ -17,5 +17,5 @@ long syscall(long n, ...)
e=va_arg(ap, syscall_arg_t);
f=va_arg(ap, syscall_arg_t);
va_end(ap);
- return __syscall_ret(__syscall(n,a,b,c,d,e,f));
+ return __syscall_ret(__syscall_var(n,a,b,c,d,e,f));
}
```
The `__syscall_var` can be defaulted to `__syscall` on all existing
platforms, but will provide the flexibility for allowing a hook for
name-binding these calls.
Arjun
[-- Attachment #2: Type: text/html, Size: 5423 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [musl] [Patch Request] Name-bound syscalls within musl
2025-11-26 3:17 ` Arjun Ramesh
@ 2025-11-26 22:31 ` Szabolcs Nagy
0 siblings, 0 replies; 6+ messages in thread
From: Szabolcs Nagy @ 2025-11-26 22:31 UTC (permalink / raw)
To: Arjun Ramesh; +Cc: Rich Felker, musl
* Arjun Ramesh <arjunr2@andrew.cmu.edu> [2025-11-25 22:17:12 -0500]:
> On Tue, Nov 25, 2025 at 8:44 PM Rich Felker <dalias@libc.org> wrote:
> This makes sense, and macro magic within syscall_arch.h can certainly work.
> Looking through the codebase, luckily most call-sites use exclusively SYS_*
> macros, allowing this sort of magic to work. However, there are still a
> couple of spots that might need some patching where a variable is used for
> syscall numbers. These will likely have to expand out to a different macro
> expansion -- one which has a giant switch case over all possible syscalls
> to name-bind them. At the moment, I identify very few places where this
> happens, which is a good thing (seems like both are just for generic
> syscall-by-number invocations):
> * src/misc/syscall.c
> * src/thread/__syscall_cp.c
and
src/thread/pthread_cancel.c
src/unistd/setxid.c
>
> Given this, would you then be open to minimal patches that would route
> these "variable" numbered to a different macro? Perhaps something of the
> nature of this in those spots:
> ```
> diff --git a/src/misc/syscall.c b/src/misc/syscall.c
> index 6f3ef656..72356346 100644
> --- a/src/misc/syscall.c
> +++ b/src/misc/syscall.c
> @@ -17,5 +17,5 @@ long syscall(long n, ...)
> e=va_arg(ap, syscall_arg_t);
> f=va_arg(ap, syscall_arg_t);
> va_end(ap);
> - return __syscall_ret(__syscall(n,a,b,c,d,e,f));
> + return __syscall_ret(__syscall_var(n,a,b,c,d,e,f));
> }
> ```
>
> The `__syscall_var` can be defaulted to `__syscall` on all existing
> platforms, but will provide the flexibility for allowing a hook for
> name-binding these calls.
note that with
#define __syscall1(n,a) __my_##n(a)
#define __syscall_cp1(n,a) __my_cp_##n(a)
...
the call above would expand to
__my_n(a,b,c,d,e,f)
which your target can override. the only issue
is setxid:
int ret = __syscall(c->nr, c->id, c->eid, c->sid);
you can work this around e.g. by 'int n = c->nr;'
but for better typesafety you can just do the
switch(c->nr) dispatch there.
likely you need to add ifdefs in the internal
syscall.h to be able to override __syscall*
macros, so maintaining a minimal setxid patch
should be acceptable as well.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-11-26 22:31 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-11-25 17:29 [musl] [Patch Request] Name-bound syscalls within musl Arjun Ramesh
2025-11-25 17:41 ` Daniel Gutson
2025-11-25 18:04 ` Arjun Ramesh
2025-11-26 1:44 ` Rich Felker
2025-11-26 3:17 ` Arjun Ramesh
2025-11-26 22:31 ` Szabolcs Nagy
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).