mailing list of musl libc
 help / color / mirror / code / Atom feed
* Using unistd functions vs calling syscall straight in the code
@ 2012-08-10 12:47 Murali Vijayaraghavan
  2012-08-10 14:16 ` Szabolcs Nagy
  0 siblings, 1 reply; 7+ messages in thread
From: Murali Vijayaraghavan @ 2012-08-10 12:47 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1222 bytes --]

Hi

I am trying to run C programs on a barebones (MIPS-like) processor
simulator without any OS. The simulator mainly implements the userspace
ISA, with no syscall instruction support in hardware. I was hoping to
instead support some of the system calls (like open, read, write, etc,
mainly for debugging purposes) by using custom instructions, one for each
(or a few similar) system call(s). For that, the implementation of
functions like read and write should be calling these custom instructions
in assembly, in other words, I have to port the system call layer to my
simulator. I looked at musl among other libc implementations, and this was
the only one whose structure I could understand well, making it easy to
port. I did successfully and easily port it for my purposes, which brings
to my question/comment.

You guys do have a unistd implementation which supposedly implements each
of the system calls. But you are not consistent with the use of these
functions to perform the unistd-implemented tasks. Wouldn't it be a lot
cleaner to call these functions instead of calling syscall / syscall_cp
directly from the other (top-level) functions? Was there some rationale or
is it just code evolution?

Thanks
Murali

[-- Attachment #2: Type: text/html, Size: 1343 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Using unistd functions vs calling syscall straight in the code
  2012-08-10 12:47 Using unistd functions vs calling syscall straight in the code Murali Vijayaraghavan
@ 2012-08-10 14:16 ` Szabolcs Nagy
  2012-08-10 14:32   ` Murali Vijayaraghavan
  0 siblings, 1 reply; 7+ messages in thread
From: Szabolcs Nagy @ 2012-08-10 14:16 UTC (permalink / raw)
  To: musl

* Murali Vijayaraghavan <vmurali@csail.mit.edu> [2012-08-10 21:47:59 +0900]:
> You guys do have a unistd implementation which supposedly implements each
> of the system calls. But you are not consistent with the use of these
> functions to perform the unistd-implemented tasks. Wouldn't it be a lot
> cleaner to call these functions instead of calling syscall / syscall_cp
> directly from the other (top-level) functions? Was there some rationale or
> is it just code evolution?
> 

i don't understand the question

can you show with an example what do you mean?

calling a libc function is not the same as using a linux
syscall, and there is usually a reason why one is used
instead of the other..

(the first has posix semantics the second has whatever
semantics linux have, even if these happen to be compatible
then the first one creates an extra call and an extra
internal dependency when static linking is used)



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Using unistd functions vs calling syscall straight in the code
  2012-08-10 14:16 ` Szabolcs Nagy
@ 2012-08-10 14:32   ` Murali Vijayaraghavan
  2012-08-10 14:59     ` Szabolcs Nagy
  0 siblings, 1 reply; 7+ messages in thread
From: Murali Vijayaraghavan @ 2012-08-10 14:32 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1904 bytes --]

On Fri, Aug 10, 2012 at 11:16 PM, Szabolcs Nagy <nsz@port70.net> wrote:

> * Murali Vijayaraghavan <vmurali@csail.mit.edu> [2012-08-10 21:47:59
> +0900]:
> > You guys do have a unistd implementation which supposedly implements each
> > of the system calls. But you are not consistent with the use of these
> > functions to perform the unistd-implemented tasks. Wouldn't it be a lot
> > cleaner to call these functions instead of calling syscall / syscall_cp
> > directly from the other (top-level) functions? Was there some rationale
> or
> > is it just code evolution?
> >
>
> i don't understand the question
>
> can you show with an example what do you mean?
>
> calling a libc function is not the same as using a linux
> syscall, and there is usually a reason why one is used
> instead of the other..
>
> (the first has posix semantics the second has whatever
> semantics linux have, even if these happen to be compatible
> then the first one creates an extra call and an extra
> internal dependency when static linking is used)
>


For example, I could have implemented src/stdio/__stdio_read.c using
src/unistd/readv.c's readv function instead of calling
syscall/syscall_cp(SYS_readv, ...) in lines 20 and 24. I believe unistd is
the POSIX compatibility layer (correct me if I am wrong). So shouldn't the
C standard library, namely stdio functions like scanf eventually use the
unistd functions instead of using the syscall directly?

This would have made my job easier because I could have just modified this
POSIX compability layer instead of scanning through the C standard library
functions and changing them one by one. Remember I have multiple special
instructions to perform each IO task instead of a single system call
instruction, since it's easier to implement hardware simulator that way - I
can get the function type simply by decoding the instruction rather than
reading some register.

[-- Attachment #2: Type: text/html, Size: 2401 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Using unistd functions vs calling syscall straight in the code
  2012-08-10 14:32   ` Murali Vijayaraghavan
@ 2012-08-10 14:59     ` Szabolcs Nagy
  2012-08-10 15:40       ` Murali Vijayaraghavan
  0 siblings, 1 reply; 7+ messages in thread
From: Szabolcs Nagy @ 2012-08-10 14:59 UTC (permalink / raw)
  To: musl

* Murali Vijayaraghavan <vmurali@csail.mit.edu> [2012-08-10 23:32:11 +0900]:
> For example, I could have implemented src/stdio/__stdio_read.c using
> src/unistd/readv.c's readv function instead of calling
> syscall/syscall_cp(SYS_readv, ...) in lines 20 and 24. I believe unistd is
> the POSIX compatibility layer (correct me if I am wrong). So shouldn't the
> C standard library, namely stdio functions like scanf eventually use the
> unistd functions instead of using the syscall directly?
> 

that's not how it works,

unistd is no more posix than stdio
they are all part of the posix api

stdio functions are also defined by the
c standard so in this sense it's good
that the stdio implementation does not
depend on the larger posix api
(it only depends on the syscall api)

but yes otherwise stdio could use unistd
functions and then it would be a bit
slower (+1 call) and +1 symbol resolution
during linking i guess

> This would have made my job easier because I could have just modified this
> POSIX compability layer instead of scanning through the C standard library
> functions and changing them one by one. Remember I have multiple special

you are not supposed to change the functions

you only need to implement the syscalls
and dummy out the ones you don't use
(ie. have a large switch, with a defalut: return -ENOSYS;)

if you modify the .c source files you are
doing it wrong

> instructions to perform each IO task instead of a single system call
> instruction, since it's easier to implement hardware simulator that way - I
> can get the function type simply by decoding the instruction rather than
> reading some register.

even if you have special instructions
in your emulator i don't see why you
cannot implement the syscall api
(actually that seems simpler and more
correct to me than putting random special
instructions all over the place)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Using unistd functions vs calling syscall straight in the code
  2012-08-10 14:59     ` Szabolcs Nagy
@ 2012-08-10 15:40       ` Murali Vijayaraghavan
  2012-08-10 17:59         ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Murali Vijayaraghavan @ 2012-08-10 15:40 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 2349 bytes --]

On Fri, Aug 10, 2012 at 11:59 PM, Szabolcs Nagy <nsz@port70.net> wrote:

> * Murali Vijayaraghavan <vmurali@csail.mit.edu> [2012-08-10 23:32:11
> +0900]:
> > For example, I could have implemented src/stdio/__stdio_read.c using
> > src/unistd/readv.c's readv function instead of calling
> > syscall/syscall_cp(SYS_readv, ...) in lines 20 and 24. I believe unistd
> is
> > the POSIX compatibility layer (correct me if I am wrong). So shouldn't
> the
> > C standard library, namely stdio functions like scanf eventually use the
> > unistd functions instead of using the syscall directly?
> >
>
> that's not how it works,
>
> unistd is no more posix than stdio
> they are all part of the posix api
>
> stdio functions are also defined by the
> c standard so in this sense it's good
> that the stdio implementation does not
> depend on the larger posix api
> (it only depends on the syscall api)
>
> but yes otherwise stdio could use unistd
> functions and then it would be a bit
> slower (+1 call) and +1 symbol resolution
> during linking i guess
>

Oh k. I thought one was on top of the other. If they are all supposed to be
part of POSIX, I guess it makes more sense to avoid an extra call.

>
> > This would have made my job easier because I could have just modified
> this
> > POSIX compability layer instead of scanning through the C standard
> library
> > functions and changing them one by one. Remember I have multiple special
>
> you are not supposed to change the functions
>
> you only need to implement the syscalls
> and dummy out the ones you don't use
> (ie. have a large switch, with a defalut: return -ENOSYS;)
>
> if you modify the .c source files you are
> doing it wrong
>
> > instructions to perform each IO task instead of a single system call
> > instruction, since it's easier to implement hardware simulator that way
> - I
> > can get the function type simply by decoding the instruction rather than
> > reading some register.
>
> even if you have special instructions
> in your emulator i don't see why you
> cannot implement the syscall api
> (actually that seems simpler and more
> correct to me than putting random special
> instructions all over the place)
>

I suppose I can do this. I was just more familiar with unistd functions'
semantics than the syscall API's. But moving forward, this is more
maintainable. Thanks.

[-- Attachment #2: Type: text/html, Size: 3089 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Using unistd functions vs calling syscall straight in the code
  2012-08-10 15:40       ` Murali Vijayaraghavan
@ 2012-08-10 17:59         ` Rich Felker
  2012-08-10 18:40           ` Murali Vijayaraghavan
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2012-08-10 17:59 UTC (permalink / raw)
  To: musl

On Sat, Aug 11, 2012 at 12:40:25AM +0900, Murali Vijayaraghavan wrote:
> On Fri, Aug 10, 2012 at 11:59 PM, Szabolcs Nagy <nsz@port70.net> wrote:
> 
> > * Murali Vijayaraghavan <vmurali@csail.mit.edu> [2012-08-10 23:32:11
> > +0900]:
> > > For example, I could have implemented src/stdio/__stdio_read.c using
> > > src/unistd/readv.c's readv function instead of calling
> > > syscall/syscall_cp(SYS_readv, ...) in lines 20 and 24. I believe unistd
> > is
> > > the POSIX compatibility layer (correct me if I am wrong). So shouldn't
> > the
> > > C standard library, namely stdio functions like scanf eventually use the
> > > unistd functions instead of using the syscall directly?
> > >
> >
> > that's not how it works,
> >
> > unistd is no more posix than stdio
> > they are all part of the posix api
> >
> > stdio functions are also defined by the
> > c standard so in this sense it's good
> > that the stdio implementation does not
> > depend on the larger posix api
> > (it only depends on the syscall api)
> >
> > but yes otherwise stdio could use unistd
> > functions and then it would be a bit
> > slower (+1 call) and +1 symbol resolution
> > during linking i guess
> >
> 
> Oh k. I thought one was on top of the other. If they are all supposed to be
> part of POSIX, I guess it makes more sense to avoid an extra call.

It's tricky because from a _functionality_ standpoint, stdio is built
on primitives that correspond to the low-level POSIX IO functions in
unistd.h, but from a _standards_ standpoint, POSIX is built on top of
plain ISO C and not the other way around.

To understand why stdio functions cannot call read() or write() (or
readv or writev), consider the following conforming C program:

#include <stdio.h>
int read()
{
	int c = getchar();
	if (c==EOF) exit(0);
	return c;
}
int main()
{
	for (;;) printf("got '%c'\n", read());
}

If getchar internally called read, you'd have infinite mutual
recursion; even if this weren't a problem, the _semantics_ of the
application-provided function named "read" do not match the POSIX
semantics, so it would break.

Even if there weren't this namespace problem with using the unistd
functions, there are also semantic issues. Many of the syscalls made
from stdio (open, close, ...) are cancellation points per POSIX, and 
often the cancellation behavior is undesirable in stdio. Just not
invoking the cancellable version is cheaper than wrapping the call
with code to change the cancellabilty status before and after the
call.

> > > This would have made my job easier because I could have just modified
> > this
> > > POSIX compability layer instead of scanning through the C standard
> > library
> > > functions and changing them one by one. Remember I have multiple special
> >
> > you are not supposed to change the functions
> >
> > you only need to implement the syscalls
> > and dummy out the ones you don't use
> > (ie. have a large switch, with a defalut: return -ENOSYS;)

I would do it this way:

#define __syscall0(n) __syscall_#n()
#define __syscall1(n,a) __syscall_#n(a)
...

Then __syscall(SYS_exit, val) expands to __syscall_SYS_exit(val), and
as long as you implement a function __syscall_SYS_exit with the proper
semantics, everything will work as expected.

Of course another possible design for musl would have been to do this
all the other way around: for each syscall foo, making a function
__syscall_foo and using that for all the internals rather than using
syscall(SYS_foo, ...). I chose the latter however because it's closer
to the (de facto) standard way you'd use syscalls from an application,
and because it better facilitates expanding the syscall inline (which
usually reduces code size quite a bit; it's irrelevant to performance
of course since syscall time is dominated by overhead entering/exiting
kernelspace or doing the actual work in kernelspace.

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Using unistd functions vs calling syscall straight in the code
  2012-08-10 17:59         ` Rich Felker
@ 2012-08-10 18:40           ` Murali Vijayaraghavan
  0 siblings, 0 replies; 7+ messages in thread
From: Murali Vijayaraghavan @ 2012-08-10 18:40 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 4576 bytes --]

On Sat, Aug 11, 2012 at 2:59 AM, Rich Felker <dalias@aerifal.cx> wrote:

> On Sat, Aug 11, 2012 at 12:40:25AM +0900, Murali Vijayaraghavan wrote:
> > On Fri, Aug 10, 2012 at 11:59 PM, Szabolcs Nagy <nsz@port70.net> wrote:
> >
> > > * Murali Vijayaraghavan <vmurali@csail.mit.edu> [2012-08-10 23:32:11
> > > +0900]:
> > > > For example, I could have implemented src/stdio/__stdio_read.c using
> > > > src/unistd/readv.c's readv function instead of calling
> > > > syscall/syscall_cp(SYS_readv, ...) in lines 20 and 24. I believe
> unistd
> > > is
> > > > the POSIX compatibility layer (correct me if I am wrong). So
> shouldn't
> > > the
> > > > C standard library, namely stdio functions like scanf eventually use
> the
> > > > unistd functions instead of using the syscall directly?
> > > >
> > >
> > > that's not how it works,
> > >
> > > unistd is no more posix than stdio
> > > they are all part of the posix api
> > >
> > > stdio functions are also defined by the
> > > c standard so in this sense it's good
> > > that the stdio implementation does not
> > > depend on the larger posix api
> > > (it only depends on the syscall api)
> > >
> > > but yes otherwise stdio could use unistd
> > > functions and then it would be a bit
> > > slower (+1 call) and +1 symbol resolution
> > > during linking i guess
> > >
> >
> > Oh k. I thought one was on top of the other. If they are all supposed to
> be
> > part of POSIX, I guess it makes more sense to avoid an extra call.
>
> It's tricky because from a _functionality_ standpoint, stdio is built
> on primitives that correspond to the low-level POSIX IO functions in
> unistd.h, but from a _standards_ standpoint, POSIX is built on top of
> plain ISO C and not the other way around.
>
> To understand why stdio functions cannot call read() or write() (or
> readv or writev), consider the following conforming C program:
>
> #include <stdio.h>
> int read()
> {
>         int c = getchar();
>         if (c==EOF) exit(0);
>         return c;
> }
> int main()
> {
>         for (;;) printf("got '%c'\n", read());
> }
>
> If getchar internally called read, you'd have infinite mutual
> recursion; even if this weren't a problem, the _semantics_ of the
> application-provided function named "read" do not match the POSIX
> semantics, so it would break.
>

Hmm, now I understand why I could never get glibc/newlib use my custom
unistd library.  Guess they are also implemented a la  musl style, using a
__syscall or some such function. It's hard to read their codes, so couldn't
figure out why my read function wasn't getting called by stdio.


>
> Even if there weren't this namespace problem with using the unistd
> functions, there are also semantic issues. Many of the syscalls made
> from stdio (open, close, ...) are cancellation points per POSIX, and
> often the cancellation behavior is undesirable in stdio. Just not
> invoking the cancellable version is cheaper than wrapping the call
> with code to change the cancellabilty status before and after the
> call.
>
> > > > This would have made my job easier because I could have just modified
> > > this
> > > > POSIX compability layer instead of scanning through the C standard
> > > library
> > > > functions and changing them one by one. Remember I have multiple
> special
> > >
> > > you are not supposed to change the functions
> > >
> > > you only need to implement the syscalls
> > > and dummy out the ones you don't use
> > > (ie. have a large switch, with a defalut: return -ENOSYS;)
>
> I would do it this way:
>
> #define __syscall0(n) __syscall_#n()
> #define __syscall1(n,a) __syscall_#n(a)
> ...
>
> Then __syscall(SYS_exit, val) expands to __syscall_SYS_exit(val), and
> as long as you implement a function __syscall_SYS_exit with the proper
> semantics, everything will work as expected.
>
> Of course another possible design for musl would have been to do this
> all the other way around: for each syscall foo, making a function
> __syscall_foo and using that for all the internals rather than using
> syscall(SYS_foo, ...). I chose the latter however because it's closer
> to the (de facto) standard way you'd use syscalls from an application,
> and because it better facilitates expanding the syscall inline (which
> usually reduces code size quite a bit; it's irrelevant to performance
> of course since syscall time is dominated by overhead entering/exiting
> kernelspace or doing the actual work in kernelspace.
>

This seems to be a nice way to do things. It also gives a compile error on
unimplemented sys calls, which is desirable for me. Thanks!

[-- Attachment #2: Type: text/html, Size: 5752 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-08-10 18:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-10 12:47 Using unistd functions vs calling syscall straight in the code Murali Vijayaraghavan
2012-08-10 14:16 ` Szabolcs Nagy
2012-08-10 14:32   ` Murali Vijayaraghavan
2012-08-10 14:59     ` Szabolcs Nagy
2012-08-10 15:40       ` Murali Vijayaraghavan
2012-08-10 17:59         ` Rich Felker
2012-08-10 18:40           ` Murali Vijayaraghavan

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).