On Sat, Aug 11, 2012 at 2:59 AM, Rich Felker <dalias@aerifal.cx> wrote:

> On Sat, Aug 11, 2012 at 12:40:25AM +0900, Murali Vijayaraghavan wrote:
> > On Fri, Aug 10, 2012 at 11:59 PM, Szabolcs Nagy <nsz@port70.net> wrote:
> >
> > > * Murali Vijayaraghavan <vmurali@csail.mit.edu> [2012-08-10 23:32:11
> > > +0900]:
> > > > For example, I could have implemented src/stdio/__stdio_read.c using
> > > > src/unistd/readv.c's readv function instead of calling
> > > > syscall/syscall_cp(SYS_readv, ...) in lines 20 and 24. I believe
> unistd
> > > is
> > > > the POSIX compatibility layer (correct me if I am wrong). So
> shouldn't
> > > the
> > > > C standard library, namely stdio functions like scanf eventually use
> the
> > > > unistd functions instead of using the syscall directly?
> > > >
> > >
> > > that's not how it works,
> > >
> > > unistd is no more posix than stdio
> > > they are all part of the posix api
> > >
> > > stdio functions are also defined by the
> > > c standard so in this sense it's good
> > > that the stdio implementation does not
> > > depend on the larger posix api
> > > (it only depends on the syscall api)
> > >
> > > but yes otherwise stdio could use unistd
> > > functions and then it would be a bit
> > > slower (+1 call) and +1 symbol resolution
> > > during linking i guess
> > >
> >
> > Oh k. I thought one was on top of the other. If they are all supposed to
> be
> > part of POSIX, I guess it makes more sense to avoid an extra call.
>
> It's tricky because from a _functionality_ standpoint, stdio is built
> on primitives that correspond to the low-level POSIX IO functions in
> unistd.h, but from a _standards_ standpoint, POSIX is built on top of
> plain ISO C and not the other way around.
>
> To understand why stdio functions cannot call read() or write() (or
> readv or writev), consider the following conforming C program:
>
> #include <stdio.h>
> int read()
> {
>         int c = getchar();
>         if (c==EOF) exit(0);
>         return c;
> }
> int main()
> {
>         for (;;) printf("got '%c'\n", read());
> }
>
> If getchar internally called read, you'd have infinite mutual
> recursion; even if this weren't a problem, the _semantics_ of the
> application-provided function named "read" do not match the POSIX
> semantics, so it would break.
>

Hmm, now I understand why I could never get glibc/newlib use my custom
unistd library.  Guess they are also implemented a la  musl style, using a
__syscall or some such function. It's hard to read their codes, so couldn't
figure out why my read function wasn't getting called by stdio.


>
> Even if there weren't this namespace problem with using the unistd
> functions, there are also semantic issues. Many of the syscalls made
> from stdio (open, close, ...) are cancellation points per POSIX, and
> often the cancellation behavior is undesirable in stdio. Just not
> invoking the cancellable version is cheaper than wrapping the call
> with code to change the cancellabilty status before and after the
> call.
>
> > > > This would have made my job easier because I could have just modified
> > > this
> > > > POSIX compability layer instead of scanning through the C standard
> > > library
> > > > functions and changing them one by one. Remember I have multiple
> special
> > >
> > > you are not supposed to change the functions
> > >
> > > you only need to implement the syscalls
> > > and dummy out the ones you don't use
> > > (ie. have a large switch, with a defalut: return -ENOSYS;)
>
> I would do it this way:
>
> #define __syscall0(n) __syscall_#n()
> #define __syscall1(n,a) __syscall_#n(a)
> ...
>
> Then __syscall(SYS_exit, val) expands to __syscall_SYS_exit(val), and
> as long as you implement a function __syscall_SYS_exit with the proper
> semantics, everything will work as expected.
>
> Of course another possible design for musl would have been to do this
> all the other way around: for each syscall foo, making a function
> __syscall_foo and using that for all the internals rather than using
> syscall(SYS_foo, ...). I chose the latter however because it's closer
> to the (de facto) standard way you'd use syscalls from an application,
> and because it better facilitates expanding the syscall inline (which
> usually reduces code size quite a bit; it's irrelevant to performance
> of course since syscall time is dominated by overhead entering/exiting
> kernelspace or doing the actual work in kernelspace.
>

This seems to be a nice way to do things. It also gives a compile error on
unimplemented sys calls, which is desirable for me. Thanks!