From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/1510 Path: news.gmane.org!not-for-mail From: Murali Vijayaraghavan Newsgroups: gmane.linux.lib.musl.general Subject: Re: Using unistd functions vs calling syscall straight in the code Date: Sat, 11 Aug 2012 03:40:45 +0900 Message-ID: References: <20120810141613.GA20243@port70.net> <20120810145923.GB20243@port70.net> <20120810175910.GZ27715@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=f46d041825bc986f7604c6edaefd X-Trace: dough.gmane.org 1344624075 14926 80.91.229.3 (10 Aug 2012 18:41:15 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Fri, 10 Aug 2012 18:41:15 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-1511-gllmg-musl=m.gmane.org@lists.openwall.com Fri Aug 10 20:41:10 2012 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Szu8q-0003Rf-QX for gllmg-musl@plane.gmane.org; Fri, 10 Aug 2012 20:41:01 +0200 Original-Received: (qmail 14234 invoked by uid 550); 10 Aug 2012 18:40:59 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 14225 invoked from network); 10 Aug 2012 18:40:59 -0000 In-Reply-To: <20120810175910.GZ27715@brightrain.aerifal.cx> Xref: news.gmane.org gmane.linux.lib.musl.general:1510 Archived-At: --f46d041825bc986f7604c6edaefd Content-Type: text/plain; charset=ISO-8859-1 On Sat, Aug 11, 2012 at 2:59 AM, Rich Felker wrote: > On Sat, Aug 11, 2012 at 12:40:25AM +0900, Murali Vijayaraghavan wrote: > > On Fri, Aug 10, 2012 at 11:59 PM, Szabolcs Nagy wrote: > > > > > * Murali Vijayaraghavan [2012-08-10 23:32:11 > > > +0900]: > > > > For example, I could have implemented src/stdio/__stdio_read.c using > > > > src/unistd/readv.c's readv function instead of calling > > > > syscall/syscall_cp(SYS_readv, ...) in lines 20 and 24. I believe > unistd > > > is > > > > the POSIX compatibility layer (correct me if I am wrong). So > shouldn't > > > the > > > > C standard library, namely stdio functions like scanf eventually use > the > > > > unistd functions instead of using the syscall directly? > > > > > > > > > > that's not how it works, > > > > > > unistd is no more posix than stdio > > > they are all part of the posix api > > > > > > stdio functions are also defined by the > > > c standard so in this sense it's good > > > that the stdio implementation does not > > > depend on the larger posix api > > > (it only depends on the syscall api) > > > > > > but yes otherwise stdio could use unistd > > > functions and then it would be a bit > > > slower (+1 call) and +1 symbol resolution > > > during linking i guess > > > > > > > Oh k. I thought one was on top of the other. If they are all supposed to > be > > part of POSIX, I guess it makes more sense to avoid an extra call. > > It's tricky because from a _functionality_ standpoint, stdio is built > on primitives that correspond to the low-level POSIX IO functions in > unistd.h, but from a _standards_ standpoint, POSIX is built on top of > plain ISO C and not the other way around. > > To understand why stdio functions cannot call read() or write() (or > readv or writev), consider the following conforming C program: > > #include > int read() > { > int c = getchar(); > if (c==EOF) exit(0); > return c; > } > int main() > { > for (;;) printf("got '%c'\n", read()); > } > > If getchar internally called read, you'd have infinite mutual > recursion; even if this weren't a problem, the _semantics_ of the > application-provided function named "read" do not match the POSIX > semantics, so it would break. > Hmm, now I understand why I could never get glibc/newlib use my custom unistd library. Guess they are also implemented a la musl style, using a __syscall or some such function. It's hard to read their codes, so couldn't figure out why my read function wasn't getting called by stdio. > > Even if there weren't this namespace problem with using the unistd > functions, there are also semantic issues. Many of the syscalls made > from stdio (open, close, ...) are cancellation points per POSIX, and > often the cancellation behavior is undesirable in stdio. Just not > invoking the cancellable version is cheaper than wrapping the call > with code to change the cancellabilty status before and after the > call. > > > > > This would have made my job easier because I could have just modified > > > this > > > > POSIX compability layer instead of scanning through the C standard > > > library > > > > functions and changing them one by one. Remember I have multiple > special > > > > > > you are not supposed to change the functions > > > > > > you only need to implement the syscalls > > > and dummy out the ones you don't use > > > (ie. have a large switch, with a defalut: return -ENOSYS;) > > I would do it this way: > > #define __syscall0(n) __syscall_#n() > #define __syscall1(n,a) __syscall_#n(a) > ... > > Then __syscall(SYS_exit, val) expands to __syscall_SYS_exit(val), and > as long as you implement a function __syscall_SYS_exit with the proper > semantics, everything will work as expected. > > Of course another possible design for musl would have been to do this > all the other way around: for each syscall foo, making a function > __syscall_foo and using that for all the internals rather than using > syscall(SYS_foo, ...). I chose the latter however because it's closer > to the (de facto) standard way you'd use syscalls from an application, > and because it better facilitates expanding the syscall inline (which > usually reduces code size quite a bit; it's irrelevant to performance > of course since syscall time is dominated by overhead entering/exiting > kernelspace or doing the actual work in kernelspace. > This seems to be a nice way to do things. It also gives a compile error on unimplemented sys calls, which is desirable for me. Thanks! --f46d041825bc986f7604c6edaefd Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

On Sat, Aug 11, 2012 at 2:59 AM, Rich Fe= lker <dalias@aerifal.cx> wrote:
On Sat, Aug 11, 2012 at 12:40:25AM +0900, Murali Vijayaraghavan wrote:
> On Fri, Aug 10, 2012 at 11:59 PM, Szabolcs Nagy <nsz@port70.net> wrote:
>
> > * Murali Vijayaraghavan <vmurali@csail.mit.edu> [2012-08-10 23:32:11
> > +0900]:
> > > For example, I could have implemented src/stdio/__stdio_read= .c using
> > > src/unistd/readv.c's readv function instead of calling > > > syscall/syscall_cp(SYS_readv, ...) in lines 20 and 24. I bel= ieve unistd
> > is
> > > the POSIX compatibility layer (correct me if I am wrong). So= shouldn't
> > the
> > > C standard library, namely stdio functions like scanf eventu= ally use the
> > > unistd functions instead of using the syscall directly?
> > >
> >
> > that's not how it works,
> >
> > unistd is no more posix than stdio
> > they are all part of the posix api
> >
> > stdio functions are also defined by the
> > c standard so in this sense it's good
> > that the stdio implementation does not
> > depend on the larger posix api
> > (it only depends on the syscall api)
> >
> > but yes otherwise stdio could use unistd
> > functions and then it would be a bit
> > slower (+1 call) and +1 symbol resolution
> > during linking i guess
> >
>
> Oh k. I thought one was on top of the other. If they are all supposed = to be
> part of POSIX, I guess it makes more sense to avoid an extra call.

It's tricky because from a _functionality_ standpoint, stdio is built on primitives that correspond to the low-level POSIX IO functions in
unistd.h, but from a _standards_ standpoint, POSIX is built on top of
plain ISO C and not the other way around.

To understand why stdio functions cannot call read() or write() (or
readv or writev), consider the following conforming C program:

#include <stdio.h>
int read()
{
=A0 =A0 =A0 =A0 int c =3D getchar();
=A0 =A0 =A0 =A0 if (c=3D=3DEOF) exit(0);
=A0 =A0 =A0 =A0 return c;
}
int main()
{
=A0 =A0 =A0 =A0 for (;;) printf("got '%c'\n", read()); }

If getchar internally called read, you'd have infinite mutual
recursion; even if this weren't a problem, the _semantics_ of the
application-provided function named "read" do not match the POSIX=
semantics, so it would break.

Hmm, now = I understand why I could never get glibc/newlib use my custom unistd librar= y. =A0Guess they are also implemented a la =A0musl style, using a __syscall= or some such function. It's hard to read their codes, so couldn't = figure out why my read function wasn't getting called by stdio.
=A0

Even if there weren't this namespace problem with using the unistd
functions, there are also semantic issues. Many of the syscalls made
from stdio (open, close, ...) are cancellation points per POSIX, and
often the cancellation behavior is undesirable in stdio. Just not
invoking the cancellable version is cheaper than wrapping the call
with code to change the cancellabilty status before and after the
call.

> > > This would have made my job easier because I could have just= modified
> > this
> > > POSIX compability layer instead of scanning through the C st= andard
> > library
> > > functions and changing them one by one. Remember I have mult= iple special
> >
> > you are not supposed to change the functions
> >
> > you only need to implement the syscalls
> > and dummy out the ones you don't use
> > (ie. have a large switch, with a defalut: return -ENOSYS;)

I would do it this way:

#define __syscall0(n) __syscall_#n()
#define __syscall1(n,a) __syscall_#n(a)
...

Then __syscall(SYS_exit, val) expands to __syscall_SYS_exit(val), and
as long as you implement a function __syscall_SYS_exit with the proper
semantics, everything will work as expected.

Of course another possible design for musl would have been to do this
all the other way around: for each syscall foo, making a function
__syscall_foo and using that for all the internals rather than using
syscall(SYS_foo, ...). I chose the latter however because it's closer to the (de facto) standard way you'd use syscalls from an application,<= br> and because it better facilitates expanding the syscall inline (which
usually reduces code size quite a bit; it's irrelevant to performance of course since syscall time is dominated by overhead entering/exiting
kernelspace or doing the actual work in kernelspace.
<= br>
This seems to be a nice way to do things. It also gives a com= pile error on unimplemented sys calls, which is desirable for me. Thanks!
--f46d041825bc986f7604c6edaefd--