mailing list of musl libc
 help / color / mirror / code / Atom feed
* Deduplicating __NR_* and SYS_* syscall defines
@ 2016-05-10  0:26 Bobby Bingham
  2016-05-10  7:34 ` Alexander Monakov
  2016-05-11 21:53 ` Rich Felker
  0 siblings, 2 replies; 9+ messages in thread
From: Bobby Bingham @ 2016-05-10  0:26 UTC (permalink / raw)
  To: musl

During the powerpc64 review, Rich mentioned wanting to replace the
arch/*/bits/syscall.h files with .in files that would be preprocessed with
something like:

    sed -e p -e s/__NR_/SYS_/ < $< > $@

This would eliminate a lot of intra-file duplication here.

I took a look, and this won't quite work as-is, because the following
lines in the arm version would end up outputting duplicate definitions:

    #define __ARM_NR_breakpoint	0x0f0001
    #define __ARM_NR_cacheflush	0x0f0002
    #define __ARM_NR_usr26		0x0f0003
    #define __ARM_NR_usr32		0x0f0004
    #define __ARM_NR_set_tls	0x0f0005

Same thing for this line in x32:

    #define __X32_SYSCALL_BIT        0x40000000

I'm thinking something like the following awk script would work:

    {
        print
    }

    $1 ~ /^#(define|undef)$/ && $2 ~ /^__NR_/ {
        sub(/__NR_/, "SYS_", $2)
        print
    }

The handling for #undef is for the x32 file.  It looks like only the
`#undef __NR_getdents' in that file is actually necessary, and even that
could be avoided by just omitting the earlier line:

    #define __NR_getdents (__X32_SYSCALL_BIT + 78)

So maybe we can get rid of the #undefs there, and simplify the awk script
accordingly.

Thoughts on this approach?  If this sounds ok, I'll submit a patch.

--
Bobby


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Deduplicating __NR_* and SYS_* syscall defines
  2016-05-10  0:26 Deduplicating __NR_* and SYS_* syscall defines Bobby Bingham
@ 2016-05-10  7:34 ` Alexander Monakov
  2016-05-10  8:47   ` Alexander Monakov
  2016-05-11 21:53 ` Rich Felker
  1 sibling, 1 reply; 9+ messages in thread
From: Alexander Monakov @ 2016-05-10  7:34 UTC (permalink / raw)
  To: musl

On Mon, 9 May 2016, Bobby Bingham wrote:
> The handling for #undef is for the x32 file.  It looks like only the
> `#undef __NR_getdents' in that file is actually necessary, and even that
> could be avoided by just omitting the earlier line:
> 
>     #define __NR_getdents (__X32_SYSCALL_BIT + 78)
> 
> So maybe we can get rid of the #undefs there, and simplify the awk script
> accordingly.

+1, seems pointless to define something only to undefine a few lines later.
And the inconsistency between undefs (the other three undef something that
wasn't even defined earlier) is puzzling.  Finally, __NR_fadvise is not
undef'ed before definition like the other four.

> Thoughts on this approach?  If this sounds ok, I'll submit a patch.

It's possible to avoid interleaving __NR_ and SYS_ definitions and keep
current structure mostly intact:

    cp $< $@
    sed -e /^#define/s/__NR_/SYS_/p < $< >> $@

(bits/syscall.h is not included on its own, so it lacks include guards)

Thanks for looking into this!

Alexander


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Deduplicating __NR_* and SYS_* syscall defines
  2016-05-10  7:34 ` Alexander Monakov
@ 2016-05-10  8:47   ` Alexander Monakov
  0 siblings, 0 replies; 9+ messages in thread
From: Alexander Monakov @ 2016-05-10  8:47 UTC (permalink / raw)
  To: musl

On Tue, 10 May 2016, Alexander Monakov wrote:
> It's possible to avoid interleaving __NR_ and SYS_ definitions and keep
> current structure mostly intact:
> 
>     cp $< $@
>     sed -e /^#define/s/__NR_/SYS_/p < $< >> $@

    sed -n -e ...

Alexander


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Deduplicating __NR_* and SYS_* syscall defines
  2016-05-10  0:26 Deduplicating __NR_* and SYS_* syscall defines Bobby Bingham
  2016-05-10  7:34 ` Alexander Monakov
@ 2016-05-11 21:53 ` Rich Felker
  2016-05-12  0:22   ` Bobby Bingham
  1 sibling, 1 reply; 9+ messages in thread
From: Rich Felker @ 2016-05-11 21:53 UTC (permalink / raw)
  To: musl

On Mon, May 09, 2016 at 07:26:37PM -0500, Bobby Bingham wrote:
> During the powerpc64 review, Rich mentioned wanting to replace the
> arch/*/bits/syscall.h files with .in files that would be preprocessed with
> something like:
> 
>     sed -e p -e s/__NR_/SYS_/ < $< > $@
> 
> This would eliminate a lot of intra-file duplication here.
> 
> I took a look, and this won't quite work as-is, because the following
> lines in the arm version would end up outputting duplicate definitions:
> 
>     #define __ARM_NR_breakpoint	0x0f0001
>     #define __ARM_NR_cacheflush	0x0f0002
>     #define __ARM_NR_usr26		0x0f0003
>     #define __ARM_NR_usr32		0x0f0004
>     #define __ARM_NR_set_tls	0x0f0005

This is easily fixed by something like:

sed -e /__NR_/p -e s/__NR_/SYS_/ < $< > $@

> Same thing for this line in x32:
> 
>     #define __X32_SYSCALL_BIT        0x40000000

In general we've tried to eliminate this sort of macro and direct-code
the values. I would be in favor of doing the same for x32 I think. But
with my fixed sed command (above) I think that change is unnecssary
and orthogonal to the deduplication.

> I'm thinking something like the following awk script would work:
> 
>     {
>         print
>     }
> 
>     $1 ~ /^#(define|undef)$/ && $2 ~ /^__NR_/ {
>         sub(/__NR_/, "SYS_", $2)
>         print
>     }
> 
> The handling for #undef is for the x32 file.  It looks like only the
> `#undef __NR_getdents' in that file is actually necessary, and even that
> could be avoided by just omitting the earlier line:
> 
>     #define __NR_getdents (__X32_SYSCALL_BIT + 78)

I don't see why any #undef is needed here; this looks like leftover
cruft that was not properly cleaned up. All the logic for replacing
syscall numbers belongs in src/internal/syscall.h or
arch/$ARCH/syscall_arch.h, I think.

> So maybe we can get rid of the #undefs there, and simplify the awk script
> accordingly.
> 
> Thoughts on this approach?  If this sounds ok, I'll submit a patch.

I'd rather use sed than awk if possible since it's more universally
available and understood.

Rich


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Deduplicating __NR_* and SYS_* syscall defines
  2016-05-11 21:53 ` Rich Felker
@ 2016-05-12  0:22   ` Bobby Bingham
  2016-05-12  0:57     ` Rich Felker
  0 siblings, 1 reply; 9+ messages in thread
From: Bobby Bingham @ 2016-05-12  0:22 UTC (permalink / raw)
  To: musl

On Wed, May 11, 2016 at 05:53:12PM -0400, Rich Felker wrote:
> On Mon, May 09, 2016 at 07:26:37PM -0500, Bobby Bingham wrote:
> > During the powerpc64 review, Rich mentioned wanting to replace the
> > arch/*/bits/syscall.h files with .in files that would be preprocessed with
> > something like:
> > 
> >     sed -e p -e s/__NR_/SYS_/ < $< > $@
> > 
> > This would eliminate a lot of intra-file duplication here.
> > 
> > I took a look, and this won't quite work as-is, because the following
> > lines in the arm version would end up outputting duplicate definitions:
> > 
> >     #define __ARM_NR_breakpoint	0x0f0001
> >     #define __ARM_NR_cacheflush	0x0f0002
> >     #define __ARM_NR_usr26		0x0f0003
> >     #define __ARM_NR_usr32		0x0f0004
> >     #define __ARM_NR_set_tls	0x0f0005
> 
> This is easily fixed by something like:
> 
> sed -e /__NR_/p -e s/__NR_/SYS_/ < $< > $@

Neat.  I didn't know about /p.

Any objection to using Alexander's approach to avoid interleaving the
__NR_* and SYS_* lines?

> 
> > Same thing for this line in x32:
> > 
> >     #define __X32_SYSCALL_BIT        0x40000000
> 
> In general we've tried to eliminate this sort of macro and direct-code
> the values. I would be in favor of doing the same for x32 I think. But
> with my fixed sed command (above) I think that change is unnecssary
> and orthogonal to the deduplication.

I'll submit a separate patch to clean this up.

> 
> > I'm thinking something like the following awk script would work:
> > 
> >     {
> >         print
> >     }
> > 
> >     $1 ~ /^#(define|undef)$/ && $2 ~ /^__NR_/ {
> >         sub(/__NR_/, "SYS_", $2)
> >         print
> >     }
> > 
> > The handling for #undef is for the x32 file.  It looks like only the
> > `#undef __NR_getdents' in that file is actually necessary, and even that
> > could be avoided by just omitting the earlier line:
> > 
> >     #define __NR_getdents (__X32_SYSCALL_BIT + 78)
> 
> I don't see why any #undef is needed here; this looks like leftover
> cruft that was not properly cleaned up. All the logic for replacing
> syscall numbers belongs in src/internal/syscall.h or
> arch/$ARCH/syscall_arch.h, I think.

Ok.

> 
> > So maybe we can get rid of the #undefs there, and simplify the awk script
> > accordingly.
> > 
> > Thoughts on this approach?  If this sounds ok, I'll submit a patch.
> 
> I'd rather use sed than awk if possible since it's more universally
> available and understood.

I must be the exception then :)

Bobby


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Deduplicating __NR_* and SYS_* syscall defines
  2016-05-12  0:22   ` Bobby Bingham
@ 2016-05-12  0:57     ` Rich Felker
  2016-05-12  3:56       ` Bobby Bingham
  2016-05-13 10:15       ` Szabolcs Nagy
  0 siblings, 2 replies; 9+ messages in thread
From: Rich Felker @ 2016-05-12  0:57 UTC (permalink / raw)
  To: musl

On Wed, May 11, 2016 at 07:22:30PM -0500, Bobby Bingham wrote:
> On Wed, May 11, 2016 at 05:53:12PM -0400, Rich Felker wrote:
> > On Mon, May 09, 2016 at 07:26:37PM -0500, Bobby Bingham wrote:
> > > During the powerpc64 review, Rich mentioned wanting to replace the
> > > arch/*/bits/syscall.h files with .in files that would be preprocessed with
> > > something like:
> > > 
> > >     sed -e p -e s/__NR_/SYS_/ < $< > $@
> > > 
> > > This would eliminate a lot of intra-file duplication here.
> > > 
> > > I took a look, and this won't quite work as-is, because the following
> > > lines in the arm version would end up outputting duplicate definitions:
> > > 
> > >     #define __ARM_NR_breakpoint	0x0f0001
> > >     #define __ARM_NR_cacheflush	0x0f0002
> > >     #define __ARM_NR_usr26		0x0f0003
> > >     #define __ARM_NR_usr32		0x0f0004
> > >     #define __ARM_NR_set_tls	0x0f0005
> > 
> > This is easily fixed by something like:
> > 
> > sed -e /__NR_/p -e s/__NR_/SYS_/ < $< > $@
> 
> Neat.  I didn't know about /p.

The p command just prints the pattern space. The trick is that we
print it an extra time before the s command, but only if the line
matches something the s command is going to change.

> Any objection to using Alexander's approach to avoid interleaving the
> __NR_* and SYS_* lines?

I'm indifferent to the interleaving, but if we take that approach, we
should make sure that the rules are written such that interrupting the
make process between the commands doesn't leave a partial file that
subsequent runs of make think is complete. It might be ok as-is if
make automatically deletes the target on error producing it; otherwise
we might need a temp file that's moved into place at the end. I always
forge how this aspect of make works...

> > > Same thing for this line in x32:
> > > 
> > >     #define __X32_SYSCALL_BIT        0x40000000
> > 
> > In general we've tried to eliminate this sort of macro and direct-code
> > the values. I would be in favor of doing the same for x32 I think. But
> > with my fixed sed command (above) I think that change is unnecssary
> > and orthogonal to the deduplication.
> 
> I'll submit a separate patch to clean this up.

Any thoughts on how it should be done? If it were a clean decimal
constant like on mips I'd just write each as a single integer literal
(e.g. 6001, etc.) but since the syscall numbers are normally thought
of as decimal whereas the x32 offset is hex/bit value, it seems + or |
is still needed.

> > > I'm thinking something like the following awk script would work:
> > > 
> > >     {
> > >         print
> > >     }
> > > 
> > >     $1 ~ /^#(define|undef)$/ && $2 ~ /^__NR_/ {
> > >         sub(/__NR_/, "SYS_", $2)
> > >         print
> > >     }
> > > 
> > > The handling for #undef is for the x32 file.  It looks like only the
> > > `#undef __NR_getdents' in that file is actually necessary, and even that
> > > could be avoided by just omitting the earlier line:
> > > 
> > >     #define __NR_getdents (__X32_SYSCALL_BIT + 78)
> > 
> > I don't see why any #undef is needed here; this looks like leftover
> > cruft that was not properly cleaned up. All the logic for replacing
> > syscall numbers belongs in src/internal/syscall.h or
> > arch/$ARCH/syscall_arch.h, I think.
> 
> Ok.

Can you verify that removing these #undef-and-redefine lines from the
bits header doesn't change the result (i.e. it's redundant with other
files) for building libc?

> > > So maybe we can get rid of the #undefs there, and simplify the awk script
> > > accordingly.
> > > 
> > > Thoughts on this approach?  If this sounds ok, I'll submit a patch.
> > 
> > I'd rather use sed than awk if possible since it's more universally
> > available and understood.
> 
> I must be the exception then :)

:-)

Rich


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Deduplicating __NR_* and SYS_* syscall defines
  2016-05-12  0:57     ` Rich Felker
@ 2016-05-12  3:56       ` Bobby Bingham
  2016-05-13 10:15       ` Szabolcs Nagy
  1 sibling, 0 replies; 9+ messages in thread
From: Bobby Bingham @ 2016-05-12  3:56 UTC (permalink / raw)
  To: musl

On Wed, May 11, 2016 at 08:57:12PM -0400, Rich Felker wrote:
> The p command just prints the pattern space. The trick is that we
> print it an extra time before the s command, but only if the line
> matches something the s command is going to change.
> 
> > Any objection to using Alexander's approach to avoid interleaving the
> > __NR_* and SYS_* lines?
> 
> I'm indifferent to the interleaving, but if we take that approach, we
> should make sure that the rules are written such that interrupting the
> make process between the commands doesn't leave a partial file that
> subsequent runs of make think is complete. It might be ok as-is if
> make automatically deletes the target on error producing it; otherwise
> we might need a temp file that's moved into place at the end. I always
> forge how this aspect of make works...

Make does what you want by default.

https://www.gnu.org/software/make/manual/html_node/Interrupts.html

> 
> > > > Same thing for this line in x32:
> > > > 
> > > >     #define __X32_SYSCALL_BIT        0x40000000
> > > 
> > > In general we've tried to eliminate this sort of macro and direct-code
> > > the values. I would be in favor of doing the same for x32 I think. But
> > > with my fixed sed command (above) I think that change is unnecssary
> > > and orthogonal to the deduplication.
> > 
> > I'll submit a separate patch to clean this up.
> 
> Any thoughts on how it should be done? If it were a clean decimal
> constant like on mips I'd just write each as a single integer literal
> (e.g. 6001, etc.) but since the syscall numbers are normally thought
> of as decimal whereas the x32 offset is hex/bit value, it seems + or |
> is still needed.

I agree.  I was planning to use +.

> 
> > > > I'm thinking something like the following awk script would work:
> > > > 
> > > >     {
> > > >         print
> > > >     }
> > > > 
> > > >     $1 ~ /^#(define|undef)$/ && $2 ~ /^__NR_/ {
> > > >         sub(/__NR_/, "SYS_", $2)
> > > >         print
> > > >     }
> > > > 
> > > > The handling for #undef is for the x32 file.  It looks like only the
> > > > `#undef __NR_getdents' in that file is actually necessary, and even that
> > > > could be avoided by just omitting the earlier line:
> > > > 
> > > >     #define __NR_getdents (__X32_SYSCALL_BIT + 78)
> > > 
> > > I don't see why any #undef is needed here; this looks like leftover
> > > cruft that was not properly cleaned up. All the logic for replacing
> > > syscall numbers belongs in src/internal/syscall.h or
> > > arch/$ARCH/syscall_arch.h, I think.
> > 
> > Ok.
> 
> Can you verify that removing these #undef-and-redefine lines from the
> bits header doesn't change the result (i.e. it's redundant with other
> files) for building libc?

I just verified, the generated code for libc is the same either way.

> 
> > > > So maybe we can get rid of the #undefs there, and simplify the awk script
> > > > accordingly.
> > > > 
> > > > Thoughts on this approach?  If this sounds ok, I'll submit a patch.
> > > 
> > > I'd rather use sed than awk if possible since it's more universally
> > > available and understood.
> > 
> > I must be the exception then :)
> 
> :-)
> 
> Rich


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Deduplicating __NR_* and SYS_* syscall defines
  2016-05-12  0:57     ` Rich Felker
  2016-05-12  3:56       ` Bobby Bingham
@ 2016-05-13 10:15       ` Szabolcs Nagy
  2016-05-13 16:02         ` Rich Felker
  1 sibling, 1 reply; 9+ messages in thread
From: Szabolcs Nagy @ 2016-05-13 10:15 UTC (permalink / raw)
  To: musl

* Rich Felker <dalias@libc.org> [2016-05-11 20:57:12 -0400]:
> On Wed, May 11, 2016 at 07:22:30PM -0500, Bobby Bingham wrote:
> > On Wed, May 11, 2016 at 05:53:12PM -0400, Rich Felker wrote:
> > > On Mon, May 09, 2016 at 07:26:37PM -0500, Bobby Bingham wrote:
> > > > During the powerpc64 review, Rich mentioned wanting to replace the
> > > > arch/*/bits/syscall.h files with .in files that would be preprocessed with
> > > > something like:
> > > > 
> > > >     sed -e p -e s/__NR_/SYS_/ < $< > $@
> > > > 
> > > > This would eliminate a lot of intra-file duplication here.

i don't know how much is it worth to keep the syntax consistent
with kernel headers, i'd just clean these up with special syntax
like TYPEDEF in alltypes e.g.

SYS write 1

with consistent whitespace, removed comments etc.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Deduplicating __NR_* and SYS_* syscall defines
  2016-05-13 10:15       ` Szabolcs Nagy
@ 2016-05-13 16:02         ` Rich Felker
  0 siblings, 0 replies; 9+ messages in thread
From: Rich Felker @ 2016-05-13 16:02 UTC (permalink / raw)
  To: musl

On Fri, May 13, 2016 at 12:15:48PM +0200, Szabolcs Nagy wrote:
> * Rich Felker <dalias@libc.org> [2016-05-11 20:57:12 -0400]:
> > On Wed, May 11, 2016 at 07:22:30PM -0500, Bobby Bingham wrote:
> > > On Wed, May 11, 2016 at 05:53:12PM -0400, Rich Felker wrote:
> > > > On Mon, May 09, 2016 at 07:26:37PM -0500, Bobby Bingham wrote:
> > > > > During the powerpc64 review, Rich mentioned wanting to replace the
> > > > > arch/*/bits/syscall.h files with .in files that would be preprocessed with
> > > > > something like:
> > > > > 
> > > > >     sed -e p -e s/__NR_/SYS_/ < $< > $@
> > > > > 
> > > > > This would eliminate a lot of intra-file duplication here.
> 
> i don't know how much is it worth to keep the syntax consistent
> with kernel headers, i'd just clean these up with special syntax
> like TYPEDEF in alltypes e.g.
> 
> SYS write 1

This does save some space (maybe 25-30% size reduction?) but I think
it also comes at the sense of some flexibility and readability. For
example if there are ever archs where the definitions need to depend
on some compile-time macros (e.g. if we had an arch where it makes
sense to put a couple ABI variants with different syscalls together as
subarchs rather than independent archs) there becomes a question of
how to format those conditions and whether their directives would be
misinterpreted. Having #define on every line so that the file as-is is
valid preprocessor input avoids any such issues.

Of course these things are also a reason to possibly prefer the form
with SYS_ and __NR_ interleaved (my original sed command). With them
separate like the current proposed patch, anything inside a
conditional block or dependent on interleaved redefinition of other
macros would break.

> with consistent whitespace, removed comments etc.

Agree on consistent whitespace (and we should do that regardless of
the rest of this proposal), but note that it can't necessarily be just
3 space-delimited tokens. Some archs have arithmetic expressions for
the integer field, and requiring that not to contain whitespace would
be rather arbitrary and confusing, so the last field would probably
need to be the whole remainder of the line.

Rich


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-05-13 16:02 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-05-10  0:26 Deduplicating __NR_* and SYS_* syscall defines Bobby Bingham
2016-05-10  7:34 ` Alexander Monakov
2016-05-10  8:47   ` Alexander Monakov
2016-05-11 21:53 ` Rich Felker
2016-05-12  0:22   ` Bobby Bingham
2016-05-12  0:57     ` Rich Felker
2016-05-12  3:56       ` Bobby Bingham
2016-05-13 10:15       ` Szabolcs Nagy
2016-05-13 16:02         ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).