mailing list of musl libc
 help / color / mirror / code / Atom feed
* [PATCH v8] Build process uses script to add CFI directives to x86 asm
@ 2015-06-05  8:39 Alex Dowad
  2015-06-14  4:37 ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Alex Dowad @ 2015-06-05  8:39 UTC (permalink / raw)
  To: musl

Some functions implemented in asm need to use EBP for purposes other than acting
as a frame pointer. (Notably, it is used for the 6th argument to syscalls with
6 arguments.) Without frame pointers, GDB can only show backtraces if it gets
CFI information from a .debug_frame or .eh_frame ELF section.

Rather than littering our asm with ugly .cfi directives, use an awk script to
insert them in the right places during the build process, so GDB can keep track of
where the current stack frame is relative to the stack pointer. This means GDB can
produce beautiful stack traces at any given point when single-stepping through asm
functions.

Additionally, when registers are saved on the stack and later overwritten, emit
.cfi directives so GDB will know where they were saved relative to the stack
pointer. This way, when you look back up the stack from within an asm function,
you can still reliably print the values of local variables in the caller.

If this awk script were to understand every possible wild and crazy contortion that
an asm programmer can do with the stack and registers, and always emit the exact
.cfi directives needed for GDB to know what the register values were in the
preceding stack frame, it would necessarily be as complex as a full x86 emulator.
That way lies madness.

Hence, we assume that the stack pointer will _only_ ever be adjusted using push/pop
or else add/sub with a constant. We do not attempt to detect every possible way that
a register value could be saved for later use, just the simple and common ways.

Thanks to Szabolcs Nagy for suggesting numerous improvements to this code.
---

Dear musl devs,

Fixed one bug. Otherwise everything looks good in testing.

Thanks, AD

 Makefile               |  12 ++-
 configure              |  20 +++++
 tools/add-cfi.i386.awk | 227 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 257 insertions(+), 2 deletions(-)
 create mode 100644 tools/add-cfi.i386.awk

diff --git a/Makefile b/Makefile
index 2eb7b30..9b55fd8 100644
--- a/Makefile
+++ b/Makefile
@@ -120,7 +120,11 @@ $(foreach s,$(wildcard src/*/$(ARCH)*/*.s),$(eval $(call mkasmdep,$(s))))
 	$(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $(dir $<)$(shell cat $<)
 
 %.o: $(ARCH)/%.s
-	$(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $<
+ifeq ($(ADD_CFI),yes)
+	LC_ALL=C awk -f tools/add-cfi.$(ARCH).awk $< | $(CC) $(ASFLAGS) -x assembler -c -o $@ -
+else
+	$(CC) $(ASFLAGS) -c -o $@ $<
+endif
 
 %.o: %.c $(GENH) $(IMPH)
 	$(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $<
@@ -129,7 +133,11 @@ $(foreach s,$(wildcard src/*/$(ARCH)*/*.s),$(eval $(call mkasmdep,$(s))))
 	$(CC) $(CFLAGS_ALL_SHARED) -c -o $@ $(dir $<)$(shell cat $<)
 
 %.lo: $(ARCH)/%.s
-	$(CC) $(CFLAGS_ALL_SHARED) -c -o $@ $<
+ifeq ($(ADD_CFI),yes)
+	LC_ALL=C awk -f tools/add-cfi.$(ARCH).awk $< | $(CC) $(ASFLAGS) -x assembler -c -o $@ -
+else
+	$(CC) $(ASFLAGS) -c -o $@ $<
+endif
 
 %.lo: %.c $(GENH) $(IMPH)
 	$(CC) $(CFLAGS_ALL_SHARED) -c -o $@ $<
diff --git a/configure b/configure
index 7b29ae4..5d15a2a 100755
--- a/configure
+++ b/configure
@@ -116,6 +116,7 @@ CFLAGS_NOSSP=
 CFLAGS_TRY=
 LDFLAGS_AUTO=
 LDFLAGS_TRY=
+ASFLAGS=
 OPTIMIZE_GLOBS=
 prefix=/usr/local/musl
 exec_prefix='$(prefix)'
@@ -327,6 +328,23 @@ tryflag CFLAGS_MEMOPS -fno-tree-loop-distribute-patterns
 test "$debug" = yes && CFLAGS_AUTO=-g
 
 #
+# Preprocess asm files to add extra debugging information if debug is
+# enabled, our assembler supports the needed directives, and the
+# preprocessing script has been written for our architecture.
+#
+printf "checking whether we should preprocess assembly to add debugging information... "
+if fnmatch '-g*|*\ -g*' "$CFLAGS_AUTO" &&
+   test -f "tools/add-cfi.$ARCH.awk" &&
+   echo ".cfi_startproc
+.cfi_endproc" | $CC -x assembler -c -o /dev/null -
+then
+  ADD_CFI=yes
+else
+  ADD_CFI=no
+fi
+printf "%s\n" "$ADD_CFI"
+
+#
 # Possibly add a -O option to CFLAGS and select modules to optimize with
 # -O3 based on the status of --enable-optimize and provided CFLAGS.
 #
@@ -577,9 +595,11 @@ CFLAGS_MEMOPS = $CFLAGS_MEMOPS
 CFLAGS_NOSSP = $CFLAGS_NOSSP
 CPPFLAGS = $CPPFLAGS
 LDFLAGS = $LDFLAGS_AUTO $LDFLAGS
+ASFLAGS = $ASFLAGS
 CROSS_COMPILE = $CROSS_COMPILE
 LIBCC = $LIBCC
 OPTIMIZE_GLOBS = $OPTIMIZE_GLOBS
+ADD_CFI = $ADD_CFI
 EOF
 test "x$static" = xno && echo "STATIC_LIBS ="
 test "x$shared" = xno && echo "SHARED_LIBS ="
diff --git a/tools/add-cfi.i386.awk b/tools/add-cfi.i386.awk
new file mode 100644
index 0000000..02686b9
--- /dev/null
+++ b/tools/add-cfi.i386.awk
@@ -0,0 +1,227 @@
+# Insert GAS CFI directives ("control frame information") into x86-32 asm input
+#
+# CFI directives tell the assembler how to generate "stack frame" debug info
+# This information can tell a debugger (like gdb) how to find the current stack
+#   frame at any point in the program code, and how to find the values which
+#   various registers had at higher points in the call stack
+# With this information, the debugger can show a backtrace, and you can move up
+#   and down the call stack and examine the values of local variables
+
+BEGIN {
+  # don't put CFI data in the .eh_frame ELF section (which we don't keep)
+  print ".cfi_sections .debug_frame"
+
+  # only emit CFI directives inside a function
+  in_function = 0
+
+  # emit .loc directives with line numbers from original source
+  printf ".file 1 \"%s\"\n", ARGV[1]
+  line_number = 0
+
+  # used to detect "call label; label:" trick
+  called = ""
+}
+
+function hex2int(str,   i) {
+  str = tolower(str)
+
+  for (i = 1; i <= 16; i++) {
+    char = substr("0123456789abcdef", i, 1)
+    lookup[char] = i-1
+  }
+
+  result = 0
+  for (i = 1; i <= length(str); i++) {
+    result = result * 16
+    char   = substr(str, i, 1)
+    result = result + lookup[char]
+  }
+  return result
+}
+
+function parse_const(str) {
+  sign = sub(/^-/, "", str)
+  hex  = sub(/^0x/, "", str)
+  if (hex)
+    n = hex2int(str)
+  else
+    n = str+0
+  return sign ? -n : n
+}
+
+function get_const1() {
+  # for instructions with 2 operands, get 1st operand (assuming it is constant)
+  match($0, /-?(0x[0-9a-fA-F]+|[0-9]+),/)
+  return parse_const(substr($0, RSTART, RLENGTH-1))
+}
+function get_reg() {
+  # only use if you already know there is 1 and only 1 register
+  match($0, /%e(ax|bx|cx|dx|si|di|bp)/)
+  return substr($0, RSTART+1, 3)
+}
+function get_reg1() {
+  # for instructions with 2 operands, get 1st operand (assuming it is register)
+  match($0, /%e(ax|bx|cx|dx|si|di|bp),/)
+  return substr($0, RSTART+1, 3)
+}
+function get_reg2() {
+  # for instructions with 2 operands, get 2nd operand (assuming it is register)
+  match($0, /,%e(ax|bx|cx|dx|si|di|bp)/)
+  return substr($0, RSTART+RLENGTH-3, 3)
+}
+
+function adjust_sp_offset(delta) {
+  if (in_function)
+    printf ".cfi_adjust_cfa_offset %d\n", delta
+}
+
+{
+  line_number = line_number + 1
+
+  # clean the input up before doing anything else
+  # delete comments
+  gsub(/(#|\/\/).*/, "")
+
+  # canonicalize whitespace
+  gsub(/[ \t]+/, " ") # mawk doesn't understand \s
+  gsub(/ *, */, ",")
+  gsub(/ *: */, ": ")
+  gsub(/ $/, "")
+  gsub(/^ /, "")
+}
+
+# check for assembler directives which we care about
+/^\.(section|data|text)/ {
+  # a .cfi_startproc/.cfi_endproc pair should be within the same section
+  # otherwise, clang will choke when generating ELF output
+  if (in_function) {
+    print ".cfi_endproc"
+    in_function = 0
+  }
+}
+/^\.globa?l +[a-zA-Z0-9_]+/ {
+  globals[$2] = 1
+}
+# not interested in assembler directives beyond this, just pass them through
+/^\./ {
+  print
+  next
+}
+
+/^[a-zA-Z0-9_]+:/ {
+  label = substr($1, 1, length($1)-1) # drop trailing :
+
+  if (called == label) {
+    # note adjustment of stack pointer from "call label; label:"
+    adjust_sp_offset(4)
+  }
+
+  if (globals[label]) {
+    if (in_function)
+      print ".cfi_endproc"
+
+    in_function = 1
+    print ".cfi_startproc"
+
+    for (register in saved)
+      delete saved[register]
+    for (register in dirty)
+      delete dirty[register]
+  }
+
+  # an instruction may follow on the same line, so continue processing
+}
+
+/^$/ { next }
+
+{
+  called = ""
+  printf ".loc 1 %d\n", line_number
+  print
+}
+
+# KEEPING UP WITH THE STACK POINTER
+# We do NOT attempt to understand foolish and ridiculous tricks like stashing
+#   the stack pointer and then using %esp as a scratch register, or bitshifting
+#   it or taking its square root or anything stupid like that.
+# %esp should only be adjusted by pushing/popping or adding/subtracting constants
+#
+/pushl?/ {
+  if (match($0, / %(ax|bx|cx|dx|di|si|bp|sp)/))
+    adjust_sp_offset(2)
+  else
+    adjust_sp_offset(4)
+}
+/popl?/ {
+  if (match($0, / %(ax|bx|cx|dx|di|si|bp|sp)/))
+    adjust_sp_offset(-2)
+  else
+    adjust_sp_offset(-4)
+}
+/addl? \$-?(0x[0-9a-fA-F]+|[0-9]+),%esp/ { adjust_sp_offset(-get_const1()) }
+/subl? \$-?(0x[0-9a-fA-F]+|[0-9]+),%esp/ { adjust_sp_offset(get_const1()) }
+
+/call/ {
+  if (match($0, /call [0-9]+f/)) # "forward" label
+    called = substr($0, RSTART+5, RLENGTH-6)
+  else if (match($0, /call [0-9a-zA-Z_]+/))
+    called = substr($0, RSTART+5, RLENGTH-5)
+}
+
+# TRACKING REGISTER VALUES FROM THE PREVIOUS STACK FRAME
+#
+/pushl? %e(ax|bx|cx|dx|si|di|bp)/ { # don't match "push (%reg)"
+  # if a register is being pushed, and its value has not changed since the
+  #   beginning of this function, the pushed value can be used when printing
+  #   local variables at the next level up the stack
+  # emit '.cfi_rel_offset' for that
+
+  if (in_function) {
+    register = get_reg()
+    if (!saved[register] && !dirty[register]) {
+      printf ".cfi_rel_offset %s,0\n", register
+      saved[register] = 1
+    }
+  }
+}
+
+/movl? %e(ax|bx|cx|dx|si|di|bp),-?(0x[0-9a-fA-F]+|[0-9]+)?\(%esp\)/ {
+  if (in_function) {
+    register = get_reg()
+    if (match($0, /-?(0x[0-9a-fA-F]+|[0-9]+)\(%esp\)/)) {
+      offset = parse_const(substr($0, RSTART, RLENGTH-6))
+    } else {
+      offset = 0
+    }
+    if (!saved[register] && !dirty[register]) {
+      printf ".cfi_rel_offset %s,%d\n", register, offset
+      saved[register] = 1
+    }
+  }
+}
+
+# IF REGISTER VALUES ARE UNCEREMONIOUSLY TRASHED
+# ...then we want to know about it.
+#
+function trashed(register) {
+  if (in_function && !saved[register] && !dirty[register]) {
+    printf ".cfi_undefined %s\n", register
+  }
+  dirty[register] = 1
+}
+# this does NOT exhaustively check for all possible instructions which could
+# overwrite a register value inherited from the caller (just the common ones)
+/mov.*,%e(ax|bx|cx|dx|si|di|bp)/  { trashed(get_reg2()) }
+/(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr) %e(ax|bx|cx|dx|si|di|bp),/ {
+  trashed(get_reg1())
+}
+/i?mul [^,]*$/                    { trashed("eax"); trashed("edx") }
+/i?mul %e(ax|bx|cx|dx|si|di|bp),/ { trashed(get_reg1()) }
+/i?div/                           { trashed("eax"); trashed("edx") }
+/(dec|inc|not|neg|pop) %e(ax|bx|cx|dx|si|di|bp)/  { trashed(get_reg()) }
+/cpuid/ { trashed("eax"); trashed("ebx"); trashed("ecx"); trashed("edx") }
+
+END {
+  if (in_function)
+    print ".cfi_endproc"
+}
-- 
2.0.0.GIT



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v8] Build process uses script to add CFI directives to x86 asm
  2015-06-05  8:39 [PATCH v8] Build process uses script to add CFI directives to x86 asm Alex Dowad
@ 2015-06-14  4:37 ` Rich Felker
  2015-06-14 19:06   ` Alex
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2015-06-14  4:37 UTC (permalink / raw)
  To: musl

On Fri, Jun 05, 2015 at 10:39:18AM +0200, Alex Dowad wrote:
> Dear musl devs,
> 
> Fixed one bug. Otherwise everything looks good in testing.
> 
> Thanks, AD

Sorry it's taken me a while to get back to this. I'm working on the
nommu/sh2 stuff, byte-based C locale, and several other things that
have come up, but I definitely want to get to the CFI patch in this
release cycle. A few comments:

>  Makefile               |  12 ++-
>  configure              |  20 +++++
>  tools/add-cfi.i386.awk | 227 +++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 257 insertions(+), 2 deletions(-)
>  create mode 100644 tools/add-cfi.i386.awk
> 
> diff --git a/Makefile b/Makefile
> index 2eb7b30..9b55fd8 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -120,7 +120,11 @@ $(foreach s,$(wildcard src/*/$(ARCH)*/*.s),$(eval $(call mkasmdep,$(s))))
>  	$(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $(dir $<)$(shell cat $<)
>  
>  %.o: $(ARCH)/%.s
> -	$(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $<
> +ifeq ($(ADD_CFI),yes)
> +	LC_ALL=C awk -f tools/add-cfi.$(ARCH).awk $< | $(CC) $(ASFLAGS) -x assembler -c -o $@ -
> +else
> +	$(CC) $(ASFLAGS) -c -o $@ $<
> +endif

Removing $(CFLAGS_STATIC_ALL) here is a regression. -Wa,--noexecstack
is necessary to prevent the kernel from giving us an executable stack
when asm files are linked. We could move it to a separate ASFLAGS, but
the patch doesn't do this, and unless there's a real need to avoid
passing CFLAGS, I'd rather not add more vars. (In this case, needing
the new var would be a silent security regression for anyone building
without re-running configure.)

As for the naming (tools/add-cfi.$(ARCH).awk), I'm not opposed to this
and the configure test for it is nice, but I wonder if there will be
significant code duplication between versions of this script for
different archs that would make it preferable to take the arch as an
argument. What do you think? Or does awk have an easy #include-like
mechanism?

>  #
> +# Preprocess asm files to add extra debugging information if debug is
> +# enabled, our assembler supports the needed directives, and the
> +# preprocessing script has been written for our architecture.
> +#
> +printf "checking whether we should preprocess assembly to add debugging information... "
> +if fnmatch '-g*|*\ -g*' "$CFLAGS_AUTO" &&
> +   test -f "tools/add-cfi.$ARCH.awk" &&
> +   echo ".cfi_startproc
> +.cfi_endproc" | $CC -x assembler -c -o /dev/null -
> +then
> +  ADD_CFI=yes
> +else
> +  ADD_CFI=no
> +fi
> +printf "%s\n" "$ADD_CFI"
> +
> +#

This test looks nice and robust. I'd mildly prefer:

  printf '.cfi_startproc\n.cfi_endproc\n'

to avoid the multi-line string with echo, but that's a tiny detail.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v8] Build process uses script to add CFI directives to x86 asm
  2015-06-14  4:37 ` Rich Felker
@ 2015-06-14 19:06   ` Alex
  2015-06-15  3:26     ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Alex @ 2015-06-14 19:06 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 4250 bytes --]

Thanks for the reply! Comments below:

On Sun, Jun 14, 2015 at 6:37 AM, Rich Felker <dalias@libc.org> wrote:

> On Fri, Jun 05, 2015 at 10:39:18AM +0200, Alex Dowad wrote:
> > diff --git a/Makefile b/Makefile
> > index 2eb7b30..9b55fd8 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -120,7 +120,11 @@ $(foreach s,$(wildcard src/*/$(ARCH)*/*.s),$(eval
> $(call mkasmdep,$(s))))
> >       $(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $(dir $<)$(shell cat $<)
> >
> >  %.o: $(ARCH)/%.s
> > -     $(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $<
> > +ifeq ($(ADD_CFI),yes)
> > +     LC_ALL=C awk -f tools/add-cfi.$(ARCH).awk $< | $(CC) $(ASFLAGS) -x
> assembler -c -o $@ -
> > +else
> > +     $(CC) $(ASFLAGS) -c -o $@ $<
> > +endif
>
> Removing $(CFLAGS_STATIC_ALL) here is a regression. -Wa,--noexecstack
> is necessary to prevent the kernel from giving us an executable stack
> when asm files are linked. We could move it to a separate ASFLAGS, but
> the patch doesn't do this, and unless there's a real need to avoid
> passing CFLAGS, I'd rather not add more vars. (In this case, needing
> the new var would be a silent security regression for anyone building
> without re-running configure.)
>

The reason for not passing CFLAGS is because clang chokes on "-g" when
assembling code with CFI directives. I also thought that ASFLAGS might be a
useful customization point for people who want to edit config.mak to create
a custom build. But you are the judge of that.

Since it seems that CFLAGS is needed, would it be acceptable to bypass the
issue by saying that clang users simply won't be able to do debug builds of
musl until their compiler is fixed? The current state of LLVM's CFI
generation is so bad that debug builds probably won't be useful anyways.

If that is a sticking point, I might put together a patch for LLVM and see
if they want it. Unfortunately, I have already discovered a bunch of other
problems with LLVM which would be nice to fix, but time for developing and
polishing patches is limited...

As an aside, I admire the fact that you picked up on that subtle
regression. The standard of code quality and attention to detail on this
project is very high, as compared to other open-source projects I have
worked on. Kudos to all the contributors!

As for the naming (tools/add-cfi.$(ARCH).awk), I'm not opposed to this
> and the configure test for it is nice, but I wonder if there will be
> significant code duplication between versions of this script for
> different archs that would make it preferable to take the arch as an
> argument. What do you think? Or does awk have an easy #include-like
> mechanism?
>

I'm not an AWKer, but from what I have read, apparently "awk -f script1.awk
-f script2.awk" is the equivalent of concatenating "script1.awk" and
"script2.awk", so shared functions can easily be put in a common file.

It seems that the amount of shared code will be small, however. Actually,
the entire script for x86-32 is already fairly small. I feel that anything
more sophisticated than picking a script based on arch would just be
complicating matters for little benefit.

If it turns out that I am wrong, the commonalities can be abstracted out
later. At that time, with several such preprocessing scripts available to
look at, it will be clearer what and how to abstract.


>
> >  #
> > +# Preprocess asm files to add extra debugging information if debug is
> > +# enabled, our assembler supports the needed directives, and the
> > +# preprocessing script has been written for our architecture.
> > +#
> > +printf "checking whether we should preprocess assembly to add debugging
> information... "
> > +if fnmatch '-g*|*\ -g*' "$CFLAGS_AUTO" &&
> > +   test -f "tools/add-cfi.$ARCH.awk" &&
> > +   echo ".cfi_startproc
> > +.cfi_endproc" | $CC -x assembler -c -o /dev/null -
> > +then
> > +  ADD_CFI=yes
> > +else
> > +  ADD_CFI=no
> > +fi
> > +printf "%s\n" "$ADD_CFI"
> > +
> > +#
>
> This test looks nice and robust. I'd mildly prefer:
>
>   printf '.cfi_startproc\n.cfi_endproc\n'
>
> to avoid the multi-line string with echo, but that's a tiny detail.
>

OK. It was written like this because "echo '.cfi_startproc\n.cfi_endproc'"
didn't work on BusyBox ash. But it seems that printf is fine. Will revise.

Thanks, AD

[-- Attachment #2: Type: text/html, Size: 5557 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v8] Build process uses script to add CFI directives to x86 asm
  2015-06-14 19:06   ` Alex
@ 2015-06-15  3:26     ` Rich Felker
  2015-06-15  6:42       ` Alex
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2015-06-15  3:26 UTC (permalink / raw)
  To: musl

On Sun, Jun 14, 2015 at 09:06:16PM +0200, Alex wrote:
> Thanks for the reply! Comments below:
> 
> On Sun, Jun 14, 2015 at 6:37 AM, Rich Felker <dalias@libc.org> wrote:
> 
> > On Fri, Jun 05, 2015 at 10:39:18AM +0200, Alex Dowad wrote:
> > > diff --git a/Makefile b/Makefile
> > > index 2eb7b30..9b55fd8 100644
> > > --- a/Makefile
> > > +++ b/Makefile
> > > @@ -120,7 +120,11 @@ $(foreach s,$(wildcard src/*/$(ARCH)*/*.s),$(eval
> > $(call mkasmdep,$(s))))
> > >       $(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $(dir $<)$(shell cat $<)
> > >
> > >  %.o: $(ARCH)/%.s
> > > -     $(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $<
> > > +ifeq ($(ADD_CFI),yes)
> > > +     LC_ALL=C awk -f tools/add-cfi.$(ARCH).awk $< | $(CC) $(ASFLAGS) -x
> > assembler -c -o $@ -
> > > +else
> > > +     $(CC) $(ASFLAGS) -c -o $@ $<
> > > +endif
> >
> > Removing $(CFLAGS_STATIC_ALL) here is a regression. -Wa,--noexecstack
> > is necessary to prevent the kernel from giving us an executable stack
> > when asm files are linked. We could move it to a separate ASFLAGS, but
> > the patch doesn't do this, and unless there's a real need to avoid
> > passing CFLAGS, I'd rather not add more vars. (In this case, needing
> > the new var would be a silent security regression for anyone building
> > without re-running configure.)
> >
> 
> The reason for not passing CFLAGS is because clang chokes on "-g" when
> assembling code with CFI directives. I also thought that ASFLAGS might be a
> useful customization point for people who want to edit config.mak to create
> a custom build. But you are the judge of that.
> 
> Since it seems that CFLAGS is needed, would it be acceptable to bypass the
> issue by saying that clang users simply won't be able to do debug builds of
> musl until their compiler is fixed? The current state of LLVM's CFI
> generation is so bad that debug builds probably won't be useful anyways.

Could you elaborate on what happens? I'm not opposed to this approach
as long as either (1) the configure test successfully determines that
CFI gen doesn't work on clang, or (2) the 'choking' just produces bad
CFI, but doesn't break the build.

> If that is a sticking point, I might put together a patch for LLVM and see
> if they want it. Unfortunately, I have already discovered a bunch of other
> problems with LLVM which would be nice to fix, but time for developing and
> polishing patches is limited...

Why is -g even being processes for asm? Are they trying to
auto-generate CFI when it's not present? I think this really needs to
be fixed in any case since there are plenty of .s files that _do_ have
CFI and build systems that use -g. All this points to clang's internal
assembler being not-widely-tested and not ready for serious use... :(

But another option would be just having the Makefile remove "-g*" for
asm. Obviously this is hard to make robust because technically "-g"
could be an argument to "-o" or something stupid, but our Makefile
doesn't need to be robust against arbitrary ridiculous filenames and
such... It's not like spaces work in pathnames in Makefiles anyway...
;-)

> As an aside, I admire the fact that you picked up on that subtle
> regression. The standard of code quality and attention to detail on this
> project is very high, as compared to other open-source projects I have
> worked on. Kudos to all the contributors!

Thanks!

> As for the naming (tools/add-cfi.$(ARCH).awk), I'm not opposed to this
> > and the configure test for it is nice, but I wonder if there will be
> > significant code duplication between versions of this script for
> > different archs that would make it preferable to take the arch as an
> > argument. What do you think? Or does awk have an easy #include-like
> > mechanism?
> >
> 
> I'm not an AWKer, but from what I have read, apparently "awk -f script1.awk
> -f script2.awk" is the equivalent of concatenating "script1.awk" and
> "script2.awk", so shared functions can easily be put in a common file.
> 
> It seems that the amount of shared code will be small, however. Actually,
> the entire script for x86-32 is already fairly small. I feel that anything
> more sophisticated than picking a script based on arch would just be
> complicating matters for little benefit.
> 
> If it turns out that I am wrong, the commonalities can be abstracted out
> later. At that time, with several such preprocessing scripts available to
> look at, it will be clearer what and how to abstract.

OK, this sounds fine. I just wanted to hear your opinion on it.
Apologies if you already stated it earlier and I missed it; I was
rather focused on other things at the time most of the discussion and
review happened.

> > > +# Preprocess asm files to add extra debugging information if debug is
> > > +# enabled, our assembler supports the needed directives, and the
> > > +# preprocessing script has been written for our architecture.
> > > +#
> > > +printf "checking whether we should preprocess assembly to add debugging
> > information... "
> > > +if fnmatch '-g*|*\ -g*' "$CFLAGS_AUTO" &&
> > > +   test -f "tools/add-cfi.$ARCH.awk" &&
> > > +   echo ".cfi_startproc
> > > +.cfi_endproc" | $CC -x assembler -c -o /dev/null -
> > > +then
> > > +  ADD_CFI=yes
> > > +else
> > > +  ADD_CFI=no
> > > +fi
> > > +printf "%s\n" "$ADD_CFI"
> > > +
> > > +#
> >
> > This test looks nice and robust. I'd mildly prefer:
> >
> >   printf '.cfi_startproc\n.cfi_endproc\n'
> >
> > to avoid the multi-line string with echo, but that's a tiny detail.
> >
> 
> OK. It was written like this because "echo '.cfi_startproc\n.cfi_endproc'"
> didn't work on BusyBox ash. But it seems that printf is fine. Will revise.

Yes, also musl's configure redefines echo as a shell function in terms
of printf, since echo varies widely in behavior and the standard's
text on echo is contrary to most real-world implementations... So
using printf is preferred in general anyway for non-trivial usage.

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v8] Build process uses script to add CFI directives to x86 asm
  2015-06-15  3:26     ` Rich Felker
@ 2015-06-15  6:42       ` Alex
  0 siblings, 0 replies; 5+ messages in thread
From: Alex @ 2015-06-15  6:42 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 3446 bytes --]

On Mon, Jun 15, 2015 at 5:26 AM, Rich Felker <dalias@libc.org> wrote:

> On Sun, Jun 14, 2015 at 09:06:16PM +0200, Alex wrote:
> > Thanks for the reply! Comments below:
> >
> > On Sun, Jun 14, 2015 at 6:37 AM, Rich Felker <dalias@libc.org> wrote:
> >
> > > On Fri, Jun 05, 2015 at 10:39:18AM +0200, Alex Dowad wrote:
> > > > diff --git a/Makefile b/Makefile
> > > > index 2eb7b30..9b55fd8 100644
> > > > --- a/Makefile
> > > > +++ b/Makefile
> > > > @@ -120,7 +120,11 @@ $(foreach s,$(wildcard
> src/*/$(ARCH)*/*.s),$(eval
> > > $(call mkasmdep,$(s))))
> > > >       $(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $(dir $<)$(shell cat $<)
> > > >
> > > >  %.o: $(ARCH)/%.s
> > > > -     $(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $<
> > > > +ifeq ($(ADD_CFI),yes)
> > > > +     LC_ALL=C awk -f tools/add-cfi.$(ARCH).awk $< | $(CC)
> $(ASFLAGS) -x
> > > assembler -c -o $@ -
> > > > +else
> > > > +     $(CC) $(ASFLAGS) -c -o $@ $<
> > > > +endif
> > >
> > > Removing $(CFLAGS_STATIC_ALL) here is a regression. -Wa,--noexecstack
> > > is necessary to prevent the kernel from giving us an executable stack
> > > when asm files are linked. We could move it to a separate ASFLAGS, but
> > > the patch doesn't do this, and unless there's a real need to avoid
> > > passing CFLAGS, I'd rather not add more vars. (In this case, needing
> > > the new var would be a silent security regression for anyone building
> > > without re-running configure.)
> > >
> >
> > The reason for not passing CFLAGS is because clang chokes on "-g" when
> > assembling code with CFI directives. I also thought that ASFLAGS might
> be a
> > useful customization point for people who want to edit config.mak to
> create
> > a custom build. But you are the judge of that.
> >
> > Since it seems that CFLAGS is needed, would it be acceptable to bypass
> the
> > issue by saying that clang users simply won't be able to do debug builds
> of
> > musl until their compiler is fixed? The current state of LLVM's CFI
> > generation is so bad that debug builds probably won't be useful anyways.
>
> Could you elaborate on what happens? I'm not opposed to this approach
> as long as either (1) the configure test successfully determines that
> CFI gen doesn't work on clang, or (2) the 'choking' just produces bad
> CFI, but doesn't break the build.
>

The assembler errors out and doesn't produce any output. I have made the
test in ./configure more robust now, to work around this problem. Insertion
of .cfi directives will not occur when building with clang, until it is
fixed.


> > If that is a sticking point, I might put together a patch for LLVM and
> see
> > if they want it. Unfortunately, I have already discovered a bunch of
> other
> > problems with LLVM which would be nice to fix, but time for developing
> and
> > polishing patches is limited...
>
> Why is -g even being processes for asm? Are they trying to
> auto-generate CFI when it's not present? I think this really needs to
> be fixed in any case since there are plenty of .s files that _do_ have
> CFI and build systems that use -g. All this points to clang's internal
> assembler being not-widely-tested and not ready for serious use... :(
>

GAS silently disables auto-generation of debug info as soon as it sees an
explicit debug directive. Clang gets ornery, digs its heels in, and says:
"forget it, you aren't getting nothing from me if you tell me to generate
debug info but then provide your own".

Posting v9 patch now.

[-- Attachment #2: Type: text/html, Size: 4556 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-06-15  6:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-05  8:39 [PATCH v8] Build process uses script to add CFI directives to x86 asm Alex Dowad
2015-06-14  4:37 ` Rich Felker
2015-06-14 19:06   ` Alex
2015-06-15  3:26     ` Rich Felker
2015-06-15  6:42       ` Alex

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).