mailing list of musl libc
 help / color / mirror / code / Atom feed
* [PATCH v9] Build process uses script to add CFI directives to x86 asm
@ 2015-06-15  6:45 Alex Dowad
  2015-06-20 21:11 ` Rich Felker
  0 siblings, 1 reply; 21+ messages in thread
From: Alex Dowad @ 2015-06-15  6:45 UTC (permalink / raw)
  To: musl

Some functions implemented in asm need to use EBP for purposes other than acting
as a frame pointer. (Notably, it is used for the 6th argument to syscalls with
6 arguments.) Without frame pointers, GDB can only show backtraces if it gets
CFI information from a .debug_frame or .eh_frame ELF section.

Rather than littering our asm with ugly .cfi directives, use an awk script to
insert them in the right places during the build process, so GDB can keep track of
where the current stack frame is relative to the stack pointer. This means GDB can
produce beautiful stack traces at any given point when single-stepping through asm
functions.

Additionally, when registers are saved on the stack and later overwritten, emit
.cfi directives so GDB will know where they were saved relative to the stack
pointer. This way, when you look back up the stack from within an asm function,
you can still reliably print the values of local variables in the caller.

If this awk script were to understand every possible wild and crazy contortion that
an asm programmer can do with the stack and registers, and always emit the exact
.cfi directives needed for GDB to know what the register values were in the
preceding stack frame, it would necessarily be as complex as a full x86 emulator.
That way lies madness.

Hence, we assume that the stack pointer will _only_ ever be adjusted using push/pop
or else add/sub with a constant. We do not attempt to detect every possible way that
a register value could be saved for later use, just the simple and common ways.

Thanks to Szabolcs Nagy for suggesting numerous improvements to this code.
---
Dear musl devs,

In response to feedback from R. Felker, the test in ./configure has been made more
robust. Now it will not enable the insertion of .cfi directives when using a compiler
which errors out when -g is combined with explicit debug directives.

Since that problem has been resolved, CFLAGS is again used when assembling.

Thank you, AD

 Makefile               |   8 ++
 configure              |  17 ++++
 tools/add-cfi.i386.awk | 227 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 252 insertions(+)
 create mode 100644 tools/add-cfi.i386.awk

diff --git a/Makefile b/Makefile
index 2eb7b30..b4927f4 100644
--- a/Makefile
+++ b/Makefile
@@ -120,7 +120,11 @@ $(foreach s,$(wildcard src/*/$(ARCH)*/*.s),$(eval $(call mkasmdep,$(s))))
 	$(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $(dir $<)$(shell cat $<)
 
 %.o: $(ARCH)/%.s
+ifeq ($(ADD_CFI),yes)
+	LC_ALL=C awk -f tools/add-cfi.$(ARCH).awk $< | $(CC) $(CFLAGS_ALL_STATIC) -x assembler -c -o $@ -
+else
 	$(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $<
+endif
 
 %.o: %.c $(GENH) $(IMPH)
 	$(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $<
@@ -129,7 +133,11 @@ $(foreach s,$(wildcard src/*/$(ARCH)*/*.s),$(eval $(call mkasmdep,$(s))))
 	$(CC) $(CFLAGS_ALL_SHARED) -c -o $@ $(dir $<)$(shell cat $<)
 
 %.lo: $(ARCH)/%.s
+ifeq ($(ADD_CFI),yes)
+	LC_ALL=C awk -f tools/add-cfi.$(ARCH).awk $< | $(CC) $(CFLAGS_ALL_SHARED) -x assembler -c -o $@ -
+else
 	$(CC) $(CFLAGS_ALL_SHARED) -c -o $@ $<
+endif
 
 %.lo: %.c $(GENH) $(IMPH)
 	$(CC) $(CFLAGS_ALL_SHARED) -c -o $@ $<
diff --git a/configure b/configure
index 7b29ae4..a55acd6 100755
--- a/configure
+++ b/configure
@@ -327,6 +327,22 @@ tryflag CFLAGS_MEMOPS -fno-tree-loop-distribute-patterns
 test "$debug" = yes && CFLAGS_AUTO=-g
 
 #
+# Preprocess asm files to add extra debugging information if debug is
+# enabled, our assembler supports the needed directives, and the
+# preprocessing script has been written for our architecture.
+#
+printf "checking whether we should preprocess assembly to add debugging information... "
+if fnmatch '-g*|*\ -g*' "$CFLAGS_AUTO" &&
+   test -f "tools/add-cfi.$ARCH.awk" &&
+   printf ".file 1 \"srcfile.s\"\n.line 1\n.cfi_startproc\n.cfi_endproc" | $CC -g -x assembler -c -o /dev/null 2>/dev/null -
+then
+  ADD_CFI=yes
+else
+  ADD_CFI=no
+fi
+printf "%s\n" "$ADD_CFI"
+
+#
 # Possibly add a -O option to CFLAGS and select modules to optimize with
 # -O3 based on the status of --enable-optimize and provided CFLAGS.
 #
@@ -580,6 +596,7 @@ LDFLAGS = $LDFLAGS_AUTO $LDFLAGS
 CROSS_COMPILE = $CROSS_COMPILE
 LIBCC = $LIBCC
 OPTIMIZE_GLOBS = $OPTIMIZE_GLOBS
+ADD_CFI = $ADD_CFI
 EOF
 test "x$static" = xno && echo "STATIC_LIBS ="
 test "x$shared" = xno && echo "SHARED_LIBS ="
diff --git a/tools/add-cfi.i386.awk b/tools/add-cfi.i386.awk
new file mode 100644
index 0000000..02686b9
--- /dev/null
+++ b/tools/add-cfi.i386.awk
@@ -0,0 +1,227 @@
+# Insert GAS CFI directives ("control frame information") into x86-32 asm input
+#
+# CFI directives tell the assembler how to generate "stack frame" debug info
+# This information can tell a debugger (like gdb) how to find the current stack
+#   frame at any point in the program code, and how to find the values which
+#   various registers had at higher points in the call stack
+# With this information, the debugger can show a backtrace, and you can move up
+#   and down the call stack and examine the values of local variables
+
+BEGIN {
+  # don't put CFI data in the .eh_frame ELF section (which we don't keep)
+  print ".cfi_sections .debug_frame"
+
+  # only emit CFI directives inside a function
+  in_function = 0
+
+  # emit .loc directives with line numbers from original source
+  printf ".file 1 \"%s\"\n", ARGV[1]
+  line_number = 0
+
+  # used to detect "call label; label:" trick
+  called = ""
+}
+
+function hex2int(str,   i) {
+  str = tolower(str)
+
+  for (i = 1; i <= 16; i++) {
+    char = substr("0123456789abcdef", i, 1)
+    lookup[char] = i-1
+  }
+
+  result = 0
+  for (i = 1; i <= length(str); i++) {
+    result = result * 16
+    char   = substr(str, i, 1)
+    result = result + lookup[char]
+  }
+  return result
+}
+
+function parse_const(str) {
+  sign = sub(/^-/, "", str)
+  hex  = sub(/^0x/, "", str)
+  if (hex)
+    n = hex2int(str)
+  else
+    n = str+0
+  return sign ? -n : n
+}
+
+function get_const1() {
+  # for instructions with 2 operands, get 1st operand (assuming it is constant)
+  match($0, /-?(0x[0-9a-fA-F]+|[0-9]+),/)
+  return parse_const(substr($0, RSTART, RLENGTH-1))
+}
+function get_reg() {
+  # only use if you already know there is 1 and only 1 register
+  match($0, /%e(ax|bx|cx|dx|si|di|bp)/)
+  return substr($0, RSTART+1, 3)
+}
+function get_reg1() {
+  # for instructions with 2 operands, get 1st operand (assuming it is register)
+  match($0, /%e(ax|bx|cx|dx|si|di|bp),/)
+  return substr($0, RSTART+1, 3)
+}
+function get_reg2() {
+  # for instructions with 2 operands, get 2nd operand (assuming it is register)
+  match($0, /,%e(ax|bx|cx|dx|si|di|bp)/)
+  return substr($0, RSTART+RLENGTH-3, 3)
+}
+
+function adjust_sp_offset(delta) {
+  if (in_function)
+    printf ".cfi_adjust_cfa_offset %d\n", delta
+}
+
+{
+  line_number = line_number + 1
+
+  # clean the input up before doing anything else
+  # delete comments
+  gsub(/(#|\/\/).*/, "")
+
+  # canonicalize whitespace
+  gsub(/[ \t]+/, " ") # mawk doesn't understand \s
+  gsub(/ *, */, ",")
+  gsub(/ *: */, ": ")
+  gsub(/ $/, "")
+  gsub(/^ /, "")
+}
+
+# check for assembler directives which we care about
+/^\.(section|data|text)/ {
+  # a .cfi_startproc/.cfi_endproc pair should be within the same section
+  # otherwise, clang will choke when generating ELF output
+  if (in_function) {
+    print ".cfi_endproc"
+    in_function = 0
+  }
+}
+/^\.globa?l +[a-zA-Z0-9_]+/ {
+  globals[$2] = 1
+}
+# not interested in assembler directives beyond this, just pass them through
+/^\./ {
+  print
+  next
+}
+
+/^[a-zA-Z0-9_]+:/ {
+  label = substr($1, 1, length($1)-1) # drop trailing :
+
+  if (called == label) {
+    # note adjustment of stack pointer from "call label; label:"
+    adjust_sp_offset(4)
+  }
+
+  if (globals[label]) {
+    if (in_function)
+      print ".cfi_endproc"
+
+    in_function = 1
+    print ".cfi_startproc"
+
+    for (register in saved)
+      delete saved[register]
+    for (register in dirty)
+      delete dirty[register]
+  }
+
+  # an instruction may follow on the same line, so continue processing
+}
+
+/^$/ { next }
+
+{
+  called = ""
+  printf ".loc 1 %d\n", line_number
+  print
+}
+
+# KEEPING UP WITH THE STACK POINTER
+# We do NOT attempt to understand foolish and ridiculous tricks like stashing
+#   the stack pointer and then using %esp as a scratch register, or bitshifting
+#   it or taking its square root or anything stupid like that.
+# %esp should only be adjusted by pushing/popping or adding/subtracting constants
+#
+/pushl?/ {
+  if (match($0, / %(ax|bx|cx|dx|di|si|bp|sp)/))
+    adjust_sp_offset(2)
+  else
+    adjust_sp_offset(4)
+}
+/popl?/ {
+  if (match($0, / %(ax|bx|cx|dx|di|si|bp|sp)/))
+    adjust_sp_offset(-2)
+  else
+    adjust_sp_offset(-4)
+}
+/addl? \$-?(0x[0-9a-fA-F]+|[0-9]+),%esp/ { adjust_sp_offset(-get_const1()) }
+/subl? \$-?(0x[0-9a-fA-F]+|[0-9]+),%esp/ { adjust_sp_offset(get_const1()) }
+
+/call/ {
+  if (match($0, /call [0-9]+f/)) # "forward" label
+    called = substr($0, RSTART+5, RLENGTH-6)
+  else if (match($0, /call [0-9a-zA-Z_]+/))
+    called = substr($0, RSTART+5, RLENGTH-5)
+}
+
+# TRACKING REGISTER VALUES FROM THE PREVIOUS STACK FRAME
+#
+/pushl? %e(ax|bx|cx|dx|si|di|bp)/ { # don't match "push (%reg)"
+  # if a register is being pushed, and its value has not changed since the
+  #   beginning of this function, the pushed value can be used when printing
+  #   local variables at the next level up the stack
+  # emit '.cfi_rel_offset' for that
+
+  if (in_function) {
+    register = get_reg()
+    if (!saved[register] && !dirty[register]) {
+      printf ".cfi_rel_offset %s,0\n", register
+      saved[register] = 1
+    }
+  }
+}
+
+/movl? %e(ax|bx|cx|dx|si|di|bp),-?(0x[0-9a-fA-F]+|[0-9]+)?\(%esp\)/ {
+  if (in_function) {
+    register = get_reg()
+    if (match($0, /-?(0x[0-9a-fA-F]+|[0-9]+)\(%esp\)/)) {
+      offset = parse_const(substr($0, RSTART, RLENGTH-6))
+    } else {
+      offset = 0
+    }
+    if (!saved[register] && !dirty[register]) {
+      printf ".cfi_rel_offset %s,%d\n", register, offset
+      saved[register] = 1
+    }
+  }
+}
+
+# IF REGISTER VALUES ARE UNCEREMONIOUSLY TRASHED
+# ...then we want to know about it.
+#
+function trashed(register) {
+  if (in_function && !saved[register] && !dirty[register]) {
+    printf ".cfi_undefined %s\n", register
+  }
+  dirty[register] = 1
+}
+# this does NOT exhaustively check for all possible instructions which could
+# overwrite a register value inherited from the caller (just the common ones)
+/mov.*,%e(ax|bx|cx|dx|si|di|bp)/  { trashed(get_reg2()) }
+/(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr) %e(ax|bx|cx|dx|si|di|bp),/ {
+  trashed(get_reg1())
+}
+/i?mul [^,]*$/                    { trashed("eax"); trashed("edx") }
+/i?mul %e(ax|bx|cx|dx|si|di|bp),/ { trashed(get_reg1()) }
+/i?div/                           { trashed("eax"); trashed("edx") }
+/(dec|inc|not|neg|pop) %e(ax|bx|cx|dx|si|di|bp)/  { trashed(get_reg()) }
+/cpuid/ { trashed("eax"); trashed("ebx"); trashed("ecx"); trashed("edx") }
+
+END {
+  if (in_function)
+    print ".cfi_endproc"
+}
-- 
2.0.0.GIT



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-06-15  6:45 [PATCH v9] Build process uses script to add CFI directives to x86 asm Alex Dowad
@ 2015-06-20 21:11 ` Rich Felker
  2015-06-21  5:00   ` Alex
  0 siblings, 1 reply; 21+ messages in thread
From: Rich Felker @ 2015-06-20 21:11 UTC (permalink / raw)
  To: musl

On Mon, Jun 15, 2015 at 08:45:00AM +0200, Alex Dowad wrote:
> diff --git a/Makefile b/Makefile
> index 2eb7b30..b4927f4 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -120,7 +120,11 @@ $(foreach s,$(wildcard src/*/$(ARCH)*/*.s),$(eval $(call mkasmdep,$(s))))
>  	$(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $(dir $<)$(shell cat $<)
>  
>  %.o: $(ARCH)/%.s
> +ifeq ($(ADD_CFI),yes)
> +	LC_ALL=C awk -f tools/add-cfi.$(ARCH).awk $< | $(CC) $(CFLAGS_ALL_STATIC) -x assembler -c -o $@ -
> +else
>  	$(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $<
> +endif
>  
>  %.o: %.c $(GENH) $(IMPH)
>  	$(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $<
> @@ -129,7 +133,11 @@ $(foreach s,$(wildcard src/*/$(ARCH)*/*.s),$(eval $(call mkasmdep,$(s))))
>  	$(CC) $(CFLAGS_ALL_SHARED) -c -o $@ $(dir $<)$(shell cat $<)
>  
>  %.lo: $(ARCH)/%.s
> +ifeq ($(ADD_CFI),yes)
> +	LC_ALL=C awk -f tools/add-cfi.$(ARCH).awk $< | $(CC) $(CFLAGS_ALL_SHARED) -x assembler -c -o $@ -
> +else
>  	$(CC) $(CFLAGS_ALL_SHARED) -c -o $@ $<
> +endif
>  
>  %.lo: %.c $(GENH) $(IMPH)
>  	$(CC) $(CFLAGS_ALL_SHARED) -c -o $@ $<

I think someone mentioned this before -- this still fails to apply CFI
to the asm files obtained via the .sub/SUBARCH system (see the rules
just above the ones you changed). Of course I don't really like
duplicating complex rules, and i386 has no SUBARCHs anyway, so perhaps
we could put off changing this if someone has an idea for how to
eliminate the extra rules for SUBARCHs when we overhaul the build
system...

Alternatively we could perhaps just put the command for asm files in a
makefile variable and set it conditionally depending on ADD_CFI. Then
it could be reused in each place.

> diff --git a/configure b/configure
> index 7b29ae4..a55acd6 100755
> --- a/configure
> +++ b/configure
> @@ -327,6 +327,22 @@ tryflag CFLAGS_MEMOPS -fno-tree-loop-distribute-patterns
>  test "$debug" = yes && CFLAGS_AUTO=-g
>  
>  #
> +# Preprocess asm files to add extra debugging information if debug is
> +# enabled, our assembler supports the needed directives, and the
> +# preprocessing script has been written for our architecture.
> +#
> +printf "checking whether we should preprocess assembly to add debugging information... "
> +if fnmatch '-g*|*\ -g*' "$CFLAGS_AUTO" &&
> +   test -f "tools/add-cfi.$ARCH.awk" &&
> +   printf ".file 1 \"srcfile.s\"\n.line 1\n.cfi_startproc\n.cfi_endproc" | $CC -g -x assembler -c -o /dev/null 2>/dev/null -
> +then
> +  ADD_CFI=yes
> +else
> +  ADD_CFI=no
> +fi
> +printf "%s\n" "$ADD_CFI"
> +
> +#

I'd still like to check $CFLAGS in addition to $CFLAGS_AUTO, since
CFLAGS=-g is the usual way I ask for debug info. (I'm sorry I ever
added the --enable-debug option to configure. :)

I haven't reviewed the awk script much at all, but as long as it seems
to work in practice I'm okay with committing it. One thing I would
like to ask though -- is it pretty robust against bad things happening
if it sees new asm constructs it's not expecting? I'd just like to
avoid regressions if we add new asm later, where it becomes necessary
to delay commits to asm in order to fix bugs in the CFI generation.

Rich


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-06-20 21:11 ` Rich Felker
@ 2015-06-21  5:00   ` Alex
  2015-06-21  5:05     ` Rich Felker
  0 siblings, 1 reply; 21+ messages in thread
From: Alex @ 2015-06-21  5:00 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1695 bytes --]

>
>
> I think someone mentioned this before -- this still fails to apply CFI
> to the asm files obtained via the .sub/SUBARCH system (see the rules
> just above the ones you changed). Of course I don't really like
> duplicating complex rules, and i386 has no SUBARCHs anyway, so perhaps
> we could put off changing this if someone has an idea for how to
> eliminate the extra rules for SUBARCHs when we overhaul the build
> system...
>

This was intentional, but since it has come up so many times, I will add
the (extraneous) rules for SUBARCHs.


> I'd still like to check $CFLAGS in addition to $CFLAGS_AUTO, since
> CFLAGS=-g is the usual way I ask for debug info. (I'm sorry I ever
> added the --enable-debug option to configure. :)
>

OK, will do.


> I haven't reviewed the awk script much at all, but as long as it seems
> to work in practice I'm okay with committing it. One thing I would
> like to ask though -- is it pretty robust against bad things happening
> if it sees new asm constructs it's not expecting? I'd just like to
> avoid regressions if we add new asm later, where it becomes necessary
> to delay commits to asm in order to fix bugs in the CFI generation.


It passes all your asm through and simply adds debugging directives when it
recognizes certain constructs. So in no case will it actually break the
asm. The worst which will happen is that you will adjust the stack pointer
in some obscure way that it doesn't recognize, no CFI directive will be
added, and then a debugger will be unable to produce a stack trace at that
point in the code. But this is no worse than what we have right now. Right
now we don't do anything to help the debugger at all.

Thanks, AD

[-- Attachment #2: Type: text/html, Size: 2364 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-06-21  5:00   ` Alex
@ 2015-06-21  5:05     ` Rich Felker
  0 siblings, 0 replies; 21+ messages in thread
From: Rich Felker @ 2015-06-21  5:05 UTC (permalink / raw)
  To: musl

On Sun, Jun 21, 2015 at 07:00:57AM +0200, Alex wrote:
> >
> >
> > I think someone mentioned this before -- this still fails to apply CFI
> > to the asm files obtained via the .sub/SUBARCH system (see the rules
> > just above the ones you changed). Of course I don't really like
> > duplicating complex rules, and i386 has no SUBARCHs anyway, so perhaps
> > we could put off changing this if someone has an idea for how to
> > eliminate the extra rules for SUBARCHs when we overhaul the build
> > system...
> >
> 
> This was intentional, but since it has come up so many times, I will add
> the (extraneous) rules for SUBARCHs.

Would doing it with a new AS_CMD variable work and make sense? That
would also eliminate the duplication in logic for .o vs .lo too, I
think...

> > I'd still like to check $CFLAGS in addition to $CFLAGS_AUTO, since
> > CFLAGS=-g is the usual way I ask for debug info. (I'm sorry I ever
> > added the --enable-debug option to configure. :)
> 
> OK, will do.

Thanks!

> > I haven't reviewed the awk script much at all, but as long as it seems
> > to work in practice I'm okay with committing it. One thing I would
> > like to ask though -- is it pretty robust against bad things happening
> > if it sees new asm constructs it's not expecting? I'd just like to
> > avoid regressions if we add new asm later, where it becomes necessary
> > to delay commits to asm in order to fix bugs in the CFI generation.
> 
> 
> It passes all your asm through and simply adds debugging directives when it
> recognizes certain constructs. So in no case will it actually break the
> asm. The worst which will happen is that you will adjust the stack pointer
> in some obscure way that it doesn't recognize, no CFI directive will be
> added, and then a debugger will be unable to produce a stack trace at that
> point in the code. But this is no worse than what we have right now. Right
> now we don't do anything to help the debugger at all.

Alright, sounds good. If all the build system integration issues are
worked out then I think this will be ready to commit.

Rich


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-07-08 14:25       ` Alex
@ 2015-07-09  5:00         ` Rich Felker
  0 siblings, 0 replies; 21+ messages in thread
From: Rich Felker @ 2015-07-09  5:00 UTC (permalink / raw)
  To: musl

On Wed, Jul 08, 2015 at 04:25:47PM +0200, Alex wrote:
> On Wed, Jul 8, 2015 at 4:16 PM, Rich Felker <dalias@libc.org> wrote:
> > A nasty complication: some of the .sub files sub-in a .c file instead
> > of a .s file. This is horribly nasty and probably not something I
> > should have done. x86 is not affected, but it probably means we should
> > not try applying the CFI stuff in the makefile to .sub rules yet.
> 
> You are absolutely right about this. I will remove those Makefile rules. -- AD

In that case I think we can just use a variable for the command.
Hopefully by the time we need to support archs with sub file
controlled asm, changes to the build system will have eliminated the
whole sub file mess and we'll just multiple implicit rules in proper
precedence order to get the right files used. This should considerably
improve 'make' start latency too, which suffers a lot from the
function for sub file dependencies.

Rich


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-07-08 14:16     ` Rich Felker
@ 2015-07-08 14:25       ` Alex
  2015-07-09  5:00         ` Rich Felker
  0 siblings, 1 reply; 21+ messages in thread
From: Alex @ 2015-07-08 14:25 UTC (permalink / raw)
  To: musl

On Wed, Jul 8, 2015 at 4:16 PM, Rich Felker <dalias@libc.org> wrote:
> A nasty complication: some of the .sub files sub-in a .c file instead
> of a .s file. This is horribly nasty and probably not something I
> should have done. x86 is not affected, but it probably means we should
> not try applying the CFI stuff in the makefile to .sub rules yet.

You are absolutely right about this. I will remove those Makefile rules. -- AD


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-07-08 14:11                   ` Rich Felker
@ 2015-07-08 14:24                     ` Alex
  0 siblings, 0 replies; 21+ messages in thread
From: Alex @ 2015-07-08 14:24 UTC (permalink / raw)
  To: musl

>> Then there are some labels which are declared as ".global" and
>> ".hidden", like __memcpy_fwd, __exp2l, and ___setjmp. It looks like
>> when any of those 3 are reached, there should be a return address on
>> top of the stack. What would you think of adding a ".type @function"
>> declaration for those 3, so that the CFI generation script will treat
>> them as "functions"?
>
> Do these matter? The exact same address is already an entry point
> via the other public name(s) you're supposed to use for it.

You're right, I just looked at the source and they don't matter.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-06-24  5:33   ` Alex
  2015-06-28  2:58     ` Rich Felker
@ 2015-07-08 14:16     ` Rich Felker
  2015-07-08 14:25       ` Alex
  1 sibling, 1 reply; 21+ messages in thread
From: Rich Felker @ 2015-07-08 14:16 UTC (permalink / raw)
  To: musl

On Wed, Jun 24, 2015 at 07:33:54AM +0200, Alex wrote:
> On Tue, Jun 23, 2015 at 11:52 PM, Rich Felker <dalias@libc.org> wrote:
> 
> > On Tue, Jun 23, 2015 at 09:18:08PM +0200, Alex Dowad wrote:
> > > ../configure now checks CFLAGS as well as CFLAGS_AUTO to see if
> > generation of debug
> > > info was requested.
> > >
> > > The idea of removing duplication from the Makefile using a new AS_CMD
> > variable has
> > > come up several times, but this will not work because the arguments need
> > to be
> > > inserted in different places. I have tried using a Makefile 'define'
> > instead.
> > > Please see whether you like the way this code reads or not.
> > >
> > > Subarch asm files are treated the same as other asm files. Please note
> > that
> > > since I don't own the hardware, I can't test whether these Makefile
> > rules work or not.
> >
> > I should have mentioned -- the CFLAGS_ALL_STATIC/_SHARED variants are
> > meaningless to asm, so it would work fine to just use $(CFLAGS_ALL);
> > then there's no need for fancy tricks, and a simple make variable
> > should work, I think.
> 
> A simple make variable works for the non-subarch rules. But what to do for
> the subarch rules, which use $(dir $<)$(shell cat $<)?

A nasty complication: some of the .sub files sub-in a .c file instead
of a .s file. This is horribly nasty and probably not something I
should have done. x86 is not affected, but it probably means we should
not try applying the CFI stuff in the makefile to .sub rules yet.

Rich


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-07-08  7:54                 ` Alex
@ 2015-07-08 14:11                   ` Rich Felker
  2015-07-08 14:24                     ` Alex
  0 siblings, 1 reply; 21+ messages in thread
From: Rich Felker @ 2015-07-08 14:11 UTC (permalink / raw)
  To: musl

On Wed, Jul 08, 2015 at 09:54:55AM +0200, Alex wrote:
> On Wed, Jul 8, 2015 at 9:13 AM, Alex <alexinbeijing@gmail.com> wrote:
> > On Wed, Jul 8, 2015 at 4:36 AM, Rich Felker <dalias@libc.org> wrote:
> >> On Tue, Jul 07, 2015 at 08:27:46PM +0200, Alex wrote:
> >>> > 2. I suspect a dynamic-linked binary without the corresponding libc.so
> >>> > is not useful. Do you want the static-linked binary with a reference
> >>> > to pthread_cancel added?
> >>>
> >>> I want to see the DWARF data, and in this case it should be in
> >>> libc.so, not in canbt. But let me test under conditions more similar
> >>> to yours first.
> >>
> >> I'm attaching the static a.out which exhibits the problem, and sending
> >> this off-list since it's (mildly) big. I can send libc.so if you
> >> prefer but I think the static version is sufficient to see the
> >> problem.
> >
> > Thanks. The problem was nothing to do with differences in our build
> > environments. The problem was that in src/thread/i386/syscall_cp.s,
> > _syscall_cp_asm "falls through" into __cp_begin.
> >
> > The CFI generation process assumes that each "global" label is a call
> > target, and hence when a global label is reached, the stack frame
> > offset should be 4. It assumes that we must have reached the current
> > point from a CALL, and that the return address should be on the top of
> > the stack.
> >
> > In this case, __cp_begin is NOT a call target. (It is used in
> > pthread_cancel.c:cancel_handler, which checks whether a saved
> > instruction pointer value is between __cp_begin and __cp_end.)
> >
> > I will amend the script so that it must see a ".global" directive
> > *and* a ".type @function" directive before it assumes that a label is
> > an entry point for C function calls.

I'm a little bit skeptical of this approach, but it probably does work
in practice. Stack pointer should be non-adjusted at any callable
external symbol. The exceptions are the __cp_* labels and they (at
present) are not function-type, but may need to be function-type on
arm once we support thumb2 asm because I think the thumb bit gets lost
without function-type (but maybe it doesn't matter). In any case that
doesn't affect x86.

> A couple questions:
> 
> drem(), dremf(), and vfork() are declared as ".type @function", but
> not ".global". Rather, they use ".weak" directives. Are such labels to
> be considered entry points?

Yes, they are entry points, and I think having vfork be weak is wrong
anyway (it was from back when we had other code in musl calling
__vfork, which was a bad idea).

> Then there are some labels which are declared as ".global" and
> ".hidden", like __memcpy_fwd, __exp2l, and ___setjmp. It looks like
> when any of those 3 are reached, there should be a return address on
> top of the stack. What would you think of adding a ".type @function"
> declaration for those 3, so that the CFI generation script will treat
> them as "functions"?

Do these matter? The exact same address is already an entry point
via the other public name(s) you're supposed to use for it.

Rich


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-07-08  7:13               ` Alex
@ 2015-07-08  7:54                 ` Alex
  2015-07-08 14:11                   ` Rich Felker
  0 siblings, 1 reply; 21+ messages in thread
From: Alex @ 2015-07-08  7:54 UTC (permalink / raw)
  To: Rich Felker, musl

On Wed, Jul 8, 2015 at 9:13 AM, Alex <alexinbeijing@gmail.com> wrote:
> On Wed, Jul 8, 2015 at 4:36 AM, Rich Felker <dalias@libc.org> wrote:
>> On Tue, Jul 07, 2015 at 08:27:46PM +0200, Alex wrote:
>>> > 2. I suspect a dynamic-linked binary without the corresponding libc.so
>>> > is not useful. Do you want the static-linked binary with a reference
>>> > to pthread_cancel added?
>>>
>>> I want to see the DWARF data, and in this case it should be in
>>> libc.so, not in canbt. But let me test under conditions more similar
>>> to yours first.
>>
>> I'm attaching the static a.out which exhibits the problem, and sending
>> this off-list since it's (mildly) big. I can send libc.so if you
>> prefer but I think the static version is sufficient to see the
>> problem.
>
> Thanks. The problem was nothing to do with differences in our build
> environments. The problem was that in src/thread/i386/syscall_cp.s,
> _syscall_cp_asm "falls through" into __cp_begin.
>
> The CFI generation process assumes that each "global" label is a call
> target, and hence when a global label is reached, the stack frame
> offset should be 4. It assumes that we must have reached the current
> point from a CALL, and that the return address should be on the top of
> the stack.
>
> In this case, __cp_begin is NOT a call target. (It is used in
> pthread_cancel.c:cancel_handler, which checks whether a saved
> instruction pointer value is between __cp_begin and __cp_end.)
>
> I will amend the script so that it must see a ".global" directive
> *and* a ".type @function" directive before it assumes that a label is
> an entry point for C function calls.

A couple questions:

drem(), dremf(), and vfork() are declared as ".type @function", but
not ".global". Rather, they use ".weak" directives. Are such labels to
be considered entry points?

Then there are some labels which are declared as ".global" and
".hidden", like __memcpy_fwd, __exp2l, and ___setjmp. It looks like
when any of those 3 are reached, there should be a return address on
top of the stack. What would you think of adding a ".type @function"
declaration for those 3, so that the CFI generation script will treat
them as "functions"?

Thanks, AD


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
       [not found]             ` <20150708023638.GJ1173@brightrain.aerifal.cx>
@ 2015-07-08  7:13               ` Alex
  2015-07-08  7:54                 ` Alex
  0 siblings, 1 reply; 21+ messages in thread
From: Alex @ 2015-07-08  7:13 UTC (permalink / raw)
  To: Rich Felker, musl

On Wed, Jul 8, 2015 at 4:36 AM, Rich Felker <dalias@libc.org> wrote:
> On Tue, Jul 07, 2015 at 08:27:46PM +0200, Alex wrote:
>> > 2. I suspect a dynamic-linked binary without the corresponding libc.so
>> > is not useful. Do you want the static-linked binary with a reference
>> > to pthread_cancel added?
>>
>> I want to see the DWARF data, and in this case it should be in
>> libc.so, not in canbt. But let me test under conditions more similar
>> to yours first.
>
> I'm attaching the static a.out which exhibits the problem, and sending
> this off-list since it's (mildly) big. I can send libc.so if you
> prefer but I think the static version is sufficient to see the
> problem.

Thanks. The problem was nothing to do with differences in our build
environments. The problem was that in src/thread/i386/syscall_cp.s,
_syscall_cp_asm "falls through" into __cp_begin.

The CFI generation process assumes that each "global" label is a call
target, and hence when a global label is reached, the stack frame
offset should be 4. It assumes that we must have reached the current
point from a CALL, and that the return address should be on the top of
the stack.

In this case, __cp_begin is NOT a call target. (It is used in
pthread_cancel.c:cancel_handler, which checks whether a saved
instruction pointer value is between __cp_begin and __cp_end.)

I will amend the script so that it must see a ".global" directive
*and* a ".type @function" directive before it assumes that a label is
an entry point for C function calls.

Thanks,
Alex


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-07-07 18:15         ` Rich Felker
@ 2015-07-07 18:27           ` Alex
       [not found]             ` <20150708023638.GJ1173@brightrain.aerifal.cx>
  0 siblings, 1 reply; 21+ messages in thread
From: Alex @ 2015-07-07 18:27 UTC (permalink / raw)
  To: musl

On Tue, Jul 7, 2015 at 8:15 PM, Rich Felker <dalias@libc.org> wrote:
> On Tue, Jul 07, 2015 at 06:40:18PM +0200, Alex wrote:
>> On Tue, Jul 7, 2015 at 3:21 PM, Rich Felker <dalias@libc.org> wrote:
>> > On Tue, Jul 07, 2015 at 07:45:34AM +0200, Alex wrote:
>> >> On Tue, Jul 7, 2015 at 5:39 AM, Rich Felker <dalias@libc.org> wrote:
>> >> > On Tue, Jun 23, 2015 at 09:18:08PM +0200, Alex Dowad wrote:
>> >> >> Dear musl devs,
>> >> >>
>> >> >> ../configure now checks CFLAGS as well as CFLAGS_AUTO to see if generation of debug
>> >> >> info was requested.
>> >> >>
>> >> >> The idea of removing duplication from the Makefile using a new AS_CMD variable has
>> >> >> come up several times, but this will not work because the arguments need to be
>> >> >> inserted in different places. I have tried using a Makefile 'define' instead.
>> >> >> Please see whether you like the way this code reads or not.
>> >> >>
>> >> >> Subarch asm files are treated the same as other asm files. Please note that
>> >> >> since I don't own the hardware, I can't test whether these Makefile rules work or not.
>> >> >
>> >> > I'm trying to test this and get it merged, but I'm not having any luck
>> >> > getting gdb to backtrace when I interrupt a syscall with ^C. I've
>> >> > tried pause() and syscall(SYS_pause) (cancellable and non-cancellable)
>> >> > and the patch doesn't seem to help at all for the former, and only
>> >> > helps mildly for the latter. Have you had better results than I did?
>> >>
>> >> I tested a different set of library calls, with my own test programs.
>> >> Can you please send a copy of the test program you are using, and I
>> >> will debug? Thanks, AD
>> >
>> > Here it is. Compile with -g -O0 to make a more interesting backtrace
>> > (if it works).
>>
>> Thanks for the test program. I compiled it with:
>>
>> ../tools/musl-gcc -Xlinker -melf_i386 -m32 -o canbt canbt.c
>>
>> (That -Xlinker junk is because I am compiling to 32-bit code on a
>> 64-bit machine.)
>>
>> And then:
>>
>> gdb ./canbt
>>
>> Set a breakpoint on pause(), single-stepped all the way into the
>> innards of musl, and I was able to get a nice stack trace at each
>> stop. Sample of what it looks like at one randomly chosen point:
>> https://dl.dropboxusercontent.com/u/106217735/musl-debugging-pause.png
>>
>> Could you please 1) tell me the command-line invocation you are using
>> to compile the test program and 2) send me a binary, so I can extract
>> the DWARF data from it and see what it looks like?
>
> 1. Command line is "gcc -g canbt.c". With -static, syscall_cp gets
> optimized out to a normal syscall unless there's a reference to
> pthread_cancel, and the backtrace fully works. It did not work prior
> to your patch. But with dynamic linking, or with pthread_cancel.o
> pulled in when static linking, the backtrace produces (dynamic):

OK, I presume you are testing on a 32-bit OS. Is it Alpine Linux?

> 2. I suspect a dynamic-linked binary without the corresponding libc.so
> is not useful. Do you want the static-linked binary with a reference
> to pthread_cancel added?

I want to see the DWARF data, and in this case it should be in
libc.so, not in canbt. But let me test under conditions more similar
to yours first.

Thanks, AD


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-07-07 16:40       ` Alex
@ 2015-07-07 18:15         ` Rich Felker
  2015-07-07 18:27           ` Alex
  0 siblings, 1 reply; 21+ messages in thread
From: Rich Felker @ 2015-07-07 18:15 UTC (permalink / raw)
  To: musl

On Tue, Jul 07, 2015 at 06:40:18PM +0200, Alex wrote:
> On Tue, Jul 7, 2015 at 3:21 PM, Rich Felker <dalias@libc.org> wrote:
> > On Tue, Jul 07, 2015 at 07:45:34AM +0200, Alex wrote:
> >> On Tue, Jul 7, 2015 at 5:39 AM, Rich Felker <dalias@libc.org> wrote:
> >> > On Tue, Jun 23, 2015 at 09:18:08PM +0200, Alex Dowad wrote:
> >> >> Dear musl devs,
> >> >>
> >> >> ../configure now checks CFLAGS as well as CFLAGS_AUTO to see if generation of debug
> >> >> info was requested.
> >> >>
> >> >> The idea of removing duplication from the Makefile using a new AS_CMD variable has
> >> >> come up several times, but this will not work because the arguments need to be
> >> >> inserted in different places. I have tried using a Makefile 'define' instead.
> >> >> Please see whether you like the way this code reads or not.
> >> >>
> >> >> Subarch asm files are treated the same as other asm files. Please note that
> >> >> since I don't own the hardware, I can't test whether these Makefile rules work or not.
> >> >
> >> > I'm trying to test this and get it merged, but I'm not having any luck
> >> > getting gdb to backtrace when I interrupt a syscall with ^C. I've
> >> > tried pause() and syscall(SYS_pause) (cancellable and non-cancellable)
> >> > and the patch doesn't seem to help at all for the former, and only
> >> > helps mildly for the latter. Have you had better results than I did?
> >>
> >> I tested a different set of library calls, with my own test programs.
> >> Can you please send a copy of the test program you are using, and I
> >> will debug? Thanks, AD
> >
> > Here it is. Compile with -g -O0 to make a more interesting backtrace
> > (if it works).
> 
> Thanks for the test program. I compiled it with:
> 
> ../tools/musl-gcc -Xlinker -melf_i386 -m32 -o canbt canbt.c
> 
> (That -Xlinker junk is because I am compiling to 32-bit code on a
> 64-bit machine.)
> 
> And then:
> 
> gdb ./canbt
> 
> Set a breakpoint on pause(), single-stepped all the way into the
> innards of musl, and I was able to get a nice stack trace at each
> stop. Sample of what it looks like at one randomly chosen point:
> https://dl.dropboxusercontent.com/u/106217735/musl-debugging-pause.png
> 
> Could you please 1) tell me the command-line invocation you are using
> to compile the test program and 2) send me a binary, so I can extract
> the DWARF data from it and see what it looks like?

1. Command line is "gcc -g canbt.c". With -static, syscall_cp gets
optimized out to a normal syscall unless there's a reference to
pthread_cancel, and the backtrace fully works. It did not work prior
to your patch. But with dynamic linking, or with pthread_cancel.o
pulled in when static linking, the backtrace produces (dynamic):

#0  __cp_end () at src/thread/i386/syscall_cp.s:31
#1  0xb7ffdb70 in builtin_tls () from /lib/ld-musl-i386.so.1

or similar nonsense (static):

#0  __cp_end () at src/thread/i386/syscall_cp.s:31
#1  0x0804a728 in environ ()

My method of testing is running the program under gdb, hitting ^C once
the pause is entered, then attempting bt.

2. I suspect a dynamic-linked binary without the corresponding libc.so
is not useful. Do you want the static-linked binary with a reference
to pthread_cancel added?

Rich


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-07-07 13:21     ` Rich Felker
@ 2015-07-07 16:40       ` Alex
  2015-07-07 18:15         ` Rich Felker
  0 siblings, 1 reply; 21+ messages in thread
From: Alex @ 2015-07-07 16:40 UTC (permalink / raw)
  To: musl

On Tue, Jul 7, 2015 at 3:21 PM, Rich Felker <dalias@libc.org> wrote:
> On Tue, Jul 07, 2015 at 07:45:34AM +0200, Alex wrote:
>> On Tue, Jul 7, 2015 at 5:39 AM, Rich Felker <dalias@libc.org> wrote:
>> > On Tue, Jun 23, 2015 at 09:18:08PM +0200, Alex Dowad wrote:
>> >> Dear musl devs,
>> >>
>> >> ../configure now checks CFLAGS as well as CFLAGS_AUTO to see if generation of debug
>> >> info was requested.
>> >>
>> >> The idea of removing duplication from the Makefile using a new AS_CMD variable has
>> >> come up several times, but this will not work because the arguments need to be
>> >> inserted in different places. I have tried using a Makefile 'define' instead.
>> >> Please see whether you like the way this code reads or not.
>> >>
>> >> Subarch asm files are treated the same as other asm files. Please note that
>> >> since I don't own the hardware, I can't test whether these Makefile rules work or not.
>> >
>> > I'm trying to test this and get it merged, but I'm not having any luck
>> > getting gdb to backtrace when I interrupt a syscall with ^C. I've
>> > tried pause() and syscall(SYS_pause) (cancellable and non-cancellable)
>> > and the patch doesn't seem to help at all for the former, and only
>> > helps mildly for the latter. Have you had better results than I did?
>>
>> I tested a different set of library calls, with my own test programs.
>> Can you please send a copy of the test program you are using, and I
>> will debug? Thanks, AD
>
> Here it is. Compile with -g -O0 to make a more interesting backtrace
> (if it works).

Thanks for the test program. I compiled it with:

./tools/musl-gcc -Xlinker -melf_i386 -m32 -o canbt canbt.c

(That -Xlinker junk is because I am compiling to 32-bit code on a
64-bit machine.)

And then:

gdb ./canbt

Set a breakpoint on pause(), single-stepped all the way into the
innards of musl, and I was able to get a nice stack trace at each
stop. Sample of what it looks like at one randomly chosen point:
https://dl.dropboxusercontent.com/u/106217735/musl-debugging-pause.png

Could you please 1) tell me the command-line invocation you are using
to compile the test program and 2) send me a binary, so I can extract
the DWARF data from it and see what it looks like?

Thanks, AD


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-07-07  5:45   ` Alex
@ 2015-07-07 13:21     ` Rich Felker
  2015-07-07 16:40       ` Alex
  0 siblings, 1 reply; 21+ messages in thread
From: Rich Felker @ 2015-07-07 13:21 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1461 bytes --]

On Tue, Jul 07, 2015 at 07:45:34AM +0200, Alex wrote:
> On Tue, Jul 7, 2015 at 5:39 AM, Rich Felker <dalias@libc.org> wrote:
> > On Tue, Jun 23, 2015 at 09:18:08PM +0200, Alex Dowad wrote:
> >> Dear musl devs,
> >>
> >> ../configure now checks CFLAGS as well as CFLAGS_AUTO to see if generation of debug
> >> info was requested.
> >>
> >> The idea of removing duplication from the Makefile using a new AS_CMD variable has
> >> come up several times, but this will not work because the arguments need to be
> >> inserted in different places. I have tried using a Makefile 'define' instead.
> >> Please see whether you like the way this code reads or not.
> >>
> >> Subarch asm files are treated the same as other asm files. Please note that
> >> since I don't own the hardware, I can't test whether these Makefile rules work or not.
> >
> > I'm trying to test this and get it merged, but I'm not having any luck
> > getting gdb to backtrace when I interrupt a syscall with ^C. I've
> > tried pause() and syscall(SYS_pause) (cancellable and non-cancellable)
> > and the patch doesn't seem to help at all for the former, and only
> > helps mildly for the latter. Have you had better results than I did?
> 
> I tested a different set of library calls, with my own test programs.
> Can you please send a copy of the test program you are using, and I
> will debug? Thanks, AD

Here it is. Compile with -g -O0 to make a more interesting backtrace
(if it works).

Rich

[-- Attachment #2: canbt.c --]
[-- Type: text/plain, Size: 175 bytes --]

#include <unistd.h>
#include <sys/syscall.h>

void a(), b(), c();

void a()
{
	b();
}

void b()
{
	c();
}

void c()
{
	pause();
	//syscall(SYS_pause);
}

int main()
{
	a();
}

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-07-07  3:39 ` Rich Felker
@ 2015-07-07  5:45   ` Alex
  2015-07-07 13:21     ` Rich Felker
  0 siblings, 1 reply; 21+ messages in thread
From: Alex @ 2015-07-07  5:45 UTC (permalink / raw)
  To: musl

On Tue, Jul 7, 2015 at 5:39 AM, Rich Felker <dalias@libc.org> wrote:
> On Tue, Jun 23, 2015 at 09:18:08PM +0200, Alex Dowad wrote:
>> Dear musl devs,
>>
>> ../configure now checks CFLAGS as well as CFLAGS_AUTO to see if generation of debug
>> info was requested.
>>
>> The idea of removing duplication from the Makefile using a new AS_CMD variable has
>> come up several times, but this will not work because the arguments need to be
>> inserted in different places. I have tried using a Makefile 'define' instead.
>> Please see whether you like the way this code reads or not.
>>
>> Subarch asm files are treated the same as other asm files. Please note that
>> since I don't own the hardware, I can't test whether these Makefile rules work or not.
>
> I'm trying to test this and get it merged, but I'm not having any luck
> getting gdb to backtrace when I interrupt a syscall with ^C. I've
> tried pause() and syscall(SYS_pause) (cancellable and non-cancellable)
> and the patch doesn't seem to help at all for the former, and only
> helps mildly for the latter. Have you had better results than I did?

I tested a different set of library calls, with my own test programs.
Can you please send a copy of the test program you are using, and I
will debug? Thanks, AD


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-06-23 19:18 Alex Dowad
  2015-06-23 21:52 ` Rich Felker
@ 2015-07-07  3:39 ` Rich Felker
  2015-07-07  5:45   ` Alex
  1 sibling, 1 reply; 21+ messages in thread
From: Rich Felker @ 2015-07-07  3:39 UTC (permalink / raw)
  To: musl

On Tue, Jun 23, 2015 at 09:18:08PM +0200, Alex Dowad wrote:
> Dear musl devs,
> 
> ../configure now checks CFLAGS as well as CFLAGS_AUTO to see if generation of debug
> info was requested.
> 
> The idea of removing duplication from the Makefile using a new AS_CMD variable has
> come up several times, but this will not work because the arguments need to be
> inserted in different places. I have tried using a Makefile 'define' instead.
> Please see whether you like the way this code reads or not.
> 
> Subarch asm files are treated the same as other asm files. Please note that
> since I don't own the hardware, I can't test whether these Makefile rules work or not.

I'm trying to test this and get it merged, but I'm not having any luck
getting gdb to backtrace when I interrupt a syscall with ^C. I've
tried pause() and syscall(SYS_pause) (cancellable and non-cancellable)
and the patch doesn't seem to help at all for the former, and only
helps mildly for the latter. Have you had better results than I did?

Rich


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-06-24  5:33   ` Alex
@ 2015-06-28  2:58     ` Rich Felker
  2015-07-08 14:16     ` Rich Felker
  1 sibling, 0 replies; 21+ messages in thread
From: Rich Felker @ 2015-06-28  2:58 UTC (permalink / raw)
  To: musl

On Wed, Jun 24, 2015 at 07:33:54AM +0200, Alex wrote:
> On Tue, Jun 23, 2015 at 11:52 PM, Rich Felker <dalias@libc.org> wrote:
> 
> > On Tue, Jun 23, 2015 at 09:18:08PM +0200, Alex Dowad wrote:
> > > ../configure now checks CFLAGS as well as CFLAGS_AUTO to see if
> > generation of debug
> > > info was requested.
> > >
> > > The idea of removing duplication from the Makefile using a new AS_CMD
> > variable has
> > > come up several times, but this will not work because the arguments need
> > to be
> > > inserted in different places. I have tried using a Makefile 'define'
> > instead.
> > > Please see whether you like the way this code reads or not.
> > >
> > > Subarch asm files are treated the same as other asm files. Please note
> > that
> > > since I don't own the hardware, I can't test whether these Makefile
> > rules work or not.
> >
> > I should have mentioned -- the CFLAGS_ALL_STATIC/_SHARED variants are
> > meaningless to asm, so it would work fine to just use $(CFLAGS_ALL);
> > then there's no need for fancy tricks, and a simple make variable
> > should work, I think.
> 
> A simple make variable works for the non-subarch rules. But what to do for
> the subarch rules, which use $(dir $<)$(shell cat $<)?

Thanks for clarifying. This made sense as soon as I read your email,
but I didn't reply right away because I didn't have any better ideas,
and right now I still don't. So perhaps we should just go with this
for now. At some point I'd like to remove the existing use of "make
functions" like this, and this new one if we've added it -- they're
obstacles to being able to use a new make utility such as a
hypothetical toybox one -- but for now it's no worse than the
dependency we already have.

Anyway, I think this is still on-track for making it into the release.
I want to do some brief testing but if nothing else comes up I think
it's ready to commit.

Rich


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-06-23 21:52 ` Rich Felker
@ 2015-06-24  5:33   ` Alex
  2015-06-28  2:58     ` Rich Felker
  2015-07-08 14:16     ` Rich Felker
  0 siblings, 2 replies; 21+ messages in thread
From: Alex @ 2015-06-24  5:33 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1122 bytes --]

On Tue, Jun 23, 2015 at 11:52 PM, Rich Felker <dalias@libc.org> wrote:

> On Tue, Jun 23, 2015 at 09:18:08PM +0200, Alex Dowad wrote:
> > ../configure now checks CFLAGS as well as CFLAGS_AUTO to see if
> generation of debug
> > info was requested.
> >
> > The idea of removing duplication from the Makefile using a new AS_CMD
> variable has
> > come up several times, but this will not work because the arguments need
> to be
> > inserted in different places. I have tried using a Makefile 'define'
> instead.
> > Please see whether you like the way this code reads or not.
> >
> > Subarch asm files are treated the same as other asm files. Please note
> that
> > since I don't own the hardware, I can't test whether these Makefile
> rules work or not.
>
> I should have mentioned -- the CFLAGS_ALL_STATIC/_SHARED variants are
> meaningless to asm, so it would work fine to just use $(CFLAGS_ALL);
> then there's no need for fancy tricks, and a simple make variable
> should work, I think.


A simple make variable works for the non-subarch rules. But what to do for
the subarch rules, which use $(dir $<)$(shell cat $<)?

[-- Attachment #2: Type: text/html, Size: 1542 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v9] Build process uses script to add CFI directives to x86 asm
  2015-06-23 19:18 Alex Dowad
@ 2015-06-23 21:52 ` Rich Felker
  2015-06-24  5:33   ` Alex
  2015-07-07  3:39 ` Rich Felker
  1 sibling, 1 reply; 21+ messages in thread
From: Rich Felker @ 2015-06-23 21:52 UTC (permalink / raw)
  To: musl

On Tue, Jun 23, 2015 at 09:18:08PM +0200, Alex Dowad wrote:
> ../configure now checks CFLAGS as well as CFLAGS_AUTO to see if generation of debug
> info was requested.
> 
> The idea of removing duplication from the Makefile using a new AS_CMD variable has
> come up several times, but this will not work because the arguments need to be
> inserted in different places. I have tried using a Makefile 'define' instead.
> Please see whether you like the way this code reads or not.
> 
> Subarch asm files are treated the same as other asm files. Please note that
> since I don't own the hardware, I can't test whether these Makefile rules work or not.

I should have mentioned -- the CFLAGS_ALL_STATIC/_SHARED variants are
meaningless to asm, so it would work fine to just use $(CFLAGS_ALL);
then there's no need for fancy tricks, and a simple make variable
should work, I think.

Rich


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v9] Build process uses script to add CFI directives to x86 asm
@ 2015-06-23 19:18 Alex Dowad
  2015-06-23 21:52 ` Rich Felker
  2015-07-07  3:39 ` Rich Felker
  0 siblings, 2 replies; 21+ messages in thread
From: Alex Dowad @ 2015-06-23 19:18 UTC (permalink / raw)
  To: musl

Some functions implemented in asm need to use EBP for purposes other than acting
as a frame pointer. (Notably, it is used for the 6th argument to syscalls with
6 arguments.) Without frame pointers, GDB can only show backtraces if it gets
CFI information from a .debug_frame or .eh_frame ELF section.

Rather than littering our asm with ugly .cfi directives, use an awk script to
insert them in the right places during the build process, so GDB can keep track of
where the current stack frame is relative to the stack pointer. This means GDB can
produce beautiful stack traces at any given point when single-stepping through asm
functions.

Additionally, when registers are saved on the stack and later overwritten, emit
.cfi directives so GDB will know where they were saved relative to the stack
pointer. This way, when you look back up the stack from within an asm function,
you can still reliably print the values of local variables in the caller.

If this awk script were to understand every possible wild and crazy contortion that
an asm programmer can do with the stack and registers, and always emit the exact
.cfi directives needed for GDB to know what the register values were in the
preceding stack frame, it would necessarily be as complex as a full x86 emulator.
That way lies madness.

Hence, we assume that the stack pointer will _only_ ever be adjusted using push/pop
or else add/sub with a constant. We do not attempt to detect every possible way that
a register value could be saved for later use, just the simple and common ways.

Thanks to Szabolcs Nagy for suggesting numerous improvements to this code.
---
Dear musl devs,

./configure now checks CFLAGS as well as CFLAGS_AUTO to see if generation of debug
info was requested.

The idea of removing duplication from the Makefile using a new AS_CMD variable has
come up several times, but this will not work because the arguments need to be
inserted in different places. I have tried using a Makefile 'define' instead.
Please see whether you like the way this code reads or not.

Subarch asm files are treated the same as other asm files. Please note that
since I don't own the hardware, I can't test whether these Makefile rules work or not.

Thanks, AD

 Makefile               |  20 ++++-
 configure              |  17 ++++
 tools/add-cfi.i386.awk | 227 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 260 insertions(+), 4 deletions(-)
 create mode 100644 tools/add-cfi.i386.awk

diff --git a/Makefile b/Makefile
index 2eb7b30..7d738f6 100644
--- a/Makefile
+++ b/Makefile
@@ -116,20 +116,32 @@ $(dir $(patsubst %/,%,$(dir $(1))))$(notdir $(1:.s=.o)): $(1)
 endef
 $(foreach s,$(wildcard src/*/$(ARCH)*/*.s),$(eval $(call mkasmdep,$(s))))
 
+# Choose invocation of assembler to be used
+# $(1) is input file, $(2) is output file, $(3) is assembler flags
+ifeq ($(ADD_CFI),yes)
+define as_cmd
+	LC_ALL=C awk -f tools/add-cfi.$(ARCH).awk $(1) | $(CC) $(3) -x assembler -c -o $(2) -
+endef
+else
+define as_cmd
+	$(CC) $(3) -c -o $(2) $(1)
+endef
+endif
+
 %.o: $(ARCH)$(ASMSUBARCH)/%.sub
-	$(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $(dir $<)$(shell cat $<)
+	$(call as_cmd,$(dir $<)$(shell cat $<),$@,$(CFLAGS_ALL_STATIC))
 
 %.o: $(ARCH)/%.s
-	$(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $<
+	$(call as_cmd,$<,$@,$(CFLAGS_ALL_STATIC))
 
 %.o: %.c $(GENH) $(IMPH)
 	$(CC) $(CFLAGS_ALL_STATIC) -c -o $@ $<
 
 %.lo: $(ARCH)$(ASMSUBARCH)/%.sub
-	$(CC) $(CFLAGS_ALL_SHARED) -c -o $@ $(dir $<)$(shell cat $<)
+	$(call as_cmd,$(dir $<)$(shell cat $<),$@,$(CFLAGS_ALL_SHARED))
 
 %.lo: $(ARCH)/%.s
-	$(CC) $(CFLAGS_ALL_SHARED) -c -o $@ $<
+	$(call as_cmd,$<,$@,$(CFLAGS_ALL_SHARED))
 
 %.lo: %.c $(GENH) $(IMPH)
 	$(CC) $(CFLAGS_ALL_SHARED) -c -o $@ $<
diff --git a/configure b/configure
index 7b29ae4..a38fb21 100755
--- a/configure
+++ b/configure
@@ -327,6 +327,22 @@ tryflag CFLAGS_MEMOPS -fno-tree-loop-distribute-patterns
 test "$debug" = yes && CFLAGS_AUTO=-g
 
 #
+# Preprocess asm files to add extra debugging information if debug is
+# enabled, our assembler supports the needed directives, and the
+# preprocessing script has been written for our architecture.
+#
+printf "checking whether we should preprocess assembly to add debugging information... "
+if fnmatch '-g*|*\ -g*' "$CFLAGS_AUTO $CFLAGS" &&
+   test -f "tools/add-cfi.$ARCH.awk" &&
+   printf ".file 1 \"srcfile.s\"\n.line 1\n.cfi_startproc\n.cfi_endproc" | $CC -g -x assembler -c -o /dev/null 2>/dev/null -
+then
+  ADD_CFI=yes
+else
+  ADD_CFI=no
+fi
+printf "%s\n" "$ADD_CFI"
+
+#
 # Possibly add a -O option to CFLAGS and select modules to optimize with
 # -O3 based on the status of --enable-optimize and provided CFLAGS.
 #
@@ -580,6 +596,7 @@ LDFLAGS = $LDFLAGS_AUTO $LDFLAGS
 CROSS_COMPILE = $CROSS_COMPILE
 LIBCC = $LIBCC
 OPTIMIZE_GLOBS = $OPTIMIZE_GLOBS
+ADD_CFI = $ADD_CFI
 EOF
 test "x$static" = xno && echo "STATIC_LIBS ="
 test "x$shared" = xno && echo "SHARED_LIBS ="
diff --git a/tools/add-cfi.i386.awk b/tools/add-cfi.i386.awk
new file mode 100644
index 0000000..02686b9
--- /dev/null
+++ b/tools/add-cfi.i386.awk
@@ -0,0 +1,227 @@
+# Insert GAS CFI directives ("control frame information") into x86-32 asm input
+#
+# CFI directives tell the assembler how to generate "stack frame" debug info
+# This information can tell a debugger (like gdb) how to find the current stack
+#   frame at any point in the program code, and how to find the values which
+#   various registers had at higher points in the call stack
+# With this information, the debugger can show a backtrace, and you can move up
+#   and down the call stack and examine the values of local variables
+
+BEGIN {
+  # don't put CFI data in the .eh_frame ELF section (which we don't keep)
+  print ".cfi_sections .debug_frame"
+
+  # only emit CFI directives inside a function
+  in_function = 0
+
+  # emit .loc directives with line numbers from original source
+  printf ".file 1 \"%s\"\n", ARGV[1]
+  line_number = 0
+
+  # used to detect "call label; label:" trick
+  called = ""
+}
+
+function hex2int(str,   i) {
+  str = tolower(str)
+
+  for (i = 1; i <= 16; i++) {
+    char = substr("0123456789abcdef", i, 1)
+    lookup[char] = i-1
+  }
+
+  result = 0
+  for (i = 1; i <= length(str); i++) {
+    result = result * 16
+    char   = substr(str, i, 1)
+    result = result + lookup[char]
+  }
+  return result
+}
+
+function parse_const(str) {
+  sign = sub(/^-/, "", str)
+  hex  = sub(/^0x/, "", str)
+  if (hex)
+    n = hex2int(str)
+  else
+    n = str+0
+  return sign ? -n : n
+}
+
+function get_const1() {
+  # for instructions with 2 operands, get 1st operand (assuming it is constant)
+  match($0, /-?(0x[0-9a-fA-F]+|[0-9]+),/)
+  return parse_const(substr($0, RSTART, RLENGTH-1))
+}
+function get_reg() {
+  # only use if you already know there is 1 and only 1 register
+  match($0, /%e(ax|bx|cx|dx|si|di|bp)/)
+  return substr($0, RSTART+1, 3)
+}
+function get_reg1() {
+  # for instructions with 2 operands, get 1st operand (assuming it is register)
+  match($0, /%e(ax|bx|cx|dx|si|di|bp),/)
+  return substr($0, RSTART+1, 3)
+}
+function get_reg2() {
+  # for instructions with 2 operands, get 2nd operand (assuming it is register)
+  match($0, /,%e(ax|bx|cx|dx|si|di|bp)/)
+  return substr($0, RSTART+RLENGTH-3, 3)
+}
+
+function adjust_sp_offset(delta) {
+  if (in_function)
+    printf ".cfi_adjust_cfa_offset %d\n", delta
+}
+
+{
+  line_number = line_number + 1
+
+  # clean the input up before doing anything else
+  # delete comments
+  gsub(/(#|\/\/).*/, "")
+
+  # canonicalize whitespace
+  gsub(/[ \t]+/, " ") # mawk doesn't understand \s
+  gsub(/ *, */, ",")
+  gsub(/ *: */, ": ")
+  gsub(/ $/, "")
+  gsub(/^ /, "")
+}
+
+# check for assembler directives which we care about
+/^\.(section|data|text)/ {
+  # a .cfi_startproc/.cfi_endproc pair should be within the same section
+  # otherwise, clang will choke when generating ELF output
+  if (in_function) {
+    print ".cfi_endproc"
+    in_function = 0
+  }
+}
+/^\.globa?l +[a-zA-Z0-9_]+/ {
+  globals[$2] = 1
+}
+# not interested in assembler directives beyond this, just pass them through
+/^\./ {
+  print
+  next
+}
+
+/^[a-zA-Z0-9_]+:/ {
+  label = substr($1, 1, length($1)-1) # drop trailing :
+
+  if (called == label) {
+    # note adjustment of stack pointer from "call label; label:"
+    adjust_sp_offset(4)
+  }
+
+  if (globals[label]) {
+    if (in_function)
+      print ".cfi_endproc"
+
+    in_function = 1
+    print ".cfi_startproc"
+
+    for (register in saved)
+      delete saved[register]
+    for (register in dirty)
+      delete dirty[register]
+  }
+
+  # an instruction may follow on the same line, so continue processing
+}
+
+/^$/ { next }
+
+{
+  called = ""
+  printf ".loc 1 %d\n", line_number
+  print
+}
+
+# KEEPING UP WITH THE STACK POINTER
+# We do NOT attempt to understand foolish and ridiculous tricks like stashing
+#   the stack pointer and then using %esp as a scratch register, or bitshifting
+#   it or taking its square root or anything stupid like that.
+# %esp should only be adjusted by pushing/popping or adding/subtracting constants
+#
+/pushl?/ {
+  if (match($0, / %(ax|bx|cx|dx|di|si|bp|sp)/))
+    adjust_sp_offset(2)
+  else
+    adjust_sp_offset(4)
+}
+/popl?/ {
+  if (match($0, / %(ax|bx|cx|dx|di|si|bp|sp)/))
+    adjust_sp_offset(-2)
+  else
+    adjust_sp_offset(-4)
+}
+/addl? \$-?(0x[0-9a-fA-F]+|[0-9]+),%esp/ { adjust_sp_offset(-get_const1()) }
+/subl? \$-?(0x[0-9a-fA-F]+|[0-9]+),%esp/ { adjust_sp_offset(get_const1()) }
+
+/call/ {
+  if (match($0, /call [0-9]+f/)) # "forward" label
+    called = substr($0, RSTART+5, RLENGTH-6)
+  else if (match($0, /call [0-9a-zA-Z_]+/))
+    called = substr($0, RSTART+5, RLENGTH-5)
+}
+
+# TRACKING REGISTER VALUES FROM THE PREVIOUS STACK FRAME
+#
+/pushl? %e(ax|bx|cx|dx|si|di|bp)/ { # don't match "push (%reg)"
+  # if a register is being pushed, and its value has not changed since the
+  #   beginning of this function, the pushed value can be used when printing
+  #   local variables at the next level up the stack
+  # emit '.cfi_rel_offset' for that
+
+  if (in_function) {
+    register = get_reg()
+    if (!saved[register] && !dirty[register]) {
+      printf ".cfi_rel_offset %s,0\n", register
+      saved[register] = 1
+    }
+  }
+}
+
+/movl? %e(ax|bx|cx|dx|si|di|bp),-?(0x[0-9a-fA-F]+|[0-9]+)?\(%esp\)/ {
+  if (in_function) {
+    register = get_reg()
+    if (match($0, /-?(0x[0-9a-fA-F]+|[0-9]+)\(%esp\)/)) {
+      offset = parse_const(substr($0, RSTART, RLENGTH-6))
+    } else {
+      offset = 0
+    }
+    if (!saved[register] && !dirty[register]) {
+      printf ".cfi_rel_offset %s,%d\n", register, offset
+      saved[register] = 1
+    }
+  }
+}
+
+# IF REGISTER VALUES ARE UNCEREMONIOUSLY TRASHED
+# ...then we want to know about it.
+#
+function trashed(register) {
+  if (in_function && !saved[register] && !dirty[register]) {
+    printf ".cfi_undefined %s\n", register
+  }
+  dirty[register] = 1
+}
+# this does NOT exhaustively check for all possible instructions which could
+# overwrite a register value inherited from the caller (just the common ones)
+/mov.*,%e(ax|bx|cx|dx|si|di|bp)/  { trashed(get_reg2()) }
+/(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr) %e(ax|bx|cx|dx|si|di|bp),/ {
+  trashed(get_reg1())
+}
+/i?mul [^,]*$/                    { trashed("eax"); trashed("edx") }
+/i?mul %e(ax|bx|cx|dx|si|di|bp),/ { trashed(get_reg1()) }
+/i?div/                           { trashed("eax"); trashed("edx") }
+/(dec|inc|not|neg|pop) %e(ax|bx|cx|dx|si|di|bp)/  { trashed(get_reg()) }
+/cpuid/ { trashed("eax"); trashed("ebx"); trashed("ecx"); trashed("edx") }
+
+END {
+  if (in_function)
+    print ".cfi_endproc"
+}
-- 
2.0.0.GIT



^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-07-09  5:00 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-15  6:45 [PATCH v9] Build process uses script to add CFI directives to x86 asm Alex Dowad
2015-06-20 21:11 ` Rich Felker
2015-06-21  5:00   ` Alex
2015-06-21  5:05     ` Rich Felker
2015-06-23 19:18 Alex Dowad
2015-06-23 21:52 ` Rich Felker
2015-06-24  5:33   ` Alex
2015-06-28  2:58     ` Rich Felker
2015-07-08 14:16     ` Rich Felker
2015-07-08 14:25       ` Alex
2015-07-09  5:00         ` Rich Felker
2015-07-07  3:39 ` Rich Felker
2015-07-07  5:45   ` Alex
2015-07-07 13:21     ` Rich Felker
2015-07-07 16:40       ` Alex
2015-07-07 18:15         ` Rich Felker
2015-07-07 18:27           ` Alex
     [not found]             ` <20150708023638.GJ1173@brightrain.aerifal.cx>
2015-07-08  7:13               ` Alex
2015-07-08  7:54                 ` Alex
2015-07-08 14:11                   ` Rich Felker
2015-07-08 14:24                     ` Alex

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).