mailing list of musl libc
 help / color / mirror / code / Atom feed
* [PATCH 1/5] Pull a couple common AWK functions for CFI scripts into separate file
@ 2015-10-02 11:32 Alex Dowad
  2015-10-02 11:32 ` [PATCH 2/5] When generating CFI for i386 asm, don't mistake an FDIV instruction for DIV Alex Dowad
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Alex Dowad @ 2015-10-02 11:32 UTC (permalink / raw)
  To: musl

There is a lot which could be common between i386 and x86_64, but none of it
will be useful for any other arch. These should be useful for all archs,
however.
---

Thanks to Rich Felker for feedback. Some improvements have been made...

 Makefile                 |  2 +-
 tools/add-cfi.common.awk | 26 ++++++++++++++++++++++++++
 tools/add-cfi.i386.awk   | 27 ---------------------------
 3 files changed, 27 insertions(+), 28 deletions(-)
 create mode 100644 tools/add-cfi.common.awk

diff --git a/Makefile b/Makefile
index 5a6a43b..844a017 100644
--- a/Makefile
+++ b/Makefile
@@ -122,7 +122,7 @@ $(foreach s,$(wildcard src/*/$(ARCH)*/*.s),$(eval $(call mkasmdep,$(s))))
 # Choose invocation of assembler to be used
 # $(1) is input file, $(2) is output file, $(3) is assembler flags
 ifeq ($(ADD_CFI),yes)
-	AS_CMD = LC_ALL=C awk -f tools/add-cfi.$(ARCH).awk $< | $(CC) -x assembler -c -o $@ -
+	AS_CMD = LC_ALL=C awk -f tools/add-cfi.common.awk -f tools/add-cfi.$(ARCH).awk $< | $(CC) -x assembler -c -o $@ -
 else
 	AS_CMD = $(CC) -c -o $@ $<
 endif
diff --git a/tools/add-cfi.common.awk b/tools/add-cfi.common.awk
new file mode 100644
index 0000000..04482d4
--- /dev/null
+++ b/tools/add-cfi.common.awk
@@ -0,0 +1,26 @@
+function hex2int(str,   i) {
+  str = tolower(str)
+
+  for (i = 1; i <= 16; i++) {
+    char = substr("0123456789abcdef", i, 1)
+    lookup[char] = i-1
+  }
+
+  result = 0
+  for (i = 1; i <= length(str); i++) {
+    result = result * 16
+    char   = substr(str, i, 1)
+    result = result + lookup[char]
+  }
+  return result
+}
+
+function parse_const(str) {
+  sign = sub(/^-/, "", str)
+  hex  = sub(/^0x/, "", str)
+  if (hex)
+    n = hex2int(str)
+  else
+    n = str+0
+  return sign ? -n : n
+}
diff --git a/tools/add-cfi.i386.awk b/tools/add-cfi.i386.awk
index 4a4a3b6..b8bdd7f 100644
--- a/tools/add-cfi.i386.awk
+++ b/tools/add-cfi.i386.awk
@@ -22,33 +22,6 @@ BEGIN {
   called = ""
 }
 
-function hex2int(str,   i) {
-  str = tolower(str)
-
-  for (i = 1; i <= 16; i++) {
-    char = substr("0123456789abcdef", i, 1)
-    lookup[char] = i-1
-  }
-
-  result = 0
-  for (i = 1; i <= length(str); i++) {
-    result = result * 16
-    char   = substr(str, i, 1)
-    result = result + lookup[char]
-  }
-  return result
-}
-
-function parse_const(str) {
-  sign = sub(/^-/, "", str)
-  hex  = sub(/^0x/, "", str)
-  if (hex)
-    n = hex2int(str)
-  else
-    n = str+0
-  return sign ? -n : n
-}
-
 function get_const1() {
   # for instructions with 2 operands, get 1st operand (assuming it is constant)
   match($0, /-?(0x[0-9a-fA-F]+|[0-9]+),/)
-- 
2.0.0.GIT



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 2/5] When generating CFI for i386 asm, don't mistake an FDIV instruction for DIV
  2015-10-02 11:32 [PATCH 1/5] Pull a couple common AWK functions for CFI scripts into separate file Alex Dowad
@ 2015-10-02 11:32 ` Alex Dowad
  2015-10-02 11:32 ` [PATCH 3/5] When generating CFI for i386 asm, don't mistake an FMUL instruction for MUL Alex Dowad
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Alex Dowad @ 2015-10-02 11:32 UTC (permalink / raw)
  To: musl

...that would cause the script to erroneously conclude that the values of EAX
and EDX have been overwritten.

Probably won't make a difference, but it's just as well to get it right...
---
 tools/add-cfi.i386.awk | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/add-cfi.i386.awk b/tools/add-cfi.i386.awk
index b8bdd7f..2da1899 100644
--- a/tools/add-cfi.i386.awk
+++ b/tools/add-cfi.i386.awk
@@ -190,7 +190,7 @@ function trashed(register) {
 }
 /i?mul [^,]*$/                    { trashed("eax"); trashed("edx") }
 /i?mul %e(ax|bx|cx|dx|si|di|bp),/ { trashed(get_reg1()) }
-/i?div/                           { trashed("eax"); trashed("edx") }
+/^i?div/                          { trashed("eax"); trashed("edx") }
 /(dec|inc|not|neg|pop) %e(ax|bx|cx|dx|si|di|bp)/  { trashed(get_reg()) }
 /cpuid/ { trashed("eax"); trashed("ebx"); trashed("ecx"); trashed("edx") }
 
-- 
2.0.0.GIT



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 3/5] When generating CFI for i386 asm, don't mistake an FMUL instruction for MUL
  2015-10-02 11:32 [PATCH 1/5] Pull a couple common AWK functions for CFI scripts into separate file Alex Dowad
  2015-10-02 11:32 ` [PATCH 2/5] When generating CFI for i386 asm, don't mistake an FDIV instruction for DIV Alex Dowad
@ 2015-10-02 11:32 ` Alex Dowad
  2015-10-02 11:32 ` [PATCH 4/5] In i386 CFI script, binary ops like ADD, AND, etc. modify the 2nd operand, not 1st Alex Dowad
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Alex Dowad @ 2015-10-02 11:32 UTC (permalink / raw)
  To: musl

---
 tools/add-cfi.i386.awk | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/add-cfi.i386.awk b/tools/add-cfi.i386.awk
index 2da1899..5dc8794 100644
--- a/tools/add-cfi.i386.awk
+++ b/tools/add-cfi.i386.awk
@@ -188,9 +188,9 @@ function trashed(register) {
 /(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr) %e(ax|bx|cx|dx|si|di|bp),/ {
   trashed(get_reg1())
 }
-/i?mul [^,]*$/                    { trashed("eax"); trashed("edx") }
-/i?mul %e(ax|bx|cx|dx|si|di|bp),/ { trashed(get_reg1()) }
-/^i?div/                          { trashed("eax"); trashed("edx") }
+/^i?mul [^,]*$/                    { trashed("eax"); trashed("edx") }
+/^i?mul %e(ax|bx|cx|dx|si|di|bp),/ { trashed(get_reg1()) }
+/^i?div/                           { trashed("eax"); trashed("edx") }
 /(dec|inc|not|neg|pop) %e(ax|bx|cx|dx|si|di|bp)/  { trashed(get_reg()) }
 /cpuid/ { trashed("eax"); trashed("ebx"); trashed("ecx"); trashed("edx") }
 
-- 
2.0.0.GIT



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 4/5] In i386 CFI script, binary ops like ADD, AND, etc. modify the 2nd operand, not 1st
  2015-10-02 11:32 [PATCH 1/5] Pull a couple common AWK functions for CFI scripts into separate file Alex Dowad
  2015-10-02 11:32 ` [PATCH 2/5] When generating CFI for i386 asm, don't mistake an FDIV instruction for DIV Alex Dowad
  2015-10-02 11:32 ` [PATCH 3/5] When generating CFI for i386 asm, don't mistake an FMUL instruction for MUL Alex Dowad
@ 2015-10-02 11:32 ` Alex Dowad
  2015-10-02 11:32 ` [PATCH 5/5] Add script to add CFI directives to asm files in debug builds of x86_64 Alex Dowad
  2015-10-08 21:37 ` [PATCH 1/5] Pull a couple common AWK functions for CFI scripts into separate file Rich Felker
  4 siblings, 0 replies; 6+ messages in thread
From: Alex Dowad @ 2015-10-02 11:32 UTC (permalink / raw)
  To: musl

Thanks to Rich Felker for noticing my mistake here!
---
 tools/add-cfi.i386.awk | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/add-cfi.i386.awk b/tools/add-cfi.i386.awk
index 5dc8794..fc0d8cf 100644
--- a/tools/add-cfi.i386.awk
+++ b/tools/add-cfi.i386.awk
@@ -185,12 +185,12 @@ function trashed(register) {
 # this does NOT exhaustively check for all possible instructions which could
 # overwrite a register value inherited from the caller (just the common ones)
 /mov.*,%e(ax|bx|cx|dx|si|di|bp)/  { trashed(get_reg2()) }
-/(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr) %e(ax|bx|cx|dx|si|di|bp),/ {
-  trashed(get_reg1())
+/(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr).*,%e(ax|bx|cx|dx|si|di|bp)/ {
+  trashed(get_reg2())
 }
-/^i?mul [^,]*$/                    { trashed("eax"); trashed("edx") }
-/^i?mul %e(ax|bx|cx|dx|si|di|bp),/ { trashed(get_reg1()) }
-/^i?div/                           { trashed("eax"); trashed("edx") }
+/^i?mul [^,]*$/                     { trashed("eax"); trashed("edx") }
+/^i?mul.*,%e(ax|bx|cx|dx|si|di|bp)/ { trashed(get_reg2()) }
+/^i?div/                            { trashed("eax"); trashed("edx") }
 /(dec|inc|not|neg|pop) %e(ax|bx|cx|dx|si|di|bp)/  { trashed(get_reg()) }
 /cpuid/ { trashed("eax"); trashed("ebx"); trashed("ecx"); trashed("edx") }
 
-- 
2.0.0.GIT



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 5/5] Add script to add CFI directives to asm files in debug builds of x86_64
  2015-10-02 11:32 [PATCH 1/5] Pull a couple common AWK functions for CFI scripts into separate file Alex Dowad
                   ` (2 preceding siblings ...)
  2015-10-02 11:32 ` [PATCH 4/5] In i386 CFI script, binary ops like ADD, AND, etc. modify the 2nd operand, not 1st Alex Dowad
@ 2015-10-02 11:32 ` Alex Dowad
  2015-10-08 21:37 ` [PATCH 1/5] Pull a couple common AWK functions for CFI scripts into separate file Rich Felker
  4 siblings, 0 replies; 6+ messages in thread
From: Alex Dowad @ 2015-10-02 11:32 UTC (permalink / raw)
  To: musl

This makes debugging more pleasant for the odd times when you have to dig into
files implemented in x86_64 assembly.

It turns out that the parts we care about are almost exactly like i386, except
for the register names!
---
 tools/add-cfi.x86_64.awk | 185 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 185 insertions(+)
 create mode 100644 tools/add-cfi.x86_64.awk

diff --git a/tools/add-cfi.x86_64.awk b/tools/add-cfi.x86_64.awk
new file mode 100644
index 0000000..8c82ea3
--- /dev/null
+++ b/tools/add-cfi.x86_64.awk
@@ -0,0 +1,185 @@
+# Insert GAS CFI directives ("control frame information") into x86-64 asm input
+
+BEGIN {
+  # don't put CFI data in the .eh_frame ELF section (which we don't keep)
+  print ".cfi_sections .debug_frame"
+
+  # only emit CFI directives inside a function
+  in_function = 0
+
+  # emit .loc directives with line numbers from original source
+  printf ".file 1 \"%s\"\n", ARGV[1]
+  line_number = 0
+
+  # used to detect "call label; label:" trick
+  called = ""
+}
+
+function get_const1() {
+  # for instructions with 2 operands, get 1st operand (assuming it is constant)
+  match($0, /-?(0x[0-9a-fA-F]+|[0-9]+),/)
+  return parse_const(substr($0, RSTART, RLENGTH-1))
+}
+function get_reg() {
+  # only use if you already know there is 1 and only 1 register
+  match($0, /%r(ax|bx|cx|dx|si|di|bp|8|9|10|11|12|13|14|15)/)
+  return substr($0, RSTART+1, RLENGTH-1)
+}
+function get_reg1() {
+  # for instructions with 2 operands, get 1st operand (assuming it is register)
+  match($0, /%r(ax|bx|cx|dx|si|di|bp|8|9|10|11|12|13|14|15),/)
+  return substr($0, RSTART+1, RLENGTH-2)
+}
+function get_reg2() {
+  # for instructions with 2 operands, get 2nd operand (assuming it is register)
+  match($0, /,%r(ax|bx|cx|dx|si|di|bp|8|9|10|11|12|13|14|15)/)
+  return substr($0, RSTART+2, RLENGTH-2)
+}
+
+function adjust_sp_offset(delta) {
+  if (in_function)
+    printf ".cfi_adjust_cfa_offset %d\n", delta
+}
+
+{
+  line_number = line_number + 1
+
+  # clean the input up before doing anything else
+  # delete comments
+  gsub(/(#|\/\/).*/, "")
+
+  # canonicalize whitespace
+  gsub(/[ \t]+/, " ") # mawk doesn't understand \s
+  gsub(/ *, */, ",")
+  gsub(/ *: */, ": ")
+  gsub(/ $/, "")
+  gsub(/^ /, "")
+}
+
+# check for assembler directives which we care about
+/^\.(section|data|text)/ {
+  # a .cfi_startproc/.cfi_endproc pair should be within the same section
+  # otherwise, clang will choke when generating ELF output
+  if (in_function) {
+    print ".cfi_endproc"
+    in_function = 0
+  }
+}
+/^\.type [a-zA-Z0-9_]+,\@function/ {
+  functions[substr($2, 1, length($2)-10)] = 1
+}
+# not interested in assembler directives beyond this, just pass them through
+/^\./ {
+  print
+  next
+}
+
+/^[a-zA-Z0-9_]+:/ {
+  label = substr($1, 1, length($1)-1) # drop trailing :
+
+  if (called == label) {
+    # note adjustment of stack pointer from "call label; label:"
+    adjust_sp_offset(8)
+  }
+
+  if (functions[label]) {
+    if (in_function)
+      print ".cfi_endproc"
+
+    in_function = 1
+    print ".cfi_startproc"
+
+    for (register in saved)
+      delete saved[register]
+    for (register in dirty)
+      delete dirty[register]
+  }
+
+  # an instruction may follow on the same line, so continue processing
+}
+
+/^$/ { next }
+
+{
+  called = ""
+  printf ".loc 1 %d\n", line_number
+  print
+}
+
+# KEEPING UP WITH THE STACK POINTER
+# %rsp should only be adjusted by pushing/popping or adding/subtracting constants
+#
+/pushl?/ {
+  adjust_sp_offset(8)
+}
+/popl?/ {
+  adjust_sp_offset(-8)
+}
+/addl? \$-?(0x[0-9a-fA-F]+|[0-9]+),%rsp/ { adjust_sp_offset(-get_const1()) }
+/subl? \$-?(0x[0-9a-fA-F]+|[0-9]+),%rsp/ { adjust_sp_offset(get_const1()) }
+
+/call/ {
+  if (match($0, /call [0-9]+f/)) # "forward" label
+    called = substr($0, RSTART+5, RLENGTH-6)
+  else if (match($0, /call [0-9a-zA-Z_]+/))
+    called = substr($0, RSTART+5, RLENGTH-5)
+}
+
+# TRACKING REGISTER VALUES FROM THE PREVIOUS STACK FRAME
+#
+/pushl? %r(ax|bx|cx|dx|si|di|bp|8|9|10|11|12|13|14|15)/ { # don't match "push (%reg)"
+  # if a register is being pushed, and its value has not changed since the
+  #   beginning of this function, the pushed value can be used when printing
+  #   local variables at the next level up the stack
+  # emit '.cfi_rel_offset' for that
+
+  if (in_function) {
+    register = get_reg()
+    if (!saved[register] && !dirty[register]) {
+      printf ".cfi_rel_offset %s,0\n", register
+      saved[register] = 1
+    }
+  }
+}
+
+/movl? %r(ax|bx|cx|dx|si|di|bp|8|9|10|11|12|13|14|15),-?(0x[0-9a-fA-F]+|[0-9]+)?\(%rsp\)/ {
+  if (in_function) {
+    register = get_reg()
+    if (match($0, /-?(0x[0-9a-fA-F]+|[0-9]+)\(%rsp\)/)) {
+      offset = parse_const(substr($0, RSTART, RLENGTH-6))
+    } else {
+      offset = 0
+    }
+    if (!saved[register] && !dirty[register]) {
+      printf ".cfi_rel_offset %s,%d\n", register, offset
+      saved[register] = 1
+    }
+  }
+}
+
+# IF REGISTER VALUES ARE UNCEREMONIOUSLY TRASHED
+# ...then we want to know about it.
+#
+function trashed(register) {
+  if (in_function && !saved[register] && !dirty[register]) {
+    printf ".cfi_undefined %s\n", register
+  }
+  dirty[register] = 1
+}
+# this does NOT exhaustively check for all possible instructions which could
+# overwrite a register value inherited from the caller (just the common ones)
+/mov.*,%r(ax|bx|cx|dx|si|di|bp|8|9|10|11|12|13|14|15)/  { trashed(get_reg2()) }
+/(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr).*,%r(ax|bx|cx|dx|si|di|bp|8|9|10|11|12|13|14|15)/ {
+  trashed(get_reg2())
+}
+/^i?mul [^,]*$/                    { trashed("rax"); trashed("rdx") }
+/^i?mul.*,%r(ax|bx|cx|dx|si|di|bp|8|9|10|11|12|13|14|15)/ { trashed(get_reg2()) }
+/^i?div/                          { trashed("rax"); trashed("rdx") }
+
+/(dec|inc|not|neg|pop) %r(ax|bx|cx|dx|si|di|bp|8|9|10|11|12|13|14|15)/  { trashed(get_reg()) }
+/cpuid/ { trashed("rax"); trashed("rbx"); trashed("rcx"); trashed("rdx") }
+
+END {
+  if (in_function)
+    print ".cfi_endproc"
+}
-- 
2.0.0.GIT



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/5] Pull a couple common AWK functions for CFI scripts into separate file
  2015-10-02 11:32 [PATCH 1/5] Pull a couple common AWK functions for CFI scripts into separate file Alex Dowad
                   ` (3 preceding siblings ...)
  2015-10-02 11:32 ` [PATCH 5/5] Add script to add CFI directives to asm files in debug builds of x86_64 Alex Dowad
@ 2015-10-08 21:37 ` Rich Felker
  4 siblings, 0 replies; 6+ messages in thread
From: Rich Felker @ 2015-10-08 21:37 UTC (permalink / raw)
  To: musl

On Fri, Oct 02, 2015 at 01:32:32PM +0200, Alex Dowad wrote:
> There is a lot which could be common between i386 and x86_64, but none of it
> will be useful for any other arch. These should be useful for all archs,
> however.
> ---

Thanks! I'm commenting on these as a set rather than breaking this
down into separate replies.

I think you still have errors in your instruction parsing which will
lead to incorrect clobbers, just not as bad as before. For example,
movl $0,(%esp,%esi) gets marked as clobbering %esi. Using .* in the
patterns is probably a bad idea. I think the x86_64 patch is also
affected, but I tested it and otherwise it basically works.

In any case I'm going ahead and applying patches 1-3, with 2 and 3
merged into one commit. In general, try to merge changes that are
conceptually related and where putting them together in one patch
doesn't obscure what was changed; for example, merging 4 with 2-3
would obscure changes IMO since there would be pattern changes and
operand order changes in the same lines. I've also edited commit
messages to match capitalization used in shortlog and to make git log
fit in 80 columns (generally messages should be wrapped at 70-72
chars, like email).

Could you follow up with fixes for patches 4 and 5? I'd like to try to
get these into the next release (soon).

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-10-08 21:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-02 11:32 [PATCH 1/5] Pull a couple common AWK functions for CFI scripts into separate file Alex Dowad
2015-10-02 11:32 ` [PATCH 2/5] When generating CFI for i386 asm, don't mistake an FDIV instruction for DIV Alex Dowad
2015-10-02 11:32 ` [PATCH 3/5] When generating CFI for i386 asm, don't mistake an FMUL instruction for MUL Alex Dowad
2015-10-02 11:32 ` [PATCH 4/5] In i386 CFI script, binary ops like ADD, AND, etc. modify the 2nd operand, not 1st Alex Dowad
2015-10-02 11:32 ` [PATCH 5/5] Add script to add CFI directives to asm files in debug builds of x86_64 Alex Dowad
2015-10-08 21:37 ` [PATCH 1/5] Pull a couple common AWK functions for CFI scripts into separate file Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).