From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/8649 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH 2/3] fix matching errors related to i386 addressing modes in CFI generation script Date: Mon, 12 Oct 2015 11:12:03 -0400 Message-ID: <20151012151203.GO8645@brightrain.aerifal.cx> References: <1444658340-10065-1-git-send-email-alexinbeijing@gmail.com> <1444658340-10065-2-git-send-email-alexinbeijing@gmail.com> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1444662742 17070 80.91.229.3 (12 Oct 2015 15:12:22 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 12 Oct 2015 15:12:22 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-8661-gllmg-musl=m.gmane.org@lists.openwall.com Mon Oct 12 17:12:21 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Zlelv-0005W1-Kj for gllmg-musl@m.gmane.org; Mon, 12 Oct 2015 17:12:19 +0200 Original-Received: (qmail 32140 invoked by uid 550); 12 Oct 2015 15:12:17 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 32122 invoked from network); 12 Oct 2015 15:12:16 -0000 Content-Disposition: inline In-Reply-To: <1444658340-10065-2-git-send-email-alexinbeijing@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:8649 Archived-At: On Mon, Oct 12, 2015 at 03:58:59PM +0200, Alex Dowad wrote: > the regexps previously used to identify registers clobbered by MOVs, ADDs, > and various other operations would erroneously match index registers. In other > words, the following asm: > > mov $0, (%eax,%ebx,4) > > ....would cause EBX to be considered as overwritten, which might prevent a > debugger from displaying a variable's value in a higher stack frame. > > thanks to Rich Felker for noticing this problem. > --- > tools/add-cfi.i386.awk | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/tools/add-cfi.i386.awk b/tools/add-cfi.i386.awk > index fc0d8cf..bd7932f 100644 > --- a/tools/add-cfi.i386.awk > +++ b/tools/add-cfi.i386.awk > @@ -184,13 +184,13 @@ function trashed(register) { > } > # this does NOT exhaustively check for all possible instructions which could > # overwrite a register value inherited from the caller (just the common ones) > -/mov.*,%e(ax|bx|cx|dx|si|di|bp)/ { trashed(get_reg2()) } > -/(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr).*,%e(ax|bx|cx|dx|si|di|bp)/ { > +/mov.*,%e(ax|bx|cx|dx|si|di|bp)$/ { trashed(get_reg2()) } > +/(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr).*,%e(ax|bx|cx|dx|si|di|bp)$/ { > trashed(get_reg2()) > } > -/^i?mul [^,]*$/ { trashed("eax"); trashed("edx") } > -/^i?mul.*,%e(ax|bx|cx|dx|si|di|bp)/ { trashed(get_reg2()) } > -/^i?div/ { trashed("eax"); trashed("edx") } > +/^i?mul [^,]*$/ { trashed("eax"); trashed("edx") } > +/^i?mul.*,%e(ax|bx|cx|dx|si|di|bp)$/ { trashed(get_reg2()) } > +/^i?div/ { trashed("eax"); trashed("edx") } > /(dec|inc|not|neg|pop) %e(ax|bx|cx|dx|si|di|bp)/ { trashed(get_reg()) } > /cpuid/ { trashed("eax"); trashed("ebx"); trashed("ecx"); trashed("edx") } Clever. At first I didn't see how this was fixing anything, with the .* still there, but given that you strip comments and extra whitespace, anchoring to the end with $ seems to work. While seeing them separately was useful for seeing how you fixed the bug, patches 1 and 2 should be merged for commit. All patch 2 is doing is fixing a bug that patch 1 introduces; together they just form a non-buggy version of "fix operand order". I can take care of the merging though. One other thing I noticed for future improvement: your patterns don't seem to catch instructions that modify just the low byte or half of a register. These are fairly uncommon in musl's i386 asm, but for x86_64, I would estimate a good 50% of register usage uses the 32-bit half (%e..) of a register rather than the full %r.., and your current script fails to mark these clobbers at all. Probably the regex should be something like %[er]?([abcd][xlh]|si|di|bp|...) - I don't recall the right form for the numbered x86_64 registers' low parts right off, though. Rich