From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/8650 Path: news.gmane.org!not-for-mail From: Alex Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH 2/3] fix matching errors related to i386 addressing modes in CFI generation script Date: Mon, 12 Oct 2015 20:30:00 +0200 Message-ID: References: <1444658340-10065-1-git-send-email-alexinbeijing@gmail.com> <1444658340-10065-2-git-send-email-alexinbeijing@gmail.com> <20151012151203.GO8645@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=089e013a28225b8aed0521ec8497 X-Trace: ger.gmane.org 1444674617 22066 80.91.229.3 (12 Oct 2015 18:30:17 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 12 Oct 2015 18:30:17 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-8662-gllmg-musl=m.gmane.org@lists.openwall.com Mon Oct 12 20:30:17 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1ZlhrU-0006EK-N7 for gllmg-musl@m.gmane.org; Mon, 12 Oct 2015 20:30:16 +0200 Original-Received: (qmail 7335 invoked by uid 550); 12 Oct 2015 18:30:13 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 7313 invoked from network); 12 Oct 2015 18:30:12 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=/H23OdTKre4DIXd2e7y+wuP2rk8cypq0Pe+3MWwHUBg=; b=eN7U/q5scWa6uQieASpujWK4HpLds61G9Vsx6ysPNHos4d03Fv9OVQmnTbMK5qiGnL BPUKM4ZHhEaB9Xj2IKFum54vFYv9X/ZPRQadTqp0UhvgZvCP9hEPpdESNXOdYnzlW3Xp +Vr8LUoBs2akMY+6WJZiaZmjrObYj67NSlDAnTwpSGZjv6QZZTsJHIik1It/q1j6OmIM cTImHOKpMWbqxdKUo/eKdlzL/hpcGJzXrJRvGwfkfo/mGHirQ3KSoPgkxFtrnWu+opy+ mu39HTkbMLBzXosz2MiSwCIC7/iW56whsEvAzibWeCKd6z30nRX7kB4wjiaVRc/vuHAS rQng== X-Received: by 10.50.73.138 with SMTP id l10mr15123293igv.53.1444674600140; Mon, 12 Oct 2015 11:30:00 -0700 (PDT) In-Reply-To: <20151012151203.GO8645@brightrain.aerifal.cx> Xref: news.gmane.org gmane.linux.lib.musl.general:8650 Archived-At: --089e013a28225b8aed0521ec8497 Content-Type: text/plain; charset=UTF-8 On Mon, Oct 12, 2015 at 5:12 PM, Rich Felker wrote: > On Mon, Oct 12, 2015 at 03:58:59PM +0200, Alex Dowad wrote: > > the regexps previously used to identify registers clobbered by MOVs, > ADDs, > > and various other operations would erroneously match index registers. In > other > > words, the following asm: > > > > mov $0, (%eax,%ebx,4) > > > > ....would cause EBX to be considered as overwritten, which might prevent > a > > debugger from displaying a variable's value in a higher stack frame. > > > > thanks to Rich Felker for noticing this problem. > > --- > > tools/add-cfi.i386.awk | 10 +++++----- > > 1 file changed, 5 insertions(+), 5 deletions(-) > > > > diff --git a/tools/add-cfi.i386.awk b/tools/add-cfi.i386.awk > > index fc0d8cf..bd7932f 100644 > > --- a/tools/add-cfi.i386.awk > > +++ b/tools/add-cfi.i386.awk > > @@ -184,13 +184,13 @@ function trashed(register) { > > } > > # this does NOT exhaustively check for all possible instructions which > could > > # overwrite a register value inherited from the caller (just the common > ones) > > -/mov.*,%e(ax|bx|cx|dx|si|di|bp)/ { trashed(get_reg2()) } > > > -/(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr).*,%e(ax|bx|cx|dx|si|di|bp)/ > { > > +/mov.*,%e(ax|bx|cx|dx|si|di|bp)$/ { trashed(get_reg2()) } > > > +/(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr).*,%e(ax|bx|cx|dx|si|di|bp)$/ > { > > trashed(get_reg2()) > > } > > -/^i?mul [^,]*$/ { trashed("eax"); trashed("edx") } > > -/^i?mul.*,%e(ax|bx|cx|dx|si|di|bp)/ { trashed(get_reg2()) } > > -/^i?div/ { trashed("eax"); trashed("edx") } > > +/^i?mul [^,]*$/ { trashed("eax"); trashed("edx") } > > +/^i?mul.*,%e(ax|bx|cx|dx|si|di|bp)$/ { trashed(get_reg2()) } > > +/^i?div/ { trashed("eax"); trashed("edx") } > > /(dec|inc|not|neg|pop) %e(ax|bx|cx|dx|si|di|bp)/ { trashed(get_reg()) } > > /cpuid/ { trashed("eax"); trashed("ebx"); trashed("ecx"); > trashed("edx") } > > Clever. At first I didn't see how this was fixing anything, with the > .* still there, but given that you strip comments and extra > whitespace, anchoring to the end with $ seems to work. > > While seeing them separately was useful for seeing how you fixed the > bug, patches 1 and 2 should be merged for commit. All patch 2 is doing > is fixing a bug that patch 1 introduces; together they just form a > non-buggy version of "fix operand order". I can take care of the > merging though. > Let me do it. I'll merge them and send another patch series. One other thing I noticed for future improvement: your patterns don't > seem to catch instructions that modify just the low byte or half of a > register. These are fairly uncommon in musl's i386 asm, but for > x86_64, I would estimate a good 50% of register usage uses the 32-bit > half (%e..) of a register rather than the full %r.., and your current > script fails to mark these clobbers at all. Probably the regex should > be something like %[er]?([abcd][xlh]|si|di|bp|...) - I don't recall > the right form for the numbered x86_64 registers' low parts right off, > though. Thanks for bringing this out. I'll fix this too and send again. Alex --089e013a28225b8aed0521ec8497 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
On M= on, Oct 12, 2015 at 5:12 PM, Rich Felker <dalias@libc.org> wro= te:
On Mon, Oct 12, 2015= at 03:58:59PM +0200, Alex Dowad wrote:
> the regexps previously used to identify registers clobbered by MOVs, A= DDs,
> and various other operations would erroneously match index registers. = In other
> words, the following asm:
>
>=C2=A0 =C2=A0 =C2=A0mov $0, (%eax,%ebx,4)
>
> ....would cause EBX to be considered as overwritten, which migh= t prevent a
> debugger from displaying a variable's value= in a higher stack frame.
>
> thanks to Rich Felker for noticing this problem.
> ---
>=C2=A0 tools/add-cfi.i386.awk | 10 +++++-----
>=C2=A0 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/tools/add-cfi.i386.awk b/tools/add-cfi.i386.awk
> index fc0d8cf..bd7932f 100644
> --- a/tools/add-cfi.i386.awk
> +++ b/tools/add-cfi.i386.awk
> @@ -184,13 +184,13 @@ function trashed(register) {
>=C2=A0 }
>=C2=A0 # this does NOT exhaustively check for all possible instructions= which could
>=C2=A0 # overwrite a register value inherited from the caller (just the= common ones)
> -/mov.*,%e(ax|bx|cx|dx|si|di|bp)/=C2=A0 { trashed(get_reg2()) }
> -/(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr).*,%e(ax|bx|cx|dx|= si|di|bp)/ {
> +/mov.*,%e(ax|bx|cx|dx|si|di|bp)$/=C2=A0 { trashed(get_reg2()) }
> +/(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr).*,%e(ax|bx|cx|dx|= si|di|bp)$/ {
>=C2=A0 =C2=A0 trashed(get_reg2())
>=C2=A0 }
> -/^i?mul [^,]*$/=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0{ trashed("eax"); trashed("edx"= ) }
> -/^i?mul.*,%e(ax|bx|cx|dx|si|di|bp)/ { trashed(get_reg2()) }
> -/^i?div/=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 { trashed("eax"); trashed(= "edx") }
> +/^i?mul [^,]*$/=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 { trashed("eax"); trashed("edx"= ;) }
> +/^i?mul.*,%e(ax|bx|cx|dx|si|di|bp)$/ { trashed(get_reg2()) }
> +/^i?div/=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0{ trashed("eax"); tr= ashed("edx") }
>=C2=A0 /(dec|inc|not|neg|pop) %e(ax|bx|cx|dx|si|di|bp)/=C2=A0 { trashed= (get_reg()) }
>=C2=A0 /cpuid/ { trashed("eax"); trashed("ebx"); tr= ashed("ecx"); trashed("edx") }

Clever. At first I didn't see how this was fixing anything,= with the
.* still there, but given that you strip comments and extra
whitespace, anchoring to the end with $ seems to work.

While seeing them separately was useful for seeing how you fixed the
bug, patches 1 and 2 should be merged for commit. All patch 2 is doing
is fixing a bug that patch 1 introduces; together they just form a
non-buggy version of "fix operand order". I can take care of the<= br> merging though.

Let me do it. I'll = merge them and send another patch series.=C2=A0

One other thing I noticed for future improvement: your patterns don't seem to catch instructions that modify just the low byte or half of a
register. These are fairly uncommon in musl's i386 asm, but for
x86_64, I would estimate a good 50% of register usage uses the 32-bit
half (%e..) of a register rather than the full %r.., and your current
script fails to mark these clobbers at all. Probably the regex should
be something like %[er]?([abcd][xlh]|si|di|bp|...) - I don't recall
the right form for the numbered x86_64 registers' low parts right off,<= br> though.

Thanks for bringing this out. I'= ;ll fix this too and send again.

Alex=C2=A0
<= /div>
--089e013a28225b8aed0521ec8497--