From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2606 invoked by alias); 20 Feb 2015 03:33:11 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 34580 Received: (qmail 25647 invoked from network); 20 Feb 2015 03:33:08 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=yp/U4cj7dnGIPHYJeYIx2bkFnscwjD2M8GYQbRIS2n0=; b=cSUOMkvTD3M7PpCyKsc/dIzq6Wr7dJrER7+d3JYAOdBWtkvIVveYNaA1ras6aeRHIA 1yP2A80f9Cvu4II4fUl6sJrUXd4im02yaxYuSYIfrt0FnmlDQJk9sBFe9dMjyPSSvChh RUoVvJAAVskXLw5othp8+HM7ZZmu0gJLQ+rxzalwzpcOW9sExmk3g2XuR7jeP054hhyp +C206mxE/2aIjvvpR1G81IlblRwYl+1HjSYY/ldIMTUGhfuPL94O0RfCHnhLSWffy0N9 YsKzNcZO75ktfn3+6wj1E6ioapUZr9MfZyab56RSQyhR3FhBfgqO0jz6Ml+iFxGonL1R 98lA== MIME-Version: 1.0 X-Received: by 10.42.249.2 with SMTP id mi2mr9824510icb.36.1424403187226; Thu, 19 Feb 2015 19:33:07 -0800 (PST) In-Reply-To: References: <20150219101315.477f7f95@pwslap01u.europe.root.pri> <20150219220311.7dfdc4ec@ntlworld.com> Date: Fri, 20 Feb 2015 04:33:07 +0100 Message-ID: Subject: Re: PATCH: parse from even deeper in hell From: Mikael Magnusson To: Peter Stephenson Cc: "Zsh Hackers' List" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Fri, Feb 20, 2015 at 4:22 AM, Mikael Magnusson wrote= : > On Fri, Feb 20, 2015 at 4:16 AM, Mikael Magnusson wro= te: >> On Thu, Feb 19, 2015 at 11:03 PM, Peter Stephenson >> wrote: >>> On Thu, 19 Feb 2015 22:47:12 +0100 >>> Mikael Magnusson wrote: >>>> I get a crapton of "bad(2) wordsplit reading history:" with this >>>> patch. It seems like all the failed lines have metafied characters in >>>> them, if that's a hint. Most don't contain any syntax characters at >>>> all, for example: >>>> hist.c:3499: bad(2) wordsplit reading history: mp3info =E5=A5=BD=E3= =81=8D=E3=81=AB=E3=81=AA=E3=82=8A\M-c\M-^A=E3=81=84.mp3 >>>> at: =E5=A5=BD=E3=81=8D=E3=81=AB=E3=81=AA=E3=82=8A\M-c\M-^A=E3=81=84.mp= 3s >>>> word: =E5=A5=BD=E3=81=8D=E3=81=AB=E3=81=AA=E3=82=8A\M-c\M-^A=E3=81=84.= mp3 >>> >>> Unless I'm missing something, I don't think you've said what the real >>> characters you're expecting are. The broken ones aren't much use for >>> testing. >>> >>>> The (2) means it's the second of the two bad=3D1; assignments >>>> triggering. >>> >>> At line 3490? >> >> Yes. >> >>>> I'm also not sure why the utf8 is slightly mishandled in the output >>>> there. It has at least been unmetafied, the raw string in the history >>>> file is more or less: >>>> mp3info =E5=A5=BD=E3=81=83=EF=BF=BD=E3=81=AB=E3=81=AA=E3=82=83=EF=BF= =BD=E3=81=9F=E3=81=83=EF=BF=BD.mp3 >>> >>> So those aren't actually valid characters? Does that mean metafied >>> characters are getting into the history? I've made it necessary for tw= o >>> more bytes to be metafied, so if the shell was expecting them to be >>> metafied in the history file they won't be. The bytes are 0x9e and >>> 0x9f. I guess we could special case those, but do we really output >>> metafied characters to the history file? >> >> The actual line in the history is >> mp3info =E5=A5=BD=E3=81=8D=E3=81=AB=E3=81=AA=E3=82=8A=E3=81=9F=E3=81=84.= mp3 >> but in the history _file_, it's stored metafied, which is hard to >> paste into an email. I'm not sure why pasting the original string >> didn't occur to me. AFAIK, history files have always been metafied. >> I'm not sure why the =E3=81=9F is mangled in the error message is what I= tried >> to say originally. The final byte is 9f which I suppose is an esc with >> the 8th bit set. Maybe something is trying to double unmetafy? Running >> it through unmetafy() twice doesn't cause any problems though... > > Just looked at the debug code and found out about ZSH_DEBUG_LOG, turns > out there's also a 0x8A just before the \M-c\M-^ Rerunning the original command seems to produce a different metafied string than what was in the history before. What's weird is that it does import correctly into the session from both lines... The line from running it again also does not cause the wordsplit error. grepping both of them into my unmetafy program also produces identical utf8 strings. This is the one causing a problem, mp3info M-eM-%M-=3DM-cM-^AM-^CM--M-cM-^AM-+M-cM-^AM-*M-cM-^BM-^CM-*M-cM-^AM= -^_M-cM-^AM-^CM-$.mp3 and this is what we store now which is fine, mp3info M-eM-%M-=3DM-cM-^AM-^CM--M-cM-^AM-+M-cM-^AM-*M-cM-^BM-^CM-*M-cM-^AM= -^CM-?M-cM-^AM-^CM-$.mp3 Any idea why I have a bunch of history entries stored differently, that do unmetafy to the correct string, but are parsed weirdly with a patch that changes how $(( is parsed? I don't quite see the connection yet :). --=20 Mikael Magnusson