From: Alexey Tourbin <at@altlinux.ru>
To: zsh-workers@sunsite.dk
Subject: Re: compaudit problem
Date: Thu, 19 Oct 2006 04:17:23 +0400 [thread overview]
Message-ID: <20061019001723.GT11317@localhost.localdomain> (raw)
In-Reply-To: <20061018182019.62809029.pws@csr.com>
[-- Attachment #1: Type: text/plain, Size: 3662 bytes --]
On Wed, Oct 18, 2006 at 06:20:19PM +0100, Peter Stephenson wrote:
> Alexey Tourbin <at@altlinux.ru> wrote:
> > Thanks for the clue. git-bisect now blames 22544.
>
> That patch made the shell smarter about finding the end of
> special types of string known to the shell (identifiers in particular),
> using the multibyte code.
>
> I wonder if it's part of the problem Andrey noted? At some points the
> string we apply this too may contain tokenized characters, which
> aren't valid multibyte characters. Since the string must be metafied,
> these are easy to detect.
>
> The simplest fix is just to ensure we don't try to handle these as
> mulitbyte characters, telling the caller they're invalid. Most callers
> will just handle it as a single-byte character and move on, which
> is the right thing to do; some callers which really need valid characters
> will abort, but they shouldn't be getting a tokenized string. So
> this might actually work. If not, we need to be smarter, but probably at a
> higher level.
>
> We need some fix like this even if it isn't the root of the present
> problem. (If I could reproduce that it ought now to be easy to trace.)
>
> Index: Src/utils.c
> ===================================================================
> RCS file: /cvsroot/zsh/zsh/Src/utils.c,v
> retrieving revision 1.142
> diff -u -r1.142 utils.c
> --- Src/utils.c 10 Oct 2006 09:37:19 -0000 1.142
> +++ Src/utils.c 18 Oct 2006 17:09:16 -0000
> @@ -4003,6 +4003,21 @@
> *wcp = (wint_t)(*s == Meta ? s[1] ^ 32 : *s);
> return 1 + (*s == Meta);
> }
> + /*
> + * We have to handle tokens here, since we may be looking
> + * through a tokenized input. Obviously this isn't
> + * a valid multibyte character, so just return WEOF
> + * and let the caller handle it as a single character.
> + *
> + * TODO: I've a sneaking suspicion we could do more here
> + * to prevent the caller always needing to handle invalid
> + * characters specially, but sometimes it may need to know.
> + */
> + if (itok(*s)) {
> + if (wcp)
> + *wcp = EOF;
> + return 1;
> + }
>
> ret = MB_INVALID;
> for (ptr = s; *ptr; ) {
Thanks Peter! This patch resolves the problem.
(I quote the whole message because apparently it was not CC'ed to
zsh-wokers.)
Unfortunately I don't quite understand unicode issues in zsh. I build
zsh rpm package because I use it (and a few others use it, too). The
latest stable 4.2 release had problems in utf8 console, so I decided
to move to then-current cvs snapshot. I got my first decently working
utf8-enabled zsh with 20050926 snapshot.
So as for now there's just about the only thing I can provide is feedback.
This will change as I grok zsh code.
BTW, git archive is available at
git://git.altlinux.org/people/at/packages/zsh.git
The 'master' branch is for my own cooking, but "cvs" branch, as well
as "zsh-4_0-patches" and "zsh-4_2-patches" have pristine zsh sources.
I verified "cvs" branch against checkout, and it's almost zero-diff
(the only exception is that there's very old Completion/Core/_closequotes
is in there, but is not in checkout). I used Keith Packard's "parsecvs"
(with my changes, some of which already merged into mainline).
> --
> Peter Stephenson <pws@csr.com> Software Engineer
> CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
> Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070
>
>
> To access the latest news from CSR copy this link into a web browser: http://www.csr.com/email_sig.php
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
next prev parent reply other threads:[~2006-10-19 0:17 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-19 11:50 Alexey Tourbin
2006-08-19 17:30 ` Bart Schaefer
2006-08-19 17:45 ` Alexey Tourbin
2006-08-19 19:20 ` Bart Schaefer
2006-08-20 17:55 ` Alexey Tourbin
2006-08-19 18:00 ` Alexey Tourbin
2006-08-20 17:16 ` Peter Stephenson
2006-08-20 17:32 ` Alexey Tourbin
2006-08-20 18:33 ` Peter Stephenson
2006-10-17 19:05 ` Alexey Tourbin
2006-10-18 3:41 ` Bart Schaefer
2006-10-18 12:00 ` Alexey Tourbin
2006-10-18 13:31 ` Peter Stephenson
2006-10-18 16:20 ` Alexey Tourbin
[not found] ` <20061018182019.62809029.pws@csr.com>
2006-10-19 0:17 ` Alexey Tourbin [this message]
2006-10-19 8:35 ` Peter Stephenson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20061019001723.GT11317@localhost.localdomain \
--to=at@altlinux.ru \
--cc=zsh-workers@sunsite.dk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).