zsh-workers
 help / color / mirror / code / Atom feed
From: Wayne Davison <wayned@users.sourceforge.net>
To: Bart Schaefer <schaefer@brasslantern.com>
Cc: zsh-workers@sunsite.dk
Subject: Re: bug in completion/expansion of files with LANG=C
Date: Sun, 8 Jan 2006 00:06:21 -0800	[thread overview]
Message-ID: <20060108080621.GA32692@dot.blorf.net> (raw)
In-Reply-To: <1060108055620.ZM15382@candle.brasslantern.com>

[-- Attachment #1: Type: text/plain, Size: 1448 bytes --]

On Sun, Jan 08, 2006 at 05:56:20AM +0000, Bart Schaefer wrote:
> I prefer the \M- form because it gives you some clue what you should do
> to generate the equivalent value from the keyboard.

Fair enough -- let's just leave it alone, then.

As for my patch in the grandparent email, I noticed some problems with
it:  the manpage for mbrtowc() says that the state of the mbstate_t
object is undefined after the function returns -1, so the code should
reset it to a known state.  When the function returns -2, it means the
code scanned to the end of the string without finding the end of a wide
character, so perhaps we should treat all the remaining characters as
invalid?  I'm not certain that's the correct thing to do, so I'll leave
the code handling -2 the same way as -1 for now.  Finally, I wasn't
setting the right visible width for the \M-... string (I had mistakenly
hardwired it to "1").

While twiddling these things I noticed a couple other things that I
think could be improved:

1. It looks to me like the code in wcs_nicechar() that calls
wcswidth(&c, 1) could really just call wcwidth(c), right?  If not,
what am I missing?

2. The code in mb_niceformat() calls strlen() on the "fmt" string
returned by wcs_nicechar(), but it seems to me that it could just use
the width that wcs_nicechar() returned, right?

Attached is an updated version of my patch that fixes the aforementioned
bugs and implements the 2 improvements.

..wayne..

[-- Attachment #2: mb_niceformat.patch --]
[-- Type: text/plain, Size: 1992 bytes --]

--- Src/utils.c	15 Dec 2005 14:51:41 -0000	1.108
+++ Src/utils.c	8 Jan 2006 07:55:56 -0000
@@ -375,7 +375,7 @@ wcs_nicechar(wchar_t c, size_t *widthp, 
     }
 
     if (widthp)
-	*widthp = (s - buf) + wcswidth(&c, 1);
+	*widthp = (s - buf) + wcwidth(c);
     if (swidep)
 	*swidep = s;
     for (mbptr = mbstr; ret; s++, mbptr++, ret--) {
@@ -3446,8 +3446,8 @@ niceztrlen(char const *s)
 mod_export size_t
 mb_niceformat(const char *s, FILE *stream, char **outstrp, int heap)
 {
-    size_t l = 0, newl, ret;
-    int umlen, outalloc, outleft;
+    size_t l = 0, outlen, outleft, ret;
+    int umlen, outalloc;
     wchar_t c;
     char *ums, *ptr, *fmt, *outstr, *outptr;
     mbstate_t ps;
@@ -3473,31 +3473,31 @@ mb_niceformat(const char *s, FILE *strea
     while (umlen > 0) {
 	ret = mbrtowc(&c, ptr, umlen, &ps);
 
-	if (ret == (size_t)-1 || ret == (size_t)-2)
-	{
-	    /*
-	     * We're a bit stuck here.  I suppose we could
-	     * just stick with \M-... for the individual bytes.
-	     */
-	    break;
-	}
-	/*
-	 * careful in case converting NULL returned 0: NULLs are real
-	 * characters for us.
-	 */
-	if (c == L'\0' && ret == 0)
+	if (ret != (size_t)-1 && ret != (size_t)-2) {
+	    /* Careful:  converting '\0' returns 0, but a '\0' is a
+	     * real character for us, so we should consume 1 byte. */
+	    if (c == L'\0')
+		ret = 1;
+
+	    fmt = wcs_nicechar(c, &outlen, NULL);
+	} else {
+	    /* Get ps out of its undefined state. */
+	    memset(&ps, 0, sizeof ps);
 	    ret = 1;
+
+	    /* The byte didn't convert, so output it as a \M-... sequence. */
+	    fmt = nicechar(*(unsigned char*)ptr);
+	    outlen = strlen(fmt);
+	}
+
 	umlen -= ret;
 	ptr += ret;
-
-	fmt = wcs_nicechar(c, &newl, NULL);
-	l += newl;
+	l += outlen;
 
 	if (stream)
 	    zputs(fmt, stream);
 	if (outstr) {
 	    /* Append to output string */
-	    int outlen = strlen(fmt);
 	    if (outlen >= outleft) {
 		/* Reallocate to twice the length */
 		int outoffset = outptr - outstr;

  reply	other threads:[~2006-01-08  8:06 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-06 21:58 Wayne Davison
2006-01-06 22:40 ` Peter Stephenson
2006-01-07  0:59   ` Wayne Davison
2006-01-07  0:17 ` Wayne Davison
2006-01-07 22:44 ` Wayne Davison
2006-01-08  5:56   ` Bart Schaefer
2006-01-08  8:06     ` Wayne Davison [this message]
2006-01-08 18:03       ` Peter Stephenson
2006-01-08 23:16         ` Wayne Davison
2006-01-12  1:26         ` Wayne Davison
2006-01-09  1:42 ` Wayne Davison

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060108080621.GA32692@dot.blorf.net \
    --to=wayned@users.sourceforge.net \
    --cc=schaefer@brasslantern.com \
    --cc=zsh-workers@sunsite.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).