zsh-workers
 help / color / mirror / code / Atom feed
* changing ZLE_CHAR_T?
@ 2005-10-28 22:58 Wayne Davison
  2005-10-29  9:59 ` Peter Stephenson
  0 siblings, 1 reply; 5+ messages in thread
From: Wayne Davison @ 2005-10-28 22:58 UTC (permalink / raw)
  To: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 957 bytes --]

One thing I don't like about the current definition of ZLE_CHAR_T is
that, in the non-multibyte code, you don't end up with a ZLE_STRING_T
if you take the address of a ZLE_CHAR_T variable (due to ZLE_CHAR_T
being an "int" and ZLE_STRING_T being an "unsigned char *").  Since
we have a ZLE_INT_T for those variables that need to be able to hold
a ZLEEOF value, I thought it would be cleaner to change ZLE_CHAR_T.

For my first try, I changed ZLE_CHAR_T into an "unsigned char", which
caused a bunch of problems with comparisons against char constants,
such as "Comma", because the strings in the rest of zsh are based on
"char", not "unsigned char".  I then decided to change ZLE_CHAR_T to
be a "char" and to also change ZLE_STRING_T into a "char *".  This
change worked with just a few other tweaks to the code, and all the
tests pass.

However, I won't commit this until I hear some assent that this is a
desired change.  Attached is the patch.

..wayne..

[-- Attachment #2: char-typedef.patch --]
[-- Type: text/plain, Size: 2576 bytes --]

--- Src/zsh.h	28 Oct 2005 22:14:22 -0000	1.81
+++ Src/zsh.h	28 Oct 2005 22:49:25 -0000
@@ -1435,7 +1435,7 @@ struct histent {
 #ifdef MULTIBYTE_SUPPORT
     wchar_t *zle_text;		/* the edited history line          */
 #else
-    unsigned char *zle_text;	/* the edited history line          */
+    char *zle_text;		/* the edited history line          */
 #endif
     int zle_len;		/* length of zle_text */
     time_t stim;		/* command started time (datestamp) */
--- Src/Zle/zle.h	28 Oct 2005 17:34:33 -0000	1.21
+++ Src/Zle/zle.h	28 Oct 2005 22:49:25 -0000
@@ -84,14 +84,14 @@ typedef wint_t   ZLE_INT_T;
 
 #else  /* Not MULTIBYTE_SUPPORT: old single-byte code */
 
-typedef int ZLE_CHAR_T;
-typedef unsigned char *ZLE_STRING_T;
+typedef char ZLE_CHAR_T;
+typedef char *ZLE_STRING_T;
 typedef int ZLE_INT_T;
-#define ZLE_CHAR_SIZE	sizeof(unsigned char)
+#define ZLE_CHAR_SIZE	sizeof(char)
 
 /* Leave character or string as is, but string must be unsigned char * */
 #define ZWC(c)	c
-#define ZWS(s)	(unsigned char *)s
+#define ZWS(s)	(char *)s
 
 #define ZLEEOF	EOF
 
--- Src/Zle/zle_misc.c	28 Oct 2005 17:34:33 -0000	1.30
+++ Src/Zle/zle_misc.c	28 Oct 2005 22:49:25 -0000
@@ -60,18 +60,14 @@ doinsert(ZLE_STRING_T zstr, int len)
 mod_export int
 selfinsert(UNUSED(char **args))
 {
-#ifdef MULTIBYTE_SUPPORT
-    /* wint_t and wchar_t not neccessarily the same size */
-    wchar_t tmp;
+    ZLE_CHAR_T tmp;
 
+#ifdef MULTIBYTE_SUPPORT
     if (!lastchar_wide_valid)
 	getrestchar(lastchar);
-    tmp = lastchar_wide;
-    doinsert(&tmp, 1);
-#else
-    unsigned char s = lastchar;
-    doinsert(&s, 1);
 #endif
+    tmp = LASTFULLCHAR;
+    doinsert(&tmp, 1);
     return 0;
 }
 
@@ -89,7 +85,7 @@ fixunmeta(void)
      * selfinsertunmeta is intrinsically problematic
      * with multibyte input.
      */
-    lastchar_wide = (ZLE_CHAR_T)lastchar;
+    lastchar_wide = (ZLE_INT_T)lastchar;
     lastchar_wide_valid = 1;
 #endif
 }
@@ -1098,7 +1094,7 @@ makesuffixstr(char *f, char *s, int n)
 
 /**/
 mod_export void
-iremovesuffix(ZLE_CHAR_T c, int keep)
+iremovesuffix(ZLE_INT_T c, int keep)
 {
     if (suffixfunc) {
 	Eprog prog = getshfunc(suffixfunc);
--- Src/Zle/zle_utils.c	28 Oct 2005 17:34:33 -0000	1.30
+++ Src/Zle/zle_utils.c	28 Oct 2005 22:49:25 -0000
@@ -170,7 +170,7 @@ zlelineasstring(ZLE_STRING_T instr, int 
 #ifdef MULTIBYTE_SUPPORT
 	unsigned char *strp = (unsigned char *)s;
 #else
-	unsigned char *strp = instr;
+	unsigned char *strp = (unsigned char *)instr;
 #endif
 	unsigned char *stopcs = strp + outcs;
 	unsigned char *stopll = strp + outll;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: changing ZLE_CHAR_T?
  2005-10-28 22:58 changing ZLE_CHAR_T? Wayne Davison
@ 2005-10-29  9:59 ` Peter Stephenson
  2005-10-31 21:45   ` Wayne Davison
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Stephenson @ 2005-10-29  9:59 UTC (permalink / raw)
  To: Zsh hackers list

Wayne Davison wrote:
> One thing I don't like about the current definition of ZLE_CHAR_T is
> that, in the non-multibyte code, you don't end up with a ZLE_STRING_T
> if you take the address of a ZLE_CHAR_T variable (due to ZLE_CHAR_T
> being an "int" and ZLE_STRING_T being an "unsigned char *").  Since
> we have a ZLE_INT_T for those variables that need to be able to hold
> a ZLEEOF value, I thought it would be cleaner to change ZLE_CHAR_T.

This is potentially a good idea, but you will need to be careful since
those "int" definitions for the variables now marked as ZLE_CHAR_T have
been around a very long time and it's not necessarily clear if they will
hold an EOF or not, though if they do they should certainly now be
ZLE_INT_T.  The distinction is quite recent and not necessarily
consistently applied throughout.

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page still at http://www.pwstephenson.fsnet.co.uk/


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: changing ZLE_CHAR_T?
  2005-10-29  9:59 ` Peter Stephenson
@ 2005-10-31 21:45   ` Wayne Davison
  2005-11-01  4:31     ` Wayne Davison
  0 siblings, 1 reply; 5+ messages in thread
From: Wayne Davison @ 2005-10-31 21:45 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 2288 bytes --]

On Sat, Oct 29, 2005 at 10:59:00AM +0100, Peter Stephenson wrote:
> you will need to be careful since those "int" definitions for the
> variables now marked as ZLE_CHAR_T have been around a very long time
> and it's not necessarily clear if they will hold an EOF or not, though
> if they do they should certainly now be ZLE_INT_T.

I looked at all the ZLE_CHAR_T variables while making my patch in order
to check for size differences (which is why a couple spots got changed
to use ZLE_INT_T).

One other potential problem area is less-than and greater-than
comparisons of character values now that the char may be signed.  I just
looked through the code to check for problems, and while I didn't see
any bugs, I did see a couple things that could be improved (such as
using idigit() instead of doing a range-comparison between '0' & '9').

Yet another potential problem area is calling functions such as
islower() with a signed character.  I looked through the code, and found
quite a few problems not related to my patch, including the calling of
some non-wide isTYPE() functions on wide characters in the zleline
array.  Another thing I noticed was that the wide version of ZC_iblank()
was calling iswspace(), which would return true for a newline, unlike
the normal iblank(), which does not (that is reserved for inblank()).

So, attached is a patch that fixes the problems I discovered:

I decided to add some new iTYPE() macros, namely iascii(), ilower(),
iprint(), and iupper().  I didn't modify inittyptab() to try to define
these values (for a number of reasons).  Instead, they ensure that
STOUC() gets called on the arg before we call the system's version of
these functions (aside: we could switch some of the existing macros over
to this method, such as idigit(), and free up some bits in the typtab[]
array).

I also added any missing ZC_iTYPE macros in zle.h, just so they'd all be
there.  The wide-char version of ZC_iblank now calls a new function,
wcsiblank(), that properly excludes '\n' from returning true.  However,
both ZC_iblank and ZC_inblank still make use of iswspace(), so they
return true for a carriage return ('\r') and a vertical tab ('\v') where
iblank and inblank do not.  I didn't try to change that one way or the
other.

Comments welcomed.

..wayne..

[-- Attachment #2: isTYPE.patch --]
[-- Type: text/plain, Size: 10868 bytes --]

--- Src/utils.c	28 Oct 2005 17:34:33 -0000	1.101
+++ Src/utils.c	31 Oct 2005 21:22:57 -0000
@@ -2619,6 +2619,16 @@ inittyptab(void)
 
 
 #ifdef MULTIBYTE_SUPPORT
+/* A wide-character version of the iblank() macro. */
+/**/
+mod_export int
+wcsiblank(wint_t wc)
+{
+    if (iswspace(wc) && wc != L'\n')
+	return 1;
+    return 0;
+}
+
 /*
  * iword() macro extended to support wide characters.
  */
@@ -2641,7 +2651,7 @@ wcsiword(wchar_t c)
     if (len == 0) {
 	/* NULL is special */
 	return iword(0);
-    } else if (len == 1 && isascii(*outstr)) {
+    } else if (len == 1 && iascii(*outstr)) {
 	return iword(*outstr);
     } else {
 	return iswalnum(c) || wcschr(wordchars_wide, c);
@@ -2673,7 +2683,7 @@ wcsiident(wchar_t c)
     if (len == 0) {
 	/* NULL is special */
 	return 0;
-    } else if (len == 1 && isascii(*outstr)) {
+    } else if (len == 1 && iascii(*outstr)) {
 	return iident(*outstr);
     } else {
 	/* not currently allowed, see above */
--- Src/ztype.h	13 Oct 2005 16:30:14 -0000	1.2
+++ Src/ztype.h	31 Oct 2005 21:22:57 -0000
@@ -58,3 +58,8 @@
 #define imeta(X) _icom(X,IMETA)
 #define iwsep(X) _icom(X,IWSEP)
 #define inull(X) _icom(X,INULL)
+
+#define iascii(X) isascii(STOUC(X))
+#define ilower(X) islower(STOUC(X))
+#define iprint(X) isprint(STOUC(X))
+#define iupper(X) isupper(STOUC(X))
--- Src/Modules/zftp.c	6 Feb 2005 20:36:45 -0000	1.37
+++ Src/Modules/zftp.c	31 Oct 2005 21:22:58 -0000
@@ -713,8 +713,7 @@ zfgetmsg(void)
 
     zfgetline(line, 256, tmout);
     ptr = line;
-    if (zfdrrrring || !isdigit(STOUC(*ptr)) || !isdigit(STOUC(ptr[1])) || 
-	!isdigit(STOUC(ptr[2]))) {
+    if (zfdrrrring || !idigit(*ptr) || !idigit(ptr[1]) || !idigit(ptr[2])) {
 	/* timeout, or not talking FTP.  not really interested. */
 	zcfinish = 2;
 	if (!zfclosing)
@@ -914,7 +913,7 @@ zfopendata(char *name, union tcp_sockadd
 		goto bad_epsv;
 	    while(ptr != end && *ptr == '0')
 		ptr++;
-	    if(ptr == end || (end-ptr) > 5 || !isdigit(STOUC(*ptr)))
+	    if(ptr == end || (end-ptr) > 5 || !idigit(*ptr))
 		goto bad_epsv;
 	    memcpy(portbuf, ptr, (end-ptr));
 	    portbuf[end-ptr] = 0;
@@ -937,7 +936,7 @@ zfopendata(char *name, union tcp_sockadd
 	     * lastmsg already has the reply code expunged.
 	     */
 	    for (ptr = lastmsg; *ptr; ptr++)
-		if (isdigit(STOUC(*ptr)))
+		if (idigit(*ptr))
 		    break;
 	    if (sscanf(ptr, "%d,%d,%d,%d,%d,%d",
 		       nums, nums+1, nums+2, nums+3, nums+4, nums+5) != 6) {
@@ -1103,11 +1102,11 @@ zfgetdata(char *name, char *rest, char *
 	char *ptr = strstr(lastmsg, "bytes");
 	zfstatusp[zfsessno] |= ZFST_NOSZ|ZFST_TRSZ;
 	if (ptr) {
-	    while (ptr > lastmsg && !isdigit(STOUC(*ptr)))
+	    while (ptr > lastmsg && !idigit(*ptr))
 		ptr--;
-	    while (ptr > lastmsg && isdigit(STOUC(ptr[-1])))
+	    while (ptr > lastmsg && idigit(ptr[-1]))
 		ptr--;
-	    if (isdigit(STOUC(*ptr))) {
+	    if (idigit(*ptr)) {
 		zfstatusp[zfsessno] &= ~ZFST_NOSZ;
 		if (getsize) {
 		    off_t sz = zstrtol(ptr, NULL, 10);
--- Src/Modules/zselect.c	2 Jun 2004 22:15:01 -0000	1.6
+++ Src/Modules/zselect.c	31 Oct 2005 21:22:58 -0000
@@ -42,7 +42,7 @@ handle_digits(char *nam, char *argptr, f
     int fd;
     char *endptr;
 
-    if (!isdigit(STOUC(*argptr))) {
+    if (!idigit(*argptr)) {
 	zwarnnam(nam, "expecting file descriptor: %s", argptr, 0);
 	return 1;
     }
--- Src/Zle/zle.h	28 Oct 2005 17:34:33 -0000	1.21
+++ Src/Zle/zle.h	31 Oct 2005 21:22:58 -0000
@@ -70,13 +70,18 @@ typedef wint_t   ZLE_INT_T;
 #define ZMB_nicewidth(s)	mb_niceformat(s, NULL, NULL, 0)
 
 /* Functions that operate on ZLE_CHAR_T. */
-#define ZC_iblank iswspace
+#define ZC_ialpha iswalpha
+#define ZC_iblank wcsiblank
 #define ZC_icntrl iswcntrl
+#define ZC_idigit iswdigit
 #define ZC_iident wcsiident
+#define ZC_ilower iswlower
+#define ZC_inblank iswspace
+#define ZC_iupper iswupper
+#define ZC_iword wcsiword
 
 #define ZC_tolower towlower
 #define ZC_toupper towupper
-#define ZC_iword  wcsiword
 
 #define ZC_nicechar(c) wcs_nicechar(c, NULL, NULL)
 
@@ -129,13 +134,18 @@ static inline int ZS_strncmp(ZLE_STRING_
 #endif
 
 /* Functions that operate on ZLE_CHAR_T. */
+#define ZC_ialpha ialpha
 #define ZC_iblank iblank
 #define ZC_icntrl icntrl
+#define ZC_idigit idigit
 #define ZC_iident iident
+#define ZC_ilower ilower
+#define ZC_inblank inblank
+#define ZC_iupper iupper
+#define ZC_iword iword
 
 #define ZC_tolower tulower
 #define ZC_toupper tuupper
-#define ZC_iword   iword
 
 #define LASTFULLCHAR	lastchar
 
--- Src/Zle/zle_move.c	31 Oct 2005 21:19:30 -0000	1.7
+++ Src/Zle/zle_move.c	31 Oct 2005 21:22:58 -0000
@@ -459,7 +459,7 @@ int
 vifirstnonblank(UNUSED(char **args))
 {
     zlecs = findbol();
-    while (zlecs != zlell && iblank(zleline[zlecs]))
+    while (zlecs != zlell && ZC_iblank(zleline[zlecs]))
 	zlecs++;
     return 0;
 }
--- Src/Zle/zle_vi.c	28 Oct 2005 17:34:33 -0000	1.12
+++ Src/Zle/zle_vi.c	31 Oct 2005 21:22:59 -0000
@@ -487,7 +487,8 @@ vireplace(UNUSED(char **args))
 int
 vireplacechars(UNUSED(char **args))
 {
-    int ch, n = zmult;
+    ZLE_INT_T ch;
+    int n = zmult;
 
     startvichange(1);
     /* check argument range */
@@ -507,7 +508,7 @@ vireplacechars(UNUSED(char **args))
 	return 1;
     }
     /* do change */
-    if (ch == '\r' || ch == '\n') {
+    if (ch == ZWC('\r') || ch == ZWC('\n')) {
 	/* <return> handled specially */
 	zlecs += n - 1;
 	backkill(n - 1, 0);
@@ -570,9 +571,9 @@ vioperswapcase(UNUSED(char **args))
 	oldcs = zlecs;
 	/* swap the case of all letters within range */
 	while (zlecs < c2) {
-	    if (islower(zleline[zlecs]))
+	    if (ZC_ilower(zleline[zlecs]))
 		zleline[zlecs] = ZC_toupper(zleline[zlecs]);
-	    else if (isupper(zleline[zlecs]))
+	    else if (ZC_iupper(zleline[zlecs]))
 		zleline[zlecs] = ZC_tolower(zleline[zlecs]);
 	    zlecs++;
 	}
@@ -788,9 +789,9 @@ vijoin(UNUSED(char **args))
     if ((x = findeol()) == zlell)
 	return 1;
     zlecs = x + 1;
-    for (x = 1; zlecs != zlell && iblank(zleline[zlecs]); zlecs++, x++);
+    for (x = 1; zlecs != zlell && ZC_iblank(zleline[zlecs]); zlecs++, x++);
     backdel(x);
-    if (zlecs && iblank(zleline[zlecs-1]))
+    if (zlecs && ZC_iblank(zleline[zlecs-1]))
 	zlecs--;
     else {
 	spaceinline(1);
@@ -810,9 +811,9 @@ viswapcase(UNUSED(char **args))
 	return 1;
     eol = findeol();
     while (zlecs < eol && n--) {
-	if (islower(zleline[zlecs]))
+	if (ZC_ilower(zleline[zlecs]))
 	    zleline[zlecs] = ZC_toupper(zleline[zlecs]);
-	else if (isupper(zleline[zlecs]))
+	else if (ZC_iupper(zleline[zlecs]))
 	    zleline[zlecs] = ZC_tolower(zleline[zlecs]);
 	zlecs++;
     }
@@ -830,11 +831,7 @@ vicapslockpanic(UNUSED(char **args))
     statusline = ZWS("press a lowercase key to continue");
     statusll = ZS_strlen(statusline);
     zrefresh();
-#ifdef MULTIBYTE_SUPPORT
-    while (!iswlower(getfullchar(0)));
-#else
-    while (!islower(getfullchar(0)));
-#endif
+    while (!ZC_ilower(getfullchar(0)));
     statusline = NULL;
     return 0;
 }
--- Src/Zle/zle_word.c	9 Sep 2005 20:34:43 -0000	1.6
+++ Src/Zle/zle_word.c	31 Oct 2005 21:22:59 -0000
@@ -72,11 +72,11 @@ viforwardword(char **args)
 	    while (zlecs != zlell && iident(zleline[zlecs]))
 		zlecs++;
 	else
-	    while (zlecs != zlell && !iident(zleline[zlecs]) && !iblank(zleline[zlecs]))
+	    while (zlecs != zlell && !iident(zleline[zlecs]) && !ZC_iblank(zleline[zlecs]))
 		zlecs++;
 	if (wordflag && !n)
 	    return 0;
-	while (zlecs != zlell && (iblank(zleline[zlecs]) || zleline[zlecs] == '\n'))
+	while (zlecs != zlell && ZC_inblank(zleline[zlecs]))
 	    zlecs++;
     }
     return 0;
@@ -96,11 +96,11 @@ viforwardblankword(char **args)
 	return ret;
     }
     while (n--) {
-	while (zlecs != zlell && !iblank(zleline[zlecs]))
+	while (zlecs != zlell && !ZC_iblank(zleline[zlecs]))
 	    zlecs++;
 	if (wordflag && !n)
 	    return 0;
-	while (zlecs != zlell && iblank(zleline[zlecs]))
+	while (zlecs != zlell && ZC_iblank(zleline[zlecs]))
 	    zlecs++;
     }
     return 0;
@@ -139,9 +139,9 @@ viforwardblankwordend(UNUSED(char **args
     if (n < 0)
 	return 1;
     while (n--) {
-	while (zlecs != zlell && iblank(zleline[zlecs + 1]))
+	while (zlecs != zlell && ZC_iblank(zleline[zlecs + 1]))
 	    zlecs++;
-	while (zlecs != zlell && !iblank(zleline[zlecs + 1]))
+	while (zlecs != zlell && !ZC_iblank(zleline[zlecs + 1]))
 	    zlecs++;
     }
     if (zlecs != zlell && virangeflag)
@@ -163,14 +163,14 @@ viforwardwordend(char **args)
 	return ret;
     }
     while (n--) {
-	if (iblank(zleline[zlecs + 1]))
-	    while (zlecs != zlell && iblank(zleline[zlecs + 1]))
+	if (ZC_iblank(zleline[zlecs + 1]))
+	    while (zlecs != zlell && ZC_iblank(zleline[zlecs + 1]))
 		zlecs++;
 	if (iident(zleline[zlecs + 1]))
 	    while (zlecs != zlell && iident(zleline[zlecs + 1]))
 		zlecs++;
 	else
-	    while (zlecs != zlell && !iident(zleline[zlecs + 1]) && !iblank(zleline[zlecs + 1]))
+	    while (zlecs != zlell && !iident(zleline[zlecs + 1]) && !ZC_iblank(zleline[zlecs + 1]))
 		zlecs++;
     }
     if (zlecs != zlell && virangeflag)
@@ -214,13 +214,13 @@ vibackwardword(char **args)
 	return ret;
     }
     while (n--) {
-	while (zlecs && iblank(zleline[zlecs - 1]))
+	while (zlecs && ZC_iblank(zleline[zlecs - 1]))
 	    zlecs--;
 	if (iident(zleline[zlecs - 1]))
 	    while (zlecs && iident(zleline[zlecs - 1]))
 		zlecs--;
 	else
-	    while (zlecs && !iident(zleline[zlecs - 1]) && !iblank(zleline[zlecs - 1]))
+	    while (zlecs && !iident(zleline[zlecs - 1]) && !ZC_iblank(zleline[zlecs - 1]))
 		zlecs--;
     }
     return 0;
@@ -240,9 +240,9 @@ vibackwardblankword(char **args)
 	return ret;
     }
     while (n--) {
-	while (zlecs && iblank(zleline[zlecs - 1]))
+	while (zlecs && ZC_iblank(zleline[zlecs - 1]))
 	    zlecs--;
-	while (zlecs && !iblank(zleline[zlecs - 1]))
+	while (zlecs && !ZC_iblank(zleline[zlecs - 1]))
 	    zlecs--;
     }
     return 0;
@@ -304,13 +304,13 @@ vibackwardkillword(UNUSED(char **args))
 	return 1;
 /* this taken from "vibackwardword" */
     while (n--) {
-	while ((x > lim) && iblank(zleline[x - 1]))
+	while ((x > lim) && ZC_iblank(zleline[x - 1]))
 	    x--;
 	if (iident(zleline[x - 1]))
 	    while ((x > lim) && iident(zleline[x - 1]))
 		x--;
 	else
-	    while ((x > lim) && !iident(zleline[x - 1]) && !iblank(zleline[x - 1]))
+	    while ((x > lim) && !iident(zleline[x - 1]) && !ZC_iblank(zleline[x - 1]))
 		x--;
     }
     backkill(zlecs - x, 1);
@@ -398,7 +398,7 @@ capitalizeword(UNUSED(char **args))
 	first = 1;
 	while (zlecs != zlell && !ZC_iword(zleline[zlecs]))
 	    zlecs++;
-	while (zlecs != zlell && ZC_iword(zleline[zlecs]) && !isalpha(zleline[zlecs]))
+	while (zlecs != zlell && ZC_iword(zleline[zlecs]) && !ZC_ialpha(zleline[zlecs]))
 	    zlecs++;
 	while (zlecs != zlell && ZC_iword(zleline[zlecs])) {
 	    zleline[zlecs] = (first) ? ZC_toupper(zleline[zlecs]) :

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: changing ZLE_CHAR_T?
  2005-10-31 21:45   ` Wayne Davison
@ 2005-11-01  4:31     ` Wayne Davison
  2005-11-01 10:20       ` Peter Stephenson
  0 siblings, 1 reply; 5+ messages in thread
From: Wayne Davison @ 2005-11-01  4:31 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 1198 bytes --]

FYI, I went ahead and committed the ZLE_CHAR_T changes with a few
extra minor modifications from the posted patch.

On Mon, Oct 31, 2005 at 01:45:38PM -0800, Wayne Davison wrote:
> (aside: we could switch some of the existing macros over to this
> method, such as idigit(), and free up some bits in the typtab[]
> array).

It also allows the set of characters for things like isalpha() to
flex with the OS's definition.

Attached is a diff that changes idigit(), ialpha(), and ialnum().
I almost changed icntrl() too, but I noticed that zsh is currently
treating chars 128 - 159 as control characters (something that the
normal iscntrl() function does not do), so I left it alone for now.
This does point out an inconsistency in multibyte mode: the ZC_icntrl()
macro (which currently uses iswcntrl()) will not return true for these
extended chars.  Do we need this?

In an attempt to make multibyte ZC_iblank() and ZC_inblank() work the
same as the non-multibyte versions, this patch chooses to extend the
non-multibyte versions of the functions by giving \r and \v the IBLANK
and INBLANK bits.

I'm not going to commit this, so feel free to let me know if this is
not a good way to go.

..wayne..

[-- Attachment #2: typtab.patch --]
[-- Type: text/plain, Size: 2574 bytes --]

--- Src/utils.c	1 Nov 2005 02:50:24 -0000	1.102
+++ Src/utils.c	1 Nov 2005 04:06:03 -0000
@@ -2534,9 +2534,9 @@ inittyptab(void)
 	typtab[t0] = typtab[t0 + 128] = ICNTRL;
     typtab[127] = ICNTRL;
     for (t0 = '0'; t0 <= '9'; t0++)
-	typtab[t0] = IDIGIT | IALNUM | IWORD | IIDENT | IUSER;
+	typtab[t0] = IIDENT | IUSER | IWORD;
     for (t0 = 'a'; t0 <= 'z'; t0++)
-	typtab[t0] = typtab[t0 - 'a' + 'A'] = IALPHA | IALNUM | IIDENT | IUSER | IWORD;
+	typtab[t0] = typtab[t0 - 'a' + 'A'] = IIDENT | IUSER | IWORD;
 #ifndef MULTIBYTE_SUPPORT
     /*
      * This really doesn't seem to me the right thing to do when
@@ -2547,12 +2547,14 @@ inittyptab(void)
      * should disappear into history...
      */
     for (t0 = 0240; t0 != 0400; t0++)
-	typtab[t0] = IALPHA | IALNUM | IIDENT | IUSER | IWORD;
+	typtab[t0] = IIDENT | IUSER | IWORD;
 #endif
     typtab['_'] = IIDENT | IUSER;
     typtab['-'] = IUSER;
     typtab[' '] |= IBLANK | INBLANK;
     typtab['\t'] |= IBLANK | INBLANK;
+    typtab['\r'] |= IBLANK | INBLANK;
+    typtab['\v'] |= IBLANK | INBLANK;
     typtab['\n'] |= INBLANK;
     typtab['\0'] |= IMETA;
     typtab[STOUC(Meta)  ] |= IMETA;
--- Src/ztype.h	1 Nov 2005 02:50:22 -0000	1.3
+++ Src/ztype.h	1 Nov 2005 04:06:03 -0000
@@ -27,13 +27,13 @@
  *
  */
 
-#define IDIGIT   (1 <<  0)
-#define IALNUM   (1 <<  1)
+#define UNUSED0  (1 <<  0)
+#define UNUSED1  (1 <<  1)
 #define IBLANK   (1 <<  2)
 #define INBLANK  (1 <<  3)
 #define ITOK     (1 <<  4)
 #define ISEP     (1 <<  5)
-#define IALPHA   (1 <<  6)
+#define UNUSED6  (1 <<  6)
 #define IIDENT   (1 <<  7)
 #define IUSER    (1 <<  8)
 #define ICNTRL   (1 <<  9)
@@ -42,14 +42,12 @@
 #define IMETA    (1 << 12)
 #define IWSEP    (1 << 13)
 #define INULL    (1 << 14)
+#define UNUSED15 (1 << 15)
 #define _icom(X,Y) (typtab[STOUC(X)] & Y)
-#define idigit(X) _icom(X,IDIGIT)
-#define ialnum(X) _icom(X,IALNUM)
 #define iblank(X) _icom(X,IBLANK)	/* blank, not including \n */
 #define inblank(X) _icom(X,INBLANK)	/* blank or \n */
 #define itok(X) _icom(X,ITOK)
 #define isep(X) _icom(X,ISEP)
-#define ialpha(X) _icom(X,IALPHA)
 #define iident(X) _icom(X,IIDENT)
 #define iuser(X) _icom(X,IUSER)	/* username char */
 #define icntrl(X) _icom(X,ICNTRL)
@@ -59,7 +57,10 @@
 #define iwsep(X) _icom(X,IWSEP)
 #define inull(X) _icom(X,INULL)
 
+#define ialnum(X) isalnum(STOUC(X))
+#define ialpha(X) isalpha(STOUC(X))
 #define iascii(X) isascii(STOUC(X))
+#define idigit(X) isdigit(STOUC(X))
 #define ilower(X) islower(STOUC(X))
 #define iprint(X) isprint(STOUC(X))
 #define iupper(X) isupper(STOUC(X))

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: changing ZLE_CHAR_T?
  2005-11-01  4:31     ` Wayne Davison
@ 2005-11-01 10:20       ` Peter Stephenson
  0 siblings, 0 replies; 5+ messages in thread
From: Peter Stephenson @ 2005-11-01 10:20 UTC (permalink / raw)
  To: zsh-workers

Wayne Davison <wayned@users.sourceforge.net> wrote:
> I almost changed icntrl() too, but I noticed that zsh is currently
> treating chars 128 - 159 as control characters (something that the
> normal iscntrl() function does not do), so I left it alone for now.
> This does point out an inconsistency in multibyte mode: the ZC_icntrl()
> macro (which currently uses iswcntrl()) will not return true for these
> extended chars.  Do we need this?

I think we probably just need to let the old logic largely do what it
always did to avoid surprises (if the same thing can be done better by
making the code more consistent that's fine), and have the new code use the
system macros wherever possible.

The old code made wild guesses on the assumption that it couldn't rely on
system tests, which was probably true when it was originally written.
Characters 160 on are marked as alphanumeric for the same reason.  I don't
see any point being too clever with it now; it might as well stay
consistent with the way it's always been.

Now we have the ability to do things properly in the new code we should do
that wherever possible.  Consistency with the old logic isn't really a big
issue for once---proper multibyte support is bound to do a lot of things
differently.

We will gradually want to extend the use of iswalnum() etc, which might
well mean the current set of i-macros is gradually phased out of the
multibyte version.

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2005-11-01 10:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-10-28 22:58 changing ZLE_CHAR_T? Wayne Davison
2005-10-29  9:59 ` Peter Stephenson
2005-10-31 21:45   ` Wayne Davison
2005-11-01  4:31     ` Wayne Davison
2005-11-01 10:20       ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).