zsh-workers
 help / color / mirror / code / Atom feed
* PATCH: autoconf test for multibyte support
@ 2006-08-03 18:06 Peter Stephenson
  2006-08-04 14:50 ` Peter Stephenson
  0 siblings, 1 reply; 2+ messages in thread
From: Peter Stephenson @ 2006-08-03 18:06 UTC (permalink / raw)
  To: Zsh hackers list

This should do a better job of working out whether multibyte support can
be enabled by looking for all the functions we use.  I may have missed
some out; see if the list jogs any recollections.

If this works I will need to change some of the installation
documentation.

RCS file: /cvsroot/zsh/zsh/configure.ac,v
retrieving revision 1.55
diff -u -r1.55 configure.ac
--- configure.ac	19 Jun 2006 10:55:26 -0000	1.55
+++ configure.ac	3 Aug 2006 17:51:52 -0000
@@ -1122,7 +1122,7 @@
 	       pcre_compile pcre_study pcre_exec \
 	       nl_langinfo \
 	       erand48 open_memstream \
-	       wctomb mbrtowc wcrtomb iconv \
+	       wctomb iconv \
 	       grantpt unlockpt ptsname \
 	       htons ntohs)
 AC_FUNC_STRCOLL
@@ -2079,33 +2079,34 @@
    fi
 fi
 
-dnl ---------------------
-dnl multibyte ZLE support
-dnl ---------------------
+dnl -----------------
+dnl multibyte support
+dnl -----------------
 AC_ARG_ENABLE(multibyte,
-AC_HELP_STRING([--enable-multibyte], [support multibyte chars in the zsh line editor]),
-[zsh_cv_c_zle_unicode_support=$enableval],
-[AC_CACHE_CHECK(if the system adequately supports multibyte chars,
- zsh_cv_c_zle_unicode_support,
-  [AC_TRY_COMPILE([
-#ifdef HAVE_LOCALE_H
-# include <locale.h>
-#endif
-   ], [
-#if defined(HAVE_WCHAR_H) && defined(HAVE_WCTOMB) \
- && defined(HAVE_MBRTOWC) && defined(HAVE_WCRTOMB) \
- && defined (__STDC_ISO_10646__)
-    /* All is well */
-#else
-# error Not supported.
-#endif
-  ],
-    zsh_cv_c_zle_unicode_support=yes,
-    zsh_cv_c_zle_unicode_support=no)])
+AC_HELP_STRING([--enable-multibyte], [support multibyte characters]),
+[zsh_cv_c_unicode_support=$enableval],
+[AC_CACHE_VAL(zsh_cv_c_unicode_support,
+  AC_MSG_NOTICE([checking for functions supporting multibyte characters])
+  [zfuncs_absent=
+   for zfunc in iswalnum iswcntrl iswdigit iswgraph iswlower iswprint \
+iswpunct iswspace iswupper iswxdigit mbrlen mbrtowc towupper towlower \
+wcschr wcscpy wcslen wcsncmp wcsncpy wcrtomb wcwidth wmemchr wmemcmp \
+wmemcpy wmemmove wmemset; do
+     AC_CHECK_FUNC($zfunc,
+     [:], [zfuncs_absent="$zfuncs_absent $zfunc"])
+    done
+    if test x"$zfuncs_absent" = x; then
+      AC_MSG_NOTICE([all functions found, multibyte support enabled])
+      zsh_cv_c_unicode_support=yes
+    else
+      AC_MSG_NOTICE([missing functions, multibyte support disabled])
+      zsh_cv_c_unicode_support=no
+    fi
+  ])
 ])
 AH_TEMPLATE([MULTIBYTE_SUPPORT],
 [Define to 1 if you want support for multibyte character sets.])
-if test x$zsh_cv_c_zle_unicode_support = xyes; then
+if test x$zsh_cv_c_unicode_support = xyes; then
   AC_DEFINE(MULTIBYTE_SUPPORT)
 fi
 

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


To access the latest news from CSR copy this link into a web browser:  http://www.csr.com/email_sig.php


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: PATCH: autoconf test for multibyte support
  2006-08-03 18:06 PATCH: autoconf test for multibyte support Peter Stephenson
@ 2006-08-04 14:50 ` Peter Stephenson
  0 siblings, 0 replies; 2+ messages in thread
From: Peter Stephenson @ 2006-08-04 14:50 UTC (permalink / raw)
  To: zsh-workers

Peter Stephenson <pws@csr.com> wrote:
> If this works I will need to change some of the installation
> documentation.

This changes some documentation.

I'm only guessing it works on Cygwin, all I know is it compiles with the
same code that works everywhere else.

Index: INSTALL
===================================================================
RCS file: /cvsroot/zsh/zsh/INSTALL,v
retrieving revision 1.25
diff -u -r1.25 INSTALL
--- INSTALL	16 Feb 2006 14:28:54 -0000	1.25
+++ INSTALL	4 Aug 2006 14:44:06 -0000
@@ -264,37 +264,32 @@
 ---------------------------
 
 Support for multibyte character sets that extend ASCII, such as UTF-8, is
-under development but the code in the line editor is sufficiently stable to
-be turned on by default in environments that provide full ISO 10646 support
-including the preprocessor definition __STDC_ISO_10646__.  In principle
-this definition does not guarantee the full environment, but in practice
-systems with this defined also provide suitable library support.  The shell
-does not probe for all the features, so on other systems use of multibyte
-support must be explicitly enabled when it is available.
+now reasonably close to complete, except that combining characters are not
+handled properly (some assistance with this problem would be appreciated).
+The configuration script should turn on multibyte support on all systems
+where it can be compiled successfully.
 
 The support can be explicitly enabled or disable with --enable-multibyte or
---disable-multibyte.  Reports of systems where multibyte support was not
-enabled by default but --enable-multibyte resulted in a usable shell would
-be appreciated.  The developers are not aware of any need to use
+--disable-multibyte.  The developers are not aware of any need to use
 --disable-multibyte and this should be reported as a bug.  Currently
-multibyte mode is believed to work automatically on:
+multibyte mode is believed to work on at least the following:
 
   - All(?) current GNU/Linux distributions
-
-and to work when configured with --enable-multibyte on:
-
   - OS X 10.4.3 (problems have been reported with multibyte characters
     in HFS file names)
   - NetBSD 2.0.2
   - Solaris 8+ (inputting multibyte characters from the keyboard doesn't
     work in some installations).
+  - Cygwin (though use of multibyte characters is somewhat non-standard).
 
-The main shell is not yet aware of multibyte characters, so for example the
-length of a scalar parameter will return the number of bytes, not
-characters, and pattern tests likewise treat single bytes as if they were
-characters.  This means that pattern tests such as ? and [[:alpha:]] do not
-work correctly with characters in multibyte character sets beyond the ASCII
-subset.
+The corresponding shell option MULTIBYTE is now on by default in all
+emulation modes when multibyte support is enabled.  Turning it off is not
+recommended unless there is a particular need to examine single bytes
+regardless of the locale.  As the line editor bases its behaviour on the
+locale regardless of the option (in order to correspond to the displayed
+character set), the option should be left on during the execution of
+user-defined editor and completion widgets so that the behaviour
+corresponds to that of builtin widgets.
 
 See chapter 5 in the FAQ for some notes on multibyte input.
 
Index: MACHINES
===================================================================
RCS file: /cvsroot/zsh/zsh/MACHINES,v
retrieving revision 1.3
diff -u -r1.3 MACHINES
--- MACHINES	21 Mar 2006 19:19:07 -0000	1.3
+++ MACHINES	4 Aug 2006 14:44:07 -0000
@@ -180,9 +180,7 @@
 SGI: IRIX 6.5
 	Should build `out-of-the-box'; however, if using the native
 	compiler, "cc" rather than "c99" is recommended.  Compilation
-	with gcc is also reported to work.  Multibyte is supported,
-	for example:
-           CC=cc ./configure --enable-multibyte
+	with gcc is also reported to work.  Multibyte is supported.
 
 	On 6.5.2, zsh malloc routines are reported not to work; also
 	full optimization (cc -O3 -OPT:Olimit=0) causes problems.
Index: NEWS
===================================================================
RCS file: /cvsroot/zsh/zsh/NEWS,v
retrieving revision 1.10
diff -u -r1.10 NEWS
--- NEWS	28 Feb 2006 12:20:43 -0000	1.10
+++ NEWS	4 Aug 2006 14:44:08 -0000
@@ -5,27 +5,31 @@
 Major changes between versions 4.2 and 4.3
 ------------------------------------------
 
-- There is support for multibyte character sets in the line editor,
-  though not the main shell.  See Multibyte Character Support in INSTALL.
+- There is support for multibyte character sets.  This is now reasonably
+  close to complete, although Unicode combining characters don't work
+  properly.  See Multibyte Character Support in INSTALL.
 
 - The shell can now run an installation function for a new user
-  (one with no .zshrc, .zshenv, .zprofile or .zlogin file) without
-  any additional setting up by the administrator.
+  (a user with no .zshrc, .zshenv, .zprofile or .zlogin file) without
+  any additional setting up by the administrator.  See "THE ZSH/NEWUSER
+  MODULE" in the zshmodules manual page.
 
 - The manual now has a Roadmap section (manual page zshroadmap) to
   give new users an indication of the most interesting parts of the
   manual.
 
-- New option PROMPT_SP, on by default, to work around the problem that the
-  line editor can overwrite output with no newline at the end.
+- New option PROMPT_SP (on by default): works around the problem that the
+  line editor can overwrite output with no newline at the end.  See the
+  zshoptions manual page.
 
 - New option HIST_SAVE_BY_COPY (on by default): history is saved by
-  copying and renaming instead of directly overwriting.
+  copying and renaming instead of directly overwriting.  See the
+  zshoptions manual page.
 
 - New redirection syntax e.g. {myfd}>file opens a new file descriptor
   and stores the number in $myfd, so that >&$myfd will work.  Chosen
   not to break existing code (and to be compatible with proposals for the
-  Korn shell).
+  Korn shell).  See the section REDIRECTION in the zshmisc manual page.
 
 - Substitutions of the form ${var:-"$@"}, ${var:+"$@"} and similar where
   word-splitting is applied to the text after the :- or :+ (in particular,
@@ -36,20 +40,28 @@
 - New Posix-style zsh-specific tests [[:IDENT:]], [[:IFS:]],
   [[:IFSSPACE:]], [[:WORD:]] test if character can appear in identifier,
   is an IFS character, is an IFS whitespace character, or is considered
-  as part of a word (is alphanumeric or appears in $WORDCHARS).  Note
-  the pattern code doesn't yet handle multibyte characters.
+  as part of a word (is alphanumeric or appears in $WORDCHARS).  These
+  works correctly on multibyte characters if the appropriate support
+  is present.  See the section FILENAME GENERATION in the zshexpn
+  manual page.
 
 - The idiom =(<<<...) is optimised so that the shell internally turns
   the ... into the contents of a file whose name is then substituted.
+  The syntax has always been usable by means of the NULLCMD feature,
+  but previously it generated an intermediate process; it has now
+  been rewritten along the same lines as the optimisation for $(<...)
+  that inserts a file into the command line without the use of an
+  external programme.
 
 - Supplied functions catch and throw provide limited support for
   exception handling using the `{ ... } always { ... }' syntax.
+  See the section EXCEPTION HANDLING in the zshcontrib manual page.
 
 - Signals now accept the SIG as part of the name for compatibility with
   other shells.
 
 - Editor function argument-base allows non-decimal arguments for
-  editor widgets.
+  editor widgets.  See the entry in the zshzle manual page.
 
 - As always, there are many enhancements to completion functions.
 
Index: README
===================================================================
RCS file: /cvsroot/zsh/zsh/README,v
retrieving revision 1.35
diff -u -r1.35 README
--- README	2 Aug 2006 17:16:38 -0000	1.35
+++ README	4 Aug 2006 14:44:09 -0000
@@ -54,7 +54,8 @@
 assumed all such octets were allowed in identifiers, however the POSIX
 standard does not allow such characters in identifiers.  The older
 behaviour is still obtained with --disable-multibyte in effect.
-With --enable-multibyte set there are three possible cases:
+With --enable-multibyte in effect (this is now the default anywhere
+it is supported) there are three possible cases:
   MULTIBYTE option unset:  only ASCII characters are allowed; the
     shell does not attempt to identify non-ASCII characters at all.
   MULTIBYTE option set, POSIX_IDENTIFIERS option unset: in addition
-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


To access the latest news from CSR copy this link into a web browser:  http://www.csr.com/email_sig.php


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2006-08-04 14:50 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-08-03 18:06 PATCH: autoconf test for multibyte support Peter Stephenson
2006-08-04 14:50 ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).