* PATCH: autoconf test for multibyte support
@ 2006-08-03 18:06 Peter Stephenson
2006-08-04 14:50 ` Peter Stephenson
0 siblings, 1 reply; 2+ messages in thread
From: Peter Stephenson @ 2006-08-03 18:06 UTC (permalink / raw)
To: Zsh hackers list
This should do a better job of working out whether multibyte support can
be enabled by looking for all the functions we use. I may have missed
some out; see if the list jogs any recollections.
If this works I will need to change some of the installation
documentation.
RCS file: /cvsroot/zsh/zsh/configure.ac,v
retrieving revision 1.55
diff -u -r1.55 configure.ac
--- configure.ac 19 Jun 2006 10:55:26 -0000 1.55
+++ configure.ac 3 Aug 2006 17:51:52 -0000
@@ -1122,7 +1122,7 @@
pcre_compile pcre_study pcre_exec \
nl_langinfo \
erand48 open_memstream \
- wctomb mbrtowc wcrtomb iconv \
+ wctomb iconv \
grantpt unlockpt ptsname \
htons ntohs)
AC_FUNC_STRCOLL
@@ -2079,33 +2079,34 @@
fi
fi
-dnl ---------------------
-dnl multibyte ZLE support
-dnl ---------------------
+dnl -----------------
+dnl multibyte support
+dnl -----------------
AC_ARG_ENABLE(multibyte,
-AC_HELP_STRING([--enable-multibyte], [support multibyte chars in the zsh line editor]),
-[zsh_cv_c_zle_unicode_support=$enableval],
-[AC_CACHE_CHECK(if the system adequately supports multibyte chars,
- zsh_cv_c_zle_unicode_support,
- [AC_TRY_COMPILE([
-#ifdef HAVE_LOCALE_H
-# include <locale.h>
-#endif
- ], [
-#if defined(HAVE_WCHAR_H) && defined(HAVE_WCTOMB) \
- && defined(HAVE_MBRTOWC) && defined(HAVE_WCRTOMB) \
- && defined (__STDC_ISO_10646__)
- /* All is well */
-#else
-# error Not supported.
-#endif
- ],
- zsh_cv_c_zle_unicode_support=yes,
- zsh_cv_c_zle_unicode_support=no)])
+AC_HELP_STRING([--enable-multibyte], [support multibyte characters]),
+[zsh_cv_c_unicode_support=$enableval],
+[AC_CACHE_VAL(zsh_cv_c_unicode_support,
+ AC_MSG_NOTICE([checking for functions supporting multibyte characters])
+ [zfuncs_absent=
+ for zfunc in iswalnum iswcntrl iswdigit iswgraph iswlower iswprint \
+iswpunct iswspace iswupper iswxdigit mbrlen mbrtowc towupper towlower \
+wcschr wcscpy wcslen wcsncmp wcsncpy wcrtomb wcwidth wmemchr wmemcmp \
+wmemcpy wmemmove wmemset; do
+ AC_CHECK_FUNC($zfunc,
+ [:], [zfuncs_absent="$zfuncs_absent $zfunc"])
+ done
+ if test x"$zfuncs_absent" = x; then
+ AC_MSG_NOTICE([all functions found, multibyte support enabled])
+ zsh_cv_c_unicode_support=yes
+ else
+ AC_MSG_NOTICE([missing functions, multibyte support disabled])
+ zsh_cv_c_unicode_support=no
+ fi
+ ])
])
AH_TEMPLATE([MULTIBYTE_SUPPORT],
[Define to 1 if you want support for multibyte character sets.])
-if test x$zsh_cv_c_zle_unicode_support = xyes; then
+if test x$zsh_cv_c_unicode_support = xyes; then
AC_DEFINE(MULTIBYTE_SUPPORT)
fi
--
Peter Stephenson <pws@csr.com> Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070
To access the latest news from CSR copy this link into a web browser: http://www.csr.com/email_sig.php
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: PATCH: autoconf test for multibyte support
2006-08-03 18:06 PATCH: autoconf test for multibyte support Peter Stephenson
@ 2006-08-04 14:50 ` Peter Stephenson
0 siblings, 0 replies; 2+ messages in thread
From: Peter Stephenson @ 2006-08-04 14:50 UTC (permalink / raw)
To: zsh-workers
Peter Stephenson <pws@csr.com> wrote:
> If this works I will need to change some of the installation
> documentation.
This changes some documentation.
I'm only guessing it works on Cygwin, all I know is it compiles with the
same code that works everywhere else.
Index: INSTALL
===================================================================
RCS file: /cvsroot/zsh/zsh/INSTALL,v
retrieving revision 1.25
diff -u -r1.25 INSTALL
--- INSTALL 16 Feb 2006 14:28:54 -0000 1.25
+++ INSTALL 4 Aug 2006 14:44:06 -0000
@@ -264,37 +264,32 @@
---------------------------
Support for multibyte character sets that extend ASCII, such as UTF-8, is
-under development but the code in the line editor is sufficiently stable to
-be turned on by default in environments that provide full ISO 10646 support
-including the preprocessor definition __STDC_ISO_10646__. In principle
-this definition does not guarantee the full environment, but in practice
-systems with this defined also provide suitable library support. The shell
-does not probe for all the features, so on other systems use of multibyte
-support must be explicitly enabled when it is available.
+now reasonably close to complete, except that combining characters are not
+handled properly (some assistance with this problem would be appreciated).
+The configuration script should turn on multibyte support on all systems
+where it can be compiled successfully.
The support can be explicitly enabled or disable with --enable-multibyte or
---disable-multibyte. Reports of systems where multibyte support was not
-enabled by default but --enable-multibyte resulted in a usable shell would
-be appreciated. The developers are not aware of any need to use
+--disable-multibyte. The developers are not aware of any need to use
--disable-multibyte and this should be reported as a bug. Currently
-multibyte mode is believed to work automatically on:
+multibyte mode is believed to work on at least the following:
- All(?) current GNU/Linux distributions
-
-and to work when configured with --enable-multibyte on:
-
- OS X 10.4.3 (problems have been reported with multibyte characters
in HFS file names)
- NetBSD 2.0.2
- Solaris 8+ (inputting multibyte characters from the keyboard doesn't
work in some installations).
+ - Cygwin (though use of multibyte characters is somewhat non-standard).
-The main shell is not yet aware of multibyte characters, so for example the
-length of a scalar parameter will return the number of bytes, not
-characters, and pattern tests likewise treat single bytes as if they were
-characters. This means that pattern tests such as ? and [[:alpha:]] do not
-work correctly with characters in multibyte character sets beyond the ASCII
-subset.
+The corresponding shell option MULTIBYTE is now on by default in all
+emulation modes when multibyte support is enabled. Turning it off is not
+recommended unless there is a particular need to examine single bytes
+regardless of the locale. As the line editor bases its behaviour on the
+locale regardless of the option (in order to correspond to the displayed
+character set), the option should be left on during the execution of
+user-defined editor and completion widgets so that the behaviour
+corresponds to that of builtin widgets.
See chapter 5 in the FAQ for some notes on multibyte input.
Index: MACHINES
===================================================================
RCS file: /cvsroot/zsh/zsh/MACHINES,v
retrieving revision 1.3
diff -u -r1.3 MACHINES
--- MACHINES 21 Mar 2006 19:19:07 -0000 1.3
+++ MACHINES 4 Aug 2006 14:44:07 -0000
@@ -180,9 +180,7 @@
SGI: IRIX 6.5
Should build `out-of-the-box'; however, if using the native
compiler, "cc" rather than "c99" is recommended. Compilation
- with gcc is also reported to work. Multibyte is supported,
- for example:
- CC=cc ./configure --enable-multibyte
+ with gcc is also reported to work. Multibyte is supported.
On 6.5.2, zsh malloc routines are reported not to work; also
full optimization (cc -O3 -OPT:Olimit=0) causes problems.
Index: NEWS
===================================================================
RCS file: /cvsroot/zsh/zsh/NEWS,v
retrieving revision 1.10
diff -u -r1.10 NEWS
--- NEWS 28 Feb 2006 12:20:43 -0000 1.10
+++ NEWS 4 Aug 2006 14:44:08 -0000
@@ -5,27 +5,31 @@
Major changes between versions 4.2 and 4.3
------------------------------------------
-- There is support for multibyte character sets in the line editor,
- though not the main shell. See Multibyte Character Support in INSTALL.
+- There is support for multibyte character sets. This is now reasonably
+ close to complete, although Unicode combining characters don't work
+ properly. See Multibyte Character Support in INSTALL.
- The shell can now run an installation function for a new user
- (one with no .zshrc, .zshenv, .zprofile or .zlogin file) without
- any additional setting up by the administrator.
+ (a user with no .zshrc, .zshenv, .zprofile or .zlogin file) without
+ any additional setting up by the administrator. See "THE ZSH/NEWUSER
+ MODULE" in the zshmodules manual page.
- The manual now has a Roadmap section (manual page zshroadmap) to
give new users an indication of the most interesting parts of the
manual.
-- New option PROMPT_SP, on by default, to work around the problem that the
- line editor can overwrite output with no newline at the end.
+- New option PROMPT_SP (on by default): works around the problem that the
+ line editor can overwrite output with no newline at the end. See the
+ zshoptions manual page.
- New option HIST_SAVE_BY_COPY (on by default): history is saved by
- copying and renaming instead of directly overwriting.
+ copying and renaming instead of directly overwriting. See the
+ zshoptions manual page.
- New redirection syntax e.g. {myfd}>file opens a new file descriptor
and stores the number in $myfd, so that >&$myfd will work. Chosen
not to break existing code (and to be compatible with proposals for the
- Korn shell).
+ Korn shell). See the section REDIRECTION in the zshmisc manual page.
- Substitutions of the form ${var:-"$@"}, ${var:+"$@"} and similar where
word-splitting is applied to the text after the :- or :+ (in particular,
@@ -36,20 +40,28 @@
- New Posix-style zsh-specific tests [[:IDENT:]], [[:IFS:]],
[[:IFSSPACE:]], [[:WORD:]] test if character can appear in identifier,
is an IFS character, is an IFS whitespace character, or is considered
- as part of a word (is alphanumeric or appears in $WORDCHARS). Note
- the pattern code doesn't yet handle multibyte characters.
+ as part of a word (is alphanumeric or appears in $WORDCHARS). These
+ works correctly on multibyte characters if the appropriate support
+ is present. See the section FILENAME GENERATION in the zshexpn
+ manual page.
- The idiom =(<<<...) is optimised so that the shell internally turns
the ... into the contents of a file whose name is then substituted.
+ The syntax has always been usable by means of the NULLCMD feature,
+ but previously it generated an intermediate process; it has now
+ been rewritten along the same lines as the optimisation for $(<...)
+ that inserts a file into the command line without the use of an
+ external programme.
- Supplied functions catch and throw provide limited support for
exception handling using the `{ ... } always { ... }' syntax.
+ See the section EXCEPTION HANDLING in the zshcontrib manual page.
- Signals now accept the SIG as part of the name for compatibility with
other shells.
- Editor function argument-base allows non-decimal arguments for
- editor widgets.
+ editor widgets. See the entry in the zshzle manual page.
- As always, there are many enhancements to completion functions.
Index: README
===================================================================
RCS file: /cvsroot/zsh/zsh/README,v
retrieving revision 1.35
diff -u -r1.35 README
--- README 2 Aug 2006 17:16:38 -0000 1.35
+++ README 4 Aug 2006 14:44:09 -0000
@@ -54,7 +54,8 @@
assumed all such octets were allowed in identifiers, however the POSIX
standard does not allow such characters in identifiers. The older
behaviour is still obtained with --disable-multibyte in effect.
-With --enable-multibyte set there are three possible cases:
+With --enable-multibyte in effect (this is now the default anywhere
+it is supported) there are three possible cases:
MULTIBYTE option unset: only ASCII characters are allowed; the
shell does not attempt to identify non-ASCII characters at all.
MULTIBYTE option set, POSIX_IDENTIFIERS option unset: in addition
--
Peter Stephenson <pws@csr.com> Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070
To access the latest news from CSR copy this link into a web browser: http://www.csr.com/email_sig.php
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2006-08-04 14:50 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-08-03 18:06 PATCH: autoconf test for multibyte support Peter Stephenson
2006-08-04 14:50 ` Peter Stephenson
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).