From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7662 invoked from network); 4 Aug 2006 14:50:54 -0000 X-Spam-Checker-Version: SpamAssassin 3.1.4 (2006-07-25) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00, FORGED_RCVD_HELO autolearn=ham version=3.1.4 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 4 Aug 2006 14:50:54 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 68955 invoked from network); 4 Aug 2006 14:50:48 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 4 Aug 2006 14:50:48 -0000 Received: (qmail 29962 invoked by alias); 4 Aug 2006 14:50:45 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 22587 Received: (qmail 29953 invoked from network); 4 Aug 2006 14:50:45 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by sunsite.dk with SMTP; 4 Aug 2006 14:50:45 -0000 Received: (qmail 68654 invoked from network); 4 Aug 2006 14:50:45 -0000 Received: from cluster-c.mailcontrol.com (168.143.177.190) by a.mx.sunsite.dk with SMTP; 4 Aug 2006 14:50:42 -0000 Received: from cameurexb01.EUROPE.ROOT.PRI ([62.189.241.200]) by rly27c.srv.mailcontrol.com (MailControl) with ESMTP id k74EoVf2004678 for ; Fri, 4 Aug 2006 15:50:31 +0100 Received: from news01.csr.com ([10.103.143.38]) by cameurexb01.EUROPE.ROOT.PRI with Microsoft SMTPSVC(6.0.3790.1830); Fri, 4 Aug 2006 15:50:31 +0100 Date: Fri, 4 Aug 2006 15:50:31 +0100 From: Peter Stephenson To: zsh-workers@sunsite.dk Subject: Re: PATCH: autoconf test for multibyte support Message-Id: <20060804155031.f9ffb5c2.pws@csr.com> In-Reply-To: <200608031806.k73I6Ea2017321@news01.csr.com> References: <200608031806.k73I6Ea2017321@news01.csr.com> Organization: Cambridge Silicon Radio X-Mailer: Sylpheed version 2.2.6 (GTK+ 2.6.7; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 04 Aug 2006 14:50:31.0053 (UTC) FILETIME=[551D2BD0:01C6B7D5] X-Scanned-By: MailControl A-07-04-01 (www.mailcontrol.com) on 10.67.0.137 Peter Stephenson wrote: > If this works I will need to change some of the installation > documentation. This changes some documentation. I'm only guessing it works on Cygwin, all I know is it compiles with the same code that works everywhere else. Index: INSTALL =================================================================== RCS file: /cvsroot/zsh/zsh/INSTALL,v retrieving revision 1.25 diff -u -r1.25 INSTALL --- INSTALL 16 Feb 2006 14:28:54 -0000 1.25 +++ INSTALL 4 Aug 2006 14:44:06 -0000 @@ -264,37 +264,32 @@ --------------------------- Support for multibyte character sets that extend ASCII, such as UTF-8, is -under development but the code in the line editor is sufficiently stable to -be turned on by default in environments that provide full ISO 10646 support -including the preprocessor definition __STDC_ISO_10646__. In principle -this definition does not guarantee the full environment, but in practice -systems with this defined also provide suitable library support. The shell -does not probe for all the features, so on other systems use of multibyte -support must be explicitly enabled when it is available. +now reasonably close to complete, except that combining characters are not +handled properly (some assistance with this problem would be appreciated). +The configuration script should turn on multibyte support on all systems +where it can be compiled successfully. The support can be explicitly enabled or disable with --enable-multibyte or ---disable-multibyte. Reports of systems where multibyte support was not -enabled by default but --enable-multibyte resulted in a usable shell would -be appreciated. The developers are not aware of any need to use +--disable-multibyte. The developers are not aware of any need to use --disable-multibyte and this should be reported as a bug. Currently -multibyte mode is believed to work automatically on: +multibyte mode is believed to work on at least the following: - All(?) current GNU/Linux distributions - -and to work when configured with --enable-multibyte on: - - OS X 10.4.3 (problems have been reported with multibyte characters in HFS file names) - NetBSD 2.0.2 - Solaris 8+ (inputting multibyte characters from the keyboard doesn't work in some installations). + - Cygwin (though use of multibyte characters is somewhat non-standard). -The main shell is not yet aware of multibyte characters, so for example the -length of a scalar parameter will return the number of bytes, not -characters, and pattern tests likewise treat single bytes as if they were -characters. This means that pattern tests such as ? and [[:alpha:]] do not -work correctly with characters in multibyte character sets beyond the ASCII -subset. +The corresponding shell option MULTIBYTE is now on by default in all +emulation modes when multibyte support is enabled. Turning it off is not +recommended unless there is a particular need to examine single bytes +regardless of the locale. As the line editor bases its behaviour on the +locale regardless of the option (in order to correspond to the displayed +character set), the option should be left on during the execution of +user-defined editor and completion widgets so that the behaviour +corresponds to that of builtin widgets. See chapter 5 in the FAQ for some notes on multibyte input. Index: MACHINES =================================================================== RCS file: /cvsroot/zsh/zsh/MACHINES,v retrieving revision 1.3 diff -u -r1.3 MACHINES --- MACHINES 21 Mar 2006 19:19:07 -0000 1.3 +++ MACHINES 4 Aug 2006 14:44:07 -0000 @@ -180,9 +180,7 @@ SGI: IRIX 6.5 Should build `out-of-the-box'; however, if using the native compiler, "cc" rather than "c99" is recommended. Compilation - with gcc is also reported to work. Multibyte is supported, - for example: - CC=cc ./configure --enable-multibyte + with gcc is also reported to work. Multibyte is supported. On 6.5.2, zsh malloc routines are reported not to work; also full optimization (cc -O3 -OPT:Olimit=0) causes problems. Index: NEWS =================================================================== RCS file: /cvsroot/zsh/zsh/NEWS,v retrieving revision 1.10 diff -u -r1.10 NEWS --- NEWS 28 Feb 2006 12:20:43 -0000 1.10 +++ NEWS 4 Aug 2006 14:44:08 -0000 @@ -5,27 +5,31 @@ Major changes between versions 4.2 and 4.3 ------------------------------------------ -- There is support for multibyte character sets in the line editor, - though not the main shell. See Multibyte Character Support in INSTALL. +- There is support for multibyte character sets. This is now reasonably + close to complete, although Unicode combining characters don't work + properly. See Multibyte Character Support in INSTALL. - The shell can now run an installation function for a new user - (one with no .zshrc, .zshenv, .zprofile or .zlogin file) without - any additional setting up by the administrator. + (a user with no .zshrc, .zshenv, .zprofile or .zlogin file) without + any additional setting up by the administrator. See "THE ZSH/NEWUSER + MODULE" in the zshmodules manual page. - The manual now has a Roadmap section (manual page zshroadmap) to give new users an indication of the most interesting parts of the manual. -- New option PROMPT_SP, on by default, to work around the problem that the - line editor can overwrite output with no newline at the end. +- New option PROMPT_SP (on by default): works around the problem that the + line editor can overwrite output with no newline at the end. See the + zshoptions manual page. - New option HIST_SAVE_BY_COPY (on by default): history is saved by - copying and renaming instead of directly overwriting. + copying and renaming instead of directly overwriting. See the + zshoptions manual page. - New redirection syntax e.g. {myfd}>file opens a new file descriptor and stores the number in $myfd, so that >&$myfd will work. Chosen not to break existing code (and to be compatible with proposals for the - Korn shell). + Korn shell). See the section REDIRECTION in the zshmisc manual page. - Substitutions of the form ${var:-"$@"}, ${var:+"$@"} and similar where word-splitting is applied to the text after the :- or :+ (in particular, @@ -36,20 +40,28 @@ - New Posix-style zsh-specific tests [[:IDENT:]], [[:IFS:]], [[:IFSSPACE:]], [[:WORD:]] test if character can appear in identifier, is an IFS character, is an IFS whitespace character, or is considered - as part of a word (is alphanumeric or appears in $WORDCHARS). Note - the pattern code doesn't yet handle multibyte characters. + as part of a word (is alphanumeric or appears in $WORDCHARS). These + works correctly on multibyte characters if the appropriate support + is present. See the section FILENAME GENERATION in the zshexpn + manual page. - The idiom =(<<<...) is optimised so that the shell internally turns the ... into the contents of a file whose name is then substituted. + The syntax has always been usable by means of the NULLCMD feature, + but previously it generated an intermediate process; it has now + been rewritten along the same lines as the optimisation for $(<...) + that inserts a file into the command line without the use of an + external programme. - Supplied functions catch and throw provide limited support for exception handling using the `{ ... } always { ... }' syntax. + See the section EXCEPTION HANDLING in the zshcontrib manual page. - Signals now accept the SIG as part of the name for compatibility with other shells. - Editor function argument-base allows non-decimal arguments for - editor widgets. + editor widgets. See the entry in the zshzle manual page. - As always, there are many enhancements to completion functions. Index: README =================================================================== RCS file: /cvsroot/zsh/zsh/README,v retrieving revision 1.35 diff -u -r1.35 README --- README 2 Aug 2006 17:16:38 -0000 1.35 +++ README 4 Aug 2006 14:44:09 -0000 @@ -54,7 +54,8 @@ assumed all such octets were allowed in identifiers, however the POSIX standard does not allow such characters in identifiers. The older behaviour is still obtained with --disable-multibyte in effect. -With --enable-multibyte set there are three possible cases: +With --enable-multibyte in effect (this is now the default anywhere +it is supported) there are three possible cases: MULTIBYTE option unset: only ASCII characters are allowed; the shell does not attempt to identify non-ASCII characters at all. MULTIBYTE option set, POSIX_IDENTIFIERS option unset: in addition -- Peter Stephenson Software Engineer CSR PLC, Churchill House, Cambridge Business Park, Cowley Road Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070 To access the latest news from CSR copy this link into a web browser: http://www.csr.com/email_sig.php