* Re: PATCH: assume "enhanced goodness" when --multibyte-enable
@ 2005-12-15 11:52 Oliver Kiddle
2005-12-15 12:09 ` Peter Stephenson
0 siblings, 1 reply; 4+ messages in thread
From: Oliver Kiddle @ 2005-12-15 11:52 UTC (permalink / raw)
To: zsh-workers
Peter wrote:
> In utils.c we don't enable the full multibyte code for converting
> characters unless __STDC_ISO_10646__ is turned on. However,
everywhere
> in zle we simply trust that if --multibyte-enable is turned on
> everything just works. That includes wctomb(), which is all we need
> for character conversion.
This doesn't make sense to me. With MULTIBYTE_SUPPORT enabled are you
just assuming that wchar_t is UCS-4 everywhere? I don't understand how
that'll work if you have a system which has perfectly good multibyte
support but uses some other encoding for wchar_t. I think FreeBSD is
such a system and older versions of many Unix systems do that. Do you
actually know for sure that Solaris 8 is using UCS-4 for wchar_t? The
fact the it doesn't define __STDC_ISO_10646__ would imply that it does
not. I'd suspect it is similar but different which is why it sort-of
seems to work.
(The code in utils.c which you have just enabled with MULTIBYTE_SUPPORT
casts a 4 byte integer into a wchar_t: something that can only work
when wchar_t is implemented as UCS-4. __STDC_ISO_10646__ is supposed a
reliable way to determine if wchar_t is UCS-4)
Oliver
___________________________________________________________
To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: PATCH: assume "enhanced goodness" when --multibyte-enable
2005-12-15 11:52 PATCH: assume "enhanced goodness" when --multibyte-enable Oliver Kiddle
@ 2005-12-15 12:09 ` Peter Stephenson
0 siblings, 0 replies; 4+ messages in thread
From: Peter Stephenson @ 2005-12-15 12:09 UTC (permalink / raw)
To: Zsh hackers list
Oliver Kiddle wrote:
> Peter wrote:
> > In utils.c we don't enable the full multibyte code for converting
> > characters unless __STDC_ISO_10646__ is turned on. However,
> everywhere
> > in zle we simply trust that if --multibyte-enable is turned on
> > everything just works. That includes wctomb(), which is all we need
> > for character conversion.
>
> This doesn't make sense to me. With MULTIBYTE_SUPPORT enabled are you
> just assuming that wchar_t is UCS-4 everywhere?
ish...
> I don't understand how that'll work if you have a system which has
> perfectly good multibyte support but uses some other encoding for
> wchar_t.
It might well not, but up to now I've been assuming we need to know how
to convert it. --enable-multibyte just says "go ahead and assume this
works". Unless we can probe for what to do with a wchar_t I've been
assuming we're kind of stuck.
However, the assumptions we rely on are a bit different in the code for
converting Unicode characters and in the reset of zle, so quite likely
they shouldn't be tied...
In converting \U/\u sequences, as you say, we really need fully paid up
UCS-4.
In the reset of zle, we need wchar_t to be an integer which overlaps
with ASCII in positions 0 to 127, and we only need that in some places.
(A lot of the time we can work on the pre-converted multibyte string,
since that *must* have ASCII has a subset, and it's probably possible to
do that everywhere by additional conversions.) I don't think it
necessarily has to be exactly UCS-4 and most of the time it probably
works if it isn't. So maybe the change is wrong.
--
Peter Stephenson <pws@csr.com> Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070
This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* PATCH: assume "enhanced goodness" when --multibyte-enable
@ 2005-12-14 18:22 Peter Stephenson
2005-12-14 19:06 ` Wayne Davison
0 siblings, 1 reply; 4+ messages in thread
From: Peter Stephenson @ 2005-12-14 18:22 UTC (permalink / raw)
To: Zsh hackers list
In utils.c we don't enable the full multibyte code for converting
characters unless __STDC_ISO_10646__ is turned on. However, everywhere
in zle we simply trust that if --multibyte-enable is turned on
everything just works. That includes wctomb(), which is all we need
for character conversion.
Hence I think we need to make the same assumption in utils.c, too. This
makes things (in particular insert-{composed,unicode}-char) work better
on Solaris 8.
Index: Src/utils.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/utils.c,v
retrieving revision 1.105
diff -u -r1.105 utils.c
--- Src/utils.c 30 Nov 2005 16:35:33 -0000 1.105
+++ Src/utils.c 14 Dec 2005 18:17:29 -0000
@@ -3918,7 +3918,7 @@
}
#endif
-# if defined(HAVE_NL_LANGINFO) && defined(CODESET) && !defined(__STDC_ISO_10646__)
+# if defined(HAVE_NL_LANGINFO) && defined(CODESET) && !defined(__STDC_ISO_10646__) && !defined(MULTIBYTE_SUPPORT)
/* Convert a character from UCS4 encoding to UTF-8 */
/**/
@@ -3984,7 +3984,7 @@
char svchar = '\0';
int meta = 0, control = 0;
int i;
-#if defined(HAVE_WCHAR_H) && defined(HAVE_WCTOMB) && defined(__STDC_ISO_10646__)
+#if defined(HAVE_WCHAR_H) && defined(HAVE_WCTOMB) && (defined(__STDC_ISO_10646__) || defined(MULTIBYTE_SUPPORT))
wint_t wval;
size_t count;
#else
@@ -4093,7 +4093,7 @@
*misc = wval;
return s+1;
}
-#if defined(HAVE_WCHAR_H) && defined(HAVE_WCTOMB) && defined(__STDC_ISO_10646__)
+#if defined(HAVE_WCHAR_H) && defined(HAVE_WCTOMB) && (defined(__STDC_ISO_10646__) || defined(MULTIBYTE_SUPPORT))
count = wctomb(t, (wchar_t)wval);
if (count == (size_t)-1) {
zerr("character not in range", NULL, 0);
--
Peter Stephenson <pws@csr.com> Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070
This message has been scanned for viruses by BlackSpider MailControl - www.blackspider.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: PATCH: assume "enhanced goodness" when --multibyte-enable
2005-12-14 18:22 Peter Stephenson
@ 2005-12-14 19:06 ` Wayne Davison
0 siblings, 0 replies; 4+ messages in thread
From: Wayne Davison @ 2005-12-14 19:06 UTC (permalink / raw)
To: Peter Stephenson; +Cc: Zsh hackers list
On Wed, Dec 14, 2005 at 06:22:25PM +0000, Peter Stephenson wrote:
> Hence I think we need to make the same assumption in utils.c, too.
That certainly seems right to me.
The first hunk in your diff makes me wonder: why isn't ucs4toutf8()
declared as static? It is only used inside utils.c, and it sometimes
doesn't even get defined.
..wayne..
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2005-12-15 12:10 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-12-15 11:52 PATCH: assume "enhanced goodness" when --multibyte-enable Oliver Kiddle
2005-12-15 12:09 ` Peter Stephenson
-- strict thread matches above, loose matches on Subject: below --
2005-12-14 18:22 Peter Stephenson
2005-12-14 19:06 ` Wayne Davison
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).