zsh-workers
 help / color / mirror / code / Atom feed
* PATCH: multibyte configuration
@ 2006-01-06 11:33 Peter Stephenson
  2006-01-07  2:45 ` Danek Duvall
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Stephenson @ 2006-01-06 11:33 UTC (permalink / raw)
  To: Zsh hackers list

This probes for wcswidth() and assumes the value 1 if the function isn't
available, which at least means the code compiles on OpenBSD with
--enable-multibyte, though it still didn't seem to work.

Danek Duvall tried out a simplified test for multibyte handling on
Solaris which wasn't working for me, but it was for him, so I still have
no clue what's happening there.  Any reports of working zsh with
--enable-multibyte (and real multibyte characters, I know that at least
it compiles and runs) on Solaris 8+ would be useful.

Index: configure.ac
===================================================================
RCS file: /cvsroot/zsh/zsh/configure.ac,v
retrieving revision 1.46
diff -u -r1.46 configure.ac
--- configure.ac	9 Dec 2005 19:20:02 -0000	1.46
+++ configure.ac	6 Jan 2006 11:28:36 -0000
@@ -1121,7 +1121,7 @@
 	       pcre_compile pcre_study pcre_exec \
 	       nl_langinfo \
 	       erand48 open_memstream \
-	       wctomb mbrtowc wcrtomb iconv \
+	       wctomb mbrtowc wcrtomb wcswidth iconv \
 	       grantpt unlockpt ptsname \
 	       htons ntohs)
 AC_FUNC_STRCOLL
Index: Src/system.h
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/system.h,v
retrieving revision 1.36
diff -u -r1.36 system.h
--- Src/system.h	15 Dec 2005 14:51:41 -0000	1.36
+++ Src/system.h	6 Jan 2006 11:28:39 -0000
@@ -703,6 +703,10 @@
  */
 # include <wchar.h>
 # include <wctype.h>
+#ifndef HAVE_WCSWIDTH
+/* wcswidth is missing on OpenBSD: assume single-width characters */
+#define wcswidth(x, y)	(1)
+#endif
 #endif
 #ifdef HAVE_LANGINFO_H
 #  include <langinfo.h>


Your mail client is unable to display the latest news from CSR. To access our news copy this link into a web browser:  http://www.csr.com/email_sig.html


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PATCH: multibyte configuration
  2006-01-06 11:33 PATCH: multibyte configuration Peter Stephenson
@ 2006-01-07  2:45 ` Danek Duvall
  2006-01-07  8:04   ` Danek Duvall
  2006-01-07 13:22   ` Peter Stephenson
  0 siblings, 2 replies; 9+ messages in thread
From: Danek Duvall @ 2006-01-07  2:45 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh hackers list

On Fri, Jan 06, 2006 at 11:33:06AM +0000, Peter Stephenson wrote:

> Danek Duvall tried out a simplified test for multibyte handling on
> Solaris which wasn't working for me, but it was for him, so I still have
> no clue what's happening there.  Any reports of working zsh with
> --enable-multibyte (and real multibyte characters, I know that at least
> it compiles and runs) on Solaris 8+ would be useful.

I'm working on Nevada (Solaris 11) build 25, using the Sun Studio 10
compiler, building -dev-2 plus the patch in 22124 (though I haven't tested
that functionality yet, so it probably doesn't matter).

For the purposes of this message, working is defined as

  - being able to insert a multibyte character onto the commandline with
    insert-composed-character

  - echo a string of such characters into a file named with such
    characters, via "echo <string> > <string>"

  - verify with ls that the file was created with the correct name

  - verify with cat the file contents are correct

I can confirm that simply with --enable-multibyte, it does *not* work.  In
particular, attempting to use insert-composed-char, I get

    insert-composed-char:1: cannot do charset conversion

The screen is also a litle wonky.  The error message above is on the
command-line I was trying to insert the composed character into (at the
position where the cursor was), and the "execute: insert-..." line is still
present below it, but with my cursor where it would be if the prompt had
been printed and the line as it was before I started the enterprise.

I've confirmed that this message (there are three identical ones) is the
third of the three -- which should never have happened.  It seems that all
the autoconf symbols in that cluster of ifdefs is defined -- HAVE_WCHAR_H,
HAVE_WCTOMB, HAVE_NL_LANGINFO and HAVE_ICONV, but what's missing is CODESET
(and apparently __STDC_ISO_10646__ which I believe *should* be defined, but
I haven't investigated why yet).

Compiling utils.c with cc -H, I notice that langinfo.h is never included.
I don't know why that is, and Peter, you'd probably be better than I in
tracking down the maze of ifdefs that control that.  But when I add

    #include <langinfo.h>
    #include <iconv.h>

after the other #includes in utils.c, it compiles successfully.  Actually,
it doesn't, but for unrelated reasons -- I have to #undef
HAVE_VARIABLE_LENGTH_ARRAYS in config.h, which autoconf correctly sets
because of its test, but something about the way that outstr in wcsiword()
is defined doesn't allow it to be dereferenced.

Once those changes are made, it works, as defined above.

Tomorrow, I'll give it a shot on S10 and S9, and maybe try to track down
the auxiliary issues.

Thanks,
Danek


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PATCH: multibyte configuration
  2006-01-07  2:45 ` Danek Duvall
@ 2006-01-07  8:04   ` Danek Duvall
  2006-01-07 13:22   ` Peter Stephenson
  1 sibling, 0 replies; 9+ messages in thread
From: Danek Duvall @ 2006-01-07  8:04 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh hackers list

On Fri, Jan 06, 2006 at 06:45:38PM -0800, Danek Duvall wrote:

> For the purposes of this message, working is defined as
> 
>   - being able to insert a multibyte character onto the commandline with
>     insert-composed-character
> 
>   - echo a string of such characters into a file named with such
>     characters, via "echo <string> > <string>"
> 
>   - verify with ls that the file was created with the correct name
> 
>   - verify with cat the file contents are correct

I should have added that I played with some other widgets as well:

  - delete-char-or-list shows the file correctly

  - menu completion works, cycling to and past it

  - menu selection works, too

  - as does all the basic commandline editing -- moving the cursor through
    it, inserting other characters before it and in the middle, etc.

Danek


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PATCH: multibyte configuration
  2006-01-07  2:45 ` Danek Duvall
  2006-01-07  8:04   ` Danek Duvall
@ 2006-01-07 13:22   ` Peter Stephenson
  2006-01-07 16:44     ` Danek Duvall
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Stephenson @ 2006-01-07 13:22 UTC (permalink / raw)
  To: Zsh hackers list

Danek Duvall wrote:
> For the purposes of this message, working is defined as
> 
>   - being able to insert a multibyte character onto the commandline with
>     insert-composed-character
> 
>   - echo a string of such characters into a file named with such
>     characters, via "echo <string> > <string>"
> 
>   - verify with ls that the file was created with the correct name
> 
>   - verify with cat the file contents are correct

My big problem is with inputting characters from the keyboard: I have
the Euro and pound sterling, but if you have only ASCII character that
could be difficult.  The pound (what I call pound, not what I call hash)
is 0xc2 0xa3 in UTF-8.  It ought to be possible to convince xterm into
inserting it with some translations trickery:

XTerm*VT100.Translations: #override \
Shift <KeyPress> F9: string(0xc2) string(0xa3)

did the trick (I tried that under Linux where I know the pound sign
works).

> I can confirm that simply with --enable-multibyte, it does *not* work.  In
> particular, attempting to use insert-composed-char, I get
> 
>     insert-composed-char:1: cannot do charset conversion

Yes, this is the langinfo.h problem.  I think this is fixed by the patch
in zsh-workers/22085.

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page still at http://www.pwstephenson.fsnet.co.uk/


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PATCH: multibyte configuration
  2006-01-07 13:22   ` Peter Stephenson
@ 2006-01-07 16:44     ` Danek Duvall
  0 siblings, 0 replies; 9+ messages in thread
From: Danek Duvall @ 2006-01-07 16:44 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh hackers list

On Sat, Jan 07, 2006 at 01:22:11PM +0000, Peter Stephenson wrote:

> My big problem is with inputting characters from the keyboard: I have
> the Euro and pound sterling, but if you have only ASCII character that
> could be difficult.  The pound (what I call pound, not what I call hash)
> is 0xc2 0xa3 in UTF-8.  It ought to be possible to convince xterm into
> inserting it with some translations trickery:
> 
> XTerm*VT100.Translations: #override \
> Shift <KeyPress> F9: string(0xc2) string(0xa3)
> 
> did the trick (I tried that under Linux where I know the pound sign
> works).

Okay.  Got that to work on Linux, then, over an ssh connection to work, it
worked just fine on my Solaris box (again, that's with the patch in 22124).
It'll probably have to wait until Monday until I can get in to work and try
it at the console (and see if I can't make use of that compose key I have).

> >     insert-composed-char:1: cannot do charset conversion
> 
> Yes, this is the langinfo.h problem.  I think this is fixed by the patch
> in zsh-workers/22085.

Yep, thanks.

Danek


^ permalink raw reply	[flat|nested] 9+ messages in thread

* PATCH: _hosts
@ 2006-02-03 15:46 Peter Stephenson
  2006-02-03 19:45 ` Danek Duvall
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Stephenson @ 2006-02-03 15:46 UTC (permalink / raw)
  To: Zsh hackers list

I've just run (for the second time) across
  zstyle -e '*' hosts 'reply=($hosts)'
not working because the variable hosts is used locally in the _hosts
function.  This converts it to _hosts.  We need to watch out for more
like this.

Even more, we need namespaces:  I run across problems like this again
and again when writing functions for TCP handling, which I use
frequently.  It's compounded by the inability to pass back values from
shell functions in postional parameters.  Hmm, we could think of a
syntax for the latter...  Hmmhmm, being able to insert a parameter into
the enclosing scope regardless of values in the current scope would fix
it, but it's just the sort of feature that makes the already horrific
parameter code even worse.  We've been talking about proper namespaces
for ages.

I've also made the function search ~/.ssh/known_hosts for host names
(and strip out IPv4 dot addresses).

I've also converted a conditional array assignment into an unconditional
one since I didn't see why the former was necessary, but maybe someone
can explain.

Index: Completion/Unix/Type/_hosts
===================================================================
RCS file: /cvsroot/zsh/zsh/Completion/Unix/Type/_hosts,v
retrieving revision 1.5
diff -u -r1.5 _hosts
--- Completion/Unix/Type/_hosts	24 Oct 2005 17:09:11 -0000	1.5
+++ Completion/Unix/Type/_hosts	3 Feb 2006 15:36:42 -0000
@@ -1,21 +1,34 @@
 #compdef ftp rwho rup xping traceroute host aaaa zone mx ns soa txt
 
-local expl hosts tmp
+# avoid calling variable "hosts", it's an obvious candidate for use in
+#  zstyle -e '*' hosts 'reply=($hosts)'
+local expl _hosts tmp
 
-if ! zstyle -a ":completion:${curcontext}:hosts" hosts hosts; then
-  (( $+_cache_hosts )) ||
-      if (( ${+commands[getent]} )); then
-	: ${(A)_cache_hosts:=${(s: :)${(ps:\t:)${(f)~~"$(_call_program hosts getent hosts 2>/dev/null)"}##[:blank:]#[^[:blank:]]#}}}
-      else
-        : ${(A)_cache_hosts:=${(s: :)${(ps:\t:)${${(f)~~"$(</etc/hosts)"}%%\#*}##[:blank:]#[^[:blank:]]#}}}
-	if (( ${+commands[ypcat]} )) &&
-	    tmp=$(_call_program hosts ypcat hosts.byname 2>/dev/null); then
-          _cache_hosts+=( ${=${(f)tmp}##[:blank:]#[^[:blank:]]#} ) # If you use YP
-	fi
+if ! zstyle -a ":completion:${curcontext}:hosts" hosts _hosts; then
+  if (( $+_cache_hosts == 0 )); then
+    # uniquify
+    typeset -gUa _cache_hosts
+    if (( ${+commands[getent]} )); then
+      # pws: we were using the horrible ": ${(A)...:=}" syntax to assign
+      # to _cache_hosts, overriding the typeset as well as being unreadable
+      # and having obscure splitting behaviour.  Why?  We've just
+      # tested _cache_hosts doesn't exist.
+      _cache_hosts=(${(s: :)${(ps:\t:)${(f)~~"$(_call_program hosts getent hosts 2>/dev/null)"}##[:blank:]#[^[:blank:]]#}})
+    else
+      _cache_hosts=(${(s: :)${(ps:\t:)${${(f)~~"$(</etc/hosts)"}%%\#*}##[:blank:]#[^[:blank:]]#}})
+      if (( ${+commands[ypcat]} )) &&
+    	tmp=$(_call_program hosts ypcat hosts.byname 2>/dev/null); then
+        _cache_hosts+=( ${=${(f)tmp}##[:blank:]#[^[:blank:]]#} ) # If you use YP
       fi
+    fi
 
-  hosts=( "$_cache_hosts[@]" )
+    if [[ -r ~/.ssh/known_hosts ]]; then
+      _cache_hosts+=( $(sed -e '/^[0-9]*\.[0-9]*\.[0-9]*\.[0-9]/d' -e 's/[ ,].*//p' ~/.ssh/known_hosts) )
+    fi
+  fi
+
+  _hosts=( "$_cache_hosts[@]" )
 fi
 
 _wanted hosts expl host \
-    compadd -M 'm:{a-zA-Z}={A-Za-z} r:|.=* r:|=*' -a "$@" - hosts
+    compadd -M 'm:{a-zA-Z}={A-Za-z} r:|.=* r:|=*' -a "$@" - _hosts

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


To access the latest news from CSR copy this link into a web browser:  http://www.csr.com/email_sig.php


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PATCH: _hosts
  2006-02-03 15:46 PATCH: _hosts Peter Stephenson
@ 2006-02-03 19:45 ` Danek Duvall
  2006-02-03 22:40   ` Peter Stephenson
  0 siblings, 1 reply; 9+ messages in thread
From: Danek Duvall @ 2006-02-03 19:45 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh hackers list

On Fri, Feb 03, 2006 at 03:46:08PM +0000, Peter Stephenson wrote:

> I've also made the function search ~/.ssh/known_hosts for host names
> (and strip out IPv4 dot addresses).
>
> [ ... ]
>
>     sed -e '/^[0-9]*\.[0-9]*\.[0-9]*\.[0-9]/d' -e 's/[ ,].*//p' ~/.ssh/known_hosts

Might want to strip out IPv6 addresses as well:

  -e '/^[0-9a-f]\{0,4\}:/d'

And I'm not sure I see what the "/p" is doing there, other than doubling
each entry.  Would

    ${${${(u)${(f)"$(<~/.ssh/known_hosts)"}%%[ ,]*}:#(#s)[0-9]##.[0-9]##.[0-9]##.[0-9]##(#e)}:#(#s)[0-9a-f:]##(#e)}

be any better (if more arcane)?

Danek


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PATCH: _hosts
  2006-02-03 19:45 ` Danek Duvall
@ 2006-02-03 22:40   ` Peter Stephenson
  2006-02-03 22:48     ` Danek Duvall
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Stephenson @ 2006-02-03 22:40 UTC (permalink / raw)
  To: Zsh hackers list

Danek Duvall wrote:
> Might want to strip out IPv6 addresses as well:
> 
>   -e '/^[0-9a-f]\{0,4\}:/d'
> 
> And I'm not sure I see what the "/p" is doing there, other than doubling
> each entry.  Would
> 
>     ${${${(u)${(f)"$(<~/.ssh/known_hosts)"}%%[ ,]*}:#(#s)[0-9]##.[0-9]##.[0-9
> ]##.[0-9]##(#e)}:#(#s)[0-9a-f:]##(#e)}
> 
> be any better (if more arcane)?

It matches the rest of the completion system...

The /p was because I started with perl -ne before I decided that sed was
more portable.

> _users calls
> 
>     zstyle -a ":completion:${curcontext}:" users users
> 
> compared to _hosts, which calls
> 
>     zstyle -a ":completion:${curcontext}:hosts" hosts hosts
> 
> Is there some reason that _hosts uses the hosts tag, but _users doesn't use
> the users tag?

I think it must be an oversight, given that the manual specifically
mentions the existence of a users tag (as well as a users style).

I agree it's also inconsistent that you can't skip the userdirs
completion.  There could probably be better caching of user names if
default completion is used (and, really, _hosts ought to use the cache
system, though the disadvantage is it has to be explicitly enabled).

I'll commit the following...

Index: Completion/Unix/Type/_hosts
===================================================================
RCS file: /cvsroot/zsh/zsh/Completion/Unix/Type/_hosts,v
retrieving revision 1.6
diff -u -r1.6 _hosts
--- Completion/Unix/Type/_hosts	3 Feb 2006 16:32:15 -0000	1.6
+++ Completion/Unix/Type/_hosts	3 Feb 2006 22:34:33 -0000
@@ -23,7 +23,7 @@
     fi
 
     if [[ -r ~/.ssh/known_hosts ]]; then
-      _cache_hosts+=( $(sed -e '/^[0-9]*\.[0-9]*\.[0-9]*\.[0-9]/d' -e 's/[ ,].*//p' ~/.ssh/known_hosts) )
+      _cache_hosts+=(${${${(u)${(f)"$(<~/.ssh/known_hosts)"}%%[ ,]*}:#(#s)[0-9]##.[0-9]##.[0-9]##.[0-9]##(#e)}:#(#s)[0-9a-f:]##(#e)})
     fi
   fi
 
Index: Completion/Unix/Type/_users
===================================================================
RCS file: /cvsroot/zsh/zsh/Completion/Unix/Type/_users,v
retrieving revision 1.5
diff -u -r1.5 _users
--- Completion/Unix/Type/_users	8 Jun 2005 12:45:36 -0000	1.5
+++ Completion/Unix/Type/_users	3 Feb 2006 22:34:33 -0000
@@ -2,7 +2,9 @@
 
 local expl users
 
-zstyle -a ":completion:${curcontext}:" users users &&
-    _wanted users expl user compadd "$@" -a - users && return 0
+if zstyle -a ":completion:${curcontext}:users" users users; then
+    _wanted users expl user compadd "$@" -a - users
+    return 0
+fi
 
 _wanted users expl user compadd "$@" -k - userdirs

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page still at http://www.pwstephenson.fsnet.co.uk/


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PATCH: _hosts
  2006-02-03 22:40   ` Peter Stephenson
@ 2006-02-03 22:48     ` Danek Duvall
  0 siblings, 0 replies; 9+ messages in thread
From: Danek Duvall @ 2006-02-03 22:48 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh hackers list

On Fri, Feb 03, 2006 at 10:40:45PM +0000, Peter Stephenson wrote:

> > Is there some reason that _hosts uses the hosts tag, but _users doesn't use
> > the users tag?
> 
> I think it must be an oversight, given that the manual specifically
> mentions the existence of a users tag (as well as a users style).

Yup.  Though changing it now will break anyone who had it working with a
style with an empty tag.

> I'll commit the following...

Cool, thanks.  Just one nitty change:

> Index: Completion/Unix/Type/_hosts
> ===================================================================
> RCS file: /cvsroot/zsh/zsh/Completion/Unix/Type/_hosts,v
> retrieving revision 1.6
> diff -u -r1.6 _hosts
> --- Completion/Unix/Type/_hosts	3 Feb 2006 16:32:15 -0000	1.6
> +++ Completion/Unix/Type/_hosts	3 Feb 2006 22:34:33 -0000
> @@ -23,7 +23,7 @@
>      fi
>  
>      if [[ -r ~/.ssh/known_hosts ]]; then
> -      _cache_hosts+=( $(sed -e '/^[0-9]*\.[0-9]*\.[0-9]*\.[0-9]/d' -e 's/[ ,].*//p' ~/.ssh/known_hosts) )
> +      _cache_hosts+=(${${${(u)${(f)"$(<~/.ssh/known_hosts)"}%%[ ,]*}:#(#s)[0-9]##.[0-9]##.[0-9]##.[0-9]##(#e)}:#(#s)[0-9a-f:]##(#e)})
  +      _cache_hosts+=(${${${(u)${(f)"$(<~/.ssh/known_hosts)"}%%[ ,#]*}:#(#s)[0-9]##.[0-9]##.[0-9]##.[0-9]##(#e)}:#(#s)[0-9a-f:]##(#e)})

That is, remove everything after the first # as well, since the file can
have comments like that.

Thanks,
Danek


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-02-03 22:48 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-06 11:33 PATCH: multibyte configuration Peter Stephenson
2006-01-07  2:45 ` Danek Duvall
2006-01-07  8:04   ` Danek Duvall
2006-01-07 13:22   ` Peter Stephenson
2006-01-07 16:44     ` Danek Duvall
2006-02-03 15:46 PATCH: _hosts Peter Stephenson
2006-02-03 19:45 ` Danek Duvall
2006-02-03 22:40   ` Peter Stephenson
2006-02-03 22:48     ` Danek Duvall

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).