zsh-users
 help / color / mirror / code / Atom feed
* multibyte implementation (for filename completion with umlauts)
@ 2018-09-20 10:14 Andy Spiegl
  2018-09-20 16:05 ` Daniel Shahaf
  0 siblings, 1 reply; 4+ messages in thread
From: Andy Spiegl @ 2018-09-20 10:14 UTC (permalink / raw)
  To: zsh-users

Another question from the past:  :-)

Back in 2011 we found that a filename with umlauts (öäü) can't be completed
case insensitively because zsh can't handle multibyte characters correctly
because it's a pretty complex issue:
 http://www.zsh.org/mla/users/2011/msg00015.html

Ever since then I live with it more or less happily.  All I could find
today in the docs is a TODO entry:
 * support for multibyte characters, including UTF-8

So I suppose the multibyte problem in the completion code is still there, right?

Actually, all I'd like to achieve is case insensitivity for umlauts like that:
 zstyle ':completion:*' matcher-list 'm:{A-ZÄÖÜa-zäöü}={a-zäöüA-ZÄÖÜ}

Any ideas?

Thanks so much!
 Andy

-- 
 The pre-release of the alpha for the forthcoming beta of the
 pre-build of the next point release should be out tonight.  (Graeme Devine)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: multibyte implementation (for filename completion with umlauts)
  2018-09-20 10:14 multibyte implementation (for filename completion with umlauts) Andy Spiegl
@ 2018-09-20 16:05 ` Daniel Shahaf
  2018-09-25 16:47   ` Andy Spiegl
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel Shahaf @ 2018-09-20 16:05 UTC (permalink / raw)
  To: zsh-users

Andy Spiegl wrote on Thu, Sep 20, 2018 at 12:14:03 +0200:
> Actually, all I'd like to achieve is case insensitivity for umlauts like that:
>  zstyle ':completion:*' matcher-list 'm:{A-ZÄÖÜa-zäöü}={a-zäöüA-ZÄÖÜ}

I don't have a fix for you, but
.
    zstyle \* matcher-list 'm:{A-Za-z}={a-zA-Z} m:ae=Ä'
.
works for me for completing «fooae<TAB>» to «FOOÄBAR».  It doesn't work
if I use «ä» instead of «ae».  Therefore, how about writing a widget
that does s/ẍ/xe/ (for x in a,o,u,A,O,U) before attempting completion.
Would that be a workaround?

Cheers,

Daniel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: multibyte implementation (for filename completion with umlauts)
  2018-09-20 16:05 ` Daniel Shahaf
@ 2018-09-25 16:47   ` Andy Spiegl
  2018-09-25 19:15     ` Daniel Shahaf
  0 siblings, 1 reply; 4+ messages in thread
From: Andy Spiegl @ 2018-09-25 16:47 UTC (permalink / raw)
  To: zsh-users

Thanks Daniel.

> Therefore, how about writing a widget
> that does s/ẍ/xe/ (for x in a,o,u,A,O,U) before attempting completion.
> Would that be a workaround?
Yes, that sounds like a (pretty wicked hehe) workaround.

What would a widget like that look like?
I am really bad at zsh regexp and search/replace strings. :-(

While playing around I found a completer behaviour I cannot explain:
% touch testüng
% ls testün<TAB>   --> "ls testüng"
% ls testÜn<TAB>   --> BINGs
% ls Testün<TAB>   --> also just BINGs

Shouldn't
 zstyle ':completion:*' matcher-list 'm:{A-ZÄÖÜa-zäöü}={a-zäöüA-ZÄÖÜ}
take care of the capital T?
Apparently this one multibyte character screws up the whole completion process?

However, if I add
 zstyle ':completion:*' completer _complete _match _correct
then "ls Testün" and even "ls testÜn" is corrected/completed fine.

I am a bit lost. :-)

Thanks,
 Andy

-- 
 If you believe everything you read, better not read.
   (Japanese Proverb)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: multibyte implementation (for filename completion with umlauts)
  2018-09-25 16:47   ` Andy Spiegl
@ 2018-09-25 19:15     ` Daniel Shahaf
  0 siblings, 0 replies; 4+ messages in thread
From: Daniel Shahaf @ 2018-09-25 19:15 UTC (permalink / raw)
  To: zsh-users

> > Therefore, how about writing a widget
> > that does s/ẍ/xe/ (for x in a,o,u,A,O,U) before attempting completion.
> > Would that be a workaround?
> Yes, that sounds like a (pretty wicked hehe) workaround.
> 

Unfortunately, I just now see that while «fooae<TAB>» does complete to
«FOOÄBAR», «fooaeb<TAB>» does not.

> What would a widget like that look like?
> I am really bad at zsh regexp and search/replace strings. :-(

The string manipulation is the easy part:

    % () { local s=$1; s=${s//ö/oe}; typeset -p s } fööbar
    foeoebar

The trickier part is how to change only the last word on the command line.
I can think of a few options here, but I'm not sure what to recommend; perhaps
manipulate the end of $LBUFFER, consulting $words[CURRENT] to find the
beginning of the word before the cursor?  As you say later, perhaps using
_correct, or something modeled after it, would be better.

> While playing around I found a completer behaviour I cannot explain:
> % touch testüng
> % ls testün<TAB>   --> "ls testüng"
> % ls testÜn<TAB>   --> BINGs
> % ls Testün<TAB>   --> also just BINGs
> 
> Shouldn't
>  zstyle ':completion:*' matcher-list 'm:{A-ZÄÖÜa-zäöü}={a-zäöüA-ZÄÖÜ}
> take care of the capital T?
> Apparently this one multibyte character screws up the whole completion process?
> 

The capital T example for me if I change the order to m:{A-Za-zÄÖÜäöü}={a-zA-ZäöüÄÖÜ}.

> However, if I add
>  zstyle ':completion:*' completer _complete _match _correct
> then "ls Testün" and even "ls testÜn" is corrected/completed fine.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-09-25 19:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-20 10:14 multibyte implementation (for filename completion with umlauts) Andy Spiegl
2018-09-20 16:05 ` Daniel Shahaf
2018-09-25 16:47   ` Andy Spiegl
2018-09-25 19:15     ` Daniel Shahaf

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).