zsh-workers
 help / color / mirror / code / Atom feed
* Re: Bug#478019: zsh: Should handle non-breaking space as word separator
       [not found] <20080426110003.GA16650@implementation>
@ 2008-04-26 15:05 ` Clint Adams
  2008-04-26 15:25   ` Samuel Thibault
                     ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Clint Adams @ 2008-04-26 15:05 UTC (permalink / raw)
  To: zsh-workers; +Cc: Samuel Thibault, 478019

On Sat, Apr 26, 2008 at 12:00:03PM +0100, Samuel Thibault wrote:
> Hello,
> 
> On a french keyboard, '|' is typed by using alt-gr, and the non-breaking
> space is often typed by using alt-gr space. That often leads to this:
> 
> € echo a | grep a
> zsh: command not found:  grep
> 
> Because zsh looks for a " grep" command, with leading non-breaking space
> because my thumb remained a bit too long on the alt-gr key.
> 
> This doesn't happen with bash, because bash treats non-breaking space as
> a word separator.  Could zsh do the same? (currently, I have defined
> alias  grep=grep
> alias  vi=vi
> ...)

Having locale-based (and multibyte) word separators sounds like a nightmare
to me, but maybe someone has some ideas.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Bug#478019: zsh: Should handle non-breaking space as word separator
  2008-04-26 15:05 ` Bug#478019: zsh: Should handle non-breaking space as word separator Clint Adams
@ 2008-04-26 15:25   ` Samuel Thibault
  2008-04-26 18:41   ` Peter Stephenson
  2008-04-26 19:09   ` Stephane Chazelas
  2 siblings, 0 replies; 4+ messages in thread
From: Samuel Thibault @ 2008-04-26 15:25 UTC (permalink / raw)
  To: zsh-workers, 478019

Clint Adams, le Sat 26 Apr 2008 16:05:48 +0100, a écrit :
> Having locale-based (and multibyte) word separators sounds like a nightmare
> to me, but maybe someone has some ideas.

iswspace()
?

Samuel


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Bug#478019: zsh: Should handle non-breaking space as word separator
  2008-04-26 15:05 ` Bug#478019: zsh: Should handle non-breaking space as word separator Clint Adams
  2008-04-26 15:25   ` Samuel Thibault
@ 2008-04-26 18:41   ` Peter Stephenson
  2008-04-26 19:09   ` Stephane Chazelas
  2 siblings, 0 replies; 4+ messages in thread
From: Peter Stephenson @ 2008-04-26 18:41 UTC (permalink / raw)
  To: zsh-workers; +Cc: 478019

On Sat, 26 Apr 2008 16:05:48 +0100
Clint Adams <schizo@debian.org> wrote:
> On Sat, Apr 26, 2008 at 12:00:03PM +0100, Samuel Thibault wrote:
> > On a french keyboard, '|' is typed by using alt-gr, and the non-breaking
> > space is often typed by using alt-gr space. That often leads to this:
> > 
> > € echo a | grep a
> > zsh: command not found:  grep
> > 
> > Because zsh looks for a " grep" command, with leading non-breaking space
> > because my thumb remained a bit too long on the alt-gr key.
> > 
> > This doesn't happen with bash, because bash treats non-breaking space as
> > a word separator.  Could zsh do the same? (currently, I have defined
> > alias  grep=grep
> > alias  vi=vi
> > ...)
> 
> Having locale-based (and multibyte) word separators sounds like a nightmare
> to me, but maybe someone has some ideas.

I tend to agree with this.  It's doable, and the standard (SUS 2004)
supports the idea (see under LC_CTYPE) although it's a little bit
two-faced (only ASCII space characters are listed as requiring quoting,
for example).  However,

- I've been resisting having to convert the byte stream into anything
  else for basic shell parsing.  I've got far better things to do
  than make the shell slower and buggier for a feature of doubtful
  general utility.
- Having basic syntactic elements depending on the locale is really
  nasty.  We have one such kludge ourselves, (NO_)POSIX_IDENTIFIERS,
  which is mostly a sop to traditional pre-multibyte zsh behaviour.  I
  would actively discourage people from assuming this sort of behaviour.
- It seems to me somewhat ludicrous making a change specifically so
  that arguments can be separated by a "non-breaking" space.  Is it
  or isn't it breakable?
- This isn't a general solution to mistyping anyway.  You might be able
  to fix alt-gr space with xmodmap or the terminal emulator translation
  table.

(Yes, I know "a little bit two-faced" is meaningless, strictly
speaking.  I stopped speaking strictly years ago now.)

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Bug#478019: zsh: Should handle non-breaking space as word separator
  2008-04-26 15:05 ` Bug#478019: zsh: Should handle non-breaking space as word separator Clint Adams
  2008-04-26 15:25   ` Samuel Thibault
  2008-04-26 18:41   ` Peter Stephenson
@ 2008-04-26 19:09   ` Stephane Chazelas
  2 siblings, 0 replies; 4+ messages in thread
From: Stephane Chazelas @ 2008-04-26 19:09 UTC (permalink / raw)
  To: zsh-workers, Samuel Thibault, 478019

On Sat, Apr 26, 2008 at 04:05:48PM +0100, Clint Adams wrote:
> On Sat, Apr 26, 2008 at 12:00:03PM +0100, Samuel Thibault wrote:
> > Hello,
> > 
> > On a french keyboard, '|' is typed by using alt-gr, and the non-breaking
> > space is often typed by using alt-gr space. That often leads to this:
> > 
> > € echo a | grep a
> > zsh: command not found:  grep
> > 
> > Because zsh looks for a " grep" command, with leading non-breaking space
> > because my thumb remained a bit too long on the alt-gr key.
> > 
> > This doesn't happen with bash, because bash treats non-breaking space as
> > a word separator.  Could zsh do the same? (currently, I have defined
> > alias  grep=grep
> > alias  vi=vi
> > ...)
> 
> Having locale-based (and multibyte) word separators sounds like a nightmare
> to me, but maybe someone has some ideas.

Having the shell syntax that depends on the environment looks
like a very bad idea to me (think of scripts!).

There are already problems like that such as

case $x in
  ([a-z]) ...;;
esac

which is locale dependent while in most cases it's not what you
want. And to work around that is a nightmare POSIXly like:

LC_ALL=C command eval 'case $x in ([a-z]) ... esac'
(which doesn't even work in some shells because of bugs).

Another bad example which is causing more harm than benefit:

in ksh93, the decimal point is locale dependent, so you can't
do:

float Pi=3.14159265359

which is a syntax error is some locales:
$ LC_ALL=fr_FR ksh93 -c 'float Pi=3.141592653589'
ksh93[1]: typeset: 3.141592653589: arithmetic syntax error

This one is even harder to overcome:

$ LC_ALL=fr_FR ksh93 -c '
  LC_ALL=C command float Pi=3.141592653589; print $Pi'

$ LC_ALL=fr_FR ksh93 -c 'LC_ALL=C command eval float Pi=3.141592653589
 print $((Pi))'
ksh93: line 2: 3.141592653589: arithmetic syntax error

LC_ALL=fr_FR ksh93 -c 'in_C_locale() { typeset LC_ALL=C; eval "$@"; }
in_C_locale float Pi=3.141592653589; echo $LC_ALL;  print $((Pi))'
C
3.141592653589

All of which look like bugs to me.

Anyway, my point was to say that it's a bad idea to have the
syntax of the shell dependant on the locale.

-- 
Stéphane


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2008-04-26 19:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20080426110003.GA16650@implementation>
2008-04-26 15:05 ` Bug#478019: zsh: Should handle non-breaking space as word separator Clint Adams
2008-04-26 15:25   ` Samuel Thibault
2008-04-26 18:41   ` Peter Stephenson
2008-04-26 19:09   ` Stephane Chazelas

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).