* Re: Bug#478019: zsh: Should handle non-breaking space as word separator
2008-04-26 15:05 ` Bug#478019: zsh: Should handle non-breaking space as word separator Clint Adams
@ 2008-04-26 15:25 ` Samuel Thibault
2008-04-26 18:41 ` Peter Stephenson
2008-04-26 19:09 ` Stephane Chazelas
2 siblings, 0 replies; 4+ messages in thread
From: Samuel Thibault @ 2008-04-26 15:25 UTC (permalink / raw)
To: zsh-workers, 478019
Clint Adams, le Sat 26 Apr 2008 16:05:48 +0100, a écrit :
> Having locale-based (and multibyte) word separators sounds like a nightmare
> to me, but maybe someone has some ideas.
iswspace()
?
Samuel
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Bug#478019: zsh: Should handle non-breaking space as word separator
2008-04-26 15:05 ` Bug#478019: zsh: Should handle non-breaking space as word separator Clint Adams
2008-04-26 15:25 ` Samuel Thibault
@ 2008-04-26 18:41 ` Peter Stephenson
2008-04-26 19:09 ` Stephane Chazelas
2 siblings, 0 replies; 4+ messages in thread
From: Peter Stephenson @ 2008-04-26 18:41 UTC (permalink / raw)
To: zsh-workers; +Cc: 478019
On Sat, 26 Apr 2008 16:05:48 +0100
Clint Adams <schizo@debian.org> wrote:
> On Sat, Apr 26, 2008 at 12:00:03PM +0100, Samuel Thibault wrote:
> > On a french keyboard, '|' is typed by using alt-gr, and the non-breaking
> > space is often typed by using alt-gr space. That often leads to this:
> >
> > € echo a | grep a
> > zsh: command not found: grep
> >
> > Because zsh looks for a " grep" command, with leading non-breaking space
> > because my thumb remained a bit too long on the alt-gr key.
> >
> > This doesn't happen with bash, because bash treats non-breaking space as
> > a word separator. Could zsh do the same? (currently, I have defined
> > alias grep=grep
> > alias vi=vi
> > ...)
>
> Having locale-based (and multibyte) word separators sounds like a nightmare
> to me, but maybe someone has some ideas.
I tend to agree with this. It's doable, and the standard (SUS 2004)
supports the idea (see under LC_CTYPE) although it's a little bit
two-faced (only ASCII space characters are listed as requiring quoting,
for example). However,
- I've been resisting having to convert the byte stream into anything
else for basic shell parsing. I've got far better things to do
than make the shell slower and buggier for a feature of doubtful
general utility.
- Having basic syntactic elements depending on the locale is really
nasty. We have one such kludge ourselves, (NO_)POSIX_IDENTIFIERS,
which is mostly a sop to traditional pre-multibyte zsh behaviour. I
would actively discourage people from assuming this sort of behaviour.
- It seems to me somewhat ludicrous making a change specifically so
that arguments can be separated by a "non-breaking" space. Is it
or isn't it breakable?
- This isn't a general solution to mistyping anyway. You might be able
to fix alt-gr space with xmodmap or the terminal emulator translation
table.
(Yes, I know "a little bit two-faced" is meaningless, strictly
speaking. I stopped speaking strictly years ago now.)
--
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Bug#478019: zsh: Should handle non-breaking space as word separator
2008-04-26 15:05 ` Bug#478019: zsh: Should handle non-breaking space as word separator Clint Adams
2008-04-26 15:25 ` Samuel Thibault
2008-04-26 18:41 ` Peter Stephenson
@ 2008-04-26 19:09 ` Stephane Chazelas
2 siblings, 0 replies; 4+ messages in thread
From: Stephane Chazelas @ 2008-04-26 19:09 UTC (permalink / raw)
To: zsh-workers, Samuel Thibault, 478019
On Sat, Apr 26, 2008 at 04:05:48PM +0100, Clint Adams wrote:
> On Sat, Apr 26, 2008 at 12:00:03PM +0100, Samuel Thibault wrote:
> > Hello,
> >
> > On a french keyboard, '|' is typed by using alt-gr, and the non-breaking
> > space is often typed by using alt-gr space. That often leads to this:
> >
> > € echo a | grep a
> > zsh: command not found: grep
> >
> > Because zsh looks for a " grep" command, with leading non-breaking space
> > because my thumb remained a bit too long on the alt-gr key.
> >
> > This doesn't happen with bash, because bash treats non-breaking space as
> > a word separator. Could zsh do the same? (currently, I have defined
> > alias grep=grep
> > alias vi=vi
> > ...)
>
> Having locale-based (and multibyte) word separators sounds like a nightmare
> to me, but maybe someone has some ideas.
Having the shell syntax that depends on the environment looks
like a very bad idea to me (think of scripts!).
There are already problems like that such as
case $x in
([a-z]) ...;;
esac
which is locale dependent while in most cases it's not what you
want. And to work around that is a nightmare POSIXly like:
LC_ALL=C command eval 'case $x in ([a-z]) ... esac'
(which doesn't even work in some shells because of bugs).
Another bad example which is causing more harm than benefit:
in ksh93, the decimal point is locale dependent, so you can't
do:
float Pi=3.14159265359
which is a syntax error is some locales:
$ LC_ALL=fr_FR ksh93 -c 'float Pi=3.141592653589'
ksh93[1]: typeset: 3.141592653589: arithmetic syntax error
This one is even harder to overcome:
$ LC_ALL=fr_FR ksh93 -c '
LC_ALL=C command float Pi=3.141592653589; print $Pi'
$ LC_ALL=fr_FR ksh93 -c 'LC_ALL=C command eval float Pi=3.141592653589
print $((Pi))'
ksh93: line 2: 3.141592653589: arithmetic syntax error
LC_ALL=fr_FR ksh93 -c 'in_C_locale() { typeset LC_ALL=C; eval "$@"; }
in_C_locale float Pi=3.141592653589; echo $LC_ALL; print $((Pi))'
C
3.141592653589
All of which look like bugs to me.
Anyway, my point was to say that it's a bad idea to have the
syntax of the shell dependant on the locale.
--
Stéphane
^ permalink raw reply [flat|nested] 4+ messages in thread