zsh-workers
 help / color / mirror / code / Atom feed
* Echoing of 8-bit-characters broken after 4.3.2?
@ 2009-02-28  8:35 Wolfgang Hukriede
  2009-02-28 17:53 ` Bart Schaefer
  0 siblings, 1 reply; 12+ messages in thread
From: Wolfgang Hukriede @ 2009-02-28  8:35 UTC (permalink / raw)
  To: zsh-workers

After upgrading from 4.3.2 to 4.3.9 I've currently two zshells open of
the respective versions, each in its own xterm-window. Now, while in
4.3.2, when I type any 8-bit-character (from the latin1-set), it is nicely
echoed back. But this no longer works in 4.3.9.

E.g.:

        4.3.9> echo Le dictionnaire fran<00e7>ais-anglais
        Le dictionnaire franc,ais-anglais

where "c," is a "c" with cedilla. As can be seen it appears not to be
a problem with the xterm, since the character comes out quite well,
only the shell refuses to echo it.

All zsh-dotfiles in both sessions are identical. No LC-variables are
set (but export LC_CTYPE=ISO8859-1 does not help either). Also, I've
tried to setopt PRINT_EIGHT_BIT to no avail. This is on freebsd 6.4.
Unicode ist not used, and I currently do not intend to use it.

Is there a work-around or solution except downgrading again?

Greetings and thanks, Wolfgang


^ permalink raw reply	[flat|nested] 12+ messages in thread
* Re: Echoing of 8-bit-characters broken after 4.3.2?
@ 2009-02-28 19:52 Wolfgang Hukriede
  2009-02-28 20:40 ` Andrey Borzenkov
  0 siblings, 1 reply; 12+ messages in thread
From: Wolfgang Hukriede @ 2009-02-28 19:52 UTC (permalink / raw)
  To: zsh-workers

Bart wrote:
> Try
>
> export LANG=is_IS.ISO8859-1
>
> I discovered this by tab-completing values for LANG.

Ok, many thanks, this works so far. I had to change the export to
LANG=IS.ISO8859-1 though since otherwise `date' talks in a language
which is unknown to me:

  > date
  lau 28 feb 2009 19:40:54 CET

No, this isn't OSX, but Freebsd-6.4 (with newest ports though by a few
days).

> The multibyte character handling on OSX appears to be particularly
> sensitive to the LANG setting (see my previous mail to Wolfgang).
> At the same time, OSX doesn't appear to export a LANG value (or at
> least it doesn't on my iMac at work).

Freebsd does not export the variable either, but why should it?

> I can't precisely reproduce the above; I get things like
>
> schaefer<263> touch x<00c3><00c3><00c3>x
>
> or
>
> schaefer<263> touch xinsert-composed-char:180: character not in range
>
> before I ever get as far as creating the file.  Maybe there's some
> additional character munging happening in transit of the email so
> I'm not using the correct input.

This is not so here. Only just the echoing of the character fails
unless LANG is set.
Tab completion worked in 4.3.2 and works with LANG set.

> Wolfgang, if you're reading this, something that I forgot to mention in
> my reply to you is that sometime during 4.3.x zsh began to pay closer
> attention to characters that are absent from the declared LANG character
> set and to either refuse to process them at all, or to render them as
> digits surrounded by angle brackets.  It no longer blindly passes those
> characters around unprocessed, so things that "worked" before because
> xterm dealt with the processing will now appear to "fail" because the
> shell is trying harder to do the right thing internally.

Yes, I suspected so. But what is the benefit of it? Perhaps to make
certain the shell can assume unicode as the default? Would an explicit
setopt (to remove the ambiguity) not be a viable/better alternative?

Looking up "man 1 locale" I found the bug section below. Might this be
significant?

  DESCRIPTION
       ...
       -m      Print names of all available charmaps.

  BUGS
       Since FreeBSD does not support charmaps in their POSIX meaning, locale
       emulates the -m option using the CODESETs listing of all available
       locales.


^ permalink raw reply	[flat|nested] 12+ messages in thread
* Re: Echoing of 8-bit-characters broken after 4.3.2?
@ 2009-02-28 20:31 Wolfgang Hukriede
  2009-02-28 20:49 ` Andrey Borzenkov
  0 siblings, 1 reply; 12+ messages in thread
From: Wolfgang Hukriede @ 2009-02-28 20:31 UTC (permalink / raw)
  To: zsh-workers

Andrey wrote:
> Wolfgang, what happens if you explicitly disable multibyte support (--
> disable-multibyte) during build?

I did not know about that option. I will report asap.

Btw, I've to correct myself with respect to LANG, since I wrote:
> change the export to
> LANG=IS.ISO8859-1 though since otherwise `date' talks in a language
> which is unknown to me ...

True, but setting LANG to "is_IS.ISO8859-1" once and then setting it
to anything else seems to do the trick as well:

  > export LANG=is_IS.ISO8859-1
  > date
  lau 28 feb 2009 21:17:50 CET

  > export LANG=nada
  > date
  Sat Feb 28 21:24:36 CET 2009

Eight-bit-chars still work.

  > unset LANG

Again, eight-bit-chars still work.

Seems dubious to me.


^ permalink raw reply	[flat|nested] 12+ messages in thread
* Re: Echoing of 8-bit-characters broken after 4.3.2?
@ 2009-02-28 22:00 Wolfgang Hukriede
  2009-03-01  0:12 ` Phil Pennock
  0 siblings, 1 reply; 12+ messages in thread
From: Wolfgang Hukriede @ 2009-02-28 22:00 UTC (permalink / raw)
  To: zsh-workers

Andrey wrote:
> Because this is established standard to define your character set
> properties. Without it applications should assume C (or POSIX) locale
> that basically corresponds to standard ASCII.

Should the character set properties not be set by LC_CTYPE? As far as
I can tell LANG sets more than that? Do I understand correctly that
LANG is zsh-specific? (On my box, man 3 setlocale does not have it.)

> So I would be surprised if
> zsh were the only program that had issues with non-ASCII characters.

At least emacs passes them through without ado. There's only one other
program that I had problems with in that respect. (That's unicode
only.) Looks like more will come...

> FreeBSD could provide some other means to define local though.

Not that I know of.

> Because blindly emitting arbitrary character sequence to terminal may
> have completely undefined effects and screw up display to the point that
> you need hard reset (town legend also is that you can cause you terminal
> to echo back any sequence like "rm -rf" as input back to shell ...)

Urban legends aside, this may be. Otoh... I've been using zsh since at
least 10 years almost exclusively and quite intensely and have used
8-bit-characters all the time (all on xterms), but any display
distortion never happened to me. This is probably due to the fact that
filenames are mostly under ones own control. I suffered display
distortion from reading emails though, but the shell could not have
done anything about that. Correctness of vt100-control-sequences
cannot be monitored either.

Therefore I think passing-through of eight bit characters should be
configurable. But I still do not understand how am I supposed to do
that (without triggering side effects). Why is PRINT_EIGHT_BIT
constricted to affect tab-completion only?


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-03-01  0:12 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-02-28  8:35 Echoing of 8-bit-characters broken after 4.3.2? Wolfgang Hukriede
2009-02-28 17:53 ` Bart Schaefer
2009-02-28 19:07   ` Andrey Borzenkov
2009-02-28 19:19     ` Bart Schaefer
2009-02-28 19:29       ` Andrey Borzenkov
2009-02-28 19:52 Wolfgang Hukriede
2009-02-28 20:40 ` Andrey Borzenkov
2009-02-28 20:31 Wolfgang Hukriede
2009-02-28 20:49 ` Andrey Borzenkov
2009-02-28 23:01   ` Bart Schaefer
2009-02-28 22:00 Wolfgang Hukriede
2009-03-01  0:12 ` Phil Pennock

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).