zsh-workers
 help / color / mirror / code / Atom feed
* [BUG] Unicode variables can be exported and are exported metafied
@ 2014-12-18 18:19 ZyX
  2014-12-18 19:29 ` Peter Stephenson
  0 siblings, 1 reply; 14+ messages in thread
From: ZyX @ 2014-12-18 18:19 UTC (permalink / raw)
  To: zsh-workers

Consider the following input (zsh -f, UTF-8 locale):

zyx-desktop% ус=1
zyx-desktop% export ус
zyx-desktop% env | grep '=1' | grep '^[^A-Z]'
у�с=1
zyx-desktop% env | grep '=1' | grep '^[^A-Z]' | hexdump -C
00000000  d1 83 a3 d1 81 3d 31 0a                           |.....=1.|
00000008
zyx-desktop% echo ус=1 | hexdump -C
00000000  d1 83 d1 81 3d 31 0a                              |....=1.|
00000007

You see here that variable named `ус` can be exported (not sure whether it is a bug or not), but its 0x83 byte which is the last byte of the first unicode codepoint that forms the variable name represented as UTF-8 is using zsh `Meta` escape in the `env` output (which clearly is a bug assuming the fact that unicode variable is exported is not).


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] Unicode variables can be exported and are exported metafied
  2014-12-18 18:19 [BUG] Unicode variables can be exported and are exported metafied ZyX
@ 2014-12-18 19:29 ` Peter Stephenson
  2014-12-18 19:47   ` Peter Stephenson
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Stephenson @ 2014-12-18 19:29 UTC (permalink / raw)
  To: zsh-workers

On Thu, 18 Dec 2014 21:19:25 +0300
ZyX <kp-pav@yandex.ru> wrote: 
> You see here that variable named `ус` can be exported (not sure
> whether it is a bug or not),

I *think* that's a "don't do that unless you actually need to..." but
feel free to present evidence otherwise.

> but its 0x83 byte which is the last byte of the first unicode
> codepoint that forms the variable name represented as UTF-8 is using
> zsh `Meta` escape in the `env` output (which clearly is a bug assuming
> the fact that unicode variable is exported is not).

Yes, indeed.

diff --git a/Src/params.c b/Src/params.c
index 79088d1..b87598a 100644
--- a/Src/params.c
+++ b/Src/params.c
@@ -4357,7 +4357,18 @@ arrfixenv(char *s, char **t)
 int
 zputenv(char *str)
 {
+    char *ptr;
     DPUTS(!str, "Attempt to put null string into environment.");
+    /*
+     * The environment uses NULL-terminated strings, so just
+     * unmetafy and ignore the length.
+     */
+    for (ptr = str; *ptr && *ptr != Meta; ptr++)
+	;
+    if (*ptr == Meta) {
+	str = dupstring(str);
+	unmetafy(str, NULL);
+    }
 #ifdef USE_SET_UNSET_ENV
     /*
      * If we are using unsetenv() to remove values from the
@@ -4366,7 +4377,6 @@ zputenv(char *str)
      * Unfortunately this is a slightly different interface
      * from what zputenv() assumes.
      */
-    char *ptr;
     int ret;
 
     for (ptr = str; *ptr && *ptr != '='; ptr++)

pws


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] Unicode variables can be exported and are exported metafied
  2014-12-18 19:29 ` Peter Stephenson
@ 2014-12-18 19:47   ` Peter Stephenson
  2014-12-18 19:58     ` Bart Schaefer
  0 siblings, 1 reply; 14+ messages in thread
From: Peter Stephenson @ 2014-12-18 19:47 UTC (permalink / raw)
  To: zsh-workers

On Thu, 18 Dec 2014 19:29:17 +0000
Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> On Thu, 18 Dec 2014 21:19:25 +0300
> ZyX <kp-pav@yandex.ru> wrote: 
> > but its 0x83 byte which is the last byte of the first unicode
> > codepoint that forms the variable name represented as UTF-8 is using
> > zsh `Meta` escape in the `env` output (which clearly is a bug assuming
> > the fact that unicode variable is exported is not).
> 
> Yes, indeed.

And on import to the shell.

Currently averaging two patches per fix...

diff --git a/Src/params.c b/Src/params.c
index b87598a..1c51afd 100644
--- a/Src/params.c
+++ b/Src/params.c
@@ -641,7 +641,7 @@ split_env_string(char *env, char **name, char **value)
     if (!env || !name || !value)
 	return 0;
 
-    tenv = strcpy(zhalloc(strlen(env) + 1), env);
+    tenv = metafy(env, strlen(env), META_HEAPDUP);
     for (str = tenv; *str && *str != '='; str++)
 	;
     if (str != tenv && *str == '=') {

pws


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] Unicode variables can be exported and are exported metafied
  2014-12-18 19:47   ` Peter Stephenson
@ 2014-12-18 19:58     ` Bart Schaefer
  2014-12-18 20:09       ` Peter Stephenson
                         ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Bart Schaefer @ 2014-12-18 19:58 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 744 bytes --]

On Dec 18, 2014 11:48 AM, "Peter Stephenson" <p.w.stephenson@ntlworld.com>
wrote:
>
> On Thu, 18 Dec 2014 19:29:17 +0000
> Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> > On Thu, 18 Dec 2014 21:19:25 +0300
> > ZyX <kp-pav@yandex.ru> wrote:
> > > but its 0x83 byte which is the last byte of the first unicode
> > > codepoint that forms the variable name represented as UTF-8 is using
> > > zsh `Meta` escape in the `env` output (which clearly is a bug assuming
> > > the fact that unicode variable is exported is not).
> >
> > Yes, indeed.
>
> And on import to the shell.

Are we sure it's even "legal" to export Unicode variable names?  Internally
we can kinda ignore POSIX as we choose, but the environment crosses those
boundaries.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] Unicode variables can be exported and are exported metafied
  2014-12-18 19:58     ` Bart Schaefer
@ 2014-12-18 20:09       ` Peter Stephenson
       [not found]       ` <54933513.6010501@case.edu>
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Peter Stephenson @ 2014-12-18 20:09 UTC (permalink / raw)
  To: Zsh hackers list

On Thu, 18 Dec 2014 11:58:11 -0800
Bart Schaefer <schaefer@brasslantern.com> wrote:
> Are we sure it's even "legal" to export Unicode variable names?  Internally
> we can kinda ignore POSIX as we choose, but the environment crosses those
> boundaries.

As far as I can see it's described as not portable, which is obviously
correct, but not actually a violation of anything.

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html

However...  AARGH we were already metafying and unmetafying the value, but
not the identifier...  So now we're doing it twice.

I will give it a bit more consideration before posting anything else.

pws


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Fwd: Re: [BUG] Unicode variables can be exported and are exported metafied
       [not found]       ` <54933513.6010501@case.edu>
@ 2014-12-18 20:20         ` Bart Schaefer
  0 siblings, 0 replies; 14+ messages in thread
From: Bart Schaefer @ 2014-12-18 20:20 UTC (permalink / raw)
  To: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 913 bytes --]

Assuming it's OK to forward this.
---------- Forwarded message ----------
From: "Chet Ramey" <chet.ramey@case.edu>
Date: Dec 18, 2014 12:12 PM
Subject: Re: [BUG] Unicode variables can be exported and are exported
metafied
To: "Bart Schaefer" <schaefer@brasslantern.com>
Cc: <chet.ramey@case.edu>

On 12/18/14, 2:58 PM, Bart Schaefer wrote:

> Are we sure it's even "legal" to export Unicode variable names?
Internally
> we can kinda ignore POSIX as we choose, but the environment crosses those
> boundaries.

Yes, it is.

"Other characters may be permitted by an implementation; applications
shall tolerate the presence of such names."

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] Unicode variables can be exported and are exported metafied
  2014-12-18 19:58     ` Bart Schaefer
  2014-12-18 20:09       ` Peter Stephenson
       [not found]       ` <54933513.6010501@case.edu>
@ 2014-12-19  9:29       ` Christoph (Stucki) von Stuckrad
  2014-12-19 18:17       ` Christoph (Stucki) von Stuckrad
  3 siblings, 0 replies; 14+ messages in thread
From: Christoph (Stucki) von Stuckrad @ 2014-12-19  9:29 UTC (permalink / raw)
  To: zsh-workers

On Thu, 18 Dec 2014, Bart Schaefer wrote:

> Are we sure it's even "legal" to export Unicode variable names?  Internally
> we can kinda ignore POSIX as we choose, but the environment crosses those
> boundaries.

Independend of being 'legal' to me it seems dangerous!

Comparing the 'working as written' example:

~$ M='surprise; : ' MÄRCHEN=story sh -c 'echo $MÄRCHEN'
story

to running it with all the other shells I keep around
(bash, dash, ash, sash - untested ksh and csh)
you always get:

..................................vvvv
~$ M='surprise; : ' MÄRCHEN=story bash -c 'echo $MÄRCHEN'
surprise; : ÄRCHEN

Which gives interesting new ways to introduce security-sensitive
changes into environments by letting a Program check the
UTF8-named-Variable for its contents, but really inserting data
by the broken-part-name, which might be passed unchecked!

So PLEASE DO NOT EXPORT these !

Stucki


-- 
Christoph von Stuckrad      * * |nickname |Mail <stucki@mi.fu-berlin.de> \
Freie Universitaet Berlin   |/_*|'stucki' |Tel(Mo.,Mi.):+49 30 838-75 459|
Mathematik & Informatik EDV |\ *|if online|  (Di,Do,Fr):+49 30 77 39 6600|
Takustr. 9 / 14195 Berlin   * * |on IRCnet|Fax(home):   +49 30 77 39 6601/


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] Unicode variables can be exported and are exported metafied
  2014-12-18 19:58     ` Bart Schaefer
                         ` (2 preceding siblings ...)
  2014-12-19  9:29       ` Christoph (Stucki) von Stuckrad
@ 2014-12-19 18:17       ` Christoph (Stucki) von Stuckrad
  2014-12-19 20:13         ` Павлов Николай Александрович
  2014-12-19 21:21         ` Peter Stephenson
  3 siblings, 2 replies; 14+ messages in thread
From: Christoph (Stucki) von Stuckrad @ 2014-12-19 18:17 UTC (permalink / raw)
  To: zsh-workers

On Thu, 18 Dec 2014, Bart Schaefer wrote:

> Are we sure it's even "legal" to export Unicode variable names?  Internally
> we can kinda ignore POSIX as we choose, but the environment crosses those
> boundaries.

Independend of being 'legal' to me it seems dangerous!

Comparing the 'working as written' example:

~$ M='surprise; : ' MÄRCHEN=story sh -c 'echo $MÄRCHEN'
story

to running it with all the other shells I keep around
(bash, dash, ash, sash - untested ksh and csh)
you always get:

..................................vvvv
~$ M='surprise; : ' MÄRCHEN=story bash -c 'echo $MÄRCHEN'
surprise; : ÄRCHEN

Which gives interesting new ways to introduce security-sensitive
changes into environments by letting a Program check the
UTF8-named-Variable for its contents, but really inserting data
by the broken-part-name, which might be passed unchecked!

So PLEASE DO NOT EXPORT these !

Stucki


-- 
Christoph von Stuckrad      * * |nickname |Mail <stucki@mi.fu-berlin.de> \
Freie Universitaet Berlin   |/_*|'stucki' |Tel(Mo.,Mi.):+49 30 838-75 459|
Mathematik & Informatik EDV |\ *|if online|  (Di,Do,Fr):+49 30 77 39 6600|
Takustr. 9 / 14195 Berlin   * * |on IRCnet|Fax(home):   +49 30 77 39 6601/


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] Unicode variables can be exported and are exported metafied
  2014-12-19 18:17       ` Christoph (Stucki) von Stuckrad
@ 2014-12-19 20:13         ` Павлов Николай Александрович
  2014-12-19 21:21         ` Peter Stephenson
  1 sibling, 0 replies; 14+ messages in thread
From: Павлов Николай Александрович @ 2014-12-19 20:13 UTC (permalink / raw)
  To: Christoph (Stucki) von Stuckrad, zsh-workers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On December 19, 2014 9:17:37 PM EAT, "Christoph (Stucki) von Stuckrad" <stucki@mi.fu-berlin.de> wrote:
>On Thu, 18 Dec 2014, Bart Schaefer wrote:
>
>> Are we sure it's even "legal" to export Unicode variable names?
>Internally
>> we can kinda ignore POSIX as we choose, but the environment crosses
>those
>> boundaries.
>
>Independend of being 'legal' to me it seems dangerous!
>
>Comparing the 'working as written' example:
>
>~$ M='surprise; : ' MÄRCHEN=story sh -c 'echo $MÄRCHEN'
>story
>
>to running it with all the other shells I keep around
>(bash, dash, ash, sash - untested ksh and csh)
>you always get:
>
>..................................vvvv
>~$ M='surprise; : ' MÄRCHEN=story bash -c 'echo $MÄRCHEN'
>surprise; : ÄRCHEN
>
>Which gives interesting new ways to introduce security-sensitive
>changes into environments by letting a Program check the
>UTF8-named-Variable for its contents, but really inserting data
>by the broken-part-name, which might be passed unchecked!
>
>So PLEASE DO NOT EXPORT these !
>
>Stucki

I really do not see any problems here. If one has "surprise" in $M in an environment he runs typed scripts in then he already has much bigger problems.
-----BEGIN PGP SIGNATURE-----
Version: APG v1.1.1

iQJNBAEBCgA3BQJUlIbVMBwfMDI7PjIgHTg6PjswOSAQOzU6QTA9NEA+MjhHIDxr
cC1wYXZAeWFuZGV4LnJ1PgAKCRBu+P2/AXZZIuF9D/4oc9QkX4ziGW34IpiFzPmA
P4w5ZmbGFq8yV8IhYLX+SDukWSKP5j7K7CZgc6UU9Xftpr7RFbSXuRqyjTCWhzRM
mt6od3PeOI6+nEF+hizz+3WwqiHmrB/pagP7qed3gjX6t6y9qV7g+QCXdL7EPOQ/
uUKDoAjF0LPOc0JUtKXVJNZzE6YsCmVL/hwdeGG7pNQ7tOUeKEeS02XwNphAdUw4
5tuwE/UBRxtcPyCE3pVsV9vXa+1cyREuyY50uH/lMRGR8FuyjNmPslvRfDmzWkxw
x5OgxiyukBdxY4YLjiXuVLAVh/JqVmnZvMy2o6uqxESmv3tX8yOIjelFbwo6hZhz
L7RAdsXdw23OFBqxTrHxnSbImQuCn2yS2CrmQmBe3adilj84XIpqlpQCKy/LdXm7
LQyCGrI8gUwKLmpeqvaHrp3SbFfUZIbtMOaccQwPGBfH67JA0CUr+HZv3fJ/Iijm
F0LSsJTiQIgfeXXwk0nxHXUj/0yr5MEJUnVwFNY7C/tgOKRpDpwA5u2jgAvcYpUn
YHJddwyHryeghp2JpiECprUEd1nGRj4ijbGb4uolbs7CxVpR6z+IadEzsSg5bd2y
r499ADpGsuXM0U09unQUTMqsCaxW9y7VOeTqORSv/1jOG7O8vZIRIgLuyi+JYXsU
vMCYECRmzPKvIdYOsbQ7Gg==
=d7xB
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] Unicode variables can be exported and are exported metafied
  2014-12-19 18:17       ` Christoph (Stucki) von Stuckrad
  2014-12-19 20:13         ` Павлов Николай Александрович
@ 2014-12-19 21:21         ` Peter Stephenson
  2014-12-19 22:44           ` ZyX
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Stephenson @ 2014-12-19 21:21 UTC (permalink / raw)
  To: zsh-workers

On Fri, 19 Dec 2014 19:17:37 +0100
"Christoph (Stucki) von Stuckrad" <stucki@mi.fu-berlin.de> wrote:
> On Thu, 18 Dec 2014, Bart Schaefer wrote:
> 
> > Are we sure it's even "legal" to export Unicode variable names?
> > Internally we can kinda ignore POSIX as we choose, but the
> > environment crosses those boundaries.
> 
> Independend of being 'legal' to me it seems dangerous!

Well, this seems to be controversial.  But it's not clear how useful such
variables are anyway.

This backs off yesterday's mess and ignores environment variable names
with characters with the top bit set.  We'll see if anyone trips over it.

pws

diff --git a/Src/params.c b/Src/params.c
index 1c51afd..b8e0c42 100644
--- a/Src/params.c
+++ b/Src/params.c
@@ -641,9 +641,17 @@ split_env_string(char *env, char **name, char **value)
     if (!env || !name || !value)
 	return 0;
 
-    tenv = metafy(env, strlen(env), META_HEAPDUP);
-    for (str = tenv; *str && *str != '='; str++)
-	;
+    tenv = strcpy(zhalloc(strlen(env) + 1), env);
+    for (str = tenv; *str && *str != '='; str++) {
+	if (STOUC(*str) >= 128) {
+	    /*
+	     * We'll ignore environment variables with names not
+	     * from the portable character set since we don't
+	     * know of a good reason to accept them.
+	     */
+	    return 0;
+	}
+    }
     if (str != tenv && *str == '=') {
 	*str = '\0';
 	*name = tenv;
@@ -4357,18 +4365,7 @@ arrfixenv(char *s, char **t)
 int
 zputenv(char *str)
 {
-    char *ptr;
     DPUTS(!str, "Attempt to put null string into environment.");
-    /*
-     * The environment uses NULL-terminated strings, so just
-     * unmetafy and ignore the length.
-     */
-    for (ptr = str; *ptr && *ptr != Meta; ptr++)
-	;
-    if (*ptr == Meta) {
-	str = dupstring(str);
-	unmetafy(str, NULL);
-    }
 #ifdef USE_SET_UNSET_ENV
     /*
      * If we are using unsetenv() to remove values from the
@@ -4377,11 +4374,21 @@ zputenv(char *str)
      * Unfortunately this is a slightly different interface
      * from what zputenv() assumes.
      */
+    char *ptr;
     int ret;
 
-    for (ptr = str; *ptr && *ptr != '='; ptr++)
+    for (ptr = str; *ptr && STOUC(*ptr) < 128 && *ptr != '='; ptr++)
 	;
-    if (*ptr) {
+    if (STOUC(*ptr) >= 128) {
+	/*
+	 * Environment variables not in the portable character
+	 * set are non-standard and we don't really know of
+	 * a use for them.
+	 *
+	 * We'll disable until someone complains.
+	 */
+	return 1;
+    } else if (*ptr) {
 	*ptr = '\0';
 	ret = setenv(str, ptr+1, 1);
 	*ptr = '=';


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] Unicode variables can be exported and are exported metafied
  2014-12-19 21:21         ` Peter Stephenson
@ 2014-12-19 22:44           ` ZyX
  2014-12-20  0:13             ` Stephane Chazelas
  0 siblings, 1 reply; 14+ messages in thread
From: ZyX @ 2014-12-19 22:44 UTC (permalink / raw)
  To: Peter Stephenson, zsh-workers



20.12.2014, 00:54, "Peter Stephenson" <p.w.stephenson@ntlworld.com>:
> On Fri, 19 Dec 2014 19:17:37 +0100
> "Christoph (Stucki) von Stuckrad" <stucki@mi.fu-berlin.de> wrote:
>>  On Thu, 18 Dec 2014, Bart Schaefer wrote:
>>>  Are we sure it's even "legal" to export Unicode variable names?
>>>  Internally we can kinda ignore POSIX as we choose, but the
>>>  environment crosses those boundaries.
>>  Independend of being 'legal' to me it seems dangerous!
>
> Well, this seems to be controversial.  But it's not clear how useful such
> variables are anyway.
>
> This backs off yesterday's mess and ignores environment variable names
> with characters with the top bit set.  We'll see if anyone trips over it.

According to

> "Other characters may be permitted by an implementation; applications
> shall tolerate the presence of such names."

> … shall tolerate …

such environment variables can be used for testing software for standard conformance.

I think though that for such testing `env` is more likely to be used because zsh support for such environment variables is not only locale-dependent, but also restricted to whatever libc thinks is alphanumeric character:

 zyx  ~  env «»=10 python -c 'import os; print(os.environ["«»"])'
10
 zyx  ~  «»=10 python -c 'import os; print(os.environ["«»"])'
zsh: command not found: «»=10

(something weird like `$'\n'` also works as an “environment variable name” for `env`).

---

By the way, support status for shells found on my system:

code:

    абв=1 $SHELL -c 'echo $абв'

tcsh: Illegal variable name
ksh: echoes 1
mksh: echoes $абв
fish: echoes 1
busybox with ash: echoes $абв
busybox with hush (commit ad0d009e0c1968a14f17189264d3aa8008ea2e3b): echoes $абв
rcsh: syntax error near (decimal -48)
bash: echoes $абв
dash: echoes $абв
zsh: echoes 1

    абв=1 $SHELL /c 'echo %абв%'

wine cmd.exe: echoes ^[[?1h^[=1 followed by CRNL

Summary:

Syntax error: tcsh, rcsh
$абв: mksh, busybox ash, busybox hush, bash, dash
1: ksh, fish, zsh, wine cmd.exe

>
> pws
>
> diff --git a/Src/params.c b/Src/params.c
> index 1c51afd..b8e0c42 100644
> --- a/Src/params.c
> +++ b/Src/params.c
> @@ -641,9 +641,17 @@ split_env_string(char *env, char **name, char **value)
>      if (!env || !name || !value)
>          return 0;
>
> -    tenv = metafy(env, strlen(env), META_HEAPDUP);
> -    for (str = tenv; *str && *str != '='; str++)
> - ;
> +    tenv = strcpy(zhalloc(strlen(env) + 1), env);
> +    for (str = tenv; *str && *str != '='; str++) {
> + if (STOUC(*str) >= 128) {
> +    /*
> +     * We'll ignore environment variables with names not
> +     * from the portable character set since we don't
> +     * know of a good reason to accept them.
> +     */
> +    return 0;
> + }
> +    }
>      if (str != tenv && *str == '=') {
>          *str = '\0';
>          *name = tenv;
> @@ -4357,18 +4365,7 @@ arrfixenv(char *s, char **t)
>  int
>  zputenv(char *str)
>  {
> -    char *ptr;
>      DPUTS(!str, "Attempt to put null string into environment.");
> -    /*
> -     * The environment uses NULL-terminated strings, so just
> -     * unmetafy and ignore the length.
> -     */
> -    for (ptr = str; *ptr && *ptr != Meta; ptr++)
> - ;
> -    if (*ptr == Meta) {
> - str = dupstring(str);
> - unmetafy(str, NULL);
> -    }
>  #ifdef USE_SET_UNSET_ENV
>      /*
>       * If we are using unsetenv() to remove values from the
> @@ -4377,11 +4374,21 @@ zputenv(char *str)
>       * Unfortunately this is a slightly different interface
>       * from what zputenv() assumes.
>       */
> +    char *ptr;
>      int ret;
>
> -    for (ptr = str; *ptr && *ptr != '='; ptr++)
> +    for (ptr = str; *ptr && STOUC(*ptr) < 128 && *ptr != '='; ptr++)
>          ;
> -    if (*ptr) {
> +    if (STOUC(*ptr) >= 128) {
> + /*
> + * Environment variables not in the portable character
> + * set are non-standard and we don't really know of
> + * a use for them.
> + *
> + * We'll disable until someone complains.
> + */
> + return 1;
> +    } else if (*ptr) {
>          *ptr = '\0';
>          ret = setenv(str, ptr+1, 1);
>          *ptr = '=';


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] Unicode variables can be exported and are exported metafied
  2014-12-19 22:44           ` ZyX
@ 2014-12-20  0:13             ` Stephane Chazelas
  2014-12-20  9:27               ` ZyX
  0 siblings, 1 reply; 14+ messages in thread
From: Stephane Chazelas @ 2014-12-20  0:13 UTC (permalink / raw)
  To: ZyX; +Cc: Peter Stephenson, zsh-workers

2014-12-20 01:44:10 +0300, ZyX:
[...]
[about making the shell syntax locale-dependant]
> bash: echoes $абв
[...]

You'd have gotten a different behaviour in a single-byte locale.

Related discussion:

http://thread.gmane.org/gmane.comp.shells.bash.bugs/22367

Personally, I think I'd rather shell variable names be limited
to ASCII [:alnum:]_.

Making the shell syntax locale dependant is a problem in scripts
(and we're already affected by that to some extent with
utilities) because there, it's the *author*'s locale that
matters, not the user's.

(Or else, you could do like rc and allow anything allowed in env
vars (rc allows any name as long as it's not empty and doesn't
contain `=`).

-- 
Stephane


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] Unicode variables can be exported and are exported metafied
  2014-12-20  0:13             ` Stephane Chazelas
@ 2014-12-20  9:27               ` ZyX
  2014-12-20 10:08                 ` Stephane Chazelas
  0 siblings, 1 reply; 14+ messages in thread
From: ZyX @ 2014-12-20  9:27 UTC (permalink / raw)
  To: Stephane Chazelas; +Cc: Peter Stephenson, zsh-workers

20.12.2014, 03:22, "Stephane Chazelas" <stephane.chazelas@gmail.com>:
> 2014-12-20 01:44:10 +0300, ZyX:
> [...]
> [about making the shell syntax locale-dependant]
>>  bash: echoes $абв
>
> [...]
>
> You'd have gotten a different behaviour in a single-byte locale.
>
> Related discussion:
>
> http://thread.gmane.org/gmane.comp.shells.bash.bugs/22367
>
> Personally, I think I'd rather shell variable names be limited
> to ASCII [:alnum:]_.
>
> Making the shell syntax locale dependant is a problem in scripts
> (and we're already affected by that to some extent with
> utilities) because there, it's the *author*'s locale that
> matters, not the user's.
>
> (Or else, you could do like rc and allow anything allowed in env
> vars (rc allows any name as long as it's not empty and doesn't
> contain `=`).

`абв=1 rcsh -c 'echo $абв'` emits “line 1: syntax error near (decimal -48)”. Is it the difference between “A reimplementation of the Plan 9 shell” (http://rc-shell.slackmatic.org/) version 1.7.2 that can be found in the main portage tree and the original Plan9 shell? Or maybe I should simply use different syntax (rcsh -c 'echo $PATH' works though)?

By the way, `env` allows empty name.

--

Found this syntax: `абв=1 rcsh -c 'echo $(абв)'` echoes 1. But `абв=1 rcsh -c 'env; абв=10; env; echo $(абв)'` shows that rcsh removes `абв` from the environment and places variable named `__d0__b0__d0__b1__d0__b2` there instead.

>
> --
> Stephane


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG] Unicode variables can be exported and are exported metafied
  2014-12-20  9:27               ` ZyX
@ 2014-12-20 10:08                 ` Stephane Chazelas
  0 siblings, 0 replies; 14+ messages in thread
From: Stephane Chazelas @ 2014-12-20 10:08 UTC (permalink / raw)
  To: ZyX; +Cc: Peter Stephenson, zsh-workers

2014-12-20 12:27:27 +0300, ZyX:
[...]
> > (Or else, you could do like rc and allow anything allowed in env
> > vars (rc allows any name as long as it's not empty and doesn't
> > contain `=`).
> 
> `абв=1 rcsh -c 'echo $абв'` emits “line 1: syntax error near
> (decimal -48)”. Is it the difference between “A
> reimplementation of the Plan 9 shell”
> (http://rc-shell.slackmatic.org/) version 1.7.2 that can be
> found in the main portage tree and the original Plan9 shell?
> Or maybe I should simply use different syntax (rcsh -c 'echo
> $PATH' works though)?
> 
> --
> 
> Found this syntax: `абв=1 rcsh -c 'echo $(абв)'` echoes 1. But
> `абв=1 rcsh -c 'env; абв=10; env; echo $(абв)'` shows that
> rcsh removes `абв` from the environment and places variable
> named `__d0__b0__d0__b1__d0__b2` there instead.
[...]

Sorry,

I should have tried. I suppose rc being from the 80s uses the
8th bit for parsing/tokening like other shells from that time as
well.

But the idea remains. In `rc`, you can do:

'my var
(with all sorts of characters)' = whatever
echo $'my var
(with all sorts of characters)'

And assuming it was extended to 8bit bytes, that means you can
have code that works regarless of the locale of the user since
all variable names are allowed. That also simplifies interaction
with env vars (though you still have problems with special
parameters like $1, $*...).

That would mean changing zsh syntax though and I don't think
it's really worth it.

> By the way, `env` allows empty name.

Yes, and you can pass env strings to execve() without `=` in
them.

-- 
Stephane


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-12-20 10:08 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-18 18:19 [BUG] Unicode variables can be exported and are exported metafied ZyX
2014-12-18 19:29 ` Peter Stephenson
2014-12-18 19:47   ` Peter Stephenson
2014-12-18 19:58     ` Bart Schaefer
2014-12-18 20:09       ` Peter Stephenson
     [not found]       ` <54933513.6010501@case.edu>
2014-12-18 20:20         ` Fwd: " Bart Schaefer
2014-12-19  9:29       ` Christoph (Stucki) von Stuckrad
2014-12-19 18:17       ` Christoph (Stucki) von Stuckrad
2014-12-19 20:13         ` Павлов Николай Александрович
2014-12-19 21:21         ` Peter Stephenson
2014-12-19 22:44           ` ZyX
2014-12-20  0:13             ` Stephane Chazelas
2014-12-20  9:27               ` ZyX
2014-12-20 10:08                 ` Stephane Chazelas

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).