From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 21498 invoked by alias); 19 Dec 2014 22:44:16 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 34016 Received: (qmail 7906 invoked from network); 19 Dec 2014 22:44:13 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,T_FSL_HELO_BARE_IP_2 autolearn=ham version=3.3.2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1419029051; bh=vo3wi0CVW6kOCGTQGNQj+ykwTmenYxeTrTn2evbG23M=; h=From:To:In-Reply-To:References:Subject:Date; b=ThDIiL8hYanh5w4mMBekOVz7R19ddtMBbEih7ROGjPnCuDsaGc9nz5eaEql50HvN5 VVGgAF4cDpvzTzDkEEelDxSfnPVFmOtkdEKm0plPrAc5B/EFEV1685FD/9U+HMx1Uw ZHr0V7jNLJsdwmG4emFYu86PBp55HyB4RduqvyLo= From: ZyX To: Peter Stephenson , "zsh-workers@zsh.org" In-Reply-To: <20141219212125.1e1fea6b@ntlworld.com> References: <1054131418926765@web2o.yandex.ru> <20141218192917.4df5324b@pws-pc.ntlworld.com> <20141218194758.329bd9ef@pws-pc.ntlworld.com> <20141219181652.GA3996@localhost.mi.fu-berlin.de> <20141219212125.1e1fea6b@ntlworld.com> Subject: Re: [BUG] Unicode variables can be exported and are exported metafied MIME-Version: 1.0 Message-Id: <798181419029050@web11g.yandex.ru> X-Mailer: Yamail [ http://yandex.ru ] 5.0 Date: Sat, 20 Dec 2014 01:44:10 +0300 Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset=utf-8 20.12.2014, 00:54, "Peter Stephenson" : > On Fri, 19 Dec 2014 19:17:37 +0100 > "Christoph (Stucki) von Stuckrad" wrote: >>  On Thu, 18 Dec 2014, Bart Schaefer wrote: >>>  Are we sure it's even "legal" to export Unicode variable names? >>>  Internally we can kinda ignore POSIX as we choose, but the >>>  environment crosses those boundaries. >>  Independend of being 'legal' to me it seems dangerous! > > Well, this seems to be controversial.  But it's not clear how useful such > variables are anyway. > > This backs off yesterday's mess and ignores environment variable names > with characters with the top bit set.  We'll see if anyone trips over it. According to > "Other characters may be permitted by an implementation; applications > shall tolerate the presence of such names." > … shall tolerate … such environment variables can be used for testing software for standard conformance. I think though that for such testing `env` is more likely to be used because zsh support for such environment variables is not only locale-dependent, but also restricted to whatever libc thinks is alphanumeric character:  zyx  ~  env «»=10 python -c 'import os; print(os.environ["«»"])' 10  zyx  ~  «»=10 python -c 'import os; print(os.environ["«»"])' zsh: command not found: «»=10 (something weird like `$'\n'` also works as an “environment variable name” for `env`). --- By the way, support status for shells found on my system: code: абв=1 $SHELL -c 'echo $абв' tcsh: Illegal variable name ksh: echoes 1 mksh: echoes $абв fish: echoes 1 busybox with ash: echoes $абв busybox with hush (commit ad0d009e0c1968a14f17189264d3aa8008ea2e3b): echoes $абв rcsh: syntax error near (decimal -48) bash: echoes $абв dash: echoes $абв zsh: echoes 1 абв=1 $SHELL /c 'echo %абв%' wine cmd.exe: echoes ^[[?1h^[=1 followed by CRNL Summary: Syntax error: tcsh, rcsh $абв: mksh, busybox ash, busybox hush, bash, dash 1: ksh, fish, zsh, wine cmd.exe > > pws > > diff --git a/Src/params.c b/Src/params.c > index 1c51afd..b8e0c42 100644 > --- a/Src/params.c > +++ b/Src/params.c > @@ -641,9 +641,17 @@ split_env_string(char *env, char **name, char **value) >      if (!env || !name || !value) >          return 0; > > -    tenv = metafy(env, strlen(env), META_HEAPDUP); > -    for (str = tenv; *str && *str != '='; str++) > - ; > +    tenv = strcpy(zhalloc(strlen(env) + 1), env); > +    for (str = tenv; *str && *str != '='; str++) { > + if (STOUC(*str) >= 128) { > +    /* > +     * We'll ignore environment variables with names not > +     * from the portable character set since we don't > +     * know of a good reason to accept them. > +     */ > +    return 0; > + } > +    } >      if (str != tenv && *str == '=') { >          *str = '\0'; >          *name = tenv; > @@ -4357,18 +4365,7 @@ arrfixenv(char *s, char **t) >  int >  zputenv(char *str) >  { > -    char *ptr; >      DPUTS(!str, "Attempt to put null string into environment."); > -    /* > -     * The environment uses NULL-terminated strings, so just > -     * unmetafy and ignore the length. > -     */ > -    for (ptr = str; *ptr && *ptr != Meta; ptr++) > - ; > -    if (*ptr == Meta) { > - str = dupstring(str); > - unmetafy(str, NULL); > -    } >  #ifdef USE_SET_UNSET_ENV >      /* >       * If we are using unsetenv() to remove values from the > @@ -4377,11 +4374,21 @@ zputenv(char *str) >       * Unfortunately this is a slightly different interface >       * from what zputenv() assumes. >       */ > +    char *ptr; >      int ret; > > -    for (ptr = str; *ptr && *ptr != '='; ptr++) > +    for (ptr = str; *ptr && STOUC(*ptr) < 128 && *ptr != '='; ptr++) >          ; > -    if (*ptr) { > +    if (STOUC(*ptr) >= 128) { > + /* > + * Environment variables not in the portable character > + * set are non-standard and we don't really know of > + * a use for them. > + * > + * We'll disable until someone complains. > + */ > + return 1; > +    } else if (*ptr) { >          *ptr = '\0'; >          ret = setenv(str, ptr+1, 1); >          *ptr = '=';