zsh-workers
 help / color / mirror / code / Atom feed
* namespaces limitation
@ 2023-05-19  0:13 Oliver Kiddle
  2023-05-23 17:10 ` Bart Schaefer
  0 siblings, 1 reply; 6+ messages in thread
From: Oliver Kiddle @ 2023-05-19  0:13 UTC (permalink / raw)
  To: Zsh workers

I came across the following when using the new namespaces:

  integer .var.d=0
  (( .var.d++ ))
  zsh: bad floating point constant

It'd be good if it could identify this as a namespace variable.

And should this be allowed or should it complain about the 3 part not
being an identifier?

  integer .3=67

Oliver


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: namespaces limitation
  2023-05-19  0:13 namespaces limitation Oliver Kiddle
@ 2023-05-23 17:10 ` Bart Schaefer
  2023-05-24 16:41   ` Oliver Kiddle
  0 siblings, 1 reply; 6+ messages in thread
From: Bart Schaefer @ 2023-05-23 17:10 UTC (permalink / raw)
  To: Oliver Kiddle; +Cc: Zsh workers

On Thu, May 18, 2023 at 5:14 PM Oliver Kiddle <opk@zsh.org> wrote:
>
> I came across the following when using the new namespaces:
>
>   integer .var.d=0
>   (( .var.d++ ))
>   zsh: bad floating point constant

There are three (mainly) places where I elected not to dive straight
in to changing the meaning of "identifier" ...
1) math variables
2) math function names
3) word completions

#3 is tricky (as in zle_tricky) because itype_end() is often being
used to determine whether a character is valid in an identifier, not
whether an entire word is an identifier; it would probably be wrong or
break some other constraint to skip across "." in that context.

#2 is ... well, function naming in general is a bit hard to deal with.
For example, out of math context, you can name a function pretty much
anything (using dots, slashes, hyphens, plus signs) but if you want to
use the function with the (+CMD) glob qualifier, the name has to be
strictly an identifier (alphanum and underscore).

#1 (and likely also #2) requires changing the math lexer, which I
didn't want to risk breaking until the rest had been thoroughly
tested.  E.g., --

> And should this be allowed or should it complain about the 3 part not
> being an identifier?
>
>   integer .3=67

Yes, the first character of a namespace identifier should probably be
an alphabetic.  Or an underscore?  That might really complicate math
lexing, given that we allow underscores in numeric constants.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: namespaces limitation
  2023-05-23 17:10 ` Bart Schaefer
@ 2023-05-24 16:41   ` Oliver Kiddle
  2023-05-25 22:09     ` Bart Schaefer
  0 siblings, 1 reply; 6+ messages in thread
From: Oliver Kiddle @ 2023-05-24 16:41 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh workers

Bart Schaefer wrote:
> Yes, the first character of a namespace identifier should probably be
> an alphabetic.  Or an underscore?  That might really complicate math
> lexing, given that we allow underscores in numeric constants.

What's allowed in math context can be (and already is) more restrictive
that what it's possible to declare.

One option might be to disallow bare namespaces, so

  integer .var=56

would be disallowed. Where .3 just looks really wrong, .3.var feels
innocuous.

In ksh93:

  $ .var=23
  $ typeset -p .var
  .var=23
  $ namespace var { val=78; }
  $ typeset -p .var
  namespace var
  {
	  val=78
  }

Also questionable is the following which currently works:

  .a.=ddd

The ksh error for this is 'no parent' even if you create an "a"
namespace first. We also currently allow a.=ddd but I think that is less
questionable. We allow empty association elements.

Oliver


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: namespaces limitation
  2023-05-24 16:41   ` Oliver Kiddle
@ 2023-05-25 22:09     ` Bart Schaefer
  2023-05-25 23:13       ` Bart Schaefer
  2023-06-22 17:49       ` [PATCH] " Bart Schaefer
  0 siblings, 2 replies; 6+ messages in thread
From: Bart Schaefer @ 2023-05-25 22:09 UTC (permalink / raw)
  To: Oliver Kiddle; +Cc: Zsh workers

On Wed, May 24, 2023 at 9:41 AM Oliver Kiddle <opk@zsh.org> wrote:
>
> One option might be to disallow bare namespaces, so
>
>   integer .var=56
>
> would be disallowed.

I had it that way at one point and backed off because ksh allows it.

> Where .3 just looks really wrong, .3.var feels
> innocuous.

Strictly speaking (to the extent that any of this is strict),
namespaces are supposed to otherwise obey the rules for identifiers,
which with the exception of positional parameters are not allowed to
begin with a digit ... so I think I'll just go with that.

I still haven't worked out how to deal correctly with .var.3x -- ksh
allows .var.3 but rejects .var.3x so although it's a simple !idigit()
test to disallow anything starting with a digit, it's more work to
allow numbers but not strings that begin with a number.  Currently
that's handled in isident() at a point which precedes the check for
presence of a namespace.  Hrm.

> Also questionable is the following which currently works:
>
>   .a.=ddd
>
> The ksh error for this is 'no parent'

That probably should be rejected, yeah.

> We also currently allow a.=ddd but I think that is less
> questionable. We allow empty association elements.

I'm still not entirely sold on the x.y <=> x[y] mapping idea, but if
we follow JavaScript's (somewhat questionable) lead on this, empty
elements have to use bracket-notation rather than dot-notation.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: namespaces limitation
  2023-05-25 22:09     ` Bart Schaefer
@ 2023-05-25 23:13       ` Bart Schaefer
  2023-06-22 17:49       ` [PATCH] " Bart Schaefer
  1 sibling, 0 replies; 6+ messages in thread
From: Bart Schaefer @ 2023-05-25 23:13 UTC (permalink / raw)
  To: Oliver Kiddle; +Cc: Zsh workers

On Thu, May 25, 2023 at 3:09 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> I still haven't worked out how to deal correctly with .var.3x -- ksh
> allows .var.3 but rejects .var.3x

Also, paramsubst() doesn't apply the same rules in the same way as
assignments, so getting both .var.3x=y and ${.var.3x} to be rejected
is annoyingly complex.

Even rejecting .var.3x=y in zsh produces "not an identifier" whereas
ksh attempts to run a command of that name ... that may not be worth
tackling.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] Re: namespaces limitation
  2023-05-25 22:09     ` Bart Schaefer
  2023-05-25 23:13       ` Bart Schaefer
@ 2023-06-22 17:49       ` Bart Schaefer
  1 sibling, 0 replies; 6+ messages in thread
From: Bart Schaefer @ 2023-06-22 17:49 UTC (permalink / raw)
  To: Zsh workers

[-- Attachment #1: Type: text/plain, Size: 2405 bytes --]

On Thu, May 18, 2023 at 5:14 PM Oliver Kiddle <opk@zsh.org> wrote:
>
>   integer .var.d=0
>   (( .var.d++ ))
>   zsh: bad floating point constant
>
> It'd be good if it could identify this as a namespace variable.

This should be correctly handled after this patch, test is included.
If you find other broken edge cases please point them out.

On Thu, May 25, 2023 at 3:09 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Wed, May 24, 2023 at 9:41 AM Oliver Kiddle <opk@zsh.org> wrote:
> >
> >   integer .var=56
> >
> > would be disallowed.
>
> I had it that way at one point and backed off because ksh allows it.

Patch below leaves this unchanged (so still allowed) and adds a test for it.

> Strictly speaking (to the extent that any of this is strict),
> namespaces are supposed to otherwise obey the rules for identifiers,
> which with the exception of positional parameters are not allowed to
> begin with a digit ... so I think I'll just go with that.
>
> I still haven't worked out how to deal correctly with .var.3x

The below enforces that identifiers following a namespace prefix must
either begin with a non-digit or must consist entirely of digits.  For
possible future enhancements, this does not apply when the identifier
before the first "." does not itself begin with a ".", that is,
${var.3x} is allowed.

I haven't determined how/where to document that latter special case.
Perhaps it's not necessary to do so, given that omitting the first "."
is explicitly discouraged.

> > Also questionable is the following which currently works:
> >
> >   .a.=ddd

This is rejected by the patch (not an identifier / bad substitution).
I think the extra checks add very little if any overhead to the
unaffected cases but I haven't run benchmarks.

> > We also currently allow a.=ddd but I think that is less
> > questionable. We allow empty association elements.
>
> I'm still not entirely sold on the x.y <=> x[y] mapping idea, but if
> we follow JavaScript's (somewhat questionable) lead on this, empty
> elements have to use bracket-notation rather than dot-notation.

I've not [yet] enforced the bracket-notation requirement, so as a
special case of the ${var.3x} case above, ${var.} is also allowed.  A
point against following JS is that JS doesn't have the braces around
the reference to delineate the empty element.

[-- Attachment #2: namespace-limiting.txt --]
[-- Type: text/plain, Size: 3928 bytes --]

diff --git a/Src/math.c b/Src/math.c
index 12c8d6f6b..a060181ed 100644
--- a/Src/math.c
+++ b/Src/math.c
@@ -641,7 +641,9 @@ zzlex(void)
 		return MINUSEQ;
 	    }
 	    if (unary) {
-		if (idigit(*ptr) || *ptr == '.') {
+		if (idigit(*ptr) ||
+		    (*ptr == '.' &&
+		     (idigit(ptr[1]) || !itype_end(ptr, INAMESPC, 0)))) {
 		    int ctype = lexconstant();
 		    if (ctype == NUM)
 		    {
@@ -835,7 +837,9 @@ zzlex(void)
 	case Dnull:
 	    break;
 	default:
-	    if (idigit(*--ptr) || *ptr == '.')
+	    if (idigit(*--ptr) ||
+		(*ptr == '.' &&
+		 (idigit(ptr[1]) || !itype_end(ptr, INAMESPC, 0))))
 		return lexconstant();
 	    if (*ptr == '#') {
 		if (*++ptr == '\\' || *ptr == '#') {
@@ -857,7 +861,7 @@ zzlex(void)
 		}
 		cct = 1;
 	    }
-	    if ((ie = itype_end(ptr, IIDENT, 0)) != ptr) {
+	    if ((ie = itype_end(ptr, INAMESPC, 0)) != ptr) {
 		int func = 0;
 		char *p;
 
diff --git a/Src/params.c b/Src/params.c
index 021d341e8..2b0837e03 100644
--- a/Src/params.c
+++ b/Src/params.c
@@ -1226,6 +1226,26 @@ isident(char *s)
     if (!*s)			/* empty string is definitely not valid */
 	return 0;
 
+    /* This partly duplicates code in itype_end(), but we need to
+     * distinguish the leading namespace at this point to check the
+     * correctness of the identifier that follows
+     */
+    if (*s == '.') {
+	if (idigit(s[1]))
+	    return 0;	/* Namespace must not start with a digit */
+	/* Reject identifiers beginning with a digit in namespaces.
+	 * Move this out below this block to also reject v.1x form.
+	 */
+	if ((ss = itype_end(s + (*s == '.'), IIDENT, 0))) {
+	    if (*ss == '.') {
+		if (!ss[1])
+		    return 0;
+		if (idigit(ss[1]))
+		    s = ss + 1;
+	    }
+	}
+    }
+
     if (idigit(*s)) {
 	/* If the first character is `s' is a digit, then all must be */
 	for (ss = ++s; *ss; ss++)
@@ -2148,7 +2168,12 @@ fetchvalue(Value v, char **pptr, int bracks, int flags)
 	    pm = (Param) paramtab->getnode2(paramtab, *t == '0' ? "0" : t);
 	else
 	    pm = (Param) paramtab->getnode(paramtab, *t == '0' ? "0" : t);
-	if (sav)
+	if (!pm && *t == '.' && !isident(t)) {
+	    /* badly formed namespace reference */
+	    if (sav)
+		*s = sav;
+	    return NULL;
+	} else if (sav)
 	    *s = sav;
 	*pptr = s;
 	if (!pm || ((pm->node.flags & PM_UNSET) &&
diff --git a/Test/K02parameter.ztst b/Test/K02parameter.ztst
index 8a1be1e36..0b1a8dd4a 100644
--- a/Test/K02parameter.ztst
+++ b/Test/K02parameter.ztst
@@ -100,7 +100,55 @@ F:Braces are required
 >.k02.array
 >characters
 
+  k.=empty
   k.2=test
-  print ${k.2}
+  print ${k.} ${k.2}
 0:Parse without leading dot (future proofing)
->test
+>empty test
+
+  .k=OK
+  print ${.k}
+0:Bare namespace is usable (ksh compatibility)
+>OK
+
+  .k.=empty
+1:Namespace must precede an identifier, assignment
+?(eval):1: not an identifier: .k.
+
+  typeset .k.=empty
+1:Namespace must precede an identifier, typeset
+?(eval):typeset:1: not valid in this context: .k.
+
+  print ${.k.}
+1:Namespace must precede an identifier, reference
+?(eval):1: bad substitution
+
+  .2.b=not
+1:Namespace identifier must not begin with a digit, assignment
+?(eval):1: not an identifier: .2.b
+
+  typeset .2.b=not
+1:Namespace identifier must not begin with a digit, typeset
+?(eval):typeset:1: not valid in this context: .2.b
+
+  print ${.2.b}
+1:Namespace identifier must not begin with a digit, reference
+?(eval):1: bad substitution
+
+  .not.2b=question
+1:Identifier starting with a digit must be all digits, assignment
+?(eval):1: not an identifier: .not.2b
+
+  typeset .not.2b=question
+1:Identifier starting with a digit must be all digits, typeset
+?(eval):typeset:1: not valid in this context: .not.2b
+
+  print ${.not.2b}
+1:Identifier starting with a digit must be all digits, reference
+?(eval):1: bad substitution
+
+  integer .var.d=0
+  float .var.f=.2
+  print $((.var.x = ++.var.d - -.var.f))
+0:Namespaces in math context
+>1.2

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-06-22 17:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-19  0:13 namespaces limitation Oliver Kiddle
2023-05-23 17:10 ` Bart Schaefer
2023-05-24 16:41   ` Oliver Kiddle
2023-05-25 22:09     ` Bart Schaefer
2023-05-25 23:13       ` Bart Schaefer
2023-06-22 17:49       ` [PATCH] " Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).