This way the logic makes more sense: typeset var <- NULL(on) unset var <- NULL(off) var='' <- NULL(off) NULL means: declared, no value, but still valid even with PM_UNSET. No functional changes. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> --- Src/builtin.c | 2 +- Src/params.c | 4 ++-- Src/subst.c | 2 +- Src/zsh.h | 4 ++-- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/Src/builtin.c b/Src/builtin.c index 1e950f122..46d3962bb 100644 --- a/Src/builtin.c +++ b/Src/builtin.c @@ -2837,7 +2837,7 @@ bin_typeset(char *name, char **argv, LinkList assigns, Options ops, int func) unqueue_signals(); return 1; } else if (pm) { - if ((!(pm->node.flags & PM_UNSET) || pm->node.flags & PM_DECLARED) + if ((!(pm->node.flags & PM_UNSET) || pm->node.flags & PM_NULL) && (locallevel == pm->level || !(on & PM_LOCAL))) { if (pm->node.flags & PM_TIED) { if (PM_TYPE(pm->node.flags) != PM_SCALAR) { diff --git a/Src/params.c b/Src/params.c index c09a3eccf..1c587872b 100644 --- a/Src/params.c +++ b/Src/params.c @@ -2094,7 +2094,7 @@ fetchvalue(Value v, char **pptr, int bracks, int flags) *s = sav; *pptr = s; if (!pm || ((pm->node.flags & PM_UNSET) && - !(pm->node.flags & PM_DECLARED))) + !(pm->node.flags & PM_NULL))) return NULL; if (v) memset(v, 0, sizeof(*v)); @@ -3625,7 +3625,7 @@ unsetparam_pm(Param pm, int altflag, int exp) else altremove = NULL; - pm->node.flags &= ~PM_DECLARED; /* like ksh, not like bash */ + pm->node.flags &= ~PM_NULL; /* like ksh, not like bash */ if (!(pm->node.flags & PM_UNSET)) pm->gsu.s->unsetfn(pm, exp); if (pm->env) diff --git a/Src/subst.c b/Src/subst.c index 8731297f7..7ac2dd47a 100644 --- a/Src/subst.c +++ b/Src/subst.c @@ -2541,7 +2541,7 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags, * Handle the (t) flag: value now becomes the type * information for the parameter. */ - if (v && v->pm && ((v->pm->node.flags & PM_DECLARED) || + if (v && v->pm && ((v->pm->node.flags & PM_NULL) || !(v->pm->node.flags & PM_UNSET))) { int f = v->pm->node.flags; diff --git a/Src/zsh.h b/Src/zsh.h index 6d7f517c6..c68b47383 100644 --- a/Src/zsh.h +++ b/Src/zsh.h @@ -1929,10 +1929,10 @@ struct tieddata { made read-only by the user */ #define PM_READONLY_SPECIAL (PM_SPECIAL|PM_READONLY|PM_RO_BY_DESIGN) #define PM_DONTIMPORT (1<<22) /* do not import this variable */ -#define PM_DECLARED (1<<22) /* explicitly named with typeset */ +#define PM_NULL (1<<22) /* declared but null */ #define PM_RESTRICTED (1<<23) /* cannot be changed in restricted mode */ #define PM_UNSET (1<<24) /* has null value */ -#define PM_DECLAREDNULL (PM_DECLARED|PM_UNSET) +#define PM_DECLAREDNULL (PM_NULL|PM_UNSET) #define PM_REMOVABLE (1<<25) /* special can be removed from paramtab */ #define PM_AUTOLOAD (1<<26) /* autoloaded from module */ #define PM_NORESTORE (1<<27) /* do not restore value of local special */ -- 2.30.0.rc2
On Mon, Dec 28, 2020 at 2:13 PM Felipe Contreras
<felipe.contreras@gmail.com> wrote:
>
> This way the logic makes more sense:
>
> typeset var <- NULL(on)
> unset var <- NULL(off)
> var='' <- NULL(off)
>
> NULL means: declared, no value, but still valid even with PM_UNSET.
>
> No functional changes.
This is now getting a little bit weedy since only the names are under
discussion, but ... if we go all the way back to the original
discussion about this, the point was that
typeset var
print ${var-unset}
should output "unset". Correct?
Consequently my thought was that PM_UNSET should be assigned for that
case. Using your terminology above, NULL(on) implies UNSET(on). To
me, that means that if there is a bit pattern named PM_NULL, it should
include the bit pattern for PM_UNSET. In retrospect I could just have
used PM_NULL instead of PM_DECLAREDNULL but I was seeking to make it
obvious that there were two bits in the pattern when it was used as a
mask.
As one last stab at this, since neither PM_DECLARED nor PM_IMPLIED is
satisfactory, what about PM_DEFAULT ? And scrap PM_DECLAREDNULL for
just PM_NULL.
#define PM_NULL (PM_DEFAULT|PM_UNSET)
This yields
typeset var <- NULL(on) <- DEFAULT(on), UNSET(on)
unset var <- DEFAULT(off), UNSET(on) <- NULL(off)
var='' <- DEFAULT(off), UNSET(off) <- NULL(off)
??
On Sat, Jan 2, 2021 at 7:18 PM Bart Schaefer <schaefer@brasslantern.com> wrote: > > On Mon, Dec 28, 2020 at 2:13 PM Felipe Contreras > <felipe.contreras@gmail.com> wrote: > > > > This way the logic makes more sense: > > > > typeset var <- NULL(on) > > unset var <- NULL(off) > > var='' <- NULL(off) > > > > NULL means: declared, no value, but still valid even with PM_UNSET. > > > > No functional changes. > > This is now getting a little bit weedy since only the names are under > discussion, but ... Yes. In this particular patch that's the only consideration, but I'm using that only as the starting point. I'm still unsure about the rest of the logic, but I haven't checked all the cases of PM_UNSET yet. So, yes... For now. > if we go all the way back to the original > discussion about this, the point was that > > typeset var > print ${var-unset} > > should output "unset". Correct? Yes, but that's only *one* of the considerations. It's still not perfectly clear what "typeset -p var" should output. Using "${var-unset}" is using loaded language. If we use "${var-foo}", then yes; 100% under agreement. > Consequently my thought was that PM_UNSET should be assigned for that > case. Using your terminology above, NULL(on) implies UNSET(on). To > me, that means that if there is a bit pattern named PM_NULL, it should > include the bit pattern for PM_UNSET. In retrospect I could just have > used PM_NULL instead of PM_DECLAREDNULL but I was seeking to make it > obvious that there were two bits in the pattern when it was used as a > mask. I mean NULL(on) imples "${var-foo}" returns "foo", yes. > As one last stab at this, since neither PM_DECLARED nor PM_IMPLIED is > satisfactory, what about PM_DEFAULT ? And scrap PM_DECLAREDNULL for > just PM_NULL. > > #define PM_NULL (PM_DEFAULT|PM_UNSET) > > This yields > > typeset var <- NULL(on) <- DEFAULT(on), UNSET(on) Makes sense. I think it's debatable whether or not this is really "unset" (what "typeset -p" var shows), but OK. > unset var <- DEFAULT(off), UNSET(on) <- NULL(off) Nope. The value hasn't changed, it still has the "default" value. Now it's 100% sure it's really "unset" though. > var='' <- DEFAULT(off), UNSET(off) <- NULL(off) Yes. I agree with all these. I think this is playing ring-around-the-rosy; you are trying to find a word that signifies that no value has been assigned, even if the variable is "set", but that's not "default", since the default can be "". What we want is to change the default value from an empty string (""), to a non-value, which in computer science usually is NIL. It's not "default", it's not "designed"; it's "no value". Doing "unset var" should not change the value in my opinion, but that's probably another patch. I think we should start by separating meaning from behavior, so how about this script: ---- [[ -n ${ZSH_VERSION-} ]] && setopt posixbuiltins function check { echo -n "$1: " test -n "${var+on}" && echo -n 'A(on)' || echo -n 'A(off)' echo -n ', ' test -n "$(typeset -p var)" && echo -n 'B(on)' || echo -n 'B(off)' echo } function f { local var check 'local var' unset var check 'unset var' } f ---- With that script we get: current zsh: local var: A(on), B(on) unset var: A(off), B(off) bash: local var: A(off), B(on) unset var: A(off), B(on) ksh: local var: A(off), B(off) unset var: A(off), B(off) patched zsh (Bart): local var: A(off), B(on) unset var: A(off), B(off) patched zsh (Felipe): local var: A(off), B(on) unset var: A(off), B(off) So It seems your code and my code agree with the behavior of both A and B. The only unknown is what A and B mean. Agreed? Cheers. -- Felipe Contreras
On Sat, Jan 2, 2021 at 6:38 PM Felipe Contreras <felipe.contreras@gmail.com> wrote: > > On Sat, Jan 2, 2021 at 7:18 PM Bart Schaefer <schaefer@brasslantern.com> wrote: > > > > if we go all the way back to the original > > discussion about this, the point was that > > > > typeset var > > print ${var-unset} > > > > should output "unset". Correct? > > Yes, but that's only *one* of the considerations. It's still not > perfectly clear what "typeset -p var" should output. Hmm, sorry, I thought that was a solved problem. Except for some special cases like "readonly var", I thought it was pretty clear that "typeset -p var" should output a semantically identical command to that which declared the variable in the first place. (Assuming POSIX_BUILTINS, of course.) > > unset var <- DEFAULT(off), UNSET(on) <- NULL(off) > > Nope. The value hasn't changed, it still has the "default" value. "Default" here does not refer to the value (or at least, not to the value alone). A different example might be more obvious; if I do integer var unset var then "var" is no longer an integer. That is the default that has changed. > I think this is playing ring-around-the-rosy; you are trying to find a > word that signifies that no value has been assigned, even if the > variable is "set" No, that's not it. I'm trying to find a word that describes the STATE of the variable, independent of its value. It happens that the "spec" that we're importing from posix-ish shells means that this particular state is always paired with the state of "unset-ness" but regardless of your arguments of functional equivalence, neither of these states is an actual value of NULL. > So It seems your code and my code agree with the behavior of both A > and B. The only unknown is what A and B mean. > > Agreed? Yes, although I would not say "unknown". More like "unnamed". Also, your script doesn't observe that "current zsh: B(on)" does not mean the same thing that "patched zsh: B(on)" means (at least for my patch and I think for yours).
Bart Schaefer wrote on Sun, Jan 03, 2021 at 10:26:48 -0800: > No, that's not it. I'm trying to find a word that describes the STATE > of the variable, independent of its value. It happens that the "spec" > that we're importing from posix-ish shells means that this particular > state is always paired with the state of "unset-ness" but regardless > of your arguments of functional equivalence, neither of these states > is an actual value of NULL. Could you summarize the bits that need to be named and the corresponding shell language incantations/semantics? Is this anything like using «struct foo **p» in C to denote a single parameter that has three possible states: . (!p) (p && !*p) (p && *p) > > So It seems your code and my code agree with the behavior of both A > > and B. The only unknown is what A and B mean. > > > > Agreed? > > Yes, although I would not say "unknown". More like "unnamed". Also, > your script doesn't observe that "current zsh: B(on)" does not mean > the same thing that "patched zsh: B(on)" means (at least for my patch > and I think for yours). >
On Sun, Jan 3, 2021 at 10:17 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > > Could you summarize the bits that need to be named and the corresponding shell > language incantations/semantics? There has been a series of threads about this, but to try to summarize: In current zsh, typeset -i var typeset -p var print ${var-foo} produces the output typeset -i var=0 0 Note that integer declaration (-i) is for example purposes only (because then $var is visibly zero rather than empty string); the flags to typeset don't matter, only the fact that there is no assignment in the command is significant. This disagrees with e.g. bash/ksh, so the proposal is that with POSIXBUILTINS set, the same three commands would output typeset -i var foo That is, even though a parameter named "var" has been declared, and that declaration can be regurgitated, "dereferencing" $var produces a result equivalent to examining a variable that is unset. Felipe has argued that this is functionally equivalent to having assigned a NULL value to the name "var". Internally, zsh does not have a representation for a NULL value (although with a number of changes, it could do so for non-numeric types where the parameter union contains a pointer, and we spent a while discussing whether those were the only cases that require implementation). My approach has been to create an additional flag (originally called PM_DECLARED) which represents the state immediately following typeset -i var and then to bitwise-OR that with PM_UNSET to minimize differences in code that tests for unset-ness. So the "bits that need to be named" are: 1) the bit representing "remember that this was declared but no value was assigned" 2) the combination of that with PM_UNSET that represents "functionally behaves like NULL" We could of course simply never name #2 and always write out the bitwise-OR, but that seems cumbersome. As I understand it, the objection to PM_DECLARED for #1 is that the name implies that only "unset var" should ever turn that bit off again, but the implementation requires that assignment also turns it off. Similar objections of English language semantics conflicting with the implementation have been raised to other names I've suggested. > Is this anything like using «struct foo **p» in C to denote a single parameter > that has three possible states Sort of, except that (!p && *p) is actually "valid". The problem is that ${var-foo} resolves as if (!p) but ${var} resolves as if *p=""
Bart Schaefer wrote on Mon, 04 Jan 2021 21:57 +00:00: > On Sun, Jan 3, 2021 at 10:17 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > > > > Could you summarize the bits that need to be named and the corresponding shell > > language incantations/semantics? > > There has been a series of threads about this, but to try to summarize: > > In current zsh, > typeset -i var > typeset -p var > print ${var-foo} > produces the output > typeset -i var=0 > 0 ⋮ > This disagrees with e.g. bash/ksh, so the proposal is that with > POSIXBUILTINS set, the same three commands would output > typeset -i var > foo > Thanks for the summary. > So the "bits that need to be named" are: > 1) the bit representing "remember that this was declared but no value > was assigned" > 2) the combination of that with PM_UNSET that represents "functionally > behaves like NULL" > > We could of course simply never name #2 and always write out the > bitwise-OR, but that seems cumbersome. > To be clear, (2) would generally be used as testing whether _either_ PM_UNSET or the bit from #1 is set, right? How about, for #1, PM_BEEN_ASSIGNED or PM_INITIALIZED? As to the combination, my first inclination would have been to leave it unnamed so that it's obvious PM_UNSET is being inspected, but if the combination merits being named, then perhaps PM_HAS_VALUE(pm). > As I understand it, the objection to PM_DECLARED for #1 is that the > name implies that only "unset var" should ever turn that bit off > again, but the implementation requires that assignment also turns it > off. Similar objections of English language semantics conflicting > with the implementation have been raised to other names I've > suggested. *nod* Cheers, Daniel
On Wed, Jan 6, 2021 at 8:02 AM Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > > Bart Schaefer wrote on Mon, 04 Jan 2021 21:57 +00:00: > > So the "bits that need to be named" are: > > 1) the bit representing "remember that this was declared but no value > > was assigned" > > 2) the combination of that with PM_UNSET that represents "functionally > > behaves like NULL" > > To be clear, (2) would generally be used as testing whether _either_ > PM_UNSET or the bit from #1 is set, right? Most often it's used for changing the value of both bits at once, not testing. The bits are almost always tested independently. > How about, for #1, PM_BEEN_ASSIGNED or PM_INITIALIZED? The latter was already rejected. Both of these arguably describe the opposite of the actual state, that is, PM_HAS_NOT_BEEN_ASSIGNED would be more accurate ... but Felipe has essentially argued that after "unset foo" the variable still has not been assigned, so why clear a bit with that name? PM_DECLARED_BUT_NEITHER_ASSIGNED_NOR_UNSET is just too verbose, and shortening it to just the first word got us into this discussion in the first place. > As to the combination, my first inclination would have been to leave it > unnamed so that it's obvious PM_UNSET is being inspected, but if the > combination merits being named, then perhaps PM_HAS_VALUE(pm). The reason for doing it the way I did is because (I presumed) most cases would never examine bit #1 because they are already examining PM_UNSET by itself. PM_HAS_VALUE(pm) is actually also backwards. It would usually be PM_HAS_NO_VALUE(pm). But there's actually exactly one such test.
Bart Schaefer wrote on Wed, Jan 06, 2021 at 09:33:49 -0800: > On Wed, Jan 6, 2021 at 8:02 AM Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > > > > Bart Schaefer wrote on Mon, 04 Jan 2021 21:57 +00:00: > > > So the "bits that need to be named" are: > > > 1) the bit representing "remember that this was declared but no value > > > was assigned" > > > 2) the combination of that with PM_UNSET that represents "functionally > > > behaves like NULL" > > > > To be clear, (2) would generally be used as testing whether _either_ > > PM_UNSET or the bit from #1 is set, right? > > Most often it's used for changing the value of both bits at once, not > testing. The bits are almost always tested independently. *nod* > > How about, for #1, PM_BEEN_ASSIGNED or PM_INITIALIZED? > > The latter was already rejected. Both of these arguably describe the > opposite of the actual state, that is, PM_HAS_NOT_BEEN_ASSIGNED would > be more accurate ... but Felipe has essentially argued that after > "unset foo" the variable still has not been assigned, so why clear a > bit with that name? After «unset», PM_UNSET would be set, and I don't immediately see why bit #1 should be tested at all if if PM_UNSET is set. If the «unset» is followed by an assignment and/or (re-)declaration, the value of bit #1 can then be set properly, and PM_UNSET cleared. Makes sense? > PM_DECLARED_BUT_NEITHER_ASSIGNED_NOR_UNSET is just too verbose, and > shortening it to just the first word got us into this discussion in > the first place. > > > As to the combination, my first inclination would have been to leave it > > unnamed so that it's obvious PM_UNSET is being inspected, but if the > > combination merits being named, then perhaps PM_HAS_VALUE(pm). > > The reason for doing it the way I did is because (I presumed) most > cases would never examine bit #1 because they are already examining > PM_UNSET by itself. > > PM_HAS_VALUE(pm) is actually also backwards. It would usually be > PM_HAS_NO_VALUE(pm). But there's actually exactly one such test. So long as we don't have «!PM_HAS_NO_VALUE(pm)» ☺
[-- Attachment #1: Type: text/plain, Size: 1070 bytes --] On Thu, Jan 7, 2021 at 7:48 AM Daniel Shahaf <d.s@daniel.shahaf.name> wrote: > > > > 1) the bit representing "remember that this was declared but no value > > > > was assigned" > > > > 2) the combination of that with PM_UNSET that represents > "functionally > > > > behaves like NULL" > > After «unset», PM_UNSET would be set, and I don't immediately see why > bit #1 > should be tested at all if if PM_UNSET is set. Bit #1 is tested for "typeset -p var". If PM_UNSET is set and bit #1 is not, then "typeset -p" outputs nothing; but we want typeset -p to output a declaration (with no assignment). > If the «unset» is followed by > an assignment and/or (re-)declaration, the value of bit #1 can then be set > properly, and PM_UNSET cleared. Makes sense? > Bit #1 has to be cleared on explicit unset, and is irrelevant on assignment because PM_UNSET is cleared on assignment. However, it's most convenient to clear bit #1 on assignment because that eliminates one special case on "typeset var=value" (as opposed to "typeset var"). [-- Attachment #2: Type: text/html, Size: 1628 bytes --]
On Wed, Jan 6, 2021, at 12:33 PM, Bart Schaefer wrote:
> On Wed, Jan 6, 2021 at 8:02 AM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> > How about, for #1, PM_BEEN_ASSIGNED or PM_INITIALIZED?
>
> The latter was already rejected. Both of these arguably describe the
> opposite of the actual state, that is, PM_HAS_NOT_BEEN_ASSIGNED would
> be more accurate ... but Felipe has essentially argued that after
> "unset foo" the variable still has not been assigned, so why clear a
> bit with that name?
>
> PM_DECLARED_BUT_NEITHER_ASSIGNED_NOR_UNSET is just too verbose, and
> shortening it to just the first word got us into this discussion in
> the first place.
Looks like this petered out without a decision one way or the other?
vq
On Sat, Mar 27, 2021 at 12:25 PM Lawrence Velázquez <vq@larryv.me> wrote:
>
> Looks like this petered out without a decision one way or the other?
Since we resolved this down to arguing about internal names that are
never visible to the user, IMO the primary remaining question is
whether it's acceptable to make the user-visible behavior dependent on
the POSIX_BUILTINS option.
Perhaps the internal naming argument can be resolved by using
PM_DEFAULTED (thus making it definitely a verb rather than "default"
which could be interpreted as a noun).
Bart Schaefer wrote: > Since we resolved this down to arguing about internal names that are > never visible to the user, IMO the primary remaining question is > whether it's acceptable to make the user-visible behavior dependent on > the POSIX_BUILTINS option. It seems fairly self-contained and could have it's own option. typeset isn't a builtin. posix compatibility options aren't really improvements but someone might prefer this behaviour. But I don't have a strong view - whatever you think is good. > Perhaps the internal naming argument can be resolved by using > PM_DEFAULTED (thus making it definitely a verb rather than "default" > which could be interpreted as a noun). I'm fine with that but I wouldn't especially object to DECLARED or whatever it was before. Some other ideas: ONLYDECLARED, DECLONLY, UNASSIGNED, INITIATED, CONCEIVED, SCOPED. Was there anything else outstanding like (t) output perhaps? I don't feel strongly about either of these naming issues and would be happy for this to move forward regardless of the outcome on them. Oliver
On Sun, Mar 28, 2021 at 5:44 PM Oliver Kiddle <opk@zsh.org> wrote: > > Bart Schaefer wrote: > > IMO the primary remaining question is > > whether it's acceptable to make the user-visible behavior dependent on > > the POSIX_BUILTINS option. > > It seems fairly self-contained and could have it's own option. typeset > isn't a builtin. posix compatibility options aren't really improvements > but someone might prefer this behaviour. (I'm reading that as "typeset isn't a POSIX builtin"). There has at least been discussion about standardizing "local" on austin-group, and given that "local" is an alias for typeset, this (or related) behavior might become a POSIX compatibility thing in the future. I'm also somewhat concerned that choosing a descriptive name for a new option is going to spawn another argument. TYPESET_DOES_NOT_SET ? As mentioned long ago, it could also be an emulation-mode thing, although that makes it a lot more difficult to access at a scripting level. > Was there anything else outstanding like (t) output perhaps? I believe I have dealt properly with ${(t)var}. I'll add something to the doc about ${emptystr[(i)]}, because that's a weird case even without this patch.
Bart Schaefer wrote: > On Sun, Mar 28, 2021 at 5:44 PM Oliver Kiddle <opk@zsh.org> wrote: > > It seems fairly self-contained and could have it's own option. typeset > > isn't a builtin. posix compatibility options aren't really improvements > > but someone might prefer this behaviour. > > (I'm reading that as "typeset isn't a POSIX builtin"). There has at I meant it more in the sense of "typeset is a reserved word". I know that's only true to a limited extent and a future POSIX standardisation of local would likely only cover functionality that works as a builtin. I don't really have a strong opinion but would like to see the work finished off and pushed. > I'm also somewhat concerned that choosing a descriptive name for a new > option is going to spawn another argument. TYPESET_DOES_NOT_SET ? I can understand that, its never easy to name these things. > As mentioned long ago, it could also be an emulation-mode thing, > although that makes it a lot more difficult to access at a scripting > level. I'd be fine with that too if you prefer. If you think you might want to change how it is controlled later, an internal macro would make that easier. But I'm, not sure backward compatibility concerns would ever allow that anyway. Oliver
On Sat, Apr 10, 2021 at 2:58 PM Oliver Kiddle <opk@zsh.org> wrote: > > I don't really have a strong opinion but would like to see the work > finished off and pushed. I did another forced-push on the declarednull branch after rebase on master (though a couple of other master changes flowed in almost immediately after that). > never easy to name these things. A problem with TYPESET_DOES_NOT_SET is that it leads to the double-negative NO_TYPESET_DOES_NOT_SET, which isn't as bad as NO_NO_whatever, but argues for an option that is treated as "on" for the existing behavior. Bleah. If I do add a separate option, do you think it's OK to combine that with POSIX_BUILTINS for the tests in the (new) E03posix.ztst file? Otherwise all the tests in that file will have to be repeated twice (with and without the new option).