zsh-workers
 help / color / mirror / code / Atom feed
* zstyle: "more specific" patterns and *-components
@ 2020-04-27 10:14 Daniel Shahaf
  2020-04-27 16:10 ` Roman Perepelitsa
  2020-04-27 18:54 ` dana
  0 siblings, 2 replies; 7+ messages in thread
From: Daniel Shahaf @ 2020-04-27 10:14 UTC (permalink / raw)
  To: zsh-workers

What would you expect
.
    zstyle ':foo:bar:*' lorem world
    zstyle ':foo:*:baz:*' lorem hello
    zstyle -s ':foo:bar:baz:qux' lorem REPLY && print $REPLY
.
to print?

For reference, the documentation specifies:

> A pattern is considered to be more specific
> than another if it contains more components (substrings separated by
> colons) or if the patterns for the components are more specific, where 
> simple strings are considered to be more specific than patterns and
> complex patterns are considered to be more specific than the pattern
> `tt(*)'.  A `tt(*)' in the pattern will match zero or more characters
> in the context; colons are not treated specially in this regard.
> If two patterns are equally specific, the tie is broken in favour of
> the pattern that was defined first.

(This part of the documentation was recently changed in users/24656, by
me, but I didn't intend to change its meaning, only to clarify it.)

---

Currently, that prints "world", and would print "hello" if the first
two lines were reordered.  That's because setstypat() gives a weight
of 0 to colon-separated pattern components that consist of a single
asterisk and nothing else: the two patterns are considered equally
specific, so the first one defined wins.

However, going by the documentation I expected ':foo:*:baz:*' to be
considered more specific than ':foo:bar:*' (because it contains more
components: 'three literal strings and two asterisks' is more than
'three literal strings and one asterisk'), and therefore, 'hello' to be
printed regardless of the order of the first two lines.

WDYT?

Cheers,

Daniel

P.S. The three literal strings are ("" "foo" "bar") in the first
pattern and ("" "foo" "baz") in the second pattern.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: zstyle: "more specific" patterns and *-components
  2020-04-27 10:14 zstyle: "more specific" patterns and *-components Daniel Shahaf
@ 2020-04-27 16:10 ` Roman Perepelitsa
  2020-04-27 17:38   ` Mikael Magnusson
  2020-04-28 19:06   ` Daniel Shahaf
  2020-04-27 18:54 ` dana
  1 sibling, 2 replies; 7+ messages in thread
From: Roman Perepelitsa @ 2020-04-27 16:10 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Zsh hackers list

On Mon, Apr 27, 2020 at 12:15 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
>
> WDYT?

My reading of the documentation matches yours, so I also expected the
behavior you expected (which isn't the actual behavior). I also find
the documented behavior preferable to the actual behavior.

My experience with zstyle is minuscule, so appraising my opinion at 2
cents would be too generous.

Roman.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: zstyle: "more specific" patterns and *-components
  2020-04-27 16:10 ` Roman Perepelitsa
@ 2020-04-27 17:38   ` Mikael Magnusson
  2020-04-28 19:06   ` Daniel Shahaf
  1 sibling, 0 replies; 7+ messages in thread
From: Mikael Magnusson @ 2020-04-27 17:38 UTC (permalink / raw)
  To: Roman Perepelitsa; +Cc: Daniel Shahaf, Zsh hackers list

On 4/27/20, Roman Perepelitsa <roman.perepelitsa@gmail.com> wrote:
> On Mon, Apr 27, 2020 at 12:15 PM Daniel Shahaf <d.s@daniel.shahaf.name>
> wrote:
>>
>> WDYT?
>
> My reading of the documentation matches yours, so I also expected the
> behavior you expected (which isn't the actual behavior). I also find
> the documented behavior preferable to the actual behavior.
>
> My experience with zstyle is minuscule, so appraising my opinion at 2
> cents would be too generous.

Another view is that in both cases you specified the values of two
'elements', the fact that they are adjacent in one case shouldn't (one
could argue) affect the specificity of the pattern. (this isn't
necessarily my view).

-- 
Mikael Magnusson

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: zstyle: "more specific" patterns and *-components
  2020-04-27 10:14 zstyle: "more specific" patterns and *-components Daniel Shahaf
  2020-04-27 16:10 ` Roman Perepelitsa
@ 2020-04-27 18:54 ` dana
  1 sibling, 0 replies; 7+ messages in thread
From: dana @ 2020-04-27 18:54 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: zsh-workers

On 27 Apr 2020, at 05:14, Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> However, going by the documentation I expected ':foo:*:baz:*' to be
> considered more specific than ':foo:bar:*' (because it contains more
> components: 'three literal strings and two asterisks' is more than
> 'three literal strings and one asterisk'), and therefore, 'hello' to be
> printed regardless of the order of the first two lines.

I've actually had something about this in my drafts for a few months now.
Pasting here in full:

  Re: workers/45413, i was going to change ":completion:${curcontext%}:*" to
  ":completion:*:${service}:*", reasoning that that would be the best way to
  ensure that the fall-back style doesn't override the user's.

  But that isn't actually guaranteed — when calculating the weight of a style,
  zstyle adds 0 for each component consisting of only *, such that
  :foo:*:bar and :foo:*:*:*:bar are equally weighted, and which one wins
  depends on the order they were defined.

  The documentation says:

    For ordering of comparisons, patterns are searched from most specific to
    least specific, and patterns that are equally specific keep the order in
    which they were defined. A pattern is considered to be more specific than
    another if it contains more components (substrings separated by colons)

  I suppose * isn't really a 'substring' in this context, but it still seems
  like the pattern with more :*: should win based on there being more
  components, doesn't it?

  I'm guessing that * is weighted 0 so that :foo:* doesn't have more weight
  than :foo:, but could it work better? For example, might it work to change
  the weighting to this:

    0  First consecutive *-only component (first * in :foo:*:*:bar*)
    1  Subsequent consecutive *-only component (second * in :foo:*:*:bar*)
    2  Pattern component (bar* in :foo:*:*:bar*)
    3  Literal component (foo in :foo:*:*:bar*)

  ?

But i don't think my suggested change will fix the case you described. Maybe
give each *-only component a weight of 1 unless it's at the very end? Haven't
really thought about it since i wrote that, so there might be other
considerations

dana


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: zstyle: "more specific" patterns and *-components
  2020-04-27 16:10 ` Roman Perepelitsa
  2020-04-27 17:38   ` Mikael Magnusson
@ 2020-04-28 19:06   ` Daniel Shahaf
  2020-04-28 19:30     ` dana
  1 sibling, 1 reply; 7+ messages in thread
From: Daniel Shahaf @ 2020-04-28 19:06 UTC (permalink / raw)
  To: Zsh hackers list

Roman Perepelitsa wrote on Mon, 27 Apr 2020 18:10 +0200:
> On Mon, Apr 27, 2020 at 12:15 PM Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> >
> > WDYT?
> 
> My reading of the documentation matches yours, so I also expected the
> behavior you expected (which isn't the actual behavior). I also find
> the documented behavior preferable to the actual behavior.
> 

I'm not sure what I prefer, actually.  The documentation disagrees with
the implementation, but I'm wary of changing the implementation to match
the documentation because that seems like it could introduce
hard-to-diagnose behaviour changes upon upgrading.

I reviewed my own settings (via the output of «zstyle» without
arguments) and found only one case in which a context could be matched
by two different patterns, and even that was a bug in my zshrc (one of
the patterns was insufficiently specific), so maybe I'm being overly
careful here.

Does anyone else have zstyle settings such that a single lookup (of
a particular style under a particular context) could be matched by
multiple patterns?

Mikael Magnusson wrote on Mon, 27 Apr 2020 19:38 +0200:
> On 4/27/20, Roman Perepelitsa <roman.perepelitsa@gmail.com> wrote:
> > On Mon, Apr 27, 2020 at 12:15 PM Daniel Shahaf <d.s@daniel.shahaf.name>
> > wrote:
> >>
> >> WDYT?
> >
> > My reading of the documentation matches yours, so I also expected the
> > behavior you expected (which isn't the actual behavior). I also find
> > the documented behavior preferable to the actual behavior.
> >
> > My experience with zstyle is minuscule, so appraising my opinion at 2
> > cents would be too generous.
> 
> Another view is that in both cases you specified the values of two
> 'elements', the fact that they are adjacent in one case shouldn't (one
> could argue) affect the specificity of the pattern. (this isn't
> necessarily my view).

If we change the implementation to match the documentation and Alice
wishes for both patterns to be considered equally specific, she will be
able to achieve that by explicitly specifying all colons in the patterns:

    zstyle ':foo:bar:*:*' lorem world
    zstyle ':foo:*:baz:*' lorem hello

On the other hand, if we change the documentation to match the
implementation and Alice wants one of the patterns to be considered more
specific regardless of the order the two patterns are defined in, she
can achieve that by using a pattern other than «*» that matches the same
things «*» would:

    zstyle ':foo:bar:*'      lorem world
    zstyle ':foo:*:baz:*(|)' lorem hello

(This works because the pattern «*(|)» is considered more specific — and
gets a higher score — than the pattern «*».)

dana wrote on Mon, 27 Apr 2020 13:54 -0500:
>   I'm guessing that * is weighted 0 so that :foo:* doesn't have more weight
>   than :foo:, but could it work better? For example, might it work to change
>   the weighting to this:
> 
>     0  First consecutive *-only component (first * in :foo:*:*:bar*)
>     1  Subsequent consecutive *-only component (second * in :foo:*:*:bar*)
>     2  Pattern component (bar* in :foo:*:*:bar*)
>     3  Literal component (foo in :foo:*:*:bar*)
> 
>   ?
> 
> But i don't think my suggested change will fix the case you described. Maybe
> give each *-only component a weight of 1 unless it's at the very end? Haven't
> really thought about it since i wrote that, so there might be other
> considerations

Firstly, shouldn't we choose between the implemented semantics and the
documented semantics?  Changing to a third semantics seems too
xkcd.com/927/-ey to me: it risks causing breakage _both_ to people who
relied on the documented behaviour and to people who relied on the
implemented behaviour.

I don't immediately see the rationale behind your proposed scoring.  The
two patterns you mention, «:foo» and «:foo:*», don't seem parallel to
each other, since there's no possible context that can be matched by
both of them.

Regarding the option of changing the implementation to match the
documentation, I interpret the documentation to mean a pattern that has
more colons than another is always considered more specific, period.
That could be implemented as a «struct { unsigned number_of_colons;
unsigned components_specificity_score; } weight;» with lexicographic
comparisons.  The following is equivalent to that —

diff --git a/Src/Modules/zutil.c b/Src/Modules/zutil.c
index 7d9bf05d6..4868bee7e 100644
--- a/Src/Modules/zutil.c
+++ b/Src/Modules/zutil.c
@@ -367,6 +367,7 @@ setstypat(Style s, char *pat, Patprog prog, char **vals, int eval)
 
 	if (*str == ':') {
 	    /* Yet another component. */
+	    weight += (1 << 16);
 
 	    first = 1;
 	    weight += tmp;

— though I'd call it only a proof of concept, since 32767 components in
a zstyle context ought to be enough for anybody :P

This patch breaks the test I posted in 45722, as expected.

Given the feedback so far, though, perhaps I should just commit this
patch?

Cheers,

Daniel
(I suppose we could use «1 << (CHAR_BIT * sizeof(weight) / 2)» to let
people on 64-bit systems use absurdly long zstyle contexts…)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: zstyle: "more specific" patterns and *-components
  2020-04-28 19:06   ` Daniel Shahaf
@ 2020-04-28 19:30     ` dana
  2020-04-28 20:20       ` Daniel Shahaf
  0 siblings, 1 reply; 7+ messages in thread
From: dana @ 2020-04-28 19:30 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Zsh hackers list

On 28 Apr 2020, at 14:06, Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> I don't immediately see the rationale behind your proposed scoring.  The
> two patterns you mention, «:foo» and «:foo:*», don't seem parallel to
> each other, since there's no possible context that can be matched by
> both of them.

Well, the example i gave was :foo: vs :foo:*, not :foo vs :foo:*.

But you're right, anyway. I was too focussed on my issue with the * patterns
to see the much simpler and more general fix re: the number of components.
Just looking at it, i think what you're suggesting addresses everything, ty

dana


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: zstyle: "more specific" patterns and *-components
  2020-04-28 19:30     ` dana
@ 2020-04-28 20:20       ` Daniel Shahaf
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Shahaf @ 2020-04-28 20:20 UTC (permalink / raw)
  To: zsh-workers

dana wrote on Tue, 28 Apr 2020 19:30 +00:00:
> On 28 Apr 2020, at 14:06, Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> > I don't immediately see the rationale behind your proposed scoring.  The
> > two patterns you mention, «:foo» and «:foo:*», don't seem parallel to
> > each other, since there's no possible context that can be matched by
> > both of them.
> 
> Well, the example i gave was :foo: vs :foo:*, not :foo vs :foo:*.
> 

Ah, sorry.

My guess for why * is weighted zero is what Mikael said.  Note that
«:foo:» is scored as 3 constant strings ("" "foo" ""), so it outscores
«:foo:*» and would do so even if the pattern «*» weren't special cased
compared to other patterns.

> But you're right, anyway. I was too focussed on my issue with the * patterns
> to see the much simpler and more general fix re: the number of components.
> Just looking at it, i think what you're suggesting addresses everything, ty

*nod*

Cheers,

Daniel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-04-28 20:22 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-27 10:14 zstyle: "more specific" patterns and *-components Daniel Shahaf
2020-04-27 16:10 ` Roman Perepelitsa
2020-04-27 17:38   ` Mikael Magnusson
2020-04-28 19:06   ` Daniel Shahaf
2020-04-28 19:30     ` dana
2020-04-28 20:20       ` Daniel Shahaf
2020-04-27 18:54 ` dana

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).