zsh-workers
 help / color / mirror / code / Atom feed
* VARARR in pattern code
       [not found] <CAHAq8pF=hB4sfx+Fe6nfnbJ8W7E9r9e_mHytBdu=Oy_6CWukJA@mail.gmail.com>
@ 2014-09-08 14:10 ` Peter Stephenson
  2014-09-08 14:24   ` Peter Stephenson
  0 siblings, 1 reply; 2+ messages in thread
From: Peter Stephenson @ 2014-09-08 14:10 UTC (permalink / raw)
  To: Zsh Hackers' List

While looking at the problem with repeated *'s, I notice that inside the
pattern code for closures --- *'s, #'s and ##'s --- there's a VARARR.

		/*
		 * Array to record the start of characters for
		 * backtracking.
		 */
		VARARR(char, charstart, patinend-patinput);

If you're interested, that was added to fix a very similar problem with
pathological backtracking involving negated matches with "~" or "^".
It's otherwise a strange thing to have in pattern matching code (and
it may be why the performance with multiple "*"s was quite so bad).

We just made all VARARR's heap allocation.  It occurs to me this one can
be hit a lot of times when backtracking through a pattern with a lot of
closures.  I wonder if this one should be a special case --- zalloc if
efficient enough?  I haven't done any experiments so may be being
alarmist.

It might be possible to optimise the use of charstart out entirely
in some cases.

pws


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: VARARR in pattern code
  2014-09-08 14:10 ` VARARR in pattern code Peter Stephenson
@ 2014-09-08 14:24   ` Peter Stephenson
  0 siblings, 0 replies; 2+ messages in thread
From: Peter Stephenson @ 2014-09-08 14:24 UTC (permalink / raw)
  To: Zsh Hackers' List

On Mon, 08 Sep 2014 15:10:37 +0100
Peter Stephenson <p.stephenson@samsung.com> wrote:
> While looking at the problem with repeated *'s, I notice that inside the
> pattern code for closures --- *'s, #'s and ##'s --- there's a VARARR.
> 
> 		/*
> 		 * Array to record the start of characters for
> 		 * backtracking.
> 		 */
> 		VARARR(char, charstart, patinend-patinput);
> 
> If you're interested, that was added to fix a very similar problem with
> pathological backtracking involving negated matches with "~" or "^".

Hmm... I think I tell a lie.  I think this one was added for multibyte
mode --- when backtracking, the only way I know of to guarantee you're
going back a whole character is either to scan the entire string from
the start, or remember, and this does the latter.

There may be some more efficient way of backtracking through multibyte
strings known to cognoscenti.

The allocation I original thought this one was is actually a zshcalloc()
around line 2841 (post patch), so not subject to VARARR(), and as it's
only used by the exclusion code it doesn't slow down standard glob
matches.

pws


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2014-09-08 14:25 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAHAq8pF=hB4sfx+Fe6nfnbJ8W7E9r9e_mHytBdu=Oy_6CWukJA@mail.gmail.com>
2014-09-08 14:10 ` VARARR in pattern code Peter Stephenson
2014-09-08 14:24   ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).