* VARARR in pattern code [not found] <CAHAq8pF=hB4sfx+Fe6nfnbJ8W7E9r9e_mHytBdu=Oy_6CWukJA@mail.gmail.com> @ 2014-09-08 14:10 ` Peter Stephenson 2014-09-08 14:24 ` Peter Stephenson 0 siblings, 1 reply; 2+ messages in thread From: Peter Stephenson @ 2014-09-08 14:10 UTC (permalink / raw) To: Zsh Hackers' List While looking at the problem with repeated *'s, I notice that inside the pattern code for closures --- *'s, #'s and ##'s --- there's a VARARR. /* * Array to record the start of characters for * backtracking. */ VARARR(char, charstart, patinend-patinput); If you're interested, that was added to fix a very similar problem with pathological backtracking involving negated matches with "~" or "^". It's otherwise a strange thing to have in pattern matching code (and it may be why the performance with multiple "*"s was quite so bad). We just made all VARARR's heap allocation. It occurs to me this one can be hit a lot of times when backtracking through a pattern with a lot of closures. I wonder if this one should be a special case --- zalloc if efficient enough? I haven't done any experiments so may be being alarmist. It might be possible to optimise the use of charstart out entirely in some cases. pws ^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: VARARR in pattern code 2014-09-08 14:10 ` VARARR in pattern code Peter Stephenson @ 2014-09-08 14:24 ` Peter Stephenson 0 siblings, 0 replies; 2+ messages in thread From: Peter Stephenson @ 2014-09-08 14:24 UTC (permalink / raw) To: Zsh Hackers' List On Mon, 08 Sep 2014 15:10:37 +0100 Peter Stephenson <p.stephenson@samsung.com> wrote: > While looking at the problem with repeated *'s, I notice that inside the > pattern code for closures --- *'s, #'s and ##'s --- there's a VARARR. > > /* > * Array to record the start of characters for > * backtracking. > */ > VARARR(char, charstart, patinend-patinput); > > If you're interested, that was added to fix a very similar problem with > pathological backtracking involving negated matches with "~" or "^". Hmm... I think I tell a lie. I think this one was added for multibyte mode --- when backtracking, the only way I know of to guarantee you're going back a whole character is either to scan the entire string from the start, or remember, and this does the latter. There may be some more efficient way of backtracking through multibyte strings known to cognoscenti. The allocation I original thought this one was is actually a zshcalloc() around line 2841 (post patch), so not subject to VARARR(), and as it's only used by the exclusion code it doesn't slow down standard glob matches. pws ^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2014-09-08 14:25 UTC | newest] Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CAHAq8pF=hB4sfx+Fe6nfnbJ8W7E9r9e_mHytBdu=Oy_6CWukJA@mail.gmail.com> 2014-09-08 14:10 ` VARARR in pattern code Peter Stephenson 2014-09-08 14:24 ` Peter Stephenson
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).