zsh-workers
 help / color / mirror / code / Atom feed
From: Bart Schaefer <schaefer@brasslantern.com>
To: zsh-workers@zsh.org
Subject: Re: Re[2]: High memory usage on // substitution in one situation, normal usage in other
Date: Sun, 21 May 2017 16:43:22 -0700	[thread overview]
Message-ID: <170521164322.ZM5074@torch.brasslantern.com> (raw)
In-Reply-To: <201705212227360937.046E4951@gateway.core.mpy.ch>

On May 21, 10:27pm, Manuel Presnitz wrote:
}
} I can reproduce the extraordinary high memory usage ...
} but only if EXTENDEDGLOB is set.

OK, that helps (and I should have caught that this is required for (#b)
to be meaningful).

Let's consider the two examples:

1) arr2 is an associative array mapping 200000 integers to underscores.
   Flattened into a string, you get alternating number-strings and
   underscores separated by spaces.

2) arr is an array of 900000 single underscores.  Flattened it's a
   string of underscores separated by spaces.

Now let's look more closely at the pattern:

    [^$'\03'-$'\07'$'\013'-$'\014'$'\016'-$'\031'$'0\037']##

Look at the rightmost $'...' expression:  $'0\037'

That has a "0" character in it.  Probably a typo, but it means that
every time there is a zero digit in $__text, the // recursion has to
save state before proceeding.  This doesn't happen when the text is
nothng but underscores.  The state that is being saved includes the
entire tail of the string from that point onward, so in the first
example it requires approximately 1.8MB/2*(the number of zeroes in
all integers from 1 to 200000), and in the second case it never does
recur (the ## consumes the entire string).

The memory is being allocated here:

		/*
		 * Array to record the start of characters for
		 * backtracking.
		 */
		VARARR(char, charstart, patinend-patinput);
		memset(charstart, 0, patinend-patinput);

(pattern.c:3263)


  parent reply	other threads:[~2017-05-21 23:43 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-21  7:28 Sebastian Gniazdowski
2017-05-21 20:03 ` Bart Schaefer
2017-05-21 20:47   ` Re[2]: " Manuel Presnitz
     [not found]   ` <201705212227360937.046E4951@gateway.core.mpy.ch>
2017-05-21 23:43     ` Bart Schaefer [this message]
2017-05-22  3:31       ` Sebastian Gniazdowski
2017-05-22  8:50       ` Daniel Shahaf
2017-05-22 15:00         ` Bart Schaefer
2017-05-23 15:33           ` Christian Neukirchen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=170521164322.ZM5074@torch.brasslantern.com \
    --to=schaefer@brasslantern.com \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).