zsh-workers
 help / color / mirror / code / Atom feed
* High memory usage on // substitution in one situation, normal usage in other
@ 2017-05-21  7:28 Sebastian Gniazdowski
  2017-05-21 20:03 ` Bart Schaefer
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Gniazdowski @ 2017-05-21  7:28 UTC (permalink / raw)
  To: zsh-workers

Hello,
following will cause like 55 GB of VIRT memory usage (process will be killed), having RES memory at 1.8 MB, which roughly equals to ${#__text}.

local -A arr2
for (( i=1; i<=200000; i++ )); do arr2[$i]="_"; done
elems=( "${(kv@)arr2}" )
__text=${elems[*]}
echo ${#__text}
1688894
__text="${__text//(#b)([^$'\03'-$'\07'$'\013'-$'\014'$'\016'-$'\031'$'0\037']##)/${(q)match[1]}}"

Follwing will work smoothly:

arr=( ${(s::)${(r:900000::_:)empty}} )
__text=${arr[*]}
echo ${#__text}
1799999
__text="${__text//(#b)([^$'\03'-$'\07'$'\013'-$'\014'$'\016'-$'\031'$'0\037']##)/${(q)match[1]}}"

I wonder why there is a difference between those two scenarios, why VIRT raises and RES doesn't, and why VIRT rise happens?

--
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: High memory usage on // substitution in one situation, normal usage in other
  2017-05-21  7:28 High memory usage on // substitution in one situation, normal usage in other Sebastian Gniazdowski
@ 2017-05-21 20:03 ` Bart Schaefer
  2017-05-21 20:47   ` Re[2]: " Manuel Presnitz
       [not found]   ` <201705212227360937.046E4951@gateway.core.mpy.ch>
  0 siblings, 2 replies; 8+ messages in thread
From: Bart Schaefer @ 2017-05-21 20:03 UTC (permalink / raw)
  To: zsh-workers

On May 21,  9:28am, Sebastian Gniazdowski wrote:
}
} following will cause like 55 GB of VIRT memory usage (process will
} be killed), having RES memory at 1.8 MB, which roughly equals to
} ${#__text}.

I can't reproduce this.  The for loop takes a lot longer to create the
array than does the simple assignment, but neither example uses a huge
amount of memory.

Since you had a "local -A arr2" in your first example, I ran each of
the examples in an anon function in a fresh shell.  I also tried the
second example both with "local -a arr" added, and as you posted it,
but that didn't really make any difference.

On my system, the associative arr2 case looks like this:

swap    free
412780  81244
(skip gradual shrinking during "for" loop)
412780  33260
412780  33260
412780  58668

The second like this:

swap    free
412780  81420
412780  38092
412780  36748
412780  36772
412780  31524
412780  22692
412780  66532

The $__text value is global in both cases so the delta between the first
and last "free" numbers is the memory consumed by that plus any unused
space in the last unpopped zsh heap block.

In both cases the // expression is almost instantaneous, much faster than
creating the $__text string in the first place.  Neither uses any swap.
My config.h defines HAVE_MMAP 1, so the zsh heap is allocated as an
anonymous mapfile, but I don't have 55GB free on the filesystem so I
don't think that can be a factor.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re[2]: High memory usage on // substitution in one situation, normal usage in other
  2017-05-21 20:03 ` Bart Schaefer
@ 2017-05-21 20:47   ` Manuel Presnitz
       [not found]   ` <201705212227360937.046E4951@gateway.core.mpy.ch>
  1 sibling, 0 replies; 8+ messages in thread
From: Manuel Presnitz @ 2017-05-21 20:47 UTC (permalink / raw)
  To: zsh-workers

Good evening, Ladies and Gentlemen,

I can reproduce the extraordinary high memory usage (until the process is killed) of Sebastian's first code example with zsh-5.3.1-88-gd8c66e6 running on Linux; but only if EXTENDEDGLOB is set.

Best regards,
Manuel.




Bart Schaefer hat am 21.05.2017 um 13:03 folgendes geschrieben:

> On May 21,  9:28am, Sebastian Gniazdowski wrote:
> }
> } following will cause like 55 GB of VIRT memory usage (process will
> } be killed), having RES memory at 1.8 MB, which roughly equals to
> } ${#__text}.
>
> I can't reproduce this.  The for loop takes a lot longer to create the
> array than does the simple assignment, but neither example uses a huge
> amount of memory.
>
> Since you had a "local -A arr2" in your first example, I ran each of
> the examples in an anon function in a fresh shell.  I also tried the
> second example both with "local -a arr" added, and as you posted it,
> but that didn't really make any difference.
>
> On my system, the associative arr2 case looks like this:
>
> swap    free
> 412780  81244
> (skip gradual shrinking during "for" loop)
> 412780  33260
> 412780  33260
> 412780  58668
>
> The second like this:
>
> swap    free
> 412780  81420
> 412780  38092
> 412780  36748
> 412780  36772
> 412780  31524
> 412780  22692
> 412780  66532
>
> The $__text value is global in both cases so the delta between the first
> and last "free" numbers is the memory consumed by that plus any unused
> space in the last unpopped zsh heap block.
>
> In both cases the // expression is almost instantaneous, much faster than
> creating the $__text string in the first place.  Neither uses any swap.
> My config.h defines HAVE_MMAP 1, so the zsh heap is allocated as an
> anonymous mapfile, but I don't have 55GB free on the filesystem so I
> don't think that can be a factor.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re[2]: High memory usage on // substitution in one situation, normal usage in other
       [not found]   ` <201705212227360937.046E4951@gateway.core.mpy.ch>
@ 2017-05-21 23:43     ` Bart Schaefer
  2017-05-22  3:31       ` Sebastian Gniazdowski
  2017-05-22  8:50       ` Daniel Shahaf
  0 siblings, 2 replies; 8+ messages in thread
From: Bart Schaefer @ 2017-05-21 23:43 UTC (permalink / raw)
  To: zsh-workers

On May 21, 10:27pm, Manuel Presnitz wrote:
}
} I can reproduce the extraordinary high memory usage ...
} but only if EXTENDEDGLOB is set.

OK, that helps (and I should have caught that this is required for (#b)
to be meaningful).

Let's consider the two examples:

1) arr2 is an associative array mapping 200000 integers to underscores.
   Flattened into a string, you get alternating number-strings and
   underscores separated by spaces.

2) arr is an array of 900000 single underscores.  Flattened it's a
   string of underscores separated by spaces.

Now let's look more closely at the pattern:

    [^$'\03'-$'\07'$'\013'-$'\014'$'\016'-$'\031'$'0\037']##

Look at the rightmost $'...' expression:  $'0\037'

That has a "0" character in it.  Probably a typo, but it means that
every time there is a zero digit in $__text, the // recursion has to
save state before proceeding.  This doesn't happen when the text is
nothng but underscores.  The state that is being saved includes the
entire tail of the string from that point onward, so in the first
example it requires approximately 1.8MB/2*(the number of zeroes in
all integers from 1 to 200000), and in the second case it never does
recur (the ## consumes the entire string).

The memory is being allocated here:

		/*
		 * Array to record the start of characters for
		 * backtracking.
		 */
		VARARR(char, charstart, patinend-patinput);
		memset(charstart, 0, patinend-patinput);

(pattern.c:3263)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Re[2]: High memory usage on // substitution in one situation, normal usage in other
  2017-05-21 23:43     ` Bart Schaefer
@ 2017-05-22  3:31       ` Sebastian Gniazdowski
  2017-05-22  8:50       ` Daniel Shahaf
  1 sibling, 0 replies; 8+ messages in thread
From: Sebastian Gniazdowski @ 2017-05-22  3:31 UTC (permalink / raw)
  To: Bart Schaefer, zsh-workers

On 22 maja 2017 at 01:43:52, Bart Schaefer (schaefer@brasslantern.com) wrote:
> Now let's look more closely at the pattern:
>  
> [^$'\03'-$'\07'$'\013'-$'\014'$'\016'-$'\031'$'0\037']##
>  
> Look at the rightmost $'...' expression: $'0\037'
>  
> That has a "0" character in it. Probably a typo, but it means that
> every time there is a zero digit in $__text, the // recursion has to
> save state before proceeding. This doesn't happen when the text is
> nothng but underscores. The state that is being saved includes the
> entire tail of the string from that point onward, so in the first
> example it requires approximately 1.8MB/2*(the number of zeroes in
> all integers from 1 to 200000), and in the second case it never does
> recur (the ## consumes the entire string).

Thanks, I included that fix in my ZUI library, it could hang there very long if not this resolution now. It is still quite interesting why VIRT raises and RES actually drops, below is example htop report:

  PID USER     PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
50574 sgniazdo  31   0 55.2G 1319M     0 R 93.0 16.1  0:00.49 zsh-5.3.1-dev-0 -i

--
Sebastian Gniazdowski
psprint /at/ zdharma.org


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: High memory usage on // substitution in one situation, normal usage in other
  2017-05-21 23:43     ` Bart Schaefer
  2017-05-22  3:31       ` Sebastian Gniazdowski
@ 2017-05-22  8:50       ` Daniel Shahaf
  2017-05-22 15:00         ` Bart Schaefer
  1 sibling, 1 reply; 8+ messages in thread
From: Daniel Shahaf @ 2017-05-22  8:50 UTC (permalink / raw)
  To: zsh-workers

Bart Schaefer wrote on Sun, 21 May 2017 16:43 -0700:
> The state that is being saved includes the entire tail of the string from
> that point onward, so in the first example it requires approximately
> 1.8MB/2*(the number of zeroes in all integers from 1 to 200000),

The figure in parentheses is:

    % print {1..200000} | tr -dc 0 | wc -c
    88894

(So we can compute the exact predicted memory use and compare it to the
observed use.)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: High memory usage on // substitution in one situation, normal usage in other
  2017-05-22  8:50       ` Daniel Shahaf
@ 2017-05-22 15:00         ` Bart Schaefer
  2017-05-23 15:33           ` Christian Neukirchen
  0 siblings, 1 reply; 8+ messages in thread
From: Bart Schaefer @ 2017-05-22 15:00 UTC (permalink / raw)
  To: zsh-workers

On May 22,  8:50am, Daniel Shahaf wrote:
}
} > 1.8MB/2*(the number of zeroes in all integers from 1 to 200000),
} 
}     88894
} 
} (So we can compute the exact predicted memory use and compare it to the
} observed use.)

I'd be surprised to find that anyone has a machine with 80GB of virtual
memory that it's worth dedicating to testing this.  (Also that's a really
rough estimate, could be low by as much as a factor of six.)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: High memory usage on // substitution in one situation, normal usage in other
  2017-05-22 15:00         ` Bart Schaefer
@ 2017-05-23 15:33           ` Christian Neukirchen
  0 siblings, 0 replies; 8+ messages in thread
From: Christian Neukirchen @ 2017-05-23 15:33 UTC (permalink / raw)
  To: zsh-workers

Bart Schaefer <schaefer@brasslantern.com> writes:

> On May 22,  8:50am, Daniel Shahaf wrote:
> }
> } > 1.8MB/2*(the number of zeroes in all integers from 1 to 200000),
> } 
> }     88894
> } 
> } (So we can compute the exact predicted memory use and compare it to the
> } observed use.)
>
> I'd be surprised to find that anyone has a machine with 80GB of virtual
> memory that it's worth dedicating to testing this.  (Also that's a really
> rough estimate, could be low by as much as a factor of six.)

Challenge accepted. ;)  (The machine idles anyway...)

Linux deka 4.4.0-78-generic #99-Ubuntu SMP Thu Apr 27 15:29:09 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

zsh 5.1.1 (x86_64-ubuntu-linux-gnu) with EXTENDEDGLOB on:

  PID USER      PRI - -  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
 2841 neukirche  20 - -   0 81.4G 77.7G  3064 R 99.8 58.1  8:05.40 zsh big
 2841 neukirche  20 - -   0 82.1G 75.9G  2872 R 18.5 56.8  8:26.14 zsh big
 2841 neukirche  20 - -   0 87.7G 80.8G  2872 R 99.8 60.5  9:58.80 zsh big
 2841 neukirche  20 - -   0 91.2G 84.3G  2872 R 99.7 63.1 11:01.29 zsh big
 2841 neukirche  20 - -   0 93.5G 85.3G  2872 R 100. 63.8 11:54.64 zsh big
 2841 neukirche  20 - -   0   95G 85.7G  2872 R 99.7 64.1 12:49.51 zsh big
 2841 neukirche  20 - -   0   97G 86.2G  2872 R 99.7 64.5 13:45.90 zsh big
 2841 neukirche  20 - -   0  100G 86.8G  2872 R 99.7 65.0 15:14.28 zsh big
 2841 neukirche  20 - -   0  102G 87.4G  2664 R 100. 65.4 16:24.38 zsh big
 2841 neukirche  20 - -   0  105G 88.0G  2664 R 99.7 65.9 17:45.16 zsh big
 2841 neukirche  20 - -   0  107G 88.8G  2664 R 99.0 66.4 19:09.09 zsh big
 2841 neukirche  20 - -   0  109G 89.3G  2664 R 99.5 66.8 20:26.41 zsh big
 2841 neukirche  20 - -   0  116G 93.5G  2664 R 99.0 70.0 24:52.41 zsh big
 2841 neukirche  20 - -   0  118G 92.5G  2664 D 43.4 69.3 27:12.57 zsh big
 2841 neukirche  20 - -   0  122G   95G  2664 R 100. 71.4 30:23.67 zsh big
 2841 neukirche  20 - -   0  128G   98G  2664 R 30.8 73.7 39:08.38 zsh big
 2841 neukirche  20 - -   0  132G  101G  2640 R 99.7 75.7 47:25.95 zsh big

Then the box started to swap as it only has 134G RAM, and I stopped it.

Perhaps the result is interesting nevertheless.

-- 
Christian Neukirchen  <chneukirchen@gmail.com>  http://chneukirchen.org


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-05-23 15:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-21  7:28 High memory usage on // substitution in one situation, normal usage in other Sebastian Gniazdowski
2017-05-21 20:03 ` Bart Schaefer
2017-05-21 20:47   ` Re[2]: " Manuel Presnitz
     [not found]   ` <201705212227360937.046E4951@gateway.core.mpy.ch>
2017-05-21 23:43     ` Bart Schaefer
2017-05-22  3:31       ` Sebastian Gniazdowski
2017-05-22  8:50       ` Daniel Shahaf
2017-05-22 15:00         ` Bart Schaefer
2017-05-23 15:33           ` Christian Neukirchen

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).