* Memory usage of history? @ 2016-06-24 13:47 Dominik Vogt 2016-06-24 22:57 ` Eric Cook 2016-06-25 1:47 ` Bart Schaefer 0 siblings, 2 replies; 7+ messages in thread From: Dominik Vogt @ 2016-06-24 13:47 UTC (permalink / raw) To: Zsh Users; +Cc: Robin Dapp Could someone please explain the implications of having a large history file? Does an interactive zsh read the history file into private memory upon startup, or how does this work? Is there a way to reduce memory and cpu consumption somehow? (A colleague says his zshs use 200 MB memory each with a history size of a million lines). Ciao Dominik ^_^ ^_^ -- Dominik Vogt IBM Germany ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Memory usage of history? 2016-06-24 13:47 Memory usage of history? Dominik Vogt @ 2016-06-24 22:57 ` Eric Cook 2016-06-25 1:47 ` Bart Schaefer 1 sibling, 0 replies; 7+ messages in thread From: Eric Cook @ 2016-06-24 22:57 UTC (permalink / raw) To: zsh-users On 06/24/2016 09:47 AM, Dominik Vogt wrote: > Could someone please explain the implications of having a large > history file? Uses more disk space > Does an interactive zsh read the history file into > private memory upon startup, or how does this work? zsh reads $HISTSIZE number of lines from $HISTFILE upon startup. which may be the same size as $SAVEHIST, which controls the number of lines to keep in $HISTFILE. > Is there a way to reduce memory and cpu consumption somehow? (A colleague > says his zshs use 200 MB memory each with a history size of a > million lines). Reducing the number of lines you tell zsh to keep in memory via HISTSIZE would help. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Memory usage of history? 2016-06-24 13:47 Memory usage of history? Dominik Vogt 2016-06-24 22:57 ` Eric Cook @ 2016-06-25 1:47 ` Bart Schaefer 2016-06-25 17:33 ` Nikolay Aleksandrovich Pavlov (ZyX) 1 sibling, 1 reply; 7+ messages in thread From: Bart Schaefer @ 2016-06-25 1:47 UTC (permalink / raw) To: Zsh Users, Robin Dapp On Fri, Jun 24, 2016 at 6:47 AM, Dominik Vogt <vogt@linux.vnet.ibm.com> wrote: > > (A colleague > says his zshs use 200 MB memory each with a history size of a > million lines). To expand on Eric's answer, zsh reads the entire $HISTFILE and retains the last $HISTSIZE entries. So a large $HISTFILE also slows down startup, even if it doesn't consume lots of memory. I can't imagine anyone having a million useful lines of history. A few tens of thousands at most. Things he might consider that would allow him to reduce SAVEHIST and/or HISTSIZE without losing too much information: * Set the hist_ignore_all_dups option, if he doesn't already. * Set the hist_save_no_dups option, similarly. * Define a zshaddhistory function to filter out commands that are unlikely to be used again. If he isn't already ignoring / not saving duplicates, an interesting experiment might be to add hist_ignore_all_dups without changing HISTSIZE, then run zsh and see how many lines of history actually end up being retained. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Memory usage of history? 2016-06-25 1:47 ` Bart Schaefer @ 2016-06-25 17:33 ` Nikolay Aleksandrovich Pavlov (ZyX) 2016-06-25 17:46 ` Bart Schaefer 2016-06-26 23:29 ` Bart Schaefer 0 siblings, 2 replies; 7+ messages in thread From: Nikolay Aleksandrovich Pavlov (ZyX) @ 2016-06-25 17:33 UTC (permalink / raw) To: Bart Schaefer, Zsh Users, Robin Dapp 25.06.2016, 04:49, "Bart Schaefer" <schaefer@brasslantern.com>: > On Fri, Jun 24, 2016 at 6:47 AM, Dominik Vogt <vogt@linux.vnet.ibm.com> wrote: >> (A colleague >> says his zshs use 200 MB memory each with a history size of a >> million lines). > > To expand on Eric's answer, zsh reads the entire $HISTFILE and retains > the last $HISTSIZE entries. So a large $HISTFILE also slows down > startup, even if it doesn't consume lots of memory. > > I can't imagine anyone having a million useful lines of history. A > few tens of thousands at most. Things he might consider that would > allow him to reduce SAVEHIST and/or HISTSIZE without losing too much > information: > * Set the hist_ignore_all_dups option, if he doesn't already. > * Set the hist_save_no_dups option, similarly. > * Define a zshaddhistory function to filter out commands that are > unlikely to be used again. > > If he isn't already ignoring / not saving duplicates, an interesting > experiment might be to add hist_ignore_all_dups without changing > HISTSIZE, then run zsh and see how many lines of history actually end > up being retained. Actually there may be better solution: consider the case when zsh 1. allows saving user-defined metada in history file and 2. allows user to get control over what exactly will be removed. Specifically first may be used to save information about 1. How often the command is used (total number of uses, anything else like “uses per month” would be harder to determine). 2. Time it took command to type (when it was typed for the first time) (time between first self-insert (or $*BUFFER modification if it was constructed by a widget) and accept-line). 3. Last time command was run. 4. Time it took command to finish (average among all runs). 5. What was the exit code (hash exit code - number of times it occurred). Second is supposed to be a function like `zshhistkey` that returns basically the same thing as function used for `(o+)`: function that accepts history entry with attached metadata (passed through arguments or via a local parameter that is an associative array, meatadata saved by EXTENDED_HISTORY should also be passed) and saves something in $REPLY, history entries with least values in $REPLY will be removed. On this basis it would be possible to construct a more useful filter, I guess the first three would be enough (when removing history lines, find least often then fastest to type commands and remove them in first place, but always save commands typed during the last hour: `zshhistkey() { REPLY="$(printf "%u-%020u-%020.2g" $[$(date +%s) - $metadata[last_run_time] < 60 * 60 ? 1 : 0] $metadata[num_runs] $metadata[type_duration])"`). EXTENDED_HISTORY already provides 4 (though I do not think it provides “average”) and 3, but I do not find that very useful (especially 4, 3 is needed to protect most recent commands). Without something like this “set $HISTSIZE and $SAVEHIST to a rather large number” strategy (in addition to the options you mentioned) is the best option, I personally have both set to 65536. I have no idea how one may construct “zshaddhistory function” that “filters out commands that are unlikely to be used again” without somehow know what these stats will be in the future. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Memory usage of history? 2016-06-25 17:33 ` Nikolay Aleksandrovich Pavlov (ZyX) @ 2016-06-25 17:46 ` Bart Schaefer 2016-06-26 23:29 ` Bart Schaefer 1 sibling, 0 replies; 7+ messages in thread From: Bart Schaefer @ 2016-06-25 17:46 UTC (permalink / raw) To: ZyX; +Cc: Robin Dapp, Zsh Users [-- Attachment #1: Type: text/plain, Size: 472 bytes --] On Jun 25, 2016 10:33 AM, "Nikolay Aleksandrovich Pavlov (ZyX)" < kp-pav@yandex.ru> wrote: > I have no idea how one may construct “zshaddhistory function” that “filters out commands that are unlikely to be used again” without somehow know what these stats will be in the future. As an example, if any command I type contains an argument "foo" "bar" or "baz" it is probably a throwaway. There must be other similar criteria, e.g., why ever save "cd .."? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Memory usage of history? 2016-06-25 17:33 ` Nikolay Aleksandrovich Pavlov (ZyX) 2016-06-25 17:46 ` Bart Schaefer @ 2016-06-26 23:29 ` Bart Schaefer 2016-06-27 0:23 ` Nikolay Aleksandrovich Pavlov (ZyX) 1 sibling, 1 reply; 7+ messages in thread From: Bart Schaefer @ 2016-06-26 23:29 UTC (permalink / raw) To: Zsh Users On Jun 25, 8:33pm, Nikolay Aleksandrovich Pavlov (ZyX) wrote: } Subject: Re: Memory usage of history? } } 1. allows saving user-defined metada in history file and I'm not sure the answer to the history file being too large is to make it even larger by cramming in all sorts of other data. This would be even slower to parse at load time as well. } 2. allows user to get control over what exactly will be removed. In addition to all the other stuff I mentioned, I forgot about the relatively recent addition of the HISTORY_IGNORE variable, which can be a pattern that matches lines to leave out. That would be the best way to handle my "foo is a throwaway" and similar criteria. } Specifically first may be used to save information about } } 1. How often the command is used (total number of uses, anything else like } "uses per month" would be harder to determine). } 2. Time it took command to type (when it was typed for the first time) } (time between first self-insert (or $*BUFFER modification if it was } constructed by a widget) and accept-line). } 3. Last time command was run. } 4. Time it took command to finish (average among all runs). } 5. What was the exit code (hash exit code - number of times it occurred). I find these to be very unlikely criteria for deciding what's interesting in the history? For one thing, "time it took to type" is going to be really hard to get right; multi-line commands have multiple accept-line calls, and you'd have to filter out commands that were recalled from the history or you'd get an average much too small. Larger number of uses would be skewed towards really simple things, and in fact (at least in my own case) the LESS often I use a command, the more likely I am to want it from the history (unless it's one of those throwaways I mentioned in another message), because I can remember the ones I use a lot without zsh's help. If I use it often enough, I can make an alias or keybinding for it and not need to search history. How long the command took to run seems entirely unrelated to whether it is history-worthy (and also doesn't work with shared/incremental history). What would you use the exit code for, except maybe weeding out typos? I like Christian Neukirchen's idea of maintaining a daily archive. Adding a function / keybinding to search through an alternate history store seems more manageable than either having a huge history always in memory or a complicated AI for storing only interesting bits. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Memory usage of history? 2016-06-26 23:29 ` Bart Schaefer @ 2016-06-27 0:23 ` Nikolay Aleksandrovich Pavlov (ZyX) 0 siblings, 0 replies; 7+ messages in thread From: Nikolay Aleksandrovich Pavlov (ZyX) @ 2016-06-27 0:23 UTC (permalink / raw) To: Bart Schaefer, Zsh Users 27.06.2016, 02:30, "Bart Schaefer" <schaefer@brasslantern.com>: > On Jun 25, 8:33pm, Nikolay Aleksandrovich Pavlov (ZyX) wrote: > } Subject: Re: Memory usage of history? > } > } 1. allows saving user-defined metada in history file and > > I'm not sure the answer to the history file being too large is to > make it even larger by cramming in all sorts of other data. This > would be even slower to parse at load time as well. The idea is that history file needs not be too large, but without more advanced criteria using big SAVEHIST value is needed to not miss useful, but uncommon entries. So adding metadata will reduce history size not because metadata reduces history entry size, but because smaller SAVEHIST is needed. > > } 2. allows user to get control over what exactly will be removed. > > In addition to all the other stuff I mentioned, I forgot about the > relatively recent addition of the HISTORY_IGNORE variable, which can > be a pattern that matches lines to leave out. That would be the > best way to handle my "foo is a throwaway" and similar criteria. I cannot say I have any such patterns. > > } Specifically first may be used to save information about > } > } 1. How often the command is used (total number of uses, anything else like > } "uses per month" would be harder to determine). > } 2. Time it took command to type (when it was typed for the first time) > } (time between first self-insert (or $*BUFFER modification if it was > } constructed by a widget) and accept-line). > } 3. Last time command was run. > } 4. Time it took command to finish (average among all runs). > } 5. What was the exit code (hash exit code - number of times it occurred). > > I find these to be very unlikely criteria for deciding what's interesting > in the history? > > For one thing, "time it took to type" is going to be really hard to get > right; multi-line commands have multiple accept-line calls, and you'd > have to filter out commands that were recalled from the history or you'd > get an average much too small. Filtering out commands that were recalled from history is not hard: there are not too much widgets that do this. Though it may be practical for commands that were recalled from history and modified to add “time it took to type” from the original command to the modified one. Also some prompt %format allows determining whether command is a continuation, so saving previous time and adding it on next accept-line is not a problem. > > Larger number of uses would be skewed towards really simple things, and > in fact (at least in my own case) the LESS often I use a command, the > more likely I am to want it from the history (unless it's one of those > throwaways I mentioned in another message), because I can remember the > ones I use a lot without zsh's help. If I use it often enough, I can > make an alias or keybinding for it and not need to search history. My main point was making a custom function that allows to adjust criteria. I suggested this because I tend to keep in history some commands which are rather easy to retype, but I need them fast, and I do not want to have 100500 aliases for easy commands in my zshrc. > > How long the command took to run seems entirely unrelated to whether > it is history-worthy (and also doesn't work with shared/incremental > history). What would you use the exit code for, except maybe weeding > out typos? Exit code is for typos, and it was put on the last place because it is not much useful for other purposes. How long it took command to run is second-least-useful, but still has something to do with history: usually I do not want to repeat long-running commands, if needed they are first candidates to be run using `screen`/`tmux`/… which will be another history entry. > > I like Christian Neukirchen's idea of maintaining a daily archive. > Adding a function / keybinding to search through an alternate history > store seems more manageable than either having a huge history always > in memory or a complicated AI for storing only interesting bits. So far I am fine with my variant “just use large SAVEHIST and HISTSIZE”. Just suggested a way to reduce the number of entries that need to be stored that came to my mind. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-06-27 0:30 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-06-24 13:47 Memory usage of history? Dominik Vogt 2016-06-24 22:57 ` Eric Cook 2016-06-25 1:47 ` Bart Schaefer 2016-06-25 17:33 ` Nikolay Aleksandrovich Pavlov (ZyX) 2016-06-25 17:46 ` Bart Schaefer 2016-06-26 23:29 ` Bart Schaefer 2016-06-27 0:23 ` Nikolay Aleksandrovich Pavlov (ZyX)
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).