Hi, Recently I went ahead and enabled HIST_EXPIRE_DUPS_FIRST on my systems in order to get the most out of zsh_history. It worked great however I on two separated ocasions had a large chunk of my history file being lost. I narrowed it down to HIST_EXPIRE_DUPS_FIRST They way I can reproduce it sometimes is if multiple zsh shells exit at the very same time. It happens when I terminate tmux session that have 10+ zsh instances,or when I just reboot my system while I have dozens of urxvt instances open. This leads to all of the zsh instances to exit and do it's magic. This however happens only if I intentionally litter my .zsh_history to make the zsh actually want to run this logic. meaning the steps to reproduce would be: - Exit multiple zsh instances at the very same time. - Have .zsh_history big enough (HISTSIZE, SAVEHIST) so it trigger this logic. I had it happen to me (outside of trying to reproduce it) twice, on one system as root, on another as non-root user. Before I reported it I checked all the other things that could lead to this corruption, and I found a single user reporting the very same problem year and half ago on stackoverflow[1], he does indeed have histexpiredupsfirst enabled. The relevant configuration change that I did to enable this (among few other) HISTSIZE='128000' -SAVEHIST='128000' +SAVEHIST='96000' +setopt hist_expire_dups_first setopt hist_ignore_dups +setopt hist_ignore_all_dups +setopt hist_find_no_dups +setopt hist_save_no_dups I have disabled this feature since due those corruptions. Would love to get back to it though, perhaps adding some locking mechanism would help here? [1] https://stackoverflow.com/questions/69434630/closing-multiple-iterm2-tabs-makes-zsh-history-lose-most-of-the-history -- Piotr.
On Sun, Mar 19, 2023 at 4:56 AM Piotr Karbowski
<piotr.karbowski@protonmail.ch> wrote:
>
> They way I can reproduce it sometimes is if multiple zsh shells exit at
> the very same time. It happens when I terminate tmux session that have
> 10+ zsh instances,or when I just reboot my system while I have dozens of
> urxvt instances open.
>
> +SAVEHIST='96000'
> +setopt hist_expire_dups_first
>
> I have disabled this feature since due those corruptions. Would love to
> get back to it though, perhaps adding some locking mechanism would help
> here?
There is a locking mechanism. A couple of things may be happening:
1) Your home directory (or ZDOTDIR) is on a remote filesystem with
asynchronous mounting, so the locking mechanism doesn't always work.
You haven't described your system configuration in any detail, but
from what you have said this one seems unlikely.
2) The locking mechanism doesn't work for some other reason. Make
sure the HIST_SAVE_BY_COPY option is set (it's the default but check
anyway).
3) When exiting a session (or particularly when rebooting), there may
be a limited amount of time for the processes to clean up and exit
before they are killed. With the combination of options you have,
each shell must re-read the history file to merge it with its local
history before writing it back. With multiple shells and tens of
thousands of lines of history, this might take longer than all the
shells are allowed to stay alive.
Try putting something in your .zlogout file that writes a timestamp to
a file (different for each shell, e.g., use $$ in the filename).
Since .zlogout runs after history is saved, if you find any of those
files to be missing, that's a clue that #3 is happening. You could
also look to see if a ".zsh_history.new" file is left behind.
Hi, On 19/03/2023 17.53, Bart Schaefer wrote: > On Sun, Mar 19, 2023 at 4:56 AM Piotr Karbowski > <piotr.karbowski@protonmail.ch> wrote: >> >> They way I can reproduce it sometimes is if multiple zsh shells exit at >> the very same time. It happens when I terminate tmux session that have >> 10+ zsh instances,or when I just reboot my system while I have dozens of >> urxvt instances open. >> >> +SAVEHIST='96000' >> +setopt hist_expire_dups_first >> >> I have disabled this feature since due those corruptions. Would love to >> get back to it though, perhaps adding some locking mechanism would help >> here? > > There is a locking mechanism. A couple of things may be happening: > > 1) Your home directory (or ZDOTDIR) is on a remote filesystem with > asynchronous mounting, so the locking mechanism doesn't always work. > You haven't described your system configuration in any detail, but > from what you have said this one seems unlikely. > 2) The locking mechanism doesn't work for some other reason. Make > sure the HIST_SAVE_BY_COPY option is set (it's the default but check > anyway). > 3) When exiting a session (or particularly when rebooting), there may > be a limited amount of time for the processes to clean up and exit > before they are killed. With the combination of options you have, > each shell must re-read the history file to merge it with its local > history before writing it back. With multiple shells and tens of > thousands of lines of history, this might take longer than all the > shells are allowed to stay alive. > > Try putting something in your .zlogout file that writes a timestamp to > a file (different for each shell, e.g., use $$ in the filename). > Since .zlogout runs after history is saved, if you find any of those > files to be missing, that's a clue that #3 is happening. You could > also look to see if a ".zsh_history.new" file is left behind. Thanks for the response and applogies for missing details. I am runnin XFS (on the top of lvm, on the top of dmcrypt). The current setopt is (without the expire dups first) autopushd extendedglob noflowcontrol histignoredups histignorespace interactive interactivecomments promptsubst pushdignoredups pushdminus shinstdin The HIST_SAVE_BY_COPY does not seems to be enabled and I am unable to enable it. Running setopt HIST_SAVE_BY_COPY Does not bring this option into list I can see with running setopt. This is zsh-5.9. Running unsetopt without params does show nohistsavebycopy there, however running setopt histsavebycopy Does nothing, it still remains as nohistsavebycopy in unsetopt output. The question then is -- why this feature remains disabled and cannot be toggled on? Even running zsh -dfi to get a shell without zshrc behave the same way, this feature remains disabled and cannot be toggled on. -- Piotr.
On Sun, Mar 19, 2023 at 11:38 AM Piotr Karbowski
<piotr.karbowski@protonmail.ch> wrote:
>
> The HIST_SAVE_BY_COPY does not seems to be enabled and I am unable to
> enable it.
"setopt" with no arguments shows you only the options that differ from
the defaults. So if HIST_SAVE_BY_COPY does not show up, it IS
enabled.
Use "set -o" (with no other arguments) to see the full set of options
and on/off status, or "setopt kshoptionprint" before running "setopt"
alone.
If an option shows up with a "no" prefix in unsetopt output, that
means it is on. There is no third state.
Hi,
On 19/03/2023 22.01, Bart Schaefer wrote:
> On Sun, Mar 19, 2023 at 11:38 AM Piotr Karbowski
> <piotr.karbowski@protonmail.ch> wrote:
>>
>> The HIST_SAVE_BY_COPY does not seems to be enabled and I am unable to
>> enable it.
>
> "setopt" with no arguments shows you only the options that differ from
> the defaults. So if HIST_SAVE_BY_COPY does not show up, it IS
> enabled.
>
> Use "set -o" (with no other arguments) to see the full set of options
> and on/off status, or "setopt kshoptionprint" before running "setopt"
> alone.
>
> If an option shows up with a "no" prefix in unsetopt output, that
> means it is on. There is no third state.
That make sense, thanks. The 'set -o' do confirm 'nohistsavebycopy off'
I looked around in Src/hist.c where it does the copy history and indeed
it does use O_EXCL so this it does the locking.
I still did lost few thousands lines from the end of .zsh_history
yesterday again when I git a bug with openbox+glib-2.76.0 that crashed
entire of my X session, with many urxvt and zsh there and i had to
restore it from hourly borgbackup.
With this, I do not know where this problem originates from but I do see
that I am not the only one and the stackoverflow user with MacOS do hit
this problem as well.
Anything else I could provide to help norrowing it down?
-- Piotr.
On Sun, Mar 19, 2023 at 2:17 PM Piotr Karbowski
<piotr.karbowski@protonmail.ch> wrote:
>
> Anything else I could provide to help norrowing it down?
Did you try my suggestion of timestamps from .zlogout?
If the shell is being killed from a controlling app or by the OS,
there's not a lot we can do. If that's not what's happening, it would
be helpful to know.
Hi,
On 19/03/2023 22.34, Bart Schaefer wrote:
> On Sun, Mar 19, 2023 at 2:17 PM Piotr Karbowski
> <piotr.karbowski@protonmail.ch> wrote:
>> Anything else I could provide to help norrowing it down?
> Did you try my suggestion of timestamps from .zlogout?
>
> If the shell is being killed from a controlling app or by the OS,
> there's not a lot we can do. If that's not what's happening, it would
> be helpful to know.
Added to .zlogout, will report back the findings when I can reproduce it
again. However looking at code, it does write data to new file then does
rename. The rename is guaranteed to be atomic so in case of killing zsh
instances I would expect at worst to lost the new history entires, not
loosing the end-part of the history file, this looks like something bad
has happened and it still renamed the partial(?) file into .zsh_history
at the end.
-- Piotr.