zsh-users
 help / color / mirror / code / Atom feed
* The HIST_EXPIRE_DUPS_FIRST might corrupt and wipe partially history file if many shells exit at the same time
@ 2023-03-19 11:56 Piotr Karbowski
  2023-03-19 16:53 ` Bart Schaefer
  0 siblings, 1 reply; 7+ messages in thread
From: Piotr Karbowski @ 2023-03-19 11:56 UTC (permalink / raw)
  To: zsh-users

Hi,

Recently I went ahead and enabled HIST_EXPIRE_DUPS_FIRST on my systems
in order to get the most out of zsh_history. It worked great however I
on two separated ocasions had a large chunk of my history file being
lost. I narrowed it down to HIST_EXPIRE_DUPS_FIRST

They way I can reproduce it sometimes is if multiple zsh shells exit at
the very same time. It happens when I terminate tmux session that have
10+ zsh instances,or when I just reboot my system while I have dozens of
urxvt instances open. This leads to all of the zsh instances to exit and
do it's magic. This however happens only if I intentionally litter my
.zsh_history to make the zsh actually want to run this logic. meaning
the steps to reproduce would be:

- Exit multiple zsh instances at the very same time.
- Have .zsh_history big enough (HISTSIZE, SAVEHIST) so it trigger this
logic.

I had it happen to me (outside of trying to reproduce it) twice, on one
system as root, on another as non-root user.

Before I reported it I checked all the other things that could lead to
this corruption, and I found a single user reporting the very same
problem year and half ago on stackoverflow[1], he does indeed have
histexpiredupsfirst enabled.

The relevant configuration change that I did to enable this (among few
other)

      HISTSIZE='128000'
     -SAVEHIST='128000'
     +SAVEHIST='96000'
     +setopt hist_expire_dups_first
      setopt hist_ignore_dups
     +setopt hist_ignore_all_dups
     +setopt hist_find_no_dups
     +setopt hist_save_no_dups

I have disabled this feature since due those corruptions. Would love to
get back to it though, perhaps adding some locking mechanism would help
here?

[1]
https://stackoverflow.com/questions/69434630/closing-multiple-iterm2-tabs-makes-zsh-history-lose-most-of-the-history

-- Piotr.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: The HIST_EXPIRE_DUPS_FIRST might corrupt and wipe partially history file if many shells exit at the same time
  2023-03-19 11:56 The HIST_EXPIRE_DUPS_FIRST might corrupt and wipe partially history file if many shells exit at the same time Piotr Karbowski
@ 2023-03-19 16:53 ` Bart Schaefer
  2023-03-19 18:38   ` Piotr Karbowski
  0 siblings, 1 reply; 7+ messages in thread
From: Bart Schaefer @ 2023-03-19 16:53 UTC (permalink / raw)
  To: Piotr Karbowski; +Cc: zsh-users

On Sun, Mar 19, 2023 at 4:56 AM Piotr Karbowski
<piotr.karbowski@protonmail.ch> wrote:
>
> They way I can reproduce it sometimes is if multiple zsh shells exit at
> the very same time. It happens when I terminate tmux session that have
> 10+ zsh instances,or when I just reboot my system while I have dozens of
> urxvt instances open.
>
>      +SAVEHIST='96000'
>      +setopt hist_expire_dups_first
>
> I have disabled this feature since due those corruptions. Would love to
> get back to it though, perhaps adding some locking mechanism would help
> here?

There is a locking mechanism.  A couple of things may be happening:

1) Your home directory (or ZDOTDIR) is on a remote filesystem with
asynchronous mounting, so the locking mechanism doesn't always work.
You haven't described your system configuration in any detail, but
from what you have said this one seems unlikely.
2) The locking mechanism doesn't work for some other reason.  Make
sure the HIST_SAVE_BY_COPY option is set (it's the default but check
anyway).
3) When exiting a session (or particularly when rebooting), there may
be a limited amount of time for the processes to clean up and exit
before they are killed.  With the combination of options you have,
each shell must re-read the history file to merge it with its local
history before writing it back.  With multiple shells and tens of
thousands of lines of history, this might take longer than all the
shells are allowed to stay alive.

Try putting something in your .zlogout file that writes a timestamp to
a file (different for each shell, e.g., use $$ in the filename).
Since .zlogout runs after history is saved, if you find any of those
files to be missing, that's a clue that #3 is happening.  You could
also look to see if a ".zsh_history.new" file is left behind.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: The HIST_EXPIRE_DUPS_FIRST might corrupt and wipe partially history file if many shells exit at the same time
  2023-03-19 16:53 ` Bart Schaefer
@ 2023-03-19 18:38   ` Piotr Karbowski
  2023-03-19 21:01     ` Bart Schaefer
  0 siblings, 1 reply; 7+ messages in thread
From: Piotr Karbowski @ 2023-03-19 18:38 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-users

Hi,

On 19/03/2023 17.53, Bart Schaefer wrote:
 > On Sun, Mar 19, 2023 at 4:56 AM Piotr Karbowski
 > <piotr.karbowski@protonmail.ch> wrote:
 >>
 >> They way I can reproduce it sometimes is if multiple zsh shells exit at
 >> the very same time. It happens when I terminate tmux session that have
 >> 10+ zsh instances,or when I just reboot my system while I have dozens of
 >> urxvt instances open.
 >>
 >>       +SAVEHIST='96000'
 >>       +setopt hist_expire_dups_first
 >>
 >> I have disabled this feature since due those corruptions. Would love to
 >> get back to it though, perhaps adding some locking mechanism would help
 >> here?
 >
 > There is a locking mechanism.  A couple of things may be happening:
 >
 > 1) Your home directory (or ZDOTDIR) is on a remote filesystem with
 > asynchronous mounting, so the locking mechanism doesn't always work.
 > You haven't described your system configuration in any detail, but
 > from what you have said this one seems unlikely.
 > 2) The locking mechanism doesn't work for some other reason.  Make
 > sure the HIST_SAVE_BY_COPY option is set (it's the default but check
 > anyway).
 > 3) When exiting a session (or particularly when rebooting), there may
 > be a limited amount of time for the processes to clean up and exit
 > before they are killed.  With the combination of options you have,
 > each shell must re-read the history file to merge it with its local
 > history before writing it back.  With multiple shells and tens of
 > thousands of lines of history, this might take longer than all the
 > shells are allowed to stay alive.
 >
 > Try putting something in your .zlogout file that writes a timestamp to
 > a file (different for each shell, e.g., use $$ in the filename).
 > Since .zlogout runs after history is saved, if you find any of those
 > files to be missing, that's a clue that #3 is happening.  You could
 > also look to see if a ".zsh_history.new" file is left behind.
Thanks for the response and applogies for missing details.

I am runnin XFS (on the top of lvm, on the top of dmcrypt).

The current setopt is (without the expire dups first)

     autopushd
     extendedglob
     noflowcontrol
     histignoredups
     histignorespace
     interactive
     interactivecomments
     promptsubst
     pushdignoredups
     pushdminus
     shinstdin

The HIST_SAVE_BY_COPY does not seems to be enabled and I am unable to
enable it. Running

     setopt HIST_SAVE_BY_COPY

Does not bring this option into list I can see with running setopt. This
is zsh-5.9. Running unsetopt without params does show nohistsavebycopy
there, however running

     setopt histsavebycopy

Does nothing, it still remains as nohistsavebycopy in unsetopt output.

The question then is -- why this feature remains disabled and cannot be
toggled on? Even running zsh -dfi to get a shell without zshrc behave
the same way, this feature remains disabled and cannot be toggled on.

-- Piotr.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: The HIST_EXPIRE_DUPS_FIRST might corrupt and wipe partially history file if many shells exit at the same time
  2023-03-19 18:38   ` Piotr Karbowski
@ 2023-03-19 21:01     ` Bart Schaefer
  2023-03-19 21:17       ` Piotr Karbowski
  0 siblings, 1 reply; 7+ messages in thread
From: Bart Schaefer @ 2023-03-19 21:01 UTC (permalink / raw)
  To: Piotr Karbowski; +Cc: zsh-users

On Sun, Mar 19, 2023 at 11:38 AM Piotr Karbowski
<piotr.karbowski@protonmail.ch> wrote:
>
> The HIST_SAVE_BY_COPY does not seems to be enabled and I am unable to
> enable it.

"setopt" with no arguments shows you only the options that differ from
the defaults.  So if HIST_SAVE_BY_COPY does not show up, it IS
enabled.

Use "set -o" (with no other arguments) to see the full set of options
and on/off status, or "setopt kshoptionprint" before running "setopt"
alone.

If an option shows up with a  "no" prefix in unsetopt output, that
means it is on.  There is no third state.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: The HIST_EXPIRE_DUPS_FIRST might corrupt and wipe partially history file if many shells exit at the same time
  2023-03-19 21:01     ` Bart Schaefer
@ 2023-03-19 21:17       ` Piotr Karbowski
  2023-03-19 21:34         ` Bart Schaefer
  0 siblings, 1 reply; 7+ messages in thread
From: Piotr Karbowski @ 2023-03-19 21:17 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-users

Hi,

On 19/03/2023 22.01, Bart Schaefer wrote:
> On Sun, Mar 19, 2023 at 11:38 AM Piotr Karbowski
> <piotr.karbowski@protonmail.ch> wrote:
>>
>> The HIST_SAVE_BY_COPY does not seems to be enabled and I am unable to
>> enable it.
>
> "setopt" with no arguments shows you only the options that differ from
> the defaults.  So if HIST_SAVE_BY_COPY does not show up, it IS
> enabled.
>
> Use "set -o" (with no other arguments) to see the full set of options
> and on/off status, or "setopt kshoptionprint" before running "setopt"
> alone.
>
> If an option shows up with a  "no" prefix in unsetopt output, that
> means it is on.  There is no third state.

That make sense, thanks. The 'set -o' do confirm 'nohistsavebycopy off'

I looked around in Src/hist.c where it does the copy history and indeed
it does use O_EXCL so this it does the locking.

I still did lost few thousands lines from the end of .zsh_history
yesterday again when I git a bug with openbox+glib-2.76.0 that crashed
entire of my X session, with many urxvt and zsh there and i had to
restore it from hourly borgbackup.

With this, I do not know where this problem originates from but I do see
that I am not the only one and the stackoverflow user with MacOS do hit
this problem as well.

Anything else I could provide to help norrowing it down?

-- Piotr.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: The HIST_EXPIRE_DUPS_FIRST might corrupt and wipe partially history file if many shells exit at the same time
  2023-03-19 21:17       ` Piotr Karbowski
@ 2023-03-19 21:34         ` Bart Schaefer
  2023-03-19 21:53           ` Piotr Karbowski
  0 siblings, 1 reply; 7+ messages in thread
From: Bart Schaefer @ 2023-03-19 21:34 UTC (permalink / raw)
  To: Piotr Karbowski; +Cc: zsh-users

On Sun, Mar 19, 2023 at 2:17 PM Piotr Karbowski
<piotr.karbowski@protonmail.ch> wrote:
>
> Anything else I could provide to help norrowing it down?

Did you try my suggestion of timestamps from .zlogout?

If the shell is being killed from a controlling app or by the OS,
there's not a lot we can do.  If that's not what's happening, it would
be helpful to know.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: The HIST_EXPIRE_DUPS_FIRST might corrupt and wipe partially history file if many shells exit at the same time
  2023-03-19 21:34         ` Bart Schaefer
@ 2023-03-19 21:53           ` Piotr Karbowski
  0 siblings, 0 replies; 7+ messages in thread
From: Piotr Karbowski @ 2023-03-19 21:53 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-users

Hi,

On 19/03/2023 22.34, Bart Schaefer wrote:
> On Sun, Mar 19, 2023 at 2:17 PM Piotr Karbowski
> <piotr.karbowski@protonmail.ch>  wrote:
>> Anything else I could provide to help norrowing it down?
> Did you try my suggestion of timestamps from .zlogout?
>
> If the shell is being killed from a controlling app or by the OS,
> there's not a lot we can do.  If that's not what's happening, it would
> be helpful to know.

Added to .zlogout, will report back the findings when I can reproduce it
again. However looking at code, it does write data to new file then does
rename. The rename is guaranteed to be atomic so in case of killing zsh
instances I would expect at worst to lost the new history entires, not
loosing the end-part of the history file, this looks like something bad
has happened and it still renamed the partial(?) file into .zsh_history
at the end.

-- Piotr.



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-03-19 21:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-19 11:56 The HIST_EXPIRE_DUPS_FIRST might corrupt and wipe partially history file if many shells exit at the same time Piotr Karbowski
2023-03-19 16:53 ` Bart Schaefer
2023-03-19 18:38   ` Piotr Karbowski
2023-03-19 21:01     ` Bart Schaefer
2023-03-19 21:17       ` Piotr Karbowski
2023-03-19 21:34         ` Bart Schaefer
2023-03-19 21:53           ` Piotr Karbowski

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).