zsh-workers
 help / color / mirror / code / Atom feed
* HIST_IGNORE_DUPS ignores lines which differ by a different number of spaces
@ 2013-04-16 17:05 Vincent Lefevre
  2013-04-17 16:41 ` Bart Schaefer
  2013-04-18 20:04 ` Peter Stephenson
  0 siblings, 2 replies; 5+ messages in thread
From: Vincent Lefevre @ 2013-04-16 17:05 UTC (permalink / raw)
  To: zsh-workers

I can see the following bug with zsh 5.0.2 under Debian
(zsh/experimental Debian package). Consider:

% setopt HIST_IGNORE_DUPS
% echo " "
% echo "  "

If I type the up arrow, I get

  echo "  "

as expected, but if I type the up arrow a second time, I get

  setopt HIST_IGNORE_DUPS

instead of

  echo " "

and:

% history
    1  setopt HIST_IGNORE_DUPS
    2  echo "  "

Reported on the Debian BTS:

  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=705555

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: HIST_IGNORE_DUPS ignores lines which differ by a different number of spaces
  2013-04-16 17:05 HIST_IGNORE_DUPS ignores lines which differ by a different number of spaces Vincent Lefevre
@ 2013-04-17 16:41 ` Bart Schaefer
  2013-04-18 20:04 ` Peter Stephenson
  1 sibling, 0 replies; 5+ messages in thread
From: Bart Schaefer @ 2013-04-17 16:41 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 234 bytes --]

The intent is to consider lines that parse the same to be identical even if
the whitespace around words is different.  However, spaces inside quoted
strings should not be treated this way, so the example shown does
demonstrate a bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: HIST_IGNORE_DUPS ignores lines which differ by a different number of spaces
  2013-04-16 17:05 HIST_IGNORE_DUPS ignores lines which differ by a different number of spaces Vincent Lefevre
  2013-04-17 16:41 ` Bart Schaefer
@ 2013-04-18 20:04 ` Peter Stephenson
  2013-04-19 12:42   ` Peter Stephenson
  1 sibling, 1 reply; 5+ messages in thread
From: Peter Stephenson @ 2013-04-18 20:04 UTC (permalink / raw)
  To: zsh-workers

On Tue, 16 Apr 2013 19:05:18 +0200
Vincent Lefevre <vincent@vinc17.net> wrote:
> % setopt HIST_IGNORE_DUPS
> % echo " "
> % echo "  "
> 
> If I type the up arrow, I get
> 
>   echo "  "
> 
> as expected, but if I type the up arrow a second time, I get
> 
>   setopt HIST_IGNORE_DUPS
> 
> instead of
> 
>   echo " "
> 
> and:
> 
> % history
>     1  setopt HIST_IGNORE_DUPS
>     2  echo "  "

This turns out to be hard.  Fixing the basic comparison is
straightforward, and I did, but it turns out that's not good enough.
Then it gets complicated.  After the code has decided whether the line
is identical to the previous one or not, it gets added as a hash node to
the history table.  This has buried into it at quite a low level the
comparison function that compares lines ignoring white space.  I don't
quite understand how the hash business comes into it, given that the
basic point of hend() is to use the history ring and I can't actually
see anything ever being retrieved from the table.

The whole way history is stored in histtab apparently makes it
impossible to take account of white space --- even when retrieving a
node where the only information you've got is the name, which is what's
really stumped me, since the information about words within the line
isn't available at that point --- so unless someone can explain what's
going on here I'm stuck.

It's possible the hash table business is a red herring --- the
comparison behaviour ignoring all white space is present even when
HIST_IGNORE_DUPS isn't set --- and there's a problem somewhere else that
I haven't spotted, but I did confirm that my change caused the lines to
compare differently around line 1273 of hist.c, so it's something
subtle.

My best guess is it's got something to do with freehistdata(), which
appears to be the only way the hash table can affect the history ring.

pws


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: HIST_IGNORE_DUPS ignores lines which differ by a different number of spaces
  2013-04-18 20:04 ` Peter Stephenson
@ 2013-04-19 12:42   ` Peter Stephenson
  2013-04-20  3:01     ` Bart Schaefer
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Stephenson @ 2013-04-19 12:42 UTC (permalink / raw)
  To: zsh-workers

On Thu, 18 Apr 2013 21:04:11 +0100
Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> My best guess is it's got something to do with freehistdata(), which
> appears to be the only way the hash table can affect the history ring.

Revised theory: when the new node gets added, it retrieves the old node
(addhistnode() in hashtable.c).  Because of the name mangling it
retrieves the previous line if that's the same up to (any, even
significant) white space, even if HIST_IGNORE_DUPS isn't set.  (This is
what's bamboozling me the most.)  Because there's an old node (at all,
even if it doesn't match completely) it gets marked as HIST_DUP by
addhistnode().  Then logic elsewhere removes it.

It's just occurred to me that while it might get *marked* with HIST_DUP
undconditionally, the logic to remove duplicates only kicks in if
HIST_IGNORE_DUPS is set.  That might explain why NO_HIST_IGNORE_DUPS
works at all.

Is this jogging any memories?  The code is (by strong zsh tradition)
undocumented.

pws


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: HIST_IGNORE_DUPS ignores lines which differ by a different number of spaces
  2013-04-19 12:42   ` Peter Stephenson
@ 2013-04-20  3:01     ` Bart Schaefer
  0 siblings, 0 replies; 5+ messages in thread
From: Bart Schaefer @ 2013-04-20  3:01 UTC (permalink / raw)
  To: zsh-workers

I haven't ever really dug into this code, and the last person who did
so I think was probably Wayne when working on HIST_EXPIRE_DUPS_FIRST
back in 1999 and 2002.  It appears Zoltan did most of the work on the
*DUPS* -related stuff even longer ago than that.

With that caveat ...

On Apr 19,  1:42pm, Peter Stephenson wrote:
} Subject: Re: HIST_IGNORE_DUPS ignores lines which differ by a different nu
}
} On Thu, 18 Apr 2013 21:04:11 +0100
} Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
} > My best guess is it's got something to do with freehistdata(), which
} > appears to be the only way the hash table can affect the history ring.
} 
} Revised theory: when the new node gets added, it retrieves the old node
} (addhistnode() in hashtable.c).  Because of the name mangling it
} retrieves the previous line if that's the same up to (any, even
} significant) white space, even if HIST_IGNORE_DUPS isn't set.  (This is
} what's bamboozling me the most.)

The behavior is supposed to be that the most-recent previous command
line is always retrievable verbatim, regardless of whether history is
ignoring dups or even turned off entirely.  So the editor behavior is
as if HISTSIZE can never be less than 1, even though that line will
never be saved anywhere when HISTSIZE=0.

I vaguely recall from reading the discussions years back that this is
implemented by pushing commands into the "real" history ring only when
the *next* command is accepted, that is, there's some sort of purgatory
where the single previous command goes to wait before being sent to
heaven or hell by the arrival of a new command.

Perhaps that gives some insight into what you're seeing in the code?

} Because there's an old node (at all,
} even if it doesn't match completely) it gets marked as HIST_DUP by
} addhistnode().  Then logic elsewhere removes it.

I would not be surprised to find that the HIST_DUP marker is set for two
reasons:

(1) so HIST_EXPIRE_DUPS_FIRST can work even when NO_HIST_IGNORE_DUPS

(2) to implement HIST_FIND_NO_DUPS even when NO_HIST_IGNORE_DUPS

} It's just occurred to me that while it might get *marked* with HIST_DUP
} undconditionally, the logic to remove duplicates only kicks in if
} HIST_IGNORE_DUPS is set.  That might explain why NO_HIST_IGNORE_DUPS
} works at all.

Don't forget HIST_IGNORE_ALL_DUPS, too ...

Sorry I can't be more helpful.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-04-20  3:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-16 17:05 HIST_IGNORE_DUPS ignores lines which differ by a different number of spaces Vincent Lefevre
2013-04-17 16:41 ` Bart Schaefer
2013-04-18 20:04 ` Peter Stephenson
2013-04-19 12:42   ` Peter Stephenson
2013-04-20  3:01     ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).