zsh-workers
 help / color / mirror / code / Atom feed
* Re: correction hook
@ 2002-04-07 16:51 Felix Rosencrantz
  0 siblings, 0 replies; 7+ messages in thread
From: Felix Rosencrantz @ 2002-04-07 16:51 UTC (permalink / raw)
  To: zsh-workers

This is a response to an old thread from February.  A draft I wrote in March. 
The topic hasn't come up in a while, so it seems like there wasn't much
interest.
-FR

On Mon, 11 Feb 2002 Bart Schaefer wrote:
>On Mon, 11 Feb 2002, Clint Adams wrote:
>> Sorry, I described a specific case and forgot to mention the more
>> general problems.  They are as follows.
>> 1) In the case of mak/mawk/make, the user has no instances of 'mawk'
>> in his history, but he has recently typed 'make'.  The correction
>> algorithm is unaware of these details.
>
>"Past performance is no guarantee of future results."  Just because a
>software developer tends to type "make" repeatedly, doesn't mean that's a
>good indication of the behavior of some other shell user.

Correction is never going to be 100% right.  The idea is to improve
the odds.  Different scoring strategies (e.g. commands previously
used score higher) will give different results that will depend on
the user&tasks at hand.  I think it would be useful, if just like
completion, correction could use different scoring strategies for
performing correction.

So many times I find I'm upset with correction for offering up commands
I have never used.  So this would be one strategy for finding a good
correction.  But a different user may find that this is not for them.
Maybe that user is very prone to having one of their hands offset on
the keyboard (more becomes nire), so that user might prefer to use a
keyboard hand offset strategy.


>> 2) When one has CORRECT_ALL set, correction isn't nearly as intelligent
>> as completion. [...]
>> Rather than adding a slew of additional aliases, it would be nice if
>> correction were smart enough [...]
>
>We've discussed before that the correction system could be tied to
>completion.  It'd have to be something equivalent to that in order for it
>to "know" the legitimate arguments to every possible command, and there's
>no point in inventing all of that twice.

Sounds like time to plug the XML completion stuff... :) 
It would be easy to take the XML description of a command and convert
that into a getopt like description, which could be used to parse the
command line.  And the XML description could still be used to generate
a completion function.  If there is no checker for a command, it just
scores lower than other commands, assumption that this command is not
interesting to the user.

>Correction could actually go one step further -- when it finds a command
>name it wants to correct, it could, for each possible correction, attempt
>to check the argument list against the possible completions, and pick the
>correction for which the existing arguments give the best match.  (That
>would be some impressive code.  Anybody looking for a Master's thesis
>project?)

Seems like you have described a good algorithm.  Not sure why it would
have to be as complicated as a Master's thesis.  With some sort of
getopt-parsing, and argument count checks, it might be a useful strategy
for correcting a command.  (Though with a good hook mechanism in place,
it would make it easier for someone to tackle this as a Master's thesis
project.)

I've always been pretty impressed with the results of predictive-type
widget you've written.  It works really well in certain circumstances.
All just by calling completion. Though shell typing prediction has been
the subject of a few graduate students research. (I mention the ones I
know in zsh-workers/12289.)  So it might be possible to use some less
complicated algorithms and still get good results.

Also, it might be useful if there were a few more options at the
correction prompt.  What if zsh offered more that one possible
correction (rather than the "I'm feeling lucky" way).  The user could
select the best choice via a number menu, or completion menu. Maybe
we should allow the user to jump directly into completion from the
correction message.


On Mon, 11 Feb 2002 Oliver Kiddle wrote:
>I'm not entirely convinced by the correction mechanism because it has
>to interrupt you with its prompt. With the new completion system I get
>any typo in a word I completed corrected by _approximate anyway. I'd be
>more inclined to think about a totally different way of spotting and
>communicating typos such as using the completion system continually and
>underlining possible typos.

I like that idea.  That would be very useful.  Though, if the user
didn't fix the command name, the shell should still prompt about the
problem.  So there would still be the issue of post-enter command line
correction.

Also, there is a difference between correction and approximation.
Command correction uses a scoring system that not only counts errors,
it also looks at which errors are more likely based on a model of the
keyboard. (At least that what I remember...)  Which is different than
the _correct completer...

I would like to see some improvements on the correction front.  I think
a hook would be useful.  And potentially could provide even better
correction choices.  Also, if we had some way of highlighting problems
as the user typed, that would be useful.

My experience with tcsh and zsh, is that zsh does a much better job
at correction than tcsh did. (It's been a while since I used tcsh.)
I believe the reason was related to the fact that zsh used a scoring
system that modeled the keyboard, while tcsh just did a score based on
typing error counts.

Throw in past command usage, some flag&argument checking, multiple
choices, and some completion correction could be even better.

-FR.



__________________________________________________
Do You Yahoo!?
Yahoo! Tax Center - online filing with TurboTax
http://taxes.yahoo.com/


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: correction hook
  2002-02-11 16:38   ` Clint Adams
@ 2002-02-11 19:14     ` Bart Schaefer
  0 siblings, 0 replies; 7+ messages in thread
From: Bart Schaefer @ 2002-02-11 19:14 UTC (permalink / raw)
  To: zsh-workers

On Mon, 11 Feb 2002, Clint Adams wrote:

> Someone complained to me that when he mistyped "make" as "mak", zsh
> would spell-correct it to "mawk" instead of "make".  I had asked him for
> a proposed algorithm to solve this, but he had none.

Incidentally, I think this happens because of the order in which the hash
table is traversed when looking for a likely correction.

On Mon, 11 Feb 2002, Peter Stephenson wrote:

> Peter Stephenson wrote:
> > You can add stuff to the hash table with the `hash' command; it's probably
> > a gap that, as far as I can see, you can delete commands you never want to
> > use or see completions or corrections for.
>
> That should say `can't', or this doesn't make sense.

But you *can* delete commands you never want to use.  You can't prevent
them from coming back again if you rehash or if you actually do use them,
but `hash -f ; unhash mawk' will produce the desired effect for the
example above.

On Mon, 11 Feb 2002, Clint Adams wrote:

> Sorry, I described a specific case and forgot to mention the more
> general problems.  They are as follows.
>
> 1) In the case of mak/mawk/make, the user has no instances of 'mawk'
> in his history, but he has recently typed 'make'.  The correction
> algorithm is unaware of these details.

"Past performance is no guarantee of future results."  Just because a
software developer tends to type "make" repeatedly, doesn't mean that's a
good indication of the behavior of some other shell user.

> 2) When one has CORRECT_ALL set, correction isn't nearly as intelligent
> as completion. [...]
> Rather than adding a slew of additional aliases, it would be nice if
> correction were smart enough [...]

We've discussed before that the correction system could be tied to
completion.  It'd have to be something equivalent to that in order for it
to "know" the legitimate arguments to every possible command, and there's
no point in inventing all of that twice.

Correction could actually go one step further -- when it finds a command
name it wants to correct, it could, for each possible correction, attempt
to check the argument list against the possible completions, and pick the
correction for which the existing arguments give the best match.  (That
would be some impressive code.  Anybody looking for a Master's thesis
project?)

> > more inclined to think about a totally different way of spotting and
> > communicating typos such as using the completion system continually and
> > underlining possible typos.
>
> I imagine that would be slow, though quite useful to some.

It's not all that slow.  You can try something like it now with the
predictive typing bindings that are included in the distribution.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: correction hook
  2002-02-11 13:04 ` Oliver Kiddle
@ 2002-02-11 16:38   ` Clint Adams
  2002-02-11 19:14     ` Bart Schaefer
  0 siblings, 1 reply; 7+ messages in thread
From: Clint Adams @ 2002-02-11 16:38 UTC (permalink / raw)
  To: Oliver Kiddle; +Cc: zsh-workers

> Aliases have served me well for the few common typos like this. I have
> reservations about this because this simple function probably doesn't
> go far enough. How might you disable correction for certain words, e.g.
> the destination to a mv command?

Sorry, I described a specific case and forgot to mention the more
general problems.  They are as follows.

1) In the case of mak/mawk/make, the user has no instances of 'mawk'
in his history, but he has recently typed 'make'.  The correction
algorithm is unaware of these details.

2) When one has CORRECT_ALL set, correction isn't nearly as intelligent
as completion.  I have things like
alias cp='nocorrect cp'
alias mkdir='nocorrect mkdir'
alias mv='nocorrect mv'

Rather than adding a slew of additional aliases, it would be nice if
correction were smart enough not to assume that arguments to, f.ex.,
ssh should match local files.

> I'm not entirely convinced by the correction mechanism because it has
> to interrupt you with its prompt. With the new completion system I get
> any typo in a word I completed corrected by _approximate anyway. I'd be
> more inclined to think about a totally different way of spotting and
> communicating typos such as using the completion system continually and
> underlining possible typos.

I imagine that would be slow, though quite useful to some.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: correction hook
  2002-02-11 13:48 ` Peter Stephenson
@ 2002-02-11 13:55   ` Peter Stephenson
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Stephenson @ 2002-02-11 13:55 UTC (permalink / raw)
  To: Zsh hackers list

Peter Stephenson wrote:
> You can add stuff to the hash table with the `hash' command; it's probably
> a gap that, as far as I can see, you can delete commands you never want to
> use or see completions or corrections for.

That should say `can't', or this doesn't make sense.

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR Ltd., Science Park, Milton Road,
Cambridge, CB4 0WH, UK                          Tel: +44 (0)1223 392070


**********************************************************************
The information transmitted is intended only for the person or
entity to which it is addressed and may contain confidential 
and/or privileged material. 
Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is 
prohibited.  
If you received this in error, please contact the sender and 
delete the material from any computer.
**********************************************************************


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: correction hook
  2002-02-11  8:08 Clint Adams
  2002-02-11 13:04 ` Oliver Kiddle
@ 2002-02-11 13:48 ` Peter Stephenson
  2002-02-11 13:55   ` Peter Stephenson
  1 sibling, 1 reply; 7+ messages in thread
From: Peter Stephenson @ 2002-02-11 13:48 UTC (permalink / raw)
  To: Zsh hackers list

Clint Adams wrote:
> Someone complained to me that when he mistyped "make" as "mak", zsh
> would spell-correct it to "mawk" instead of "make".  I had asked him for
> a proposed algorithm to solve this, but he had none.

When I wrote the approximate pattern matching stuff, I had it vaguely in
mind that you might want to give different types of error different
priorities, although it doesn't currently support that.  In this case,
unfortunately, that won't help, since they are both the same type of
approximation, adding a missing character.

You can add stuff to the hash table with the `hash' command; it's probably
a gap that, as far as I can see, you can delete commands you never want to
use or see completions or corrections for.  It's also always struck me as a
bit unhelpful that a rehash, implicit or explicit, removes all deliberate
changes to the command hash table.

If spelling correction could use the results from completion, there would
be ways around this, but that's always looked like it would need a major
rewrite.

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR Ltd., Science Park, Milton Road,
Cambridge, CB4 0WH, UK                          Tel: +44 (0)1223 392070


**********************************************************************
The information transmitted is intended only for the person or
entity to which it is addressed and may contain confidential 
and/or privileged material. 
Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by 
persons or entities other than the intended recipient is 
prohibited.  
If you received this in error, please contact the sender and 
delete the material from any computer.
**********************************************************************


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: correction hook
  2002-02-11  8:08 Clint Adams
@ 2002-02-11 13:04 ` Oliver Kiddle
  2002-02-11 16:38   ` Clint Adams
  2002-02-11 13:48 ` Peter Stephenson
  1 sibling, 1 reply; 7+ messages in thread
From: Oliver Kiddle @ 2002-02-11 13:04 UTC (permalink / raw)
  To: zsh-workers

 --- Clint Adams <clint@zsh.org> wrote:
> Someone complained to me that when he mistyped "make" as "mak", zsh
> would spell-correct it to "mawk" instead of "make".  I had asked him
> for
> a proposed algorithm to solve this, but he had none.
> 
> The thought then occurred to me that a hook function might be a bit
> more
> flexible.  With the following patch, one can now do something like
> 
> correctword() {
> [[ "$1" == mak ]] && CORRECT_GUESS=make
> }

I haven't been able to try the patch but how would this work if the
CORRECT_ALL option is set and there are corrections to be made to more
than one word on the command-line. Perhaps the REPLY array could be
used instead of one scalar so that all words can be set.

> or potentially something more sophisticated that couldn't be 
> accomplished as effectively as by alias mak=make.

Aliases have served me well for the few common typos like this. I have
reservations about this because this simple function probably doesn't
go far enough. How might you disable correction for certain words, e.g.
the destination to a mv command?

I'm not entirely convinced by the correction mechanism because it has
to interrupt you with its prompt. With the new completion system I get
any typo in a word I completed corrected by _approximate anyway. I'd be
more inclined to think about a totally different way of spotting and
communicating typos such as using the completion system continually and
underlining possible typos.

Oliver

__________________________________________________
Do You Yahoo!?
Everything you'll ever need on one web page
from News and Sport to Email and Music Charts
http://uk.my.yahoo.com


^ permalink raw reply	[flat|nested] 7+ messages in thread

* correction hook
@ 2002-02-11  8:08 Clint Adams
  2002-02-11 13:04 ` Oliver Kiddle
  2002-02-11 13:48 ` Peter Stephenson
  0 siblings, 2 replies; 7+ messages in thread
From: Clint Adams @ 2002-02-11  8:08 UTC (permalink / raw)
  To: zsh-workers

Someone complained to me that when he mistyped "make" as "mak", zsh
would spell-correct it to "mawk" instead of "make".  I had asked him for
a proposed algorithm to solve this, but he had none.

The thought then occurred to me that a hook function might be a bit more
flexible.  With the following patch, one can now do something like

correctword() {
[[ "$1" == mak ]] && CORRECT_GUESS=make
}

or potentially something more sophisticated that couldn't be 
accomplished as effectively as by alias mak=make.

I'll refrain from committing the patch.  Does anyone have a better way
of solving this problem?

Index: Src/utils.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/utils.c,v
retrieving revision 1.39
diff -u -r1.39 utils.c
--- Src/utils.c	6 Jan 2002 01:07:23 -0000	1.39
+++ Src/utils.c	11 Feb 2002 07:54:28 -0000
@@ -1536,6 +1536,7 @@
     char ic = '\0';
     int ne;
     int preflen = 0;
+    Eprog prog;
 
     if ((histdone & HISTFLAG_NOEXEC) || **s == '-' || **s == '%')
 	return;
@@ -1632,6 +1633,27 @@
 	    guess = *s;
 	    *guess = *best = ztokens[ic - Pound];
 	}
+
+	if ((prog = getshfunc("correctword")) != &dummy_eprog) {
+		
+		char *correct_guess;
+		int osc = sfcontext;
+		LinkList args = NULL;
+		
+		args = newlinklist();
+		addlinknode(args, "correctword");
+		addlinknode(args, dupstring(guess));
+		addlinknode(args, dupstring(best));
+		
+		sfcontext = SFC_HOOK;
+		doshfunc("correctword", prog, args, 0, 1);
+		sfcontext = osc;
+		
+		correct_guess = ztrdup(getsparam("CORRECT_GUESS"));
+		if (correct_guess)
+			best = correct_guess;
+	}
+
 	if (ask) {
 	    if (noquery(0)) {
 		x = 'n';


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2002-04-07 16:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-04-07 16:51 correction hook Felix Rosencrantz
  -- strict thread matches above, loose matches on Subject: below --
2002-02-11  8:08 Clint Adams
2002-02-11 13:04 ` Oliver Kiddle
2002-02-11 16:38   ` Clint Adams
2002-02-11 19:14     ` Bart Schaefer
2002-02-11 13:48 ` Peter Stephenson
2002-02-11 13:55   ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).