From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Thu, 11 Nov 1999 11:56:05 +0100 From: Elliott Hughes Elliott.Hughes@genedata.com Subject: [9fans] Brian Kernighan? Topicbox-Message-UUID: 9bd66c8c-eac8-11e9-9e20-41e7f4b1d025 Message-ID: <19991111105605.ZvNMk1khc8_Q2HthEgYNh9fmQcNEOq6px7trgE-sN-k@z> forsyth wrote: > the Cornell PL/1 compiler used a similar approach to > do spelling correction on keywords as part of a broader > attempt to repair all obvious errors in the given program; the results > were amusing if not enlightening, as they so often are with AI. IBM's jikes Java compiler also tries spelling correction, but its ideas of proximity have nothing to do with any human's. it's particularly unfortunate that it doesn't even know about the Java naming conventions. (not that i necessarily think it should, it's just that if it's going to try to guess what you meant to type, it would be better off making educated guesses.) there's a big difference between correcting simple typos and more complicated "wrong identifier" errors. anyway, back to the point: the original questioner might be interested in "Finding Approximate Matches in Large Lexicons" by Justin Zobel (jz@cs.rmit.oz.au) and Philip Dart (philip@cs.mu.oz.au), which was the best paper i found when trying to come up with decent guesses in a "dict"-like program. btw, has anyone had better luck than i at getting information about the CD-ROM version of the OED, with a view to having a Plan 9/Unix OED "dict"? ever since i read the acme paper with its "futtock" example, i've been jealous. -- "As the Chinese say, 1001 words is worth more than a picture." -- John McCarthy