15 Jan 13:57
Re: Spelling based on frequency and not just distance
Philip Neustrom <philipn <at> gmail.com>
2008-01-15 12:57:18 GMT
2008-01-15 12:57:18 GMT
The patch attached to this email is better than the previous. Hopefully somebody can come up with something better entirely, as I'm not totally happy with what I have -- it tends to suggest things like "plant" for "plants" and then "plan" for "plant" :) --Philip On Jan 15, 2008 1:24 AM, Philip Neustrom < philipn <at> gmail.com> wrote: > Hey all, > > After implementing the new spelling functionality on http://wikispot.org I > noticed that terms like "wikipeda" weren't yielding spelling suggestions. > Taking a quick look at the code, it looks like if we find an exact match, > even if it has a frequency less than another match within the provided > delta, we don't suggest anything. This is probably fine for sites with > documents where you can be assured the data is properly spelled -- but not > suitable for something like a wiki or the web in general. > > I did something simple, attached in a patch. Maybe someone has a better > idea of how to weigh the different options, but my quick fix seemed to give > much better results than the "give up on exact or edit-distance-closest > match" code that was there already. > > --Philip Neustrom >
_______________________________________________ Xapian-discuss mailing list Xapian-discuss <at> lists.xapian.org http://lists.xapian.org/mailman/listinfo/xapian-discuss
RSS Feed