Home
Reading
Searching
Subscribe
Sponsors
Statistics
Posting
Contact
Spam
Lists
Links
About
Hosting
Filtering
Features Download
Marketing
Archives
FAQ
Blog
 
Gmane
From: Philip Neustrom <philipn <at> gmail.com>
Subject: Re: Spelling based on frequency and not just distance
Newsgroups: gmane.comp.search.xapian.general
Date: Tuesday 15th January 2008 12:57:18 UTC (over 10 years ago)
The patch attached to this email is better than the previous.  Hopefully
somebody can come up with something better entirely, as I'm not totally
happy with what I have -- it tends to suggest things like "plant" for
"plants" and then "plan" for "plant" :)

--Philip

On Jan 15, 2008 1:24 AM, Philip Neustrom < [email protected]> wrote:

> Hey all,
>
> After implementing the new spelling functionality on http://wikispot.org I
> noticed that terms like "wikipeda" weren't yielding spelling suggestions.
> Taking a quick look at the code, it looks like if we find an exact match,
> even if it has a frequency less than another match within the provided
> delta, we don't suggest anything.  This is probably fine for sites with
> documents where you can be assured the data is properly spelled -- but
not
> suitable for something like a wiki or the web in general.
>
> I did something simple, attached in a patch.  Maybe someone has a better
> idea of how to weigh the different options, but my quick fix seemed to
give
> much better results than the "give up on exact or edit-distance-closest
> match" code that was there already.
>
> --Philip Neustrom
>
 
CD: 3ms