Richard Boulton | 17 Mar 09:12

Re: Confused about case-sensitivity


Michael Elsdörfer wrote:
> I have two fields in a xappy index, both using INDEX_FREETEXT, one
> having a "language" option applied, the other not.
> 
> The stemmed fields work as expected, case does not matter at all.
> 
> The unstemmed fields mostly seem to ignore case as well - except for
> the first character:
> 
> For example, the term "gee" in that field will be matched by "GEE",
> "GeE" etc., but not by "gee" or "geE". Even more strange, the case of
> the term I originally indexed doesn't seem to matter either. If the
> term is "buffed", I have to query for "Buffed" to find the document.
> 
> Any idea what might be wrong here? If not, any suggestions on how to
> debug this?

This doesn't seem like the correct behaviour to me...  The intended 
behaviour is that:

  - In the unstemmed case, capitalisation should be entirely irrelevant.

  - In the stemmed case, a query word which is uncapitalised should 
match any word with the same stem, but should give a higher weight to an 
exact match for the word.  A query word which has an initial capital is 
assumed to represent a proper noun, and will only match an exact match 
for the word.

If you're able to put together a minimal example demonstrating the 
behaviour I'm seeing, that would be very helpful - I'll try and look 
into this shortly, and such an example would save me a bit of time.

--

-- 
Richard


Gmane