17 Mar 16:50
Re: Confused about case-sensitivity
Hi Richard, here's an example: http://dpaste.com/39820/ On my system, the output is: (term, num results) buffed 0 Buffed 1 gee 0 Gee 1 GeE 1 shock 1 evolution 0 Normally, and according to your explanation, I would expect to see exactly one result for each query. Also (I didn't mention this in my original post), as you can see the fields "title" and "text" are defined exactly the same way, but appear to behave differently. The all-lowercase query "shock" finds the document through the "title" field, while "evolution" through the "text" field doesn't seem to work. Thanks for your help, Michael On Mar 17, 9:12 am, Richard Boulton <rich...@...> wrote: > Michael Elsdörfer wrote: > > I have two fields in a xappy index, both using INDEX_FREETEXT, one > > having a "language" option applied, the other not. > > > The stemmed fields work as expected, case does not matter at all. > > > The unstemmed fields mostly seem to ignore case as well - except for > > the first character: > > > For example, the term "gee" in that field will be matched by "GEE", > > "GeE" etc., but not by "gee" or "geE". Even more strange, the case of > > the term I originally indexed doesn't seem to matter either. If the > > term is "buffed", I have to query for "Buffed" to find the document. > > > Any idea what might be wrong here? If not, any suggestions on how to > > debug this? > > This doesn't seem like the correct behaviour to me... The intended > behaviour is that: > > - In the unstemmed case, capitalisation should be entirely irrelevant. > > - In the stemmed case, a query word which is uncapitalised should > match any word with the same stem, but should give a higher weight to an > exact match for the word. A query word which has an initial capital is > assumed to represent a proper noun, and will only match an exact match > for the word. > > If you're able to put together a minimal example demonstrating the > behaviour I'm seeing, that would be very helpful - I'll try and look > into this shortly, and such an example would save me a bit of time. > > -- > Richard
RSS Feed