17 Mar 18:03
Re: Confused about case-sensitivity
Michael Elsdörfer wrote: > Hi Richard, > > here's an example: > > http://dpaste.com/39820/ Thanks - that was very helpful. > Normally, and according to your explanation, I would expect to see > exactly one result for each query. Yes, that would be reasonable. I've just done a quick investigation of what happens, and found the problem; we don't currently cope with mixed stemming settings correctly. If you try setting all the field actions to use the same language, or all of them to use no language (so no stemming), it works as expected. However, when any of the fields have a stemmer, the query parser fails to build the search terms for those fields correctly. I can see a "quick hack" solution, but I'm not certain it won't degrade performance elsewhere, so I'll do a few tests to check on that. I'm hoping to have time in the near future to do a clean-up of the way in which the field settings are set, which will make this kind of conflict impossible to happen, so I'm not going to spend too much effort on a short-term solution, though. For now, I suggest you use the same stemming strategy for all free text fields. Thanks very much for your feedback - it came at a good time, since I'm currently thinking about how to do this restructuring. > Also (I didn't mention this in my original post), as you can see the > fields "title" and "text" are defined exactly the same way, but appear > to behave differently. The all-lowercase query "shock" finds the > document through the "title" field, while "evolution" through the > "text" field doesn't seem to work. The search for "evolution" in the text field doesn't work because you missed the line document.fields.append(xappy.Field('text', d.get('text', ''))) in other words, you don't actually add the contents of the text field to the UnprocessedDocument anywhere! (Easy mistake to make - it took me a while to spot it...) -- -- Richard
RSS Feed