3 Mar 2003 18:09
Re: get_collapse_count()
Olly Betts <olly <at> survex.com>
2003-03-03 17:09:30 GMT
2003-03-03 17:09:30 GMT
On Mon, Mar 03, 2003 at 04:11:24PM -0000, Sam Liddicott wrote: > Olly Betts wrote: > > Hmm, I'm slightly suprised that this is zero based - maybe it's just > > me though. Do other people naturally expect "collapse_count" to > > include the match itself? i.e. always be at least 1? > > It's meant to be a count of hidden documents which seems fine to be > zero based if it is expressed as "hidden documents". If it is > expressed as "total documents" then it should be 1 based. I prefer 0 > based. I can see the logic of both, I just raised the issue since it's generally better for the semantics to match what most developers naturally expect. > > But as I've pointed out, the count is at best an approximation. In > > fact you can even get 0 when there are collapsed hits with the same > > collapse value, > > how? The same reason as below? http://article.gmane.org/gmane.comp.search.xapian.devel/67 > > and non-zero when there aren't. > > Do you mean non-zero when a repeated search without collapsing would > relevance-cutoff any other values? Yes. I also said it could happen if you're sorting in bands, though I'm not sure now what I had in mind. But it would probably be unwise to build code on the assumption that a relevance-cutoff is going to be the only way this can happen. > > You said we could > > just explain this in the documentation, but you haven't. Instead > > there's an example which leads the reader to believe that the count > > is completely correct... > > Sorry, I explained this in the omega documentation, but you are right > I should explain it here and have now. Umm, docs/omegascript.txt doesn't appear to mention that the value is approximate either. Cheers, Olly ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf
RSS Feed