Olly Betts | 3 Mar 2003 18:09
Favicon
Gravatar

Re: get_collapse_count()

On Mon, Mar 03, 2003 at 04:11:24PM -0000, Sam Liddicott wrote:
> Olly Betts wrote:
> > Hmm, I'm slightly suprised that this is zero based - maybe it's just
> > me though.  Do other people naturally expect "collapse_count" to
> > include the match itself?  i.e. always be at least 1?
> 
> It's meant to be a count of hidden documents which seems fine to be
> zero based if it is expressed as "hidden documents".  If it is
> expressed as "total documents" then it should be 1 based.  I prefer 0
> based.

I can see the logic of both, I just raised the issue since it's
generally better for the semantics to match what most developers
naturally expect.

> > But as I've pointed out, the count is at best an approximation.  In
> > fact you can even get 0 when there are collapsed hits with the same
> > collapse value, 
> 
> how?  The same reason as below?

http://article.gmane.org/gmane.comp.search.xapian.devel/67

> > and non-zero when there aren't.  
> 
> Do you mean non-zero when a repeated search without collapsing would
> relevance-cutoff any other values?

Yes.  I also said it could happen if you're sorting in bands, though
I'm not sure now what I had in mind.  But it would probably be unwise to
build code on the assumption that a relevance-cutoff is going to be the
only way this can happen.

> > You said we could
> > just explain this in the documentation, but you haven't.  Instead
> > there's an example which leads the reader to believe that the count
> > is completely correct... 
> 
> Sorry, I explained this in the omega documentation, but you are right
> I should explain it here and have now.

Umm, docs/omegascript.txt doesn't appear to mention that the value is
approximate either.

Cheers,
    Olly

-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf

Gmane