Olly Betts | 6 Mar 02:49
Favicon
Gravatar

Merging stats from multiple databases for expand

In matcher/expandweight.cc we have:

OmExpandBits
operator+(const OmExpandBits &bits1, const OmExpandBits &bits2)
{
    OmExpandBits sum(bits1); 
    sum.multiplier += bits2.multiplier;
    sum.rtermfreq += bits2.rtermfreq;

    // FIXME - try to share this information rather than pick half of it
    if (bits2.dbsize > sum.dbsize) {
        DEBUGLINE(WTCALC, "OmExpandBits::operator+ using second operand: " <<
                  bits2.termfreq << "/" << bits2.dbsize << " instead of " <<
                  bits1.termfreq << "/" << bits1.dbsize);
        sum.termfreq = bits2.termfreq;
        sum.dbsize = bits2.dbsize;
    } else {
        DEBUGLINE(WTCALC, "OmExpandBits::operator+ using first operand: " <<
                  bits1.termfreq << "/" << bits1.dbsize << " instead of " <<
                  bits2.termfreq << "/" << bits2.dbsize);
        // sum already contains the parts of the first operand
    }
    return sum;
}

Why don't we "share this information" by just replacing the "if" by:

    sum.termfreq += bits2.termfreq;
    sum.dbsize += bits2.dbsize;

Am I missing some subtlety here?

I looked at the history of this code, and the essence of this is unchanged
since it was first checked in back in 1999 (and looking at that commit, it
doesn't look like this code came from another file):

http://svn.xapian.org/trunk/xapian-core/matcher/expandweight.cc?view=markup&pathrev=532

Cheers,
    Olly

Gmane