josef.pktd | 28 May 20:53 2009
Picon

Re: Strange discontinuity in noncentral chisquare

On Thu, May 28, 2009 at 1:56 PM, Neal Becker <ndbecker2 <at> gmail.com> wrote:
> def pmiss2 (x, esnodB, N):
>    esno = 10**(0.1 * esnodB) * N
>    var = 1/esno
>    _lambda = 1/(0.5*var)
>
>    return ncx2.cdf (x, 2, _lambda)
>
> x = np.arange (0, 50, 0.1)
> p1 = [pmiss2 (e, 3.5, 24) for e in x]
>
> What's with this strange discontinuity?
> print p1:
> ...
> 3.475382846574262e-21,
>  4.2226227741362447e-21,
>  5.1248671653198949e-21,
>  6.2130949241675783e-21,
>  7.5242411687161146e-21,
>  9.1022970542215721e-21,
>  5.8787514615651664e-09,
>  6.2565279721932619e-09,
>  6.656924144742753e-09,
>  7.0811937641411923e-09,
>  7.5306544171300622e-09,
>  8.0066904400027596e-09,
>  8.5107559880675483e-09,
>  9.044378231221098e-09,
>  9.6091606801490766e-09,
> ...
>

This must be a bug in scipy.special.chndtr

class ncx2_gen(rv_continuous):

    def _cdf(self, x, df, nc):
        return special.chndtr(x,df,nc)

here is the isolated example

>>> ncx2.cdf (np.arange (20, 25, 0.2), 2, 1.07458615e+02)
array([  8.53614872e-23,   1.31445107e-22,   2.01359832e-22,
         3.06896021e-22,   4.65417198e-22,   7.02373115e-22,
         1.05489191e-21,   1.57689175e-21,   2.34632278e-21,
         3.47538283e-21,   5.12486714e-21,   7.52424113e-21,
         5.87875088e-09,   6.65692349e-09,   7.53065368e-09,
         8.51075516e-09,   9.60915975e-09,   1.08390218e-08,
         1.22148313e-08,   1.37525362e-08,   1.54696748e-08,
         1.73854823e-08,   1.95211903e-08,   2.18999761e-08,
         2.45472869e-08])

this uses numerical integration of the pdf, which is slow but doesn't
have the discontinuity

>>> ncx2.veccdf(np.arange (20, 25, 0.2), 2, 1.07458615e+02)
array([  1.21805117e-09,   1.39734318e-09,   1.60114784e-09,
         1.83255976e-09,   2.09503169e-09,   2.39241204e-09,
         2.72898607e-09,   3.10952087e-09,   3.53931444e-09,
         4.02424929e-09,   4.57085090e-09,   5.18635130e-09,
         5.87875834e-09,   6.65693102e-09,   7.53066129e-09,
         8.51076283e-09,   9.60916749e-09,   1.08390296e-08,
         1.22148392e-08,   1.37525442e-08,   1.54696828e-08,
         1.73855268e-08,   1.95212353e-08,   2.19000216e-08,
         2.45473328e-08])

Thanks for reporting, the test suite is not so fine tuned to catch these cases.

The quick fix would be to replace the call to special with the
numerical integration, but this will make the cdf much slower ( for
the cases where it is correct). I did this for some other
distributions.
But for the use as the distribution of a test statistic the jump is
irrelevant, if you reject a hypotheses with 1e-9 or 1e-21 doesn't
really make a difference.

Josef

Gmane