2 Jan 2011 00:45
Re: Some Confusion with Evaluation framework in Weka API
Bernhard Pfahringer <bernhard.pfahringer <at> gmail.com>
2011-01-01 23:45:38 GMT
2011-01-01 23:45:38 GMT
> Thanks Bernhard for quick reply. I think I am overlooking some basic idea. > Please explain what is wrong with my assumption : Given same confusion > matrix M1 and M2 in both case with same data and algorithm but with > different random value(8, 9 as previous) why am I getting two different > value for AUC and mean absolute error. > AUC is about ranking, using the probs to sort your examples. Accuracy (and the confusion matrix) depend on a specific threshold. So if your probabilities "sort" the examples differently in different runs on either side of the threshold, you can get the exact same accuracy, but different AUC values. > I was curious because with random seed 1 i get AUC of 0.937 and 3 I get > 0.874. Confusion matrix is same in both case.Which value should I trust and > why? I suppose you are using a rather "unstable" algorithm, and/or a small number of examples, and/or have a high number of class values. What you experience is that cross-validation has some variance as well. If the variance is as high as it seems in your case, I'd repeat at least ten times with a new seed each time and take the average. BTW, this is the default for the Experimenter: 10x10fold cross-validation, to get more robust estimates. hth, Bernhard --------------------------------------------------------------------- Bernhard Pfahringer, Dept. of Computer Science, University of Waikato http://www.cs.waikato.ac.nz/~bernhard +64 7 838 4041 _______________________________________________ Wekalist mailing list Send posts to: Wekalist <at> list.scms.waikato.ac.nz List info and subscription status: https://list.scms.waikato.ac.nz/mailman/listinfo/wekalist List etiquette: http://www.cs.waikato.ac.nz/~ml/weka/mailinglist_etiquette.html
RSS Feed