You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
sklearn.exceptions.UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in samples with no predicted labels.
This is because
some of the test samples generated in csrank/tests/test_choice_functions.py:trivial_choice_problem have no true positives
some of the learners predict no positives for some of the generated problems
In both of those cases the f-measure is not properly defined. sklearn assigns 0 and 1 respectively.
How should we deal with this? A metric should be defined for these possibilities. 0 and 1 in those cases seems somewhat reasonable, so maybe we should just silence the warning?
The text was updated successfully, but these errors were encountered:
The first problem we should avoid by generating test samples, which cannot consist of only negatives. Assigning a 1 in these cases would be sensible in general.
Regarding the second case: Assigning 0 here is sensible, since the learner achieved no true positive.
Note: My version of sklearn (0.20.2) returns 0.0 for both cases.
You're right, sklearn returns 0.0 for both cases. The more I think about this the less sure I am that defining values for these cases is a good idea. The implementation is also non-straightforward, since we would have to do some of the work that we currently outsource to scipy.
Here are the tests I came up with:
Therearenotruepositivesbutsomepredictedpositives; e.g. "infinite recall".
>>>f1_measure([[False, False]], [[True, True]])
0.0Therearenopredictedpositivesbutsometruepositives; e.g. 0recall, 0precision.
>>>f1_measure([[True, True]], [[False, False]])
0.0Thereareneithertruenorpredictedpositives, e.g. allpredictionsarecorrect:
>>>f1_measure([[False, False]], [[False, False]])
1.0
(2) and (3) seem pretty clear cut, but (1) should really depend on how many labels were predicted positive. Should we sidestep the issue by just defining cases (2) and (3) and continuing to throw a warning in (1)?
From those three cases (2) is an obvious 0.0.
For (3) the value 1.0 is sensible, but I would still throw a warning, since having no positives in an instance might hint at a problem in the dataset.
Similarly, I would return 0.0 for (1) and raise a warning.
sklearn
issues a warning during the tests:This is because
csrank/tests/test_choice_functions.py:trivial_choice_problem
have no true positivesIn both of those cases the f-measure is not properly defined.
sklearn
assigns 0 and 1 respectively.How should we deal with this? A metric should be defined for these possibilities.
0
and1
in those cases seems somewhat reasonable, so maybe we should just silence the warning?The text was updated successfully, but these errors were encountered: