How to interpret F-measure values?
I would like to know how to interpret a difference of f-measure values. I know that f-measure is a balanced mean between precision and recall, but I am asking about the practical meaning of a difference in F-measures.
For example, if a classifier C1 has an accuracy of 0.4 and another classifier C2 an accuracy of 0.8, then we can say that C2 has correctly classified the double of test examples compared to C1. However, if a classifier C1 has an F-measure of 0.4 for a certain class and another classifier C2 an F-measure of 0.8, what can we state about the difference in performance of the 2 classifiers ? Can we say that C2 has classified X more instances correctly that C1 ?
I cannot think of an intuitive meaning of the F measure, because it's just a combined metric. What's more intuitive than F-mesure, of course, is precision and recall.
But using two values, we often cannot determine if one algorithm is superior to another. For example, if one algorithm has higher precision but lower recall than other, how can you tell which algorithm is better?
If you have a specific goal in your mind like 'Precision is the king. I don't care much about recall', then there's no problem. Higher precision is better. But if you don't have such a strong goal, you will want a combined metric. That's F-measure. By using it, you will compare some of precision and some of recall.
The ROC curve is often drawn stating the F-measure. You may find this article interesting as it contains explanation on several measures including ROC curves: http://binf.gmu.edu/mmasso/ROC101.pdf