Inflated Effect Sizes, Underpowered Tests and the Severity Measure of Evidence
- Publication Year:
- Usage 87
- Downloads 87
- PhilSci-Archive 87
- Repository URL:
- Most Recent Tweet View All Tweets
The severity score is particularly high for hypotheses that are substantially different from the null-hypothesis when a significant result is obtained by using an underpowered test. This means that such hypotheses are very well supported by the evidence according to that measure. However, it is now well documented that significant tests with low power display inflated effect sizes. They systematically show departures from the null hypothesis H0 that are much greater than they really are. This is problematic in research contexts where the differences between H0 and H1 is particularly small and where the sample size is also small. In this paper I argue that the severity score is an inadequate measure of evidence and that it should be rejected. The reason is that it is sensitive to the inflated effect sizes provided by underpowered significant tests: inflated effect sizes also inflate severity scores.