Towards Better Ways to Assess Predictive Computing in Medicine: On Reliability, Robustness, and Utility

Citation DataBig Data Analysis and Artificial Intelligence for Medical Sciences, Page: 309-337

Publication Year2024

0
Citations
0
Usage
0
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Book Chapter Description

Computational classification systems built using machine learning (ML) techniques are increasingly being evaluated and employed in medical settings for a number of purposes and applications, including diagnosis, prognosis, and risk stratification. However, evaluation and validation practices that are commonly used and adopted in the application of ML to other disciplines are unlikely to be meaningfully applicable to medicine. In fact, otherwise, technically sound systems have been found to perform poorly in real settings, a concept that has been termed the “last mile of implementation.” In this chapter, we will focus on three main factors underlying the so-called last mile: the impact of observer variability on ground truth reliability; the meaningful and appropriateness of commonly adopted performance measures; and the issue of replicability in ML studies. We will discuss the above mentioned issues, and we will delineate possible solutions and concepts to address them.

Bibliographic Details

DOI10.1002/9781119846567.ch14

URL IDhttp://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=85200282115&origin=inward; http://dx.doi.org/10.1002/9781119846567.ch14; https://onlinelibrary.wiley.com/doi/10.1002/9781119846567.ch14

AUTHOR(S)

Federico Cabitza; Andrea Campagner

PUBLISHER(S)

Wiley

TAG(S)

Medicine; Biochemistry, Genetics and Molecular Biology; Agricultural and Biological Sciences

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know