Interpretable and accurate medical data classification – a multi-objective genetic-fuzzy optimization approach

Citation data:

Expert Systems with Applications, ISSN: 0957-4174, Vol: 71, Page: 26-39

Publication Year:
2017
Usage 1843
Clicks 1543
Abstract Views 297
Link-outs 3
Captures 32
Readers 32
Mentions 96
Q&A Site Mentions 53
Blog Mentions 33
News Mentions 7
References 3
Social Media 1361
Shares, Likes & Comments 1351
Tweets 10
Citations 5
Citation Indexes 5
DOI:
10.1016/j.eswa.2016.11.017
Author(s):
Marian B. Gorzałczany; Filip Rudziński
Publisher(s):
Elsevier BV
Tags:
Engineering; Computer Science
Most Recent Tweet View All Tweets
Most Recent Blog Mention
Most Recent News Mention
article description
In medical decision support systems, both the accuracy (i.e., the ability to adequately represent the decision making processes) as well as the transparency and interpretability (i.e., the ability to provide a domain user with compact and understandable explanation and justification of the proposed decisions) play essential roles. This paper presents an approach for automatic design of fuzzy rule-based classification systems (FRBCSs) from medical data using multi-objective evolutionary optimization algorithms (MOEOAs). Our approach generates, in a single run, a collection of solutions (medical FRBCSs) characterized by various levels of accuracy-interpretability trade-off. We propose a new complexity-related interpretability measure and we address the semantics-related interpretability issue by means of efficient implementation of the so-called strong fuzzy partitions of attribute domains. We also introduce a special-coding-free representation of the rule base and original genetic operators for its processing as well as we implement our ideas in the context of well-known and one of the presently most advanced MOEOAs, i.e., Non-dominated Sorting Genetic Algorithm II (NSGA-II). An important part of the paper is devoted to a broad comparative analysis of our approach and as many as 26 alternative techniques arranged in 32 experimental set-ups and applied to three well-known benchmark medical data sets ( Breast Cancer Wisconsin (Original), Pima Indians Diabetes, and Heart Disease (Cleveland) ) available from the UCI repository of machine learning databases ( http://archive.ics.uci.edu/ml ). A number of useful in medical applications performance measures including accuracy, sensitivity, specificity, and several interpretability measures are employed. The results of such a broad comparative analysis demonstrate that our approach significantly outperforms the alternative methods in terms of the interpretability of the obtained FRBCSs while remaining either competitive or superior in terms of their accuracy. It is worth stressing that the overwhelming majority of the existing medical classification methods concentrate almost exclusively on the accuracy issues.