PlumX Metrics
Embed PlumX Metrics

Handling data irregularities in classification: Foundations, trends, and future challenges

Pattern Recognition, ISSN: 0031-3203, Vol: 81, Page: 674-693
2018
  • 172
    Citations
  • 1
    Usage
  • 140
    Captures
  • 1
    Mentions
  • 0
    Social Media
Metric Options:   Counts1 Year3 Year

Metrics Details

  • Citations
    172
    • Citation Indexes
      170
    • Policy Citations
      2
      • 2
  • Usage
    1
  • Captures
    140
  • Mentions
    1
    • News Mentions
      1
      • 1

Most Recent News

Data Quality Issues that Kill Your Machine Learning Models

Data Quality Chronicles Navigating the complexity of imperfect data This is a column series that focuses on data quality for data science. This constitutes the

Article Description

Most of the traditional pattern classifiers assume their input data to be well-behaved in terms of similar underlying class distributions, balanced size of classes, the presence of a full set of observed features in all data instances, etc. Practical datasets, however, show up with various forms of irregularities that are, very often, sufficient to confuse a classifier, thus degrading its ability to learn from the data. In this article, we provide a bird’s eye view of such data irregularities, beginning with a taxonomy and characterization of various distribution-based and feature-based irregularities. Subsequently, we discuss the notable and recent approaches that have been taken to make the existing stand-alone as well as ensemble classifiers robust against such irregularities. We also discuss the interrelation and co-occurrences of the data irregularities including class imbalance, small disjuncts, class skew, missing features, and absent (non-existing or undefined) features. Finally, we uncover a number of interesting future research avenues that are equally contextual with respect to the regular as well as deep machine learning paradigms.

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know