A new feature subset selection using bottom-up clustering

Citation DataPattern Analysis and Applications, ISSN: 1433-7541, Vol: 21, Issue: 1, Page: 57-66

Publication Year2018

9
Citations
0
Usage
19
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Citations
9
- Citation Indexes
  9
Captures
19
- Readers
  19

Article Description

Feature subset selection and/or dimensionality reduction is an essential preprocess before performing any data mining task, especially when there are too many features in the problem space. In this paper, a clustering-based feature subset selection (CFSS) algorithm is proposed for discriminating more relevant features. In each level of agglomeration, it uses similarity measure among features to merge two most similar clusters of features. By gathering similar features into clusters and then introducing representative features of each cluster, it tries to remove some redundant features. To identify the representative features, a criterion based on mutual information is proposed. Since CFSS works in a filter manner in specifying the representatives, it is noticeably fast. As an advantage of hierarchical clustering, it does not need to determine the number of clusters in advance. In CFSS, the clustering process is repeated until all features are distributed in some clusters. However, to diffuse the features in a reasonable number of clusters, a recently proposed approach is used to obtain a suitable level for cutting the clustering tree. To assess the performance of CFSS, we have applied it on some valid UCI datasets and compared with some popular feature selection methods. The experimental results reveal the efficiency and fastness of our proposed method.

Bibliographic Details

DOI10.1007/s10044-016-0565-8

URL IDhttp://www.scopus.com/inward/record.url?partnerID=HzOxMe3b&scp=84975141194&origin=inward; http://dx.doi.org/10.1007/s10044-016-0565-8; http://link.springer.com/10.1007/s10044-016-0565-8; https://dx.doi.org/10.1007/s10044-016-0565-8; https://link.springer.com/article/10.1007/s10044-016-0565-8

AUTHOR(S)

Zeinab Dehghan; Eghbal G. Mansoori

PUBLISHER(S)

Springer Science and Business Media LLC

TAG(S)

Computer Science

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know