Group variable selection methods and their applications in analyses of genomic data
Page: 1-93
2009
- 210Usage
Metric Options: CountsSelecting the 1-year or 3-year option will change the metrics count to percentiles, illustrating how an article or review compares to other articles or reviews within the selected time period in the same journal. Selecting the 1-year option compares the metrics against other articles/reviews that were also published in the same calendar year. Selecting the 3-year option compares the metrics against other articles/reviews that were also published in the same calendar year plus the two years prior.
Example: if you select the 1-year option for an article published in 2019 and a metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019. If you select the 3-year option for the same article published in 2019 and the metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019, 2018 and 2017.
Citation Benchmarking is provided by Scopus and SciVal and is different from the metrics context provided by PlumX Metrics.
Example: if you select the 1-year option for an article published in 2019 and a metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019. If you select the 3-year option for the same article published in 2019 and the metric category shows 90%, that means that the article or review is performing better than 90% of the other articles/reviews published in that journal in 2019, 2018 and 2017.
Citation Benchmarking is provided by Scopus and SciVal and is different from the metrics context provided by PlumX Metrics.
Metrics Details
- Usage210
- Abstract Views210
Thesis / Dissertation Description
Variable selection methods are powerful tools in analysis of high dimensional massive data. In bioinformatics, the methods have often been applied in gene expression microarray data to reduce dimensions and select important features. It is well known that for genes participating in a common biological pathway or sharing a similar function, the correlations among them can be very high. However, most of the available variable selection methods cannot deal with the complicated interdependence among data. We propose three new methods, via two different approaches, by selecting groups of variables in regression models. First, we propose two new selection algorithms, namely gLars and gRidge, following LARS’ forward selection procedure. The new approaches intend to conduct grouping and selecting at the same time, not requiring any prior information on group structures of the variables. The third method called SCAD_ℓ2 is a penalized regression method. Lasso, a popular regularization approach, utilizes L 1 penalty. Elastic net combines L1 and L2 penalties to incorporate group effects in the variables. However, both of them provide biased coefficient estimators. The biasedness of Lasso and elastic net interferes with variable selection. Fan and Li (2001) proposed a non-concave penalty function called SCAD with many good properties, including unbiasedness. Our new method SCAD_ℓ2 combines the penalties of SCAD and L2. It favors group effects in addition to the good properties of SCAD. Simulations show that our proposed methods often outperform the existing variable selection methods, including Lasso, LARS, SCAD and elastic net, in terms of both reducing prediction error and preserving model sparsity, while yielding additional group information. We apply the proposed methods in gene expression microarray data and genetic variant SNP data. The group variable selection models are more appropriate than other existing methods for the genomic data with complicated interdependent structures.
Bibliographic Details
Provide Feedback
Have ideas for a new metric? Would you like to see something else here?Let us know