Divide and recombine for large complex data: The subset likelihood modeling approach to recombination

Citation DataPage: 1-79

Publication Year2015

0
Citations
54
Usage
0
Captures
0
Mentions
0
Social Media

Metric Options: Counts1 Year3 Year

Metrics Details

Usage
54
- Abstract Views
  54

Thesis / Dissertation Description

Divide and recombine (D&R) is a statistical framework for the analysis of large complex data. The data are divided into subsets. Numeric and visualization methods, which collectively are analytic methods, are applied to each subset. For each analytic method, the outputs of the application of the method to the subsets are recombined. So each analytic method has associated with it a division method and a recombination method. Here we study D&R methods for likelihood-based model fitting. We introduce a notion of likelihood analysis and modeling. We divide the data and fit a likelihood model on each subset. The fitted model is characterized by a set of parameters much smaller than the subset data size, but retains as much information as possible about the true subset likelihood. Analysis of subset likelihoods and their fitted models consists of visualizations on an appropriate scale and region. These visualizations allow the analyst to verify the choice and fit of the model. The fitted models are recombined across subsets to form a model of the the all-data likelihood, which we maximize to obtain a likelihood modeling estimate (LME). We present simulation results demonstrating the performance of our method compared with the all-data maximum likelihood estimate (MLE) for the case of logistic regression.

Bibliographic Details

REPOSITORY URLhttps://docs.lib.purdue.edu/dissertations/AAI3719180

URL IDhttps://docs.lib.purdue.edu/dissertations/AAI3719180; https://docs.lib.purdue.edu/cgi/viewcontent.cgi?article=16279&context=dissertations

AUTHOR(S)

Philip Gautier

Provide Feedback

Have ideas for a new metric? Would you like to see something else here?Let us know