Hierarchical Bayesian Regression with Application in Spatial Modeling and Outlier Detection
 Publication Year:
 2018

 Bepress 12

 Bepress 1
 Repository URL:
 http://scholarworks.uark.edu/etd/2669
 Author(s):
 Tags:
 Bayesian Analysis; CAR Prior; MCMC; Outliers Detection; Reversible Jump MCMC; Spatial Data; Applied Mathematics; Statistics and Probability
thesis / dissertation description
This dissertation makes two important contributions to the development of Bayesian hierarchical models. The first contribution is focused on spatial modeling. Spatial data observed on a group of areal units is common in scientific applications. The usual hierarchical approach for modeling this kind of dataset is to introduce a spatial random effect with an autoregressive prior. However, the usual Markov chain Monte Carlo scheme for this hierarchical framework requires the spatial effects to be sampled from their full conditional posteriors onebyone resulting in poor mixing. More importantly, it makes the model computationally inefficient for datasets with large number of units. In this dissertation, we propose a Bayesian approach that uses the spectral structure of the adjacency to construct a lowrank expansion for modeling spatial dependence. We develop a computationally efficient estimation scheme that adaptively selects the functions most important to capture the variation in response. Through simulation studies, we validate the computational efficiency as well as predictive accuracy of our method. Finally, we present an important realworld application of the proposed methodology on a massive plant abundance dataset from Cape Floristic Region in South Africa. The second contribution of this dissertation is a heavy tailed hierarchical regression to detect outliers. We aim to build a linear model that can allow for small as well as large magnitudes of residuals through observationspecific error distribution. tdistribution is specifically suited for that purpose as we can parametrically control its degrees of freedom (df) to tune the heaviness of its tail  large df values represent observations in normal range and small ones represents potential outliers with high error magnitudes. In a hierarchical structure, we can write tdistribution as a scale mixture of a Gaussian distribution so that the standard MCMC algorithm for Gaussian setting can still be used. PostMCMC, the posterior mean of degrees of freedom for any observation acts as a measure of outlyingness of that observation. We implemented this method on a real dataset consisting of biometric records.