|
Type of Document Dissertation Author Thompson, Warren Robert Author's Email Address thompson@stat.fsu.edu URN etd-08222009-111751 Title Variable Selection Of Correlated Predictors In Logistic Regression: Investigating The Diet-Heart Hypothesis Degree Doctor of Philosophy Department Statistics, Department of Advisory Committee
Advisor Name Title Daniel McGee Committee Chair Debajyoti Sinha Committee Member Fred Huffer Committee Member Yiyuan She Committee Member Isaac Eberstein Outside Committee Member Keywords
- Logistic Regression
- Bootstrap
- Lasso
- Ridge Regression
- Bayesian Model Averaging
- Diet-Heart Hypothesis
Date of Defense 2009-08-10 Availability unrestricted Abstract Variable selection is an important aspect of modeling. Its aim is to distinguish betweenthe authentic variables which are important in predicting outcome, and the noise variables which possess little to no predictive value. In other words, the goal is to find the variables that (collectively) best explains and predicts changes in the outcome variable. The variable selection problem is exacerbated when correlated variables are included in the covariate set. This dissertation examines the variable selection problem in the context of logistic regression. Specifically, we investigated the merits of the bootstrap, ridge regression, the lasso and Bayesian model averaging (BMA) as variable selection techniques when highly correlated predictors and a dichotomous outcome are considered.
This dissertation also contributes to the literature on the diet-heart hypothesis. The diet-heart hypothesis has been around since the early twentieth century. Since then, researchers have attempted to isolate the nutrients in diet that promote coronary heart disease (CHD). After a century of research, there is still no consensus. In our current research, we used some of the more recent statistical methodologies (mentioned above) to investigate the effect of twenty dietary variables on the incidence of coronary heart disease. Logistic regression models were generated for the data from the Honolulu Heart Program - a study of CHD incidence in men of Japanese descent.
Our results were largely method-specific. However, regardless of method considered, there was strong evidence to suggest that alcohol consumption has a strong protective effect on the risk of coronary heart disease. Of the variables considered, dietary cholesterol and caffeine were the only variables that, at best, exhibited a moderately strong harmful association with CHD incidence. Further investigation that includes a broader array of food groups is recommended.
Files
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access Thompson_W_Dissertation_2009s.pdf 712.17 Kb 00:03:17 00:01:41 00:01:29 00:00:44 00:00:03