FSU ETD Logo

Title page for ETD etd-02162011-101818


Type of Document Dissertation
Author li, Xiaoyun
URN etd-02162011-101818
Title Analysis of Multivariate Data with Random Cluster Size
Degree Doctor of Philosophy
Department Statistics, Department of
Advisory Committee
Advisor Name Title
Debajyoti Sinha Committee Chair
Dan McGee Committee Member
Stuart Lipsitz Committee Member
Yi Zhou University Representative
Keywords
  • Clustered data
  • Longitudinal data analysis
  • Informative missing
  • Categorical data anlaysis
  • Logistic regression
  • Bridge distribution
Date of Defense 2010-12-02
Availability unrestricted
Abstract
In this dissertation, we examine binary correlated data with present/absent component

or missing data that are related to binary responses of interest.

Depending on the data structure, correlated binary data can be referred as emph{clustered

data} if sampling unit is a cluster of subjects, or it can be referred as emph{longitudinal

data} when it involves repeated measurement of same subject over time. We propose our novel

models in these two data structures and illustrate the model with real data applications.

In biomedical studies involving clustered binary responses, the

cluster size can vary because some components of the cluster can be absent.

When both the presence of a cluster component as well as the binary disease status of a present

component are treated as responses of interest, we propose a novel

two-stage random effects logistic regression framework. For the ease

of interpretation of regression effects, both the marginal

probability of presence/absence of a component as well as the

conditional probability of disease status of a present component,

preserve the approximate logistic regression forms. We present a

maximum likelihood method of estimation implementable using standard

statistical software. We compare our models and the physical

interpretation of regression effects with competing methods from

literature. We also present a simulation study to assess the

robustness of our procedure to wrong specification of the random

effects distribution and to compare finite sample performances of

estimates with existing methods. The methodology is illustrated via

analyzing a study of the periodontal health status in a diabetic

Gullah population.

We extend this model in longitudinal studies with binary longitudinal response

and informative missing data. In longitudinal studies, when treating each subject as a cluster, cluster size is

the total number of observations for each subject.

When data is informatively missing, cluster size of each subject can vary and is related to the binary

response of interest and we are also interested in the missing mechanism. This is a modified

situation of the cluster binary data with present components. We modify and adopt our proposed

two-stage random effects logistic regression model so that both the marginal probability

of binary response and missing indicator as well as the conditional probability of binary response

and missing indicator preserve logistic regression forms. We present a Bayesian framework of this model

and illustrate our proposed model on an AIDS data example.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  Li_X_Dissertation_2011.pdf 616.03 Kb 00:02:51 00:01:28 00:01:17 00:00:38 00:00:03

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact the FSU Digital Library Center.