Type of Document Dissertation Author Ejaz, Masood Author's Email Address email@example.com URN etd-04182008-123550 Title A Framework for Implementing Independent Component Analysis Algorithms Degree Doctor of Philosophy Department Electrical and Computer Engineering, Department of Advisory Committee
Advisor Name Title Anke Meyer-Baese Committee Co-Chair Simon Y. Foo Committee Co-Chair Xiuwen Liu Committee Member Keywords
- Canonical Correlation Analysis
- Statistical Analysis
- Independent Component Analysis
- Extended Infomax
- Kernel ICA
Date of Defense 2008-04-09 Availability unrestricted AbstractIndependent Component Analysis (ICA) is a statistical and computational technique for revealing hidden factors that underlie sets of random variables, measurements, or signals. ICA defines a generative model for the observed multivariate data, which is generally given as a large database of samples. In the model the data samples are assumed to be linear or non-linear mixture of some unknown latent variables (time dependent or independent), and the mixing system is also unknown. The latent variables, if time-independent, are assumed to have a non-gaussian distribution. For variables that have a particular time structure, the non-gaussian distribution condition can be alleviated. Also, the latent variables are assumed to be mutually independent. These variables are called the independent components of the observed data and can be found, up to some degree of accuracy, using different algorithms based on ICA techniques.
There are several algorithms based on different approaches for ICA widely in use for all sort of applications. These algorithms include, but not limited to, the popular FastICA, FOBI (Fourth-Order Blind Identification) & JADE (Joint Approximate Diagonalization of Eigen-Matrices), Maximum Likelihood & Infomax, Kernel based algorithms, SOBI (Second-Order Blind Identification) etc. All the algorithms except SOBI are used for time-independent data.
The main purpose of this research is to create a framework for using different ICA algorithms. In other words to analyze the statistical properties of the data to estimate which ICA algorithm will be best suited for that type of data or which ICA algorithm will converge for the specific type of data. The data to be analyzed can come from any application or source, although for our research we have generated a large number of different datasets with random mixtures of different number of random variables that follow a number of different distributions. The idea is to make a system that takes the data and yields some characteristics or specifications of the data that correlates maximally to some specific type of ICA algorithm or algorithms.
Four different ICA algorithms have been used for this research: FastICA based on the optimization of negentropy of the datasets, Infomax based on the maximum likelihood of the datasets, Joint Approximate Diagonalization of Eigenvalues (JADE) based on the fourth-order cumulant tensor of the input data, and finally Kernel ICA based on the optimization of canonical correlation of the mapped values of the input datasets in the kernel space. We used hundreds of datasets to study the errors generated by all the methods and the correlation between the datasets and the methods and found out some very interesting results to show that for some specific parameters of ICA algorithms, one can estimate, with high probability, the relationship between the statistics of the datasets and the approach to be used to find the independent components. The statistics, easy to employ, can predict with high accuracy the ICA method or methods to be used for some specific dataset without actually dealing with all the ICA methods and thus saving quite a bit of time and processing resources, hence increasing the efficiency of the researcher.
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access EjazMDissertation.pdf 2.81 Mb 00:13:01 00:06:41 00:05:51 00:02:55 00:00:15