FSU ETD Logo

Title page for ETD etd-06292011-153951


Type of Document Dissertation
Author Yu, Haohai
Author's Email Address hy06@fsu.edu
URN etd-06292011-153951
Title Acceleration Methods for Bayesian Network Sampling
Degree Doctor of Philosophy
Department Computer Science, Department of
Advisory Committee
Advisor Name Title
Robert A. van Engelen Committee Chair
Michael V. Mascagni Committee Member
Xiuwen Liu Committee Member
Mark van Hoeij University Representative
Keywords
  • Bayesian Inference
  • Bayesian Network
  • Importance Sampling
  • GPGPU
Date of Defense 2011-06-22
Availability unrestricted
Abstract
Bayesian inference with Bayesian networks is a #P-complete problem in general. Exact Bayesian inference is feasible in practice only on small-scale Bayesian networks or networks that are dramatically simplified, such as with naive Bayes or other approximations. Stochastic sampling methods, in particular importance sampling, form one of the most prominent and efficient approximate inference techniques that gives answers in reasonable computing time. A critical problem for importance sampling is to choose an importance function to efficiently and accurately sample the posterior probability distribution given a set of observed variables (also referred to as evidence). Choosing an importance function, or designing a new function, faces two major hurdles: how to approach the posterior probability distribution accurately and how to reduce the generation of undesirable inconsistent samples caused by deterministic relations in the network (also known as the rejection problem). In my dissertation I propose Refractor Importance Sampling, consisting of a family of importance sampling algorithms to effectively break the lower bound on the error of the importance function to ensure it can approach the posterior probability distribution of a Bayesian network given evidence. The aim is to increase the convergence rate of the approximation and to reduce sampling variance. I also propose Posterior Subgraph Clustering to improve the sampling process by breaking a network into several small independent subgraphs. To address the rejection problem, I propose Zero Probability Backfilling and Compressed Vertex Tree methods to detect and store deterministic constrains that are the root cause of inconsistent samples. The rejection problem is NP-hard and I prove in my dissertation that even limiting the inconsistent samples to a certain degree with positive probability is too hard to be polynomial. I propose the k-test to measure the hardness of the rejection problem by scaling the clause density of a CNF constructed from a Bayesian network. In the final part of my dissertation, I design and implement importance sampling algorithms for GPU platforms to speed up sampling by exploiting fine-grain parallelism. GPUs are cost-effective computing devices. However, the randomness of memory-intensive operations by Bayesian network sampling causes difficulties to obtain computational speedups with these devices. In my dissertation I propose a new method, Parallel Irregular Wavefront Sampling, to improve memory access on GPU by leveraging the conditional independence relations between variables in a network. I improve this scheme further by using the posterior distribution as an oracle to reduce costly memory fetches in the GPU. The proposed methods are implemented and experimentally tested. The results show a significant contribution in efficiency and accuracy of Bayesian network sampling for Bayesian inference with real-world and synthetic benchmark Bayesian networks.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  Yu_H_Dissertation_2011.pdf 1.77 Mb 00:08:11 00:04:12 00:03:41 00:01:50 00:00:09

Browse All Available ETDs by ( Author | Department )

If you have more questions or technical problems, please Contact the FSU Digital Library Center.