
Mapping Bayesian Networks to
Boltzmann Machines
Petri Myllym?ki
Department of Computer Science
P.O.Box 26, FIN00014 University of Helsinki, Finland
Email: [email protected]
Pp. 269>=280 in Proceedings of Applied Decision Technologies 1995 (London, April 1995).
Abstract
We study the task of onding a maximal a posteriori (MAP) instantiation of Bayesian network variables, given a partial value assignment as an initial constraint. This problem is known to be NPhard, so we concentrate on a stochastic approximation algorithm, simulated annealing. This stochastic algorithm can be realized as a sequential process on the set of Bayesian network variables, where only one variable is allowed to change at a time. Consequently, the method can become impractically slow as the number of variables increases. We present a method for mapping a given Bayesian network to a massively parallel Bolztmann machine neural network architecture, in the sense that instead of using the normal sequential simulated annealing algorithm, we can use a massively parallel stochastic process on the Boltzmann machine architecture. The neural network updating process provably converges to a state which solves a given MAP task.
1 Introduction
Bayesian networks can be used for deoning a joint probability distribution on a set of discrete random variables by orst explicitly identifying conditional independencies between the variables and then determining a set of structurally simple local conditional probabilities. Given a a situation where some of the variables are clamped (permanently oxed) to some value, the Bayesian network representation can be used for computing the conditional probabilities for the values of the free variables, given the clampings, or alternatively it can be used for onding the maximal probability variablevalue combination among all the instantiations consistent with the clampings. In this study we are only concerned with the latter task, and we call it the maximal a posteriori instantiation problem, or shorter the MAP problem. As in diagnostic applications this kind of a system would provide the user the best explanation for a given set of symptoms, the solution is sometimes called the most probable explanation [24], and the task of obtaining the MAP solution is sometimes referred to as abductive inference [23].
It has been shown that the MAP problem for Bayesian networks belongs to the class of NPhard problems with respect to the size of the corresponding network [1, 28], and hence the MAP problem is probably intractable (in the worst case sense) for Bayesian networks in general. However, for singlyconnected Bayesian network structures (networks with at most one path between any two variables, disregarding the direction of the connecting arc), the MAP problem can be solved in polynomial time [24, 23]. Consequently, most existing systems for Bayesian reasoning transform orst a given multiconnected BN structure to a singlyconnected network, and use then the existing polynomialtime algorithms. This transformation can be done either explicitly by clustering several nodes together as in [16, 26] or [24, Ch.4.4.1], or implicitly by blocking multiply connected paths by conditioning a set of variables as in [24, Ch.4.4.2] (for a discussion of dioeerent transformation techniques, see [27]). Alternatively, clustering can also be done by introducing new latent variables
269