Article Contents
Article Contents

# Network inference with hidden units

• We derive learning rules for finding the connections between units in stochastic dynamical networks from the recorded history of a visible'' subset of the units. We consider two models. In both of them, the visible units are binary and stochastic. In one model the hidden'' units are continuous-valued, with sigmoidal activation functions, and in the other they are binary and stochastic like the visible ones. We derive exact learning rules for both cases. For the stochastic case, performing the exact calculation requires, in general, repeated summations over an number of configurations that grows exponentially with the size of the system and the data length, which is not feasible for large systems. We derive a mean field theory, based on a factorized ansatz for the distribution of hidden-unit states, which offers an attractive alternative for large systems. We present the results of some numerical calculations that illustrate key features of the two models and, for the stochastic case, the exact and approximate calculations.
Mathematics Subject Classification: Primary: 62M45, 82C20; Secondary: 62J02.

 Citation:

•  [1] D. Ackley, G. E. Hinton and T. J. Sejnowski, A learning algorithm for Boltzmann machines, Cogn. Sci., 9 (1985), 147-169. [2] H. Akaike, A new look at the statistical model identification. System identification and time-series analysis, IEE Transactions on Automatic Control, AC-19 (1974), 716-723. [3] D. Barber, "Bayesian Reasoning and Machine Learning," chapter 11, Cambridge Univ. Press, 2012. [4] A. P. Dempster, N. M. Laird and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm. With discussion, J. Roy. Stat. Soc. B, 39 (1977), 1-38. [5] B. Dunn and Y. Roudi, Learning and inference in a nonequilibrium Ising model with hidden nodes, Phys. Rev. E, 87 (2013), 022127.doi: 10.1103/PhysRevE.87.022127. [6] R. J. Glauber, Time-dependent statistics of the Ising model, J. Math. Phys., 4 (1963), 294-307.doi: 10.1063/1.1703954. [7] J. Hertz, Y. Roudi and J. Tyrcha, Ising models for inferring network structure from spike data, in "Principles of Neural Coding" (eds. S. Panzeri and R. R. Quiroga), CRC Press, (2013), 527-546.doi: 10.1201/b14756-31. [8] M. Mézard, G. Parisi and M. Virasoro, "Spin Glass Theory and Beyond," chapter 2, World Scientific Lecture Notes in Physics, 9, World Scientific Publishing Co., Inc., Teaneck, NJ, 1987. [9] B. A. Pearlmutter, Learning state space trajectories in recurrent neural networks, Neural Computation, 1 (1989), 263-269. [10] P. Peretto, Collective properties of neural networks: A statistical physics approach, Biol. Cybern., 50 (1984), 51-62.doi: 10.1007/BF00317939. [11] F. J. Pineda, Generalization of back-propagation to recurrent neural networks, Phys. Rev. Lett., 59 (1987), 2229-2232.doi: 10.1103/PhysRevLett.59.2229. [12] Y. Roudi and J. Hertz, Mean-field theory for nonequilibrium network reconstruction, Phys. Rev. Lett., 106 (2011), 048702.doi: 10.1103/PhysRevLett.106.048702. [13] Y. Roudi, J. Tyrcha and J. Hertz, The Ising model for neural data: Model quality and approximate methods for extracting functional connectivity, Phys. Rev. E, 79 (2009), 051915.doi: 10.1103/PhysRevE.79.051915. [14] D. E. Rumelhart, G. E. Hinton and R. J. Williams, Learning Internal Representations by Error Propagation, in "Parallel Distributed Processing" (eds. D. E. Rumelhart and J. L. McClelland), Vol. 1, Chapter 8, MIT Press, 1986. [15] L. K. Saul, T. Jaakkola and M. I. Jordan, Mean field theory for sigmoid belief networks, J. Art. Intel. Res., 4 (1996), 61-76. [16] E. Schneidman, M. J. Berry, R. Segev and W. Bialek, Weak pairwise correlations imply strongly correlated network states in a neural population, Nature, 440 (2006), 1007-1012.doi: 10.1038/nature04701. [17] G. E. Schwarz, Estimating the dimension of a model, Annals of Statistics, 6 (1978), 461-464.doi: 10.1214/aos/1176344136. [18] R. Sundberg, Maximum likelihood theory for incomplete data from an exponential family, Scand. J. Statistics, 1 (1974), 49-58. [19] D. Sherrington and S. Kirkpatrick, Solvable model of a spin-glass, Phys. Rev. Lett., 35 (1975), 1792-1796.doi: 10.1103/PhysRevLett.35.1792. [20] D. J. Thouless, P. W. Anderson and R. G. Palmer, Solution of "soluble model of a spin glass,'' Philos. Mag., 92 (1974), 272-279. [21] R. J. Williams and D. Zipser, A learning algorithm for continually running fully recurrent networks, Neural Comp., 1 (1989), 270-280.doi: 10.1162/neco.1989.1.2.270.