2014, 11(1): 149-165. doi: 10.3934/mbe.2014.11.149

Network inference with hidden units

1. 

Department of Mathematics, Stockholm University, Kräftriket, S-106 91 Stockholm, Sweden

2. 

Nordita, Stockholm University and KTH, Roslagstullsbacken 23, S-106 91 Stockholm, Sweden

Received  December 2012 Revised  May 2013 Published  September 2013

We derive learning rules for finding the connections between units in stochastic dynamical networks from the recorded history of a ``visible'' subset of the units. We consider two models. In both of them, the visible units are binary and stochastic. In one model the ``hidden'' units are continuous-valued, with sigmoidal activation functions, and in the other they are binary and stochastic like the visible ones. We derive exact learning rules for both cases. For the stochastic case, performing the exact calculation requires, in general, repeated summations over an number of configurations that grows exponentially with the size of the system and the data length, which is not feasible for large systems. We derive a mean field theory, based on a factorized ansatz for the distribution of hidden-unit states, which offers an attractive alternative for large systems. We present the results of some numerical calculations that illustrate key features of the two models and, for the stochastic case, the exact and approximate calculations.
Citation: Joanna Tyrcha, John Hertz. Network inference with hidden units. Mathematical Biosciences & Engineering, 2014, 11 (1) : 149-165. doi: 10.3934/mbe.2014.11.149
References:
[1]

D. Ackley, G. E. Hinton and T. J. Sejnowski, A learning algorithm for Boltzmann machines,, Cogn. Sci., 9 (1985), 147.   Google Scholar

[2]

H. Akaike, A new look at the statistical model identification. System identification and time-series analysis,, IEE Transactions on Automatic Control, AC-19 (1974), 716.   Google Scholar

[3]

D. Barber, "Bayesian Reasoning and Machine Learning,", chapter 11, (2012).   Google Scholar

[4]

A. P. Dempster, N. M. Laird and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm. With discussion,, J. Roy. Stat. Soc. B, 39 (1977), 1.   Google Scholar

[5]

B. Dunn and Y. Roudi, Learning and inference in a nonequilibrium Ising model with hidden nodes,, Phys. Rev. E, 87 (2013).  doi: 10.1103/PhysRevE.87.022127.  Google Scholar

[6]

R. J. Glauber, Time-dependent statistics of the Ising model,, J. Math. Phys., 4 (1963), 294.  doi: 10.1063/1.1703954.  Google Scholar

[7]

J. Hertz, Y. Roudi and J. Tyrcha, Ising models for inferring network structure from spike data,, in, (2013), 527.  doi: 10.1201/b14756-31.  Google Scholar

[8]

M. Mézard, G. Parisi and M. Virasoro, "Spin Glass Theory and Beyond,", chapter 2, 9 (1987).   Google Scholar

[9]

B. A. Pearlmutter, Learning state space trajectories in recurrent neural networks,, Neural Computation, 1 (1989), 263.   Google Scholar

[10]

P. Peretto, Collective properties of neural networks: A statistical physics approach,, Biol. Cybern., 50 (1984), 51.  doi: 10.1007/BF00317939.  Google Scholar

[11]

F. J. Pineda, Generalization of back-propagation to recurrent neural networks,, Phys. Rev. Lett., 59 (1987), 2229.  doi: 10.1103/PhysRevLett.59.2229.  Google Scholar

[12]

Y. Roudi and J. Hertz, Mean-field theory for nonequilibrium network reconstruction,, Phys. Rev. Lett., 106 (2011).  doi: 10.1103/PhysRevLett.106.048702.  Google Scholar

[13]

Y. Roudi, J. Tyrcha and J. Hertz, The Ising model for neural data: Model quality and approximate methods for extracting functional connectivity,, Phys. Rev. E, 79 (2009).  doi: 10.1103/PhysRevE.79.051915.  Google Scholar

[14]

D. E. Rumelhart, G. E. Hinton and R. J. Williams, Learning Internal Representations by Error Propagation,, in, (1986).   Google Scholar

[15]

L. K. Saul, T. Jaakkola and M. I. Jordan, Mean field theory for sigmoid belief networks,, J. Art. Intel. Res., 4 (1996), 61.   Google Scholar

[16]

E. Schneidman, M. J. Berry, R. Segev and W. Bialek, Weak pairwise correlations imply strongly correlated network states in a neural population,, Nature, 440 (2006), 1007.  doi: 10.1038/nature04701.  Google Scholar

[17]

G. E. Schwarz, Estimating the dimension of a model,, Annals of Statistics, 6 (1978), 461.  doi: 10.1214/aos/1176344136.  Google Scholar

[18]

R. Sundberg, Maximum likelihood theory for incomplete data from an exponential family,, Scand. J. Statistics, 1 (1974), 49.   Google Scholar

[19]

D. Sherrington and S. Kirkpatrick, Solvable model of a spin-glass,, Phys. Rev. Lett., 35 (1975), 1792.  doi: 10.1103/PhysRevLett.35.1792.  Google Scholar

[20]

D. J. Thouless, P. W. Anderson and R. G. Palmer, Solution of "soluble model of a spin glass,'', Philos. Mag., 92 (1974), 272.   Google Scholar

[21]

R. J. Williams and D. Zipser, A learning algorithm for continually running fully recurrent networks,, Neural Comp., 1 (1989), 270.  doi: 10.1162/neco.1989.1.2.270.  Google Scholar

show all references

References:
[1]

D. Ackley, G. E. Hinton and T. J. Sejnowski, A learning algorithm for Boltzmann machines,, Cogn. Sci., 9 (1985), 147.   Google Scholar

[2]

H. Akaike, A new look at the statistical model identification. System identification and time-series analysis,, IEE Transactions on Automatic Control, AC-19 (1974), 716.   Google Scholar

[3]

D. Barber, "Bayesian Reasoning and Machine Learning,", chapter 11, (2012).   Google Scholar

[4]

A. P. Dempster, N. M. Laird and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm. With discussion,, J. Roy. Stat. Soc. B, 39 (1977), 1.   Google Scholar

[5]

B. Dunn and Y. Roudi, Learning and inference in a nonequilibrium Ising model with hidden nodes,, Phys. Rev. E, 87 (2013).  doi: 10.1103/PhysRevE.87.022127.  Google Scholar

[6]

R. J. Glauber, Time-dependent statistics of the Ising model,, J. Math. Phys., 4 (1963), 294.  doi: 10.1063/1.1703954.  Google Scholar

[7]

J. Hertz, Y. Roudi and J. Tyrcha, Ising models for inferring network structure from spike data,, in, (2013), 527.  doi: 10.1201/b14756-31.  Google Scholar

[8]

M. Mézard, G. Parisi and M. Virasoro, "Spin Glass Theory and Beyond,", chapter 2, 9 (1987).   Google Scholar

[9]

B. A. Pearlmutter, Learning state space trajectories in recurrent neural networks,, Neural Computation, 1 (1989), 263.   Google Scholar

[10]

P. Peretto, Collective properties of neural networks: A statistical physics approach,, Biol. Cybern., 50 (1984), 51.  doi: 10.1007/BF00317939.  Google Scholar

[11]

F. J. Pineda, Generalization of back-propagation to recurrent neural networks,, Phys. Rev. Lett., 59 (1987), 2229.  doi: 10.1103/PhysRevLett.59.2229.  Google Scholar

[12]

Y. Roudi and J. Hertz, Mean-field theory for nonequilibrium network reconstruction,, Phys. Rev. Lett., 106 (2011).  doi: 10.1103/PhysRevLett.106.048702.  Google Scholar

[13]

Y. Roudi, J. Tyrcha and J. Hertz, The Ising model for neural data: Model quality and approximate methods for extracting functional connectivity,, Phys. Rev. E, 79 (2009).  doi: 10.1103/PhysRevE.79.051915.  Google Scholar

[14]

D. E. Rumelhart, G. E. Hinton and R. J. Williams, Learning Internal Representations by Error Propagation,, in, (1986).   Google Scholar

[15]

L. K. Saul, T. Jaakkola and M. I. Jordan, Mean field theory for sigmoid belief networks,, J. Art. Intel. Res., 4 (1996), 61.   Google Scholar

[16]

E. Schneidman, M. J. Berry, R. Segev and W. Bialek, Weak pairwise correlations imply strongly correlated network states in a neural population,, Nature, 440 (2006), 1007.  doi: 10.1038/nature04701.  Google Scholar

[17]

G. E. Schwarz, Estimating the dimension of a model,, Annals of Statistics, 6 (1978), 461.  doi: 10.1214/aos/1176344136.  Google Scholar

[18]

R. Sundberg, Maximum likelihood theory for incomplete data from an exponential family,, Scand. J. Statistics, 1 (1974), 49.   Google Scholar

[19]

D. Sherrington and S. Kirkpatrick, Solvable model of a spin-glass,, Phys. Rev. Lett., 35 (1975), 1792.  doi: 10.1103/PhysRevLett.35.1792.  Google Scholar

[20]

D. J. Thouless, P. W. Anderson and R. G. Palmer, Solution of "soluble model of a spin glass,'', Philos. Mag., 92 (1974), 272.   Google Scholar

[21]

R. J. Williams and D. Zipser, A learning algorithm for continually running fully recurrent networks,, Neural Comp., 1 (1989), 270.  doi: 10.1162/neco.1989.1.2.270.  Google Scholar

[1]

Juan Pablo Pinasco, Mauro Rodriguez Cartabia, Nicolas Saintier. Evolutionary game theory in mixed strategies: From microscopic interactions to kinetic equations. Kinetic & Related Models, , () : -. doi: 10.3934/krm.2020051

[2]

Claudianor O. Alves, Rodrigo C. M. Nemer, Sergio H. Monari Soares. The use of the Morse theory to estimate the number of nontrivial solutions of a nonlinear Schrödinger equation with a magnetic field. Communications on Pure & Applied Analysis, , () : -. doi: 10.3934/cpaa.2020276

[3]

Jie Li, Xiangdong Ye, Tao Yu. Mean equicontinuity, complexity and applications. Discrete & Continuous Dynamical Systems - A, 2021, 41 (1) : 359-393. doi: 10.3934/dcds.2020167

[4]

Qiang Fu, Yanlong Zhang, Yushu Zhu, Ting Li. Network centralities, demographic disparities, and voluntary participation. Mathematical Foundations of Computing, 2020, 3 (4) : 249-262. doi: 10.3934/mfc.2020011

[5]

Shipra Singh, Aviv Gibali, Xiaolong Qin. Cooperation in traffic network problems via evolutionary split variational inequalities. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2020170

[6]

Anna Abbatiello, Eduard Feireisl, Antoní Novotný. Generalized solutions to models of compressible viscous fluids. Discrete & Continuous Dynamical Systems - A, 2021, 41 (1) : 1-28. doi: 10.3934/dcds.2020345

[7]

Felix Finster, Jürg Fröhlich, Marco Oppio, Claudio F. Paganini. Causal fermion systems and the ETH approach to quantum theory. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020451

[8]

Xin Guo, Lexin Li, Qiang Wu. Modeling interactive components by coordinate kernel polynomial models. Mathematical Foundations of Computing, 2020, 3 (4) : 263-277. doi: 10.3934/mfc.2020010

[9]

Parikshit Upadhyaya, Elias Jarlebring, Emanuel H. Rubensson. A density matrix approach to the convergence of the self-consistent field iteration. Numerical Algebra, Control & Optimization, 2021, 11 (1) : 99-115. doi: 10.3934/naco.2020018

[10]

Wei Feng, Michael Freeze, Xin Lu. On competition models under allee effect: Asymptotic behavior and traveling waves. Communications on Pure & Applied Analysis, 2020, 19 (12) : 5609-5626. doi: 10.3934/cpaa.2020256

[11]

Pierre-Etienne Druet. A theory of generalised solutions for ideal gas mixtures with Maxwell-Stefan diffusion. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020458

[12]

Sergey Rashkovskiy. Hamilton-Jacobi theory for Hamiltonian and non-Hamiltonian systems. Journal of Geometric Mechanics, 2020, 12 (4) : 563-583. doi: 10.3934/jgm.2020024

[13]

Dan Zhu, Rosemary A. Renaut, Hongwei Li, Tianyou Liu. Fast non-convex low-rank matrix decomposition for separation of potential field data using minimal memory. Inverse Problems & Imaging, , () : -. doi: 10.3934/ipi.2020076

[14]

Yongge Tian, Pengyang Xie. Simultaneous optimal predictions under two seemingly unrelated linear random-effects models. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2020168

[15]

Annegret Glitzky, Matthias Liero, Grigor Nika. Dimension reduction of thermistor models for large-area organic light-emitting diodes. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020460

2018 Impact Factor: 1.313

Metrics

  • PDF downloads (18)
  • HTML views (0)
  • Cited by (11)

Other articles
by authors

[Back to Top]