2014, 11(1): 149-165. doi: 10.3934/mbe.2014.11.149

Network inference with hidden units

1. 

Department of Mathematics, Stockholm University, Kräftriket, S-106 91 Stockholm, Sweden

2. 

Nordita, Stockholm University and KTH, Roslagstullsbacken 23, S-106 91 Stockholm, Sweden

Received  December 2012 Revised  May 2013 Published  September 2013

We derive learning rules for finding the connections between units in stochastic dynamical networks from the recorded history of a ``visible'' subset of the units. We consider two models. In both of them, the visible units are binary and stochastic. In one model the ``hidden'' units are continuous-valued, with sigmoidal activation functions, and in the other they are binary and stochastic like the visible ones. We derive exact learning rules for both cases. For the stochastic case, performing the exact calculation requires, in general, repeated summations over an number of configurations that grows exponentially with the size of the system and the data length, which is not feasible for large systems. We derive a mean field theory, based on a factorized ansatz for the distribution of hidden-unit states, which offers an attractive alternative for large systems. We present the results of some numerical calculations that illustrate key features of the two models and, for the stochastic case, the exact and approximate calculations.
Citation: Joanna Tyrcha, John Hertz. Network inference with hidden units. Mathematical Biosciences & Engineering, 2014, 11 (1) : 149-165. doi: 10.3934/mbe.2014.11.149
References:
[1]

D. Ackley, G. E. Hinton and T. J. Sejnowski, A learning algorithm for Boltzmann machines,, Cogn. Sci., 9 (1985), 147.

[2]

H. Akaike, A new look at the statistical model identification. System identification and time-series analysis,, IEE Transactions on Automatic Control, AC-19 (1974), 716.

[3]

D. Barber, "Bayesian Reasoning and Machine Learning,", chapter 11, (2012).

[4]

A. P. Dempster, N. M. Laird and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm. With discussion,, J. Roy. Stat. Soc. B, 39 (1977), 1.

[5]

B. Dunn and Y. Roudi, Learning and inference in a nonequilibrium Ising model with hidden nodes,, Phys. Rev. E, 87 (2013). doi: 10.1103/PhysRevE.87.022127.

[6]

R. J. Glauber, Time-dependent statistics of the Ising model,, J. Math. Phys., 4 (1963), 294. doi: 10.1063/1.1703954.

[7]

J. Hertz, Y. Roudi and J. Tyrcha, Ising models for inferring network structure from spike data,, in, (2013), 527. doi: 10.1201/b14756-31.

[8]

M. Mézard, G. Parisi and M. Virasoro, "Spin Glass Theory and Beyond,", chapter 2, 9 (1987).

[9]

B. A. Pearlmutter, Learning state space trajectories in recurrent neural networks,, Neural Computation, 1 (1989), 263.

[10]

P. Peretto, Collective properties of neural networks: A statistical physics approach,, Biol. Cybern., 50 (1984), 51. doi: 10.1007/BF00317939.

[11]

F. J. Pineda, Generalization of back-propagation to recurrent neural networks,, Phys. Rev. Lett., 59 (1987), 2229. doi: 10.1103/PhysRevLett.59.2229.

[12]

Y. Roudi and J. Hertz, Mean-field theory for nonequilibrium network reconstruction,, Phys. Rev. Lett., 106 (2011). doi: 10.1103/PhysRevLett.106.048702.

[13]

Y. Roudi, J. Tyrcha and J. Hertz, The Ising model for neural data: Model quality and approximate methods for extracting functional connectivity,, Phys. Rev. E, 79 (2009). doi: 10.1103/PhysRevE.79.051915.

[14]

D. E. Rumelhart, G. E. Hinton and R. J. Williams, Learning Internal Representations by Error Propagation,, in, (1986).

[15]

L. K. Saul, T. Jaakkola and M. I. Jordan, Mean field theory for sigmoid belief networks,, J. Art. Intel. Res., 4 (1996), 61.

[16]

E. Schneidman, M. J. Berry, R. Segev and W. Bialek, Weak pairwise correlations imply strongly correlated network states in a neural population,, Nature, 440 (2006), 1007. doi: 10.1038/nature04701.

[17]

G. E. Schwarz, Estimating the dimension of a model,, Annals of Statistics, 6 (1978), 461. doi: 10.1214/aos/1176344136.

[18]

R. Sundberg, Maximum likelihood theory for incomplete data from an exponential family,, Scand. J. Statistics, 1 (1974), 49.

[19]

D. Sherrington and S. Kirkpatrick, Solvable model of a spin-glass,, Phys. Rev. Lett., 35 (1975), 1792. doi: 10.1103/PhysRevLett.35.1792.

[20]

D. J. Thouless, P. W. Anderson and R. G. Palmer, Solution of "soluble model of a spin glass,'', Philos. Mag., 92 (1974), 272.

[21]

R. J. Williams and D. Zipser, A learning algorithm for continually running fully recurrent networks,, Neural Comp., 1 (1989), 270. doi: 10.1162/neco.1989.1.2.270.

show all references

References:
[1]

D. Ackley, G. E. Hinton and T. J. Sejnowski, A learning algorithm for Boltzmann machines,, Cogn. Sci., 9 (1985), 147.

[2]

H. Akaike, A new look at the statistical model identification. System identification and time-series analysis,, IEE Transactions on Automatic Control, AC-19 (1974), 716.

[3]

D. Barber, "Bayesian Reasoning and Machine Learning,", chapter 11, (2012).

[4]

A. P. Dempster, N. M. Laird and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm. With discussion,, J. Roy. Stat. Soc. B, 39 (1977), 1.

[5]

B. Dunn and Y. Roudi, Learning and inference in a nonequilibrium Ising model with hidden nodes,, Phys. Rev. E, 87 (2013). doi: 10.1103/PhysRevE.87.022127.

[6]

R. J. Glauber, Time-dependent statistics of the Ising model,, J. Math. Phys., 4 (1963), 294. doi: 10.1063/1.1703954.

[7]

J. Hertz, Y. Roudi and J. Tyrcha, Ising models for inferring network structure from spike data,, in, (2013), 527. doi: 10.1201/b14756-31.

[8]

M. Mézard, G. Parisi and M. Virasoro, "Spin Glass Theory and Beyond,", chapter 2, 9 (1987).

[9]

B. A. Pearlmutter, Learning state space trajectories in recurrent neural networks,, Neural Computation, 1 (1989), 263.

[10]

P. Peretto, Collective properties of neural networks: A statistical physics approach,, Biol. Cybern., 50 (1984), 51. doi: 10.1007/BF00317939.

[11]

F. J. Pineda, Generalization of back-propagation to recurrent neural networks,, Phys. Rev. Lett., 59 (1987), 2229. doi: 10.1103/PhysRevLett.59.2229.

[12]

Y. Roudi and J. Hertz, Mean-field theory for nonequilibrium network reconstruction,, Phys. Rev. Lett., 106 (2011). doi: 10.1103/PhysRevLett.106.048702.

[13]

Y. Roudi, J. Tyrcha and J. Hertz, The Ising model for neural data: Model quality and approximate methods for extracting functional connectivity,, Phys. Rev. E, 79 (2009). doi: 10.1103/PhysRevE.79.051915.

[14]

D. E. Rumelhart, G. E. Hinton and R. J. Williams, Learning Internal Representations by Error Propagation,, in, (1986).

[15]

L. K. Saul, T. Jaakkola and M. I. Jordan, Mean field theory for sigmoid belief networks,, J. Art. Intel. Res., 4 (1996), 61.

[16]

E. Schneidman, M. J. Berry, R. Segev and W. Bialek, Weak pairwise correlations imply strongly correlated network states in a neural population,, Nature, 440 (2006), 1007. doi: 10.1038/nature04701.

[17]

G. E. Schwarz, Estimating the dimension of a model,, Annals of Statistics, 6 (1978), 461. doi: 10.1214/aos/1176344136.

[18]

R. Sundberg, Maximum likelihood theory for incomplete data from an exponential family,, Scand. J. Statistics, 1 (1974), 49.

[19]

D. Sherrington and S. Kirkpatrick, Solvable model of a spin-glass,, Phys. Rev. Lett., 35 (1975), 1792. doi: 10.1103/PhysRevLett.35.1792.

[20]

D. J. Thouless, P. W. Anderson and R. G. Palmer, Solution of "soluble model of a spin glass,'', Philos. Mag., 92 (1974), 272.

[21]

R. J. Williams and D. Zipser, A learning algorithm for continually running fully recurrent networks,, Neural Comp., 1 (1989), 270. doi: 10.1162/neco.1989.1.2.270.

[1]

Gerasimenko Viktor. Heisenberg picture of quantum kinetic evolution in mean-field limit. Kinetic & Related Models, 2011, 4 (1) : 385-399. doi: 10.3934/krm.2011.4.385

[2]

Xiaoyu Zheng, Peter Palffy-Muhoray. One order parameter tensor mean field theory for biaxial liquid crystals. Discrete & Continuous Dynamical Systems - B, 2011, 15 (2) : 475-490. doi: 10.3934/dcdsb.2011.15.475

[3]

Rong Yang, Li Chen. Mean-field limit for a collision-avoiding flocking system and the time-asymptotic flocking dynamics for the kinetic equation. Kinetic & Related Models, 2014, 7 (2) : 381-400. doi: 10.3934/krm.2014.7.381

[4]

Oliver J. Maclaren, Helen M. Byrne, Alexander G. Fletcher, Philip K. Maini. Models, measurement and inference in epithelial tissue dynamics. Mathematical Biosciences & Engineering, 2015, 12 (6) : 1321-1340. doi: 10.3934/mbe.2015.12.1321

[5]

Evangelos Evangelou. Approximate Bayesian inference for geostatistical generalised linear models. Foundations of Data Science, 2019, 1 (1) : 39-60. doi: 10.3934/fods.2019002

[6]

Chjan C. Lim. Extremal free energy in a simple mean field theory for a coupled Barotropic fluid - rotating sphere system. Discrete & Continuous Dynamical Systems - A, 2007, 19 (2) : 361-386. doi: 10.3934/dcds.2007.19.361

[7]

William Chad Young, Adrian E. Raftery, Ka Yee Yeung. A posterior probability approach for gene regulatory network inference in genetic perturbation data. Mathematical Biosciences & Engineering, 2016, 13 (6) : 1241-1251. doi: 10.3934/mbe.2016041

[8]

Franco Flandoli, Matti Leimbach. Mean field limit with proliferation. Discrete & Continuous Dynamical Systems - B, 2016, 21 (9) : 3029-3052. doi: 10.3934/dcdsb.2016086

[9]

Yu Yang, Dongmei Xiao. Influence of latent period and nonlinear incidence rate on the dynamics of SIRS epidemiological models. Discrete & Continuous Dynamical Systems - B, 2010, 13 (1) : 195-211. doi: 10.3934/dcdsb.2010.13.195

[10]

Napoleon Bame, Samuel Bowong, Josepha Mbang, Gauthier Sallet, Jean-Jules Tewa. Global stability analysis for SEIS models with n latent classes. Mathematical Biosciences & Engineering, 2008, 5 (1) : 20-33. doi: 10.3934/mbe.2008.5.20

[11]

Darryl D. Holm, Vakhtang Putkaradze, Cesare Tronci. Collisionless kinetic theory of rolling molecules. Kinetic & Related Models, 2013, 6 (2) : 429-458. doi: 10.3934/krm.2013.6.429

[12]

Emmanuel Frénod, Mathieu Lutz. On the Geometrical Gyro-Kinetic theory. Kinetic & Related Models, 2014, 7 (4) : 621-659. doi: 10.3934/krm.2014.7.621

[13]

T. S. Evans, A. D. K. Plato. Network rewiring models. Networks & Heterogeneous Media, 2008, 3 (2) : 221-238. doi: 10.3934/nhm.2008.3.221

[14]

Diogo Gomes, Levon Nurbekyan. An infinite-dimensional weak KAM theory via random variables. Discrete & Continuous Dynamical Systems - A, 2016, 36 (11) : 6167-6185. doi: 10.3934/dcds.2016069

[15]

Dong-Mei Zhu, Wai-Ki Ching, Robert J. Elliott, Tak-Kuen Siu, Lianmin Zhang. Hidden Markov models with threshold effects and their applications to oil price forecasting. Journal of Industrial & Management Optimization, 2017, 13 (2) : 757-773. doi: 10.3934/jimo.2016045

[16]

Roberto C. Alamino, Nestor Caticha. Bayesian online algorithms for learning in discrete hidden Markov models. Discrete & Continuous Dynamical Systems - B, 2008, 9 (1) : 1-10. doi: 10.3934/dcdsb.2008.9.1

[17]

Pierre Degond, Hailiang Liu. Kinetic models for polymers with inertial effects. Networks & Heterogeneous Media, 2009, 4 (4) : 625-647. doi: 10.3934/nhm.2009.4.625

[18]

Seung-Yeal Ha, Doron Levy. Particle, kinetic and fluid models for phototaxis. Discrete & Continuous Dynamical Systems - B, 2009, 12 (1) : 77-108. doi: 10.3934/dcdsb.2009.12.77

[19]

Glenn F. Webb. Individual based models and differential equations models of nosocomial epidemics in hospital intensive care units. Discrete & Continuous Dynamical Systems - B, 2017, 22 (3) : 1145-1166. doi: 10.3934/dcdsb.2017056

[20]

Pierre Cardaliaguet, Jean-Michel Lasry, Pierre-Louis Lions, Alessio Porretta. Long time average of mean field games. Networks & Heterogeneous Media, 2012, 7 (2) : 279-301. doi: 10.3934/nhm.2012.7.279

2017 Impact Factor: 1.23

Metrics

  • PDF downloads (4)
  • HTML views (0)
  • Cited by (0)

Other articles
by authors

[Back to Top]