Article Contents
Article Contents

# Bayesian inference for latent chain graphs

• In this article we consider Bayesian inference for partially observed Andersson-Madigan-Perlman (AMP) Gaussian chain graph (CG) models. Such models are of particular interest in applications such as biological networks and financial time series. The model itself features a variety of constraints which make both prior modeling and computational inference challenging. We develop a framework for the aforementioned challenges, using a sequential Monte Carlo (SMC) method for statistical inference. Our approach is illustrated on both simulated data as well as real case studies from university graduation rates and a pharmacokinetics study.

Mathematics Subject Classification: 62F15.

 Citation:

• Figure 1.  Simulation results for the independent case: (a) ESS in each SMC step; (b) plot of $\Omega[1,1]$ across on particles; (c) acceptance rates in each SMC step; (d) distribution of the log(target) (i.e., log of $\pi(B,\Omega,(a_{ij})_{i<j}\mid y_{1:m},\alpha)$) at the end of the algorithm

Figure 2.  Chain Graph Estimate presented in [12]

Figure 3.  Empirical Graph

Figure 4.  posterior estimated chain graph using a Dirichlet prior with $\alpha = (0.39 , 0.25 , 0.36 , 0.05)$

Figure 5.  posterior estimated chain graph using a Dirichlet prior with $\alpha = (1 , 1 , 1 , 1)$

Figure 6.  posterior estimated chain graph using a Dirichlet prior with $\alpha = (1 , 3 , 3 , 3)$

Figure 7.  Chain graph with highest posterior probability

Table 1.  Posterior probability $\mathbb{P} (a_{ij} = 0 | y_{1:m},\alpha), 1\leq i < j \leq p$

 Nodes $2$ $3$ $4$ $5$ $6$ $7$ $8$ $9$ $10$ $\; \; 1$ 0.882 0.892 0.920 0.932 0.920 0.870 0.874 0.908 0.934 $\; \; 2$ 0.902 0.906 0.828 0.898 0.934 0.784 0.804 0.906 $\; \; 3$ 0.914 0.890 0.880 0.890 0.908 0.900 0.890 $\; \; 4$ 0.900 0.866 0.882 0.770 0.918 0.900 $\; \; 5$ 0.788 0.952 0.912 0.830 0.918 $\; \; 6$ 0.806 0.932 0.906 0.908 $\; \; 7$ 0.904 0.738 0.916 $\; \; 8$ 0.918 0.740 $\; \; 9$ 0.952

Table 2.  The adjacency matrix corresponding to chain graph in Figure 2

 strat spend salar top10 tstsc rejr pacc apgra strat 0 1 1 2 0 0 0 0 spend 1 0 1 2 2 2 0 0 salar 1 1 0 0 2 2 2 2 top10 3 3 0 0 1 0 1 0 tstsc 0 3 3 1 0 1 0 2 rejr 0 3 3 0 1 0 1 0 pacc 0 0 3 1 0 1 0 2 apgra 0 0 3 0 3 0 3 0

Table 3.  The adjacency matrix corresponding to the chain graph in Figure 3

 strat spend salar top10 tstsc rejr pacc apgra strat 0 1 2 2 2 2 2 2 spend 1 0 1 2 2 2 2 2 salar 3 1 0 1 2 2 3 2 top10 3 3 1 0 2 1 0 2 tstsc 3 3 3 3 0 1 0 2 rejr 3 3 3 1 1 0 3 0 pacc 3 3 2 0 0 2 0 2 apgra 3 3 3 3 3 0 3 0

Table 4.  Summaries of different chain graphs using package SEM

 Base chain graph Chain graph selected by SIN Edge p-value Edge p-value Edge p-value strat — spend 1.630e-14 pacc $\rightarrow$ salar 1.137e-06 strat — spend 1.630e-14 strat $\rightarrow$ salar 1.382e-06 pacc $\rightarrow$ rejr 1.470e-03 strat — salar 1.082e-05 spend — salar 2.629e-11 strat $\rightarrow$ apgra 8.237e-02 strat $\rightarrow$ top10 1.935e-09 strat $\rightarrow$ top10 4.743e-07 spend $\rightarrow$ apgra 6.067e-02 spend — salar 7.156e-13 spend $\rightarrow$ top10 2.822e-28 salar $\rightarrow$ apgra 4.794e-03 spend $\rightarrow$ top10 5.979e-34 top10 — salar 8.931e-03 top10 $\rightarrow$ apgra 4.253e-01 spend $\rightarrow$ tstsc 3.995e-12 strat $\rightarrow$ tstsc 3.140e-03 tstsc $\rightarrow$ apgra 1.096e-10 spend $\rightarrow$ rejr 2.909e-03 spend $\rightarrow$ tstsc 6.634e-01 pacc $\rightarrow$ apgra 2.711e-03 salar $\rightarrow$ tstsc 2.350e-05 salar $\rightarrow$ tstsc 2.008e-04 salar $\rightarrow$ rejr 1.323e-03 top10 $\rightarrow$ tstsc 6.831e-19 salar $\rightarrow$ pacc 1.827e-14 strat $\rightarrow$ rejr 1.954e-01 salar $\rightarrow$ apgra 1.570e-02 spend $\rightarrow$ rejr 3.621e-03 top10 — tstsc 1.256e-09 salar $\rightarrow$ rejr 2.575e-04 top10 — pacc 5.020e-01 top10 — rejr 1.816e-04 tstsc — rejr 8.297e-03 tstsc — rejr 1.003e-02 tstsc $\rightarrow$ apgra 8.352e-19 strat $\rightarrow$ pacc 2.585e-02 rejr — pacc 5.617e-03 spend $\rightarrow$ pacc 4.109e-07 pacc $\rightarrow$ apgra 5.481e-03 AIC BIC AIC BIC 67.887 -13.319 80.838 -24.919 Chain graph selected by algorithm Chain graph selected by algorithm Chain graph selected by algorithm ($\alpha=(0.39 , 0.25 , 0.36 , 0.05)$) ($\alpha=(1,1,1,1)$) ($\alpha=(1,3,3,3)$) Edge p-value Edge p-value Edge p-value strat — spend 1.630e-14 spend $\rightarrow$ strat 2.150e-52 spend $\rightarrow$ strat 1.536e-42 strat $\rightarrow$ salar 1.727e-06 strat $\rightarrow$ salar 9.127e-07 salar $\rightarrow$ strat 4.952e-06 strat $\rightarrow$ top10 3.597e-09 spend $\rightarrow$ salar 2.484e-24 strat $\rightarrow$ top10 1.636e-07 spend $\rightarrow$ salar 4.956e-33 top10 $\rightarrow$ spend 2.068e-25 tstsc $\rightarrow$ strat 8.237e-01 spend $\rightarrow$ top10 1.086e-33 salar $\rightarrow$ pacc 7.304e-10 salar $\rightarrow$ spend 9.659e-11 spend $\rightarrow$ tstsc 4.059e-12 apgra $\rightarrow$ salar 3.751e-11 spend $\rightarrow$ top10 3.881e-10 salar $\rightarrow$ tstsc 5.524e-05 top10 — tstsc 2.163e-14 tstsc $\rightarrow$ spend 2.886e-08 salar $\rightarrow$ pacc 1.012e-18 top10 $\rightarrow$ rejr 2.504e-03 tstsc $\rightarrow$ salar 3.654e-27 salar $\rightarrow$ apgra 3.201e-05 tstsc $\rightarrow$ rejr 2.847e-04 salar $\rightarrow$ rejr 8.844e-02 top10 — tstsc 2.295e-10 tstsc $\rightarrow$ apgra 1.428e-43 salar $\rightarrow$ pacc 1.062e-08 top10 $\rightarrow$ rejr 1.951e-03 rejr $\rightarrow$ pacc 1.480e-04 salar $\rightarrow$ apgra 1.036e-04 tstsc $\rightarrow$ rejr 2.348e-04 apgra $\rightarrow$ pacc 3.587e-03 tstsc $\rightarrow$ top10 1.033e-20 tstsc $\rightarrow$ apgra 1.367e-18 top10 $\rightarrow$ rejr 9.160e-03 rejr $\rightarrow$ pacc 1.032e-03 tstsc $\rightarrow$ rejr 5.443e-03 pacc — apgra 5.315e-03 tstsc $\rightarrow$ apgra 2.644e-17 rejr $\rightarrow$ pacc 3.048e-04 apgra $\rightarrow$ pacc 5.759e-03 AIC BIC AIC BIC AIC BIC 61.372 -50.522 102.93 -18.169 58.401 -47.356

Table 5.  Tissues and cell types examined in the PK studies

 Compound Compartment Notation TFV Blood plasma TFV$_{plasma}$ TFV Rectal biopsy tissue TFV$_{tissue}$ TFV Rectal fluid TFV$_{rectal}$ TFVdp Rectal biopsy tissue TFVdp$_{tissue}$ TFVdp Total mononuclear cells in rectal tissue Total$_{\text{MMC}}$ TFVdp CD4$^+$ lymphocytes from MMC CD4$^+_{\text{MMC}}$ TFVdp CD4$^-$ lymphocytes from MMC CD4$^-_{\text{MMC}}$

Table 6.  Summaries of two different chain graphs using package SEM

 Chain graph selected by algorithm Chain graph selected by algorithm ($\alpha=(1,1,1,1)$) ($\alpha=(1,3,3,3)$) Edge p-value Edge p-value CD4$^-_{\text{MMC}}$ $\rightarrow$ Total$_{\text{MMC}}$ 9.881e-19 CD4$^+_{\text{MMC}}$ $\rightarrow$ TFVdp$_{tissue}$ 5.687e-01 CD4$^-_{\text{MMC}}$ $\rightarrow$ CD4$^+_{\text{MMC}}$ 4.091e-81 CD4$^+_{\text{MMC}}$ $\rightarrow$ CD4$^-_{\text{MMC}}$ 2.941e-46 CD4$^-_{\text{MMC}}$ $\rightarrow$ TFV$_{rectal}$ 1.496e-01 CD4$^+_{\text{MMC}}$ $\rightarrow$ TFV$_{plasma}$ 1.210e-02 Total$_{\text{MMC}}$ — TFVdp$_{tissue}$ 1.589e-02 CD4$^+_{\text{MMC}}$ $\rightarrow$ Total$_{\text{MMC}}$ 7.028e-10 TFVdp$_{tissue}$ — CD4$^+_{\text{MMC}}$ 1.812e-02 CD4$^+_{\text{MMC}}$ $\rightarrow$ TFV$_{rectal}$ 1.874e-03 CD4$^+_{\text{MMC}}$ — TFV$_{rectal}$ 8.583e-03 TFVdp$_{tissue}$ — CD4$^-_{\text{MMC}}$ 8.477e-02 Total$_{\text{MMC}}$ $\rightarrow$ TFV$_{plasma}$ 1.815e-03 TFVdp$_{tissue}$ $\rightarrow$ TFV$_{rectal}$ 2.991e-01 CD4$^+_{\text{MMC}}$ $\rightarrow$ TFV$_{tissue}$ 7.043e-01 TFVdp$_{tissue}$ $\rightarrow$ TFV$_{plasma}$ 2.584e-01 TFVdp$_{tissue}$ $\rightarrow$ Total$_{\text{MMC}}$ 1.162e-13 CD4$^-_{\text{MMC}}$ $\rightarrow$ Total$_{\text{MMC}}$ 2.352e-22 TFV$_{plasma}$ $\rightarrow$ TFV$_{tissue}$ 4.259e-01 TFV$_{plasma}$ — Total$_{\text{MMC}}$ 5.719e-02 TFV$_{tissue}$ $\rightarrow$ TFV$_{rectal}$ 7.550e-01 Total$_{\text{MMC}}$ $\rightarrow$ TFV$_{rectal}$ 2.337e-02 AIC BIC AIC BIC Inf Inf 46.875 -11.909

Table 7.  Summaries of modification indices for the model corresponding to the chain graph obtained under prior with $\alpha = (1,3,3,3)$

 5 largest modification indices, A matrix 5 largest modification indices, P matrix (regression coefficients) (variances/covariances) TFV$_{rectal}$ $\rightarrow$ Total$_{\text{MMC}}$ 3.281 TFV$_{rectal}$ — Total$_{\text{MMC}}$ 3.394 TFV$_{rectal}$ $\rightarrow$ TFV$_{plasma}$ 2.219 TFV$_{rectal}$ — TFV$_{plasma}$ 2.881 CD4$^-_{\text{MMC}}$ $\rightarrow$ TFV$_{rectal}$ 0.709 TFV$_{rectal}$ — CD4$^-_{\text{MMC}}$ 0.709 TFV$_{rectal}$ $\rightarrow$ TFVdp$_{tissue}$ 0.654 TFV$_{rectal}$ — TFVdp$_{tissue}$ 0.709 TFV$_{rectal}$ $\rightarrow$ CD4$^-_{\text{MMC}}$ 0.555 TFV$_{tissue}$ — TFVdp$_{tissue}$ 0.389
•  [1] K. Q. Abdool, K. S. S. Abdool and J. A. Frohlich, Effectiveness and safety of tenofovir gel, an antiretroviral microbicide, for the prevention of HIV infection in women, Science, 329 (2010), 1168-1174.  doi: 10.1126/science.1193748. [2] S. A. Andersson, D. Madigan and M. D. Perlman, Alternative Markov properties for chain graphs, Scand. J. Statist., 28 (2001), 33-85.  doi: 10.1111/1467-9469.00224. [3] P. A. Anton, R. D. Cranston, A. Kashuba, C. W. Hendrix, N. N. Bumpus, N. R. Harman, J. Elliott, L. Janocko, E. Khanukhova, R. Dennis, W. G. Cumberland, C. Ju, A. C. Dieguez, C. Mauck and I. McGowan, RMP-02/MTN-006: A phase rectal safety, acceptability, pharmacokinetic, and pharmacodynamic study of tenofovir 1% gel compared with oral tenofovir disoproxil fumarate, AIDS Res Hum Retroviruses, 28 (2012), 1412-1421.  doi: 10.1089/aid.2012.0262. [4] J. M. Baeten, D. Donnell and P. Ndase, et al., Antiretroviral prophylaxis for HIV prevention in heterosexual men and women, N Engl J Med, 367 (2012), 399-410.  doi: 10.1056/NEJMoa1108524. [5] A. Beskos, A. Jasra, N. Kantas and A. Thiery, On the convergence of adaptive sequential Monte Carlo, Ann. Appl. Probab., 26 (2016), 1111-1146.  doi: 10.1214/15-AAP1113. [6] B. C. Boerebach, K. M. Lombarts, C. Keijzer, M. J. Heineman and O. A. Arah, The teacher, the physician and the person: How faculty's teaching performance influences their role modeling, PLoS One, 7 (2012), e32089. doi: 10.1371/journal.pone.0032089. [7] K. Bollen, Structural Equation Models with Latent Variables, Wiley: New York, 1989. doi: 10.1002/9781118619179. [8] C. M. Carvalho and M. West, Dynamic matrix-variate graphical modelso, Bayesian Anal., 2 (2007), 69-97.  doi: 10.1214/07-BA204. [9] H. Chun, X. Zhang and H. Zhao, Gene regulation network inference with joint sparse Gaussian graphical models, J. Comp. Graph. Statist., 24 (2015), 954-974.  doi: 10.1080/10618600.2014.956876. [10] P. Del Moral, A. Doucet and A. Jasra, Sequential Monte Carlo samplers, J. Roy. Statist. Soc. Ser. B, 68 (2006), 411-436.  doi: 10.1111/j.1467-9868.2006.00553.x. [11] A. Dobra, C. Hans, B. Jones, J. R. Nevins, G. Yao and M. West, Sparse graphical models for exploring gene expression data, J. Mult. Anal., 90 (2004), 196-212.  doi: 10.1016/j.jmva.2004.02.009. [12] M. Drton and M. Eichler, Maximum Likelihood Estimation in Gaussian Chain Graph Models under the Alternative Markov Property, Scand. J. Statist., 33 (2006), 247-257.  doi: 10.1111/j.1467-9469.2006.00482.x. [13] M. Drton and M. D. Perlman, A SINful approach to Gaussian graphical model selection, Journal of Statistical Planning and Inference, 138 (2008), 1179-1200.  doi: 10.1016/j.jspi.2007.05.035. [14] M. J. Druzdel and C. Glymour, Causal inferences from databases: Why universities lose students, in Computation, Causation, and Discovery (eds C. Glymour and G. F. Cooper), AAAI Press, Menlo Park, CA., (1999), 521–539. [15] A. Jasra, D. A. Stephens, A. Doucet and T. Tsagaris, Inference for Lévy driven stochastic volatility models via adaptive sequential Monte Carlo, Scand. J. Statist., 38 (2011), 1-22.  doi: 10.1111/j.1467-9469.2010.00723.x. [16] G. Kanayama, H. G. Pope and J. I. Hudson, Associations of anabolic-androgenic steroid use with other behavioral disorders: an analysis using directed acyclic graphs, Psychol Med, 48 (2018), 2601-2608.  doi: 10.1017/S0033291718000508. [17] S. L. Lauritzen and T. S. Richardson, Chain graph models and their causal interpretations, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64 (2002), 321-348.  doi: 10.1111/1467-9868.00340. [18] S. L. Lauritzen and D. J. Spiegelhalter, Local computations with probabilities on graphical structures and their applications to expert systems (with discussion), J. R. Statist. Soc. B, 50 (1988), 157-224.  doi: 10.1111/j.2517-6161.1988.tb01721.x. [19] S. L. Lauritzen and N. Wermuth, Mixed Interaction Models, Institut for Elektroniske Systemer, Aalborg Universitetscenter, 1984. [20] S. L. Lauritzen and N. Wermuth, Graphical models for association between variables, some of which are qualitative and some quantitative, Ann. Statist, 17 (1989), 31-57.  doi: 10.1214/aos/1176347003. [21] A. Lenkoski and A. Dobra, Computational aspects related to inference in Gaussian graphical models with the G-Wishart prior, Journal of Computational and Graphical Statistics, 20 (2011), 140-157.  doi: 10.1198/jcgs.2010.08181. [22] M. Levitz, M. D. Perlman and D. Madigan, Separation and completeness properties for AMP chain graph Markov models, Annals of statistics, 29 (2001), 1751-1784.  doi: 10.1214/aos/1015345961. [23] C. McCarter and S. Kim, On sparse Gaussian chain graph models, Advances in Neural Information Processing Systems (NIPS), 2 (2014), 3212-3220. [24] J. Pearl, A constraint propagation approach to probabilistic reasoning, in Uncertainty in Artificial Intelligence (eds. L. M. Kanal and J. Lemmer), North-Holland, Amsterdam, (1986), 357–370. [25] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, The Morgan Kaufmann Series in Representation and Reasoning. Morgan Kaufmann, San Mateo, CA, 1988. [26] J. M. Pena, Learning marginal AMP chain graphs under faithfulness, in European Workshop on Probabilistic Graphical Models (eds. Linda C. van der Gaag and Ad J. Feelders), Springer, (2014), 382–395. [27] N. Richardson-Harman, C. W. Hendrix, N. N. Bumpus, C. Mauck, R. D. Cranston, K. Yang, J. Elliott, K. Tanner and I. McGowan, Correlation between compartmental tenofovir concentrations and an ex vivo rectal biopsy model of tissue infectibility in the RMP-02/MTN-006 phase 1 study, PLoS One, 9 (2014), e111507. doi: 10.1371/journal.pone.0111507. [28] R. Silva, A MCMC approach for learning the structure of gaussian acyclic directed mixed graphs, in Statistical Models for Data Analysis (eds. P. Giudici, S. Ingrassia and M. Vichi), Springer: New York, (2013), 343–351. doi: 10.1007/978-3-319-00032-9_39. [29] R. Silva and Z. Ghahramani, The Hidden Life of Latent Variables: Bayesian learning with mixed graph models, J. Mach. Learn. Res., 10 (2009), 1187-1238. [30] D. Sonntag and J. M. Pena, On expressiveness of the chain graph interpretations, International Journal of Approximate Reasoning, 68 (2016), 91-107.  doi: 10.1016/j.ijar.2015.07.009. [31] L. Tan, A. Jasra, M. De Iorio and T. Ebbels, Bayesian Inference for multiple Gaussian graphical models, Ann. Appl. Stat., 11 (2017), 2222-2251.  doi: 10.1214/17-AOAS1076. [32] H. Wang, Scaling It Up: Stochastic search structure learning in graphical models, Bayes. Anal, 10 (2015), 351-377.  doi: 10.1214/14-BA916. [33] H. Wang, C. Reesony and C. M. Carvalho, Dynamic financial index models: Modeling conditional dependencies via graphs, Bayesian Anal., 6 (2011), 639-663.  doi: 10.1214/11-BA624. [34] N. Wermuth, Linear recursive equations, covariance selection and path analysis, J. Am. Statist. Assoc, 75 (1980), 963-972.  doi: 10.1080/01621459.1980.10477580. [35] N. Wermuth and and S. L. Lauritzen, On substantive research hypotheses, conditional independence graphs and graphical chain models (with discussion), J. Roy. Statist. Soc. Ser. B, 52 (1990), 21-72.  doi: 10.1111/j.2517-6161.1990.tb01771.x. [36] K. H. Yang, H. Hendrix, N. Bumpus and J. Elliott, et. al, A multi-compartment single and multiple dose pharmacokinetic comparison of rectally applied tenofovir 1% gel and oral tenofovir disoproxil fumarate, PLOS One, 9 (2014), e106196. doi: 10.1371/journal.pone.0106196. [37] Y. Zhou, A. M. Johansen and J. A. Aston, Towards Automatic Model Comparison: An Adaptive Sequence Monte Carlo Approach, J. Comp. Graph. Statist., 25 (2016), 701-726.  doi: 10.1080/10618600.2015.1060885.

Figures(7)

Tables(7)