
-
Previous Article
Modelling uncertainty using stochastic transport noise in a 2-layer quasi-geostrophic model
- FoDS Home
- This Issue
-
Next Article
Hierarchical approximations for data reduction and learning at multiple scales
A Bayesian nonparametric test for conditional independence
Department of Mathematics, Imperial College London, UK |
This article introduces a Bayesian nonparametric method for quantifying the relative evidence in a dataset in favour of the dependence or independence of two variables conditional on a third. The approach uses Pólya tree priors on spaces of conditional probability densities, accounting for uncertainty in the form of the underlying distributions in a nonparametric way. The Bayesian perspective provides an inherently symmetric probability measure of conditional dependence or independence, a feature particularly advantageous in causal discovery and not employed in existing procedures of this type.
References:
[1] |
J. O. Berger and A. Guglielmi,
Bayesian and conditional frequentist testing of a parametric model versus nonparametric alternatives, J. Amer. Statist. Assoc., 96 (2001), 174-184.
doi: 10.1198/016214501750333045. |
[2] |
W. Bergsma, Testing conditional independence for continuous random variables, Report Eurandom, 2004. Google Scholar |
[3] |
T. B. Berrett, Y. Wang, R. F. Barber and R. J. Samworth,
The conditional permutation test for independence while controlling for confounders, J. R. Stat. Soc. B, 82 (2020), 175-197.
doi: 10.1111/rssb.12340. |
[4] |
E. Candès, Y. Fan, L. Janson and J. Lv,
Panning for gold: Model-X knockoffs for high dimensional controlled variable selection, J. R. Stat. Soc. Ser. B. Stat. Methodol., 80 (2018), 551-577.
doi: 10.1111/rssb.12265. |
[5] |
G. Doran, K. Muandet, K. Zhang and B. Schölkopf, A permutation-based kernel conditional independence test, Proc. 30th Conf. UAI, 132–141. Google Scholar |
[6] |
M. Escobar and M. West,
Bayesian density estimation and inference using mixtures, J. Amer. Statist. Assoc., 90 (1995), 577-588.
doi: 10.1080/01621459.1995.10476550. |
[7] |
S. Filippi and C. Holmes,
A Bayesian nonparametric approach to testing for dependence between random variables, Bayesian Anal., 12 (2017), 919-938.
doi: 10.1214/16-BA1027. |
[8] |
R. Fisher, The distribution of the partial correlation coefficient, Metron, 3 (1924), 329-332. Google Scholar |
[9] |
K. Fukumizu, A. Gretton, X. Sun and B. Schölkopf, Kernel measures of conditional dependence, Adv. Neural Inf. Process. Syst., 20, 489–496. Google Scholar |
[10] |
S. Ghosal and A. van der Vaart, Fundamentals of Nonparametric Bayesian Inference, Cambridge Series in Statistical and Probabilistic Mathematics, 44. Cambridge University Press, Cambridge, 2017.
doi: 10.1017/9781139029834. |
[11] |
J. K. Ghosh and R. V. Ramamoorthi, Bayesian Nonparametrics, Springer-Verlag, New York, 2003. |
[12] |
P. Giudici,
Bayes factors for zero partial covariances, J. Statist. Plann. Inference, 46 (1995), 161-174.
doi: 10.1016/0378-3758(94)00101-Z. |
[13] |
T. E. Hanson,
Inference for mixtures of finite Pólya tree models, J. Amer. Statist. Assoc., 101 (2006), 1548-1565.
doi: 10.1198/016214506000000384. |
[14] |
T. Hanson and W. O. Johnson,
Modeling regression error with a mixture of Pólya trees, J. Amer. Statist. Assoc., 97 (2002), 1020-1033.
doi: 10.1198/016214502388618843. |
[15] |
N. Harris and M. Drton,
PCalgorithm for nonparanormal graphical models, J. Mach. Learn. Res., 14 (2013), 3365-3383.
|
[16] |
P. Hoyer, D. Janzing, J. Mooij, J. Peters and B. Schölkopf, Nonlinear causal discovery with additive noise models, Adv. Neural Inf. Process. Syst. 21, 689–696. Google Scholar |
[17] |
T.-M. Huang,
Testing conditional independence using maximal nonlinear conditional correlation, Ann. Statist., 38 (2010), 2047-2091.
doi: 10.1214/09-AOS770. |
[18] |
R. E. Kass and A. E. Raftery,
Bayes factors, J. Amer. Statist. Assoc., 90 (1995), 773-795.
doi: 10.1080/01621459.1995.10476572. |
[19] |
T. Kunihama and D. B. Dunson,
Nonparametric Bayes inference on conditional independence, Biometrika, 103 (2016), 35-47.
doi: 10.1093/biomet/asv060. |
[20] |
M. Lavine,
Some aspects of Pólya tree distributions for statistical modelling, Ann. Statist., 20 (1992), 1222-1235.
doi: 10.1214/aos/1176348767. |
[21] |
M. Lavine,
More aspects of Pólya tree distributions for statistical modelling, Ann. Statist., 22 (1994), 1161-1176.
doi: 10.1214/aos/1176325623. |
[22] |
L. Ma,
Adaptive testing of conditional association through recursive mixture modeling, J. Amer. Statist. Assoc., 108 (2013), 1493-1505.
doi: 10.1080/01621459.2013.838899. |
[23] |
L. Ma,
Recursive partitioning and multi-scale modeling on conditional densities, Electron. J. Stat., 11 (2017), 1297-1325.
doi: 10.1214/17-EJS1254. |
[24] |
D. J. C. MacKay, Information Theory, Inference and Learning Algorithms, Cambridge University Press, 2003.
![]() |
[25] |
D. Margaritis, Distribution-free learning of bayesian network structure in continuous domains, Proc. 20th Nat. Conf. Artificial Intel., (2005), 825–830. Google Scholar |
[26] |
R. D. Mauldin, W. D. Sudderth and S. C. Williams,
Pólya trees and random distributions, Ann. Statist., 20 (1992), 1203-1221.
doi: 10.1214/aos/1176348766. |
[27] |
S. M. Paddock, Randomized Pólya Trees: Bayesian Nonparametrics for Multivariate Data Analysis, Thesis (Ph.D.)–Duke University. 1999. |
[28] |
J. Pearl, Causality: Models, Reasoning, and Inference, Cambridge University Press, 2009.
doi: 10.1017/CBO9780511803161.![]() ![]() |
[29] |
J. Peters, D. Janzing and B. Schölkopf, Elements of Causal Inference: Foundations and Learning Algorithms, MIT Press, Cambridge, MA, 2017.
![]() |
[30] |
J. Peters, J. Mooij, D. Janzing and B. Schölkopf,
Causal discovery with continuous additive noise models, J. Mach. Learn. Res., 15 (2014), 2009-2053.
|
[31] |
J. Ramsey, A scalable conditional independence test for nonlinear, non-Gaussian data, arXiv: 1401.5031. Google Scholar |
[32] |
J. Runge, Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information, arXiv: 1709.01447. Google Scholar |
[33] |
F. Saad and V. Mansinghka, Detecting dependencies in sparse, multivariate databases using probabilistic programming and non-parametric Bayes, Proc. Mach. Learn. Res., 46 (2017), 632-641. Google Scholar |
[34] |
R. Shah and J. Peters, The hardness of conditional independence testing and the generalised covariance measure, arXiv: 1804.07203. Google Scholar |
[35] |
P. Spirtes and C. Glymour,
An algorithm for fast recovery of sparse causal graphs, Soc. Sci. Comput. Rev., 9 (1991), 62-72.
doi: 10.1177/089443939100900106. |
[36] |
E. Strobl, K. Zhang and S. Visweswaran, Approximate kernel-based conditional independence tests for fast non-parametric causal discovery, J. Causal Inference, (2019), 20180017.
doi: 10.1515/jci-2018-0017. |
[37] |
L. Su and H. White,
A consistent characteristic function-based test for conditional independence, J. Econom., 141 (2007), 807-834.
doi: 10.1016/j.jeconom.2006.11.006. |
[38] |
L. Su and H. White,
A nonparametric Hellinger metric test for conditional independence, Econom. Theory, 24 (2008), 829-864.
doi: 10.1017/S0266466608080341. |
[39] |
W. H. Wong and L. Ma,
Optional Pólya tree and Bayesian inference, Ann. Statist., 38 (2010), 1433-1459.
doi: 10.1214/09-AOS755. |
[40] |
Q. Zhang, S. Filippi, S. Flaxman and D. Sejdinovic, Feature-to-feature regression for a two-step conditional independence test, Proc. 33rd Conf. UAI, 2017. Google Scholar |
[41] |
K. Zhang, J. Peters, D. Janzing and B. Schölkopf, Kernel-based conditional independence test and application in causal discovery, arXiv: 1202.3775. Google Scholar |
[42] |
J. Zhang, L. Yang and X. Wu,
Pólya tree priors and their estimation with multi-group data, Stat. Pap., 60 (2019), 499-525.
doi: 10.1007/s00362-016-0852-x. |
show all references
References:
[1] |
J. O. Berger and A. Guglielmi,
Bayesian and conditional frequentist testing of a parametric model versus nonparametric alternatives, J. Amer. Statist. Assoc., 96 (2001), 174-184.
doi: 10.1198/016214501750333045. |
[2] |
W. Bergsma, Testing conditional independence for continuous random variables, Report Eurandom, 2004. Google Scholar |
[3] |
T. B. Berrett, Y. Wang, R. F. Barber and R. J. Samworth,
The conditional permutation test for independence while controlling for confounders, J. R. Stat. Soc. B, 82 (2020), 175-197.
doi: 10.1111/rssb.12340. |
[4] |
E. Candès, Y. Fan, L. Janson and J. Lv,
Panning for gold: Model-X knockoffs for high dimensional controlled variable selection, J. R. Stat. Soc. Ser. B. Stat. Methodol., 80 (2018), 551-577.
doi: 10.1111/rssb.12265. |
[5] |
G. Doran, K. Muandet, K. Zhang and B. Schölkopf, A permutation-based kernel conditional independence test, Proc. 30th Conf. UAI, 132–141. Google Scholar |
[6] |
M. Escobar and M. West,
Bayesian density estimation and inference using mixtures, J. Amer. Statist. Assoc., 90 (1995), 577-588.
doi: 10.1080/01621459.1995.10476550. |
[7] |
S. Filippi and C. Holmes,
A Bayesian nonparametric approach to testing for dependence between random variables, Bayesian Anal., 12 (2017), 919-938.
doi: 10.1214/16-BA1027. |
[8] |
R. Fisher, The distribution of the partial correlation coefficient, Metron, 3 (1924), 329-332. Google Scholar |
[9] |
K. Fukumizu, A. Gretton, X. Sun and B. Schölkopf, Kernel measures of conditional dependence, Adv. Neural Inf. Process. Syst., 20, 489–496. Google Scholar |
[10] |
S. Ghosal and A. van der Vaart, Fundamentals of Nonparametric Bayesian Inference, Cambridge Series in Statistical and Probabilistic Mathematics, 44. Cambridge University Press, Cambridge, 2017.
doi: 10.1017/9781139029834. |
[11] |
J. K. Ghosh and R. V. Ramamoorthi, Bayesian Nonparametrics, Springer-Verlag, New York, 2003. |
[12] |
P. Giudici,
Bayes factors for zero partial covariances, J. Statist. Plann. Inference, 46 (1995), 161-174.
doi: 10.1016/0378-3758(94)00101-Z. |
[13] |
T. E. Hanson,
Inference for mixtures of finite Pólya tree models, J. Amer. Statist. Assoc., 101 (2006), 1548-1565.
doi: 10.1198/016214506000000384. |
[14] |
T. Hanson and W. O. Johnson,
Modeling regression error with a mixture of Pólya trees, J. Amer. Statist. Assoc., 97 (2002), 1020-1033.
doi: 10.1198/016214502388618843. |
[15] |
N. Harris and M. Drton,
PCalgorithm for nonparanormal graphical models, J. Mach. Learn. Res., 14 (2013), 3365-3383.
|
[16] |
P. Hoyer, D. Janzing, J. Mooij, J. Peters and B. Schölkopf, Nonlinear causal discovery with additive noise models, Adv. Neural Inf. Process. Syst. 21, 689–696. Google Scholar |
[17] |
T.-M. Huang,
Testing conditional independence using maximal nonlinear conditional correlation, Ann. Statist., 38 (2010), 2047-2091.
doi: 10.1214/09-AOS770. |
[18] |
R. E. Kass and A. E. Raftery,
Bayes factors, J. Amer. Statist. Assoc., 90 (1995), 773-795.
doi: 10.1080/01621459.1995.10476572. |
[19] |
T. Kunihama and D. B. Dunson,
Nonparametric Bayes inference on conditional independence, Biometrika, 103 (2016), 35-47.
doi: 10.1093/biomet/asv060. |
[20] |
M. Lavine,
Some aspects of Pólya tree distributions for statistical modelling, Ann. Statist., 20 (1992), 1222-1235.
doi: 10.1214/aos/1176348767. |
[21] |
M. Lavine,
More aspects of Pólya tree distributions for statistical modelling, Ann. Statist., 22 (1994), 1161-1176.
doi: 10.1214/aos/1176325623. |
[22] |
L. Ma,
Adaptive testing of conditional association through recursive mixture modeling, J. Amer. Statist. Assoc., 108 (2013), 1493-1505.
doi: 10.1080/01621459.2013.838899. |
[23] |
L. Ma,
Recursive partitioning and multi-scale modeling on conditional densities, Electron. J. Stat., 11 (2017), 1297-1325.
doi: 10.1214/17-EJS1254. |
[24] |
D. J. C. MacKay, Information Theory, Inference and Learning Algorithms, Cambridge University Press, 2003.
![]() |
[25] |
D. Margaritis, Distribution-free learning of bayesian network structure in continuous domains, Proc. 20th Nat. Conf. Artificial Intel., (2005), 825–830. Google Scholar |
[26] |
R. D. Mauldin, W. D. Sudderth and S. C. Williams,
Pólya trees and random distributions, Ann. Statist., 20 (1992), 1203-1221.
doi: 10.1214/aos/1176348766. |
[27] |
S. M. Paddock, Randomized Pólya Trees: Bayesian Nonparametrics for Multivariate Data Analysis, Thesis (Ph.D.)–Duke University. 1999. |
[28] |
J. Pearl, Causality: Models, Reasoning, and Inference, Cambridge University Press, 2009.
doi: 10.1017/CBO9780511803161.![]() ![]() |
[29] |
J. Peters, D. Janzing and B. Schölkopf, Elements of Causal Inference: Foundations and Learning Algorithms, MIT Press, Cambridge, MA, 2017.
![]() |
[30] |
J. Peters, J. Mooij, D. Janzing and B. Schölkopf,
Causal discovery with continuous additive noise models, J. Mach. Learn. Res., 15 (2014), 2009-2053.
|
[31] |
J. Ramsey, A scalable conditional independence test for nonlinear, non-Gaussian data, arXiv: 1401.5031. Google Scholar |
[32] |
J. Runge, Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information, arXiv: 1709.01447. Google Scholar |
[33] |
F. Saad and V. Mansinghka, Detecting dependencies in sparse, multivariate databases using probabilistic programming and non-parametric Bayes, Proc. Mach. Learn. Res., 46 (2017), 632-641. Google Scholar |
[34] |
R. Shah and J. Peters, The hardness of conditional independence testing and the generalised covariance measure, arXiv: 1804.07203. Google Scholar |
[35] |
P. Spirtes and C. Glymour,
An algorithm for fast recovery of sparse causal graphs, Soc. Sci. Comput. Rev., 9 (1991), 62-72.
doi: 10.1177/089443939100900106. |
[36] |
E. Strobl, K. Zhang and S. Visweswaran, Approximate kernel-based conditional independence tests for fast non-parametric causal discovery, J. Causal Inference, (2019), 20180017.
doi: 10.1515/jci-2018-0017. |
[37] |
L. Su and H. White,
A consistent characteristic function-based test for conditional independence, J. Econom., 141 (2007), 807-834.
doi: 10.1016/j.jeconom.2006.11.006. |
[38] |
L. Su and H. White,
A nonparametric Hellinger metric test for conditional independence, Econom. Theory, 24 (2008), 829-864.
doi: 10.1017/S0266466608080341. |
[39] |
W. H. Wong and L. Ma,
Optional Pólya tree and Bayesian inference, Ann. Statist., 38 (2010), 1433-1459.
doi: 10.1214/09-AOS755. |
[40] |
Q. Zhang, S. Filippi, S. Flaxman and D. Sejdinovic, Feature-to-feature regression for a two-step conditional independence test, Proc. 33rd Conf. UAI, 2017. Google Scholar |
[41] |
K. Zhang, J. Peters, D. Janzing and B. Schölkopf, Kernel-based conditional independence test and application in causal discovery, arXiv: 1202.3775. Google Scholar |
[42] |
J. Zhang, L. Yang and X. Wu,
Pólya tree priors and their estimation with multi-group data, Stat. Pap., 60 (2019), 499-525.
doi: 10.1007/s00362-016-0852-x. |







[1] |
Tomáš Smejkal, Jiří Mikyška, Jaromír Kukal. Comparison of modern heuristics on solving the phase stability testing problem. Discrete & Continuous Dynamical Systems - S, 2021, 14 (3) : 1161-1180. doi: 10.3934/dcdss.2020227 |
Impact Factor:
Tools
Article outline
Figures and Tables
[Back to Top]