# American Institute of Mathematical Sciences

June  2020, 2(2): 155-172. doi: 10.3934/fods.2020009

## A Bayesian nonparametric test for conditional independence

 Department of Mathematics, Imperial College London, UK

Published  July 2020

Fund Project: Supported by EPSRC grant EP/R013519/1

This article introduces a Bayesian nonparametric method for quantifying the relative evidence in a dataset in favour of the dependence or independence of two variables conditional on a third. The approach uses Pólya tree priors on spaces of conditional probability densities, accounting for uncertainty in the form of the underlying distributions in a nonparametric way. The Bayesian perspective provides an inherently symmetric probability measure of conditional dependence or independence, a feature particularly advantageous in causal discovery and not employed in existing procedures of this type.

Citation: Onur Teymur, Sarah Filippi. A Bayesian nonparametric test for conditional independence. Foundations of Data Science, 2020, 2 (2) : 155-172. doi: 10.3934/fods.2020009
##### References:

show all references

##### References:
Construction of a Pólya tree distribution on $\Omega = [0,1]$. From each set $C_\ast$, a particle of probability mass passes to the left with (random) probability $\theta_{\ast0}$ and to the right with probability $\theta_{\ast1} = 1-\theta_{\ast0}$, with all $\theta_\ast$ being independently Beta-distributed as described in the main text
Pseudocode for the proposed Bayesian nonparametric test for conditional independence
Application of the proposed Bayesian testing procedure to four synthetic datasets supported on $[0,1]^3$, chosen such that all combinations of unconditional and conditional dependence/independence are represented. The final column gives the ensemble of probabilities of conditional dependence $p(H_1|W)$ output by the test over 100 repetitions at varying values of data size $N$, with the blue line representing the median, and the dark and light shaded regions representing the (25, 75)-percentile and (5, 95)-percentile ranges respectively
Marginal scatter plots from the CalCOFI Bottle dataset showing the pairwise relationships between $\texttt{Salnty}$, $\texttt{Oxy_µmol.Kg}$ and $\texttt{T_degC}$. The nonlinear nature of the dependences is immediately apparent
Example pairwise dependence graphs output by the Bayesian conditional independence test for five variables from the CalCOFI dataset, conditional on $\texttt{T_degC}$, for four different sizes of subsample drawn from the complete dataset. The numbers associated with each edge are the posterior probabilities of conditional dependence $p(H_1|W^{(N)})$ and are given to two decimal places; where no edge is shown, this indicates $p(H_1|W^{(N)})<0.005$
Box-plots giving the output posterior probability of conditional dependence $p(H_1|W^{(N)})$ for 100 repetitions of the Bayesian conditional independence test applied to randomly-drawn subsamples of various sizes $N$ from the CalCOFI dataset. The left-hand plot gives a representative example of a pair of variables conditionally dependent given $\texttt{T_degC}$, while the right-hand plot gives a representative conditionally independent pair
. Top right: 'Slices' from this heatmap with $\rho = 0.5$. Bottom: Test outputs for 100 repetitions of the second and third models of Figure 3. Red plots fix $c = 1$ (output identical to Figure 3), while the blue plots use the optimising values $\hat{c}$ from the plot above">Figure 7.  Top left: Heat map of conditional marginal likelihood values for the three constituent models over $\Omega_X$, $\Omega_Y$ and $\Omega_XY$ for the second and third models of Figure 3. Top right: 'Slices' from this heatmap with $\rho = 0.5$. Bottom: Test outputs for 100 repetitions of the second and third models of Figure 3. Red plots fix $c = 1$ (output identical to Figure 3), while the blue plots use the optimising values $\hat{c}$ from the plot above
 [1] Jörg Schmeling. A notion of independence via moving targets. Discrete & Continuous Dynamical Systems, 2006, 15 (1) : 269-280. doi: 10.3934/dcds.2006.15.269 [2] C. Xiong, J.P. Miller, F. Gao, Y. Yan, J.C. Morris. Testing increasing hazard rate for the progression time of dementia. Discrete & Continuous Dynamical Systems - B, 2004, 4 (3) : 813-821. doi: 10.3934/dcdsb.2004.4.813 [3] Fryderyk Falniowski, Marcin Kulczycki, Dominik Kwietniak, Jian Li. Two results on entropy, chaos and independence in symbolic dynamics. Discrete & Continuous Dynamical Systems - B, 2015, 20 (10) : 3487-3505. doi: 10.3934/dcdsb.2015.20.3487 [4] Jean-François Biasse, Michael J. Jacobson, Jr.. Smoothness testing of polynomials over finite fields. Advances in Mathematics of Communications, 2014, 8 (4) : 459-477. doi: 10.3934/amc.2014.8.459 [5] Antoni Buades, Bartomeu Coll, Jose-Luis Lisani, Catalina Sbert. Conditional image diffusion. Inverse Problems & Imaging, 2007, 1 (4) : 593-608. doi: 10.3934/ipi.2007.1.593 [6] Tomáš Smejkal, Jiří Mikyška, Jaromír Kukal. Comparison of modern heuristics on solving the phase stability testing problem. Discrete & Continuous Dynamical Systems - S, 2021, 14 (3) : 1161-1180. doi: 10.3934/dcdss.2020227 [7] Philippe Destuynder, Caroline Fabre. Few remarks on the use of Love waves in non destructive testing. Discrete & Continuous Dynamical Systems - S, 2016, 9 (2) : 427-444. doi: 10.3934/dcdss.2016005 [8] Alan Beggs. Learning in monotone bayesian games. Journal of Dynamics & Games, 2015, 2 (2) : 117-140. doi: 10.3934/jdg.2015.2.117 [9] Christopher Oballe, Alan Cherne, Dave Boothe, Scott Kerick, Piotr J. Franaszczuk, Vasileios Maroulas. Bayesian topological signal processing. Discrete & Continuous Dynamical Systems - S, 2021  doi: 10.3934/dcdss.2021084 [10] David Simmons. Conditional measures and conditional expectation; Rohlin's Disintegration Theorem. Discrete & Continuous Dynamical Systems, 2012, 32 (7) : 2565-2582. doi: 10.3934/dcds.2012.32.2565 [11] Xiaomin Zhou. A formula of conditional entropy and some applications. Discrete & Continuous Dynamical Systems, 2016, 36 (7) : 4063-4075. doi: 10.3934/dcds.2016.36.4063 [12] Deng Lu, Maria De Iorio, Ajay Jasra, Gary L. Rosner. Bayesian inference for latent chain graphs. Foundations of Data Science, 2020, 2 (1) : 35-54. doi: 10.3934/fods.2020003 [13] Sahani Pathiraja, Sebastian Reich. Discrete gradients for computational Bayesian inference. Journal of Computational Dynamics, 2019, 6 (2) : 385-400. doi: 10.3934/jcd.2019019 [14] Masoumeh Dashti, Stephen Harris, Andrew Stuart. Besov priors for Bayesian inverse problems. Inverse Problems & Imaging, 2012, 6 (2) : 183-200. doi: 10.3934/ipi.2012.6.183 [15] Mila Nikolova. Model distortions in Bayesian MAP reconstruction. Inverse Problems & Imaging, 2007, 1 (2) : 399-422. doi: 10.3934/ipi.2007.1.399 [16] Matthew M. Dunlop, Andrew M. Stuart. The Bayesian formulation of EIT: Analysis and algorithms. Inverse Problems & Imaging, 2016, 10 (4) : 1007-1036. doi: 10.3934/ipi.2016030 [17] Monica Pragliola, Daniela Calvetti, Erkki Somersalo. Overcomplete representation in a hierarchical Bayesian framework. Inverse Problems & Imaging, , () : -. doi: 10.3934/ipi.2021039 [18] Felipe Cucker, Jiu-Gang Dong. A conditional, collision-avoiding, model for swarming. Discrete & Continuous Dynamical Systems, 2014, 34 (3) : 1009-1020. doi: 10.3934/dcds.2014.34.1009 [19] Ping Huang, Ercai Chen, Chenwei Wang. Entropy formulae of conditional entropy in mean metrics. Discrete & Continuous Dynamical Systems, 2018, 38 (10) : 5129-5144. doi: 10.3934/dcds.2018226 [20] Jingzhi Li, Masahiro Yamamoto, Jun Zou. Conditional Stability and Numerical Reconstruction of Initial Temperature. Communications on Pure & Applied Analysis, 2009, 8 (1) : 361-382. doi: 10.3934/cpaa.2009.8.361

Impact Factor: