# American Institute of Mathematical Sciences

September  2019, 1(3): 293-306. doi: 10.3934/fods.2019013

## Modelling dynamic network evolution as a Pitman-Yor process

 Department of Mathematics, Imperial College London, 180 Queen's Gate, London – SW7 2AZ, United Kingdom

Published  August 2019

Dynamic interaction networks frequently arise in biology, communications technology and the social sciences, representing, for example, neuronal connectivity in the brain, internet connections between computers and human interactions within social networks. The evolution and strengthening of the links in such networks can be observed through sequences of connection events occurring between network nodes over time. In some of these applications, the identity and size of the network may be unknown a priori and may change over time. In this article, a model for the evolution of dynamic networks based on the Pitman-Yor process is proposed. This model explicitly admits power-laws in the number of connections on each edge, often present in real world networks, and, for careful choices of the parameters, power-laws for the degree distribution of the nodes. A novel empirical method for the estimation of the hyperparameters of the Pitman-Yor process is proposed, and some necessary corrections for uniform discrete base distributions are carefully addressed. The methodology is tested on synthetic data and in an anomaly detection study on the enterprise computer network of the Los Alamos National Laboratory, and successfully detects connections from a red-team penetration test.

Citation: Francesco Sanna Passino, Nicholas A. Heard. Modelling dynamic network evolution as a Pitman-Yor process. Foundations of Data Science, 2019, 1 (3) : 293-306. doi: 10.3934/fods.2019013
##### References:
 [1] D. Aldous, Exchangeability and related topics, in École d'Été de Probabilités de Saint-Flour XIII-1983, 1117 (1985), 1–198. doi: 10.1007/BFb0099421.  Google Scholar [2] A. L. Barabási, The origin of bursts and heavy tails in human dynamics, Nature, 435 (2005), 207-211.   Google Scholar [3] Y. Bengio, R. Ducharme, P. Vincent and C. Janvin, A neural probabilistic language model, Journal of Machine Learning Research, 3 (2003), 1137-1155.   Google Scholar [4] W. Buntine and M. Hutter, A Bayesian view of the Poisson-Dirichlet process, preprint, arXiv: 1007.0296. Google Scholar [5] C. Chen, L. Du and W. Buntine, Sampling table configurations for the hierarchical Poisson-Dirichlet process, in Machine Learning and Knowledge Discovery in Databases (eds. D. Gunopulos, T. Hofmann, D. Malerba and M. Vazirgiannis), Springer Berlin Heidelberg, 2011, 296–311. doi: 10.1007/978-3-642-23780-5_29.  Google Scholar [6] T. S. Ferguson, A Bayesian analysis of some nonparametric problems, The Annals of Statistics, 1 (1973), 209-230.  doi: 10.1214/aos/1176342360.  Google Scholar [7] R. A. Fisher, Statistical Methods for Research Workers, Fourteenth edition. Hafner Publishing Co., New York, 1973.  Google Scholar [8] A. Goldenberg, A. X. Zheng, S. E. Fienberg and E. M. Airoldi, A survey of statistical network models, Foundations and Trends in Machine Learning, 2 (2009), 129-233.   Google Scholar [9] S. Goldwater, T. L. Griffiths and M. Johnson, Interpolating between types and tokens by estimating power-law generators, in Proceedings of the 18th International Conference on Neural Information Processing Systems, MIT Press, 2005, 459–466. Google Scholar [10] N. A. Heard and P. Rubin-Delanchy, Network-wide anomaly detection via the Dirichlet process, in Proceedings of the IEEE workshop on Big Data Analytics for Cyber-security Computing, 2016. Google Scholar [11] N. A. Heard and P. Rubin-Delanchy, Choosing between methods of combining $p$-values, Biometrika, 105 (2018), 239-246.  doi: 10.1093/biomet/asx076.  Google Scholar [12] H. Ishwaran and L. F. James, Gibbs sampling methods for stick-breaking priors, Journal of the American Statistical Association, 96 (2001), 161-173.  doi: 10.1198/016214501750332758.  Google Scholar [13] D. Jurafsky, J. H. Martin, P. Norvig and S. Russell, Speech and Language Processing, Pearson Education, 2014. Google Scholar [14] A. D. Kent, Cybersecurity data sources for dynamic network research, in Dynamic Networks and Cyber-Security, World Scientific, 2016. Google Scholar [15] H. Lancaster, Statistical control of counting experiments, Biometrika, 39 (1952), 419-422.   Google Scholar [16] Y. Lv and C. X. Zhai, Positional language models for information retrieval, in Proceedings of the 32Nd International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, 2009,299–306. doi: 10.1145/1571941.1571994.  Google Scholar [17] C. Matias and V. Miele, Statistical clustering of temporal networks through a dynamic stochastic block model, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79 (2017), 1119-1141.  doi: 10.1111/rssb.12200.  Google Scholar [18] T. Mikolov, M. Karafiát, L. Burget, J. Černocký and S. Khudanpur, Recurrent neural network based language model, in Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), International Speech Communication Association, 2010, 1045–1048. Google Scholar [19] M. E. J. Newman, Power laws, Pareto distributions and Zipf's law, Contemporary Physics, 46 (2005), 323-351.  doi: 10.1080/00107510500052444.  Google Scholar [20] K. Pearson, On a method of determining whether a sample of size $n$ supposed to have been drawn from a parent population having a known probability integral has probably been drawn at random, Biometrika, 25 (1933), 379-410.   Google Scholar [21] P. O. Perry and P. J. Wolfe, Point process modelling for directed interaction networks, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75 (2013), 821-849.  doi: 10.1111/rssb.12013.  Google Scholar [22] J. Pitman, Combinatorial Stochastic Processes, Lecture Notes in Mathematics, 1875. Springer-Verlag, Berlin, 2006.  Google Scholar [23] J. Pitman and M. Yor, The two-parameter Poisson-Dirichlet distribution derived from a stable sub-ordinator, Annals of Probability, 25 (1997), 855-900.  doi: 10.1214/aop/1024404422.  Google Scholar [24] M. Price-Williams and N. A. Heard, Nonparametric self-exciting models for computer network traffic, Statistics and Computing, 2019, 1–12. doi: 10.1007/s11222-019-09875-z.  Google Scholar [25] R. Rosenfeld, A maximum entropy approach to adaptive statistical language modelling, Computer Speech & Language, 10 (1996), 187-228.  doi: 10.1006/csla.1996.0011.  Google Scholar [26] P. Rubin-Delanchy, N. A. Heard and D. J. Lawson, Meta analysis of mid-$p$-values: some new results based on the convex order, Journal of the American Statistical Association, 2018. doi: 10.1080/01621459.2018.1469994.  Google Scholar [27] B. W. Silverman, Density Estimation, London: Chapman and Hall, 1986.  Google Scholar [28] S. A. Stouffer, E. A. Suchman, L. C. DeVinney, S. A. Star and R. M. Williams, The American Soldier. Adjustment During Army Life, Princeton, New Jersey: Princeton University Press, 1949. Google Scholar [29] W. Y. Teh, A hierarchical Bayesian language model based on Pitman-Yor processes, in Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association of Computational Linguistics, 2006,985–992. doi: 10.3115/1220175.1220299.  Google Scholar [30] L. H. C. Tippett, The Methods of Statistics, 4th ed. John Wiley & Sons, Inc., New York, N. Y.; Williams & Norgate, Ltd., London, 1952.  Google Scholar [31] H. M. Wallach, S. T. Jensen, L. Dicker and K. A. Heller, An alternative prior process for nonparametric Bayesian clustering, in Proceedings of the Thireenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), 2010,892–899. Google Scholar

show all references

##### References:
 [1] D. Aldous, Exchangeability and related topics, in École d'Été de Probabilités de Saint-Flour XIII-1983, 1117 (1985), 1–198. doi: 10.1007/BFb0099421.  Google Scholar [2] A. L. Barabási, The origin of bursts and heavy tails in human dynamics, Nature, 435 (2005), 207-211.   Google Scholar [3] Y. Bengio, R. Ducharme, P. Vincent and C. Janvin, A neural probabilistic language model, Journal of Machine Learning Research, 3 (2003), 1137-1155.   Google Scholar [4] W. Buntine and M. Hutter, A Bayesian view of the Poisson-Dirichlet process, preprint, arXiv: 1007.0296. Google Scholar [5] C. Chen, L. Du and W. Buntine, Sampling table configurations for the hierarchical Poisson-Dirichlet process, in Machine Learning and Knowledge Discovery in Databases (eds. D. Gunopulos, T. Hofmann, D. Malerba and M. Vazirgiannis), Springer Berlin Heidelberg, 2011, 296–311. doi: 10.1007/978-3-642-23780-5_29.  Google Scholar [6] T. S. Ferguson, A Bayesian analysis of some nonparametric problems, The Annals of Statistics, 1 (1973), 209-230.  doi: 10.1214/aos/1176342360.  Google Scholar [7] R. A. Fisher, Statistical Methods for Research Workers, Fourteenth edition. Hafner Publishing Co., New York, 1973.  Google Scholar [8] A. Goldenberg, A. X. Zheng, S. E. Fienberg and E. M. Airoldi, A survey of statistical network models, Foundations and Trends in Machine Learning, 2 (2009), 129-233.   Google Scholar [9] S. Goldwater, T. L. Griffiths and M. Johnson, Interpolating between types and tokens by estimating power-law generators, in Proceedings of the 18th International Conference on Neural Information Processing Systems, MIT Press, 2005, 459–466. Google Scholar [10] N. A. Heard and P. Rubin-Delanchy, Network-wide anomaly detection via the Dirichlet process, in Proceedings of the IEEE workshop on Big Data Analytics for Cyber-security Computing, 2016. Google Scholar [11] N. A. Heard and P. Rubin-Delanchy, Choosing between methods of combining $p$-values, Biometrika, 105 (2018), 239-246.  doi: 10.1093/biomet/asx076.  Google Scholar [12] H. Ishwaran and L. F. James, Gibbs sampling methods for stick-breaking priors, Journal of the American Statistical Association, 96 (2001), 161-173.  doi: 10.1198/016214501750332758.  Google Scholar [13] D. Jurafsky, J. H. Martin, P. Norvig and S. Russell, Speech and Language Processing, Pearson Education, 2014. Google Scholar [14] A. D. Kent, Cybersecurity data sources for dynamic network research, in Dynamic Networks and Cyber-Security, World Scientific, 2016. Google Scholar [15] H. Lancaster, Statistical control of counting experiments, Biometrika, 39 (1952), 419-422.   Google Scholar [16] Y. Lv and C. X. Zhai, Positional language models for information retrieval, in Proceedings of the 32Nd International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, 2009,299–306. doi: 10.1145/1571941.1571994.  Google Scholar [17] C. Matias and V. Miele, Statistical clustering of temporal networks through a dynamic stochastic block model, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79 (2017), 1119-1141.  doi: 10.1111/rssb.12200.  Google Scholar [18] T. Mikolov, M. Karafiát, L. Burget, J. Černocký and S. Khudanpur, Recurrent neural network based language model, in Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), International Speech Communication Association, 2010, 1045–1048. Google Scholar [19] M. E. J. Newman, Power laws, Pareto distributions and Zipf's law, Contemporary Physics, 46 (2005), 323-351.  doi: 10.1080/00107510500052444.  Google Scholar [20] K. Pearson, On a method of determining whether a sample of size $n$ supposed to have been drawn from a parent population having a known probability integral has probably been drawn at random, Biometrika, 25 (1933), 379-410.   Google Scholar [21] P. O. Perry and P. J. Wolfe, Point process modelling for directed interaction networks, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75 (2013), 821-849.  doi: 10.1111/rssb.12013.  Google Scholar [22] J. Pitman, Combinatorial Stochastic Processes, Lecture Notes in Mathematics, 1875. Springer-Verlag, Berlin, 2006.  Google Scholar [23] J. Pitman and M. Yor, The two-parameter Poisson-Dirichlet distribution derived from a stable sub-ordinator, Annals of Probability, 25 (1997), 855-900.  doi: 10.1214/aop/1024404422.  Google Scholar [24] M. Price-Williams and N. A. Heard, Nonparametric self-exciting models for computer network traffic, Statistics and Computing, 2019, 1–12. doi: 10.1007/s11222-019-09875-z.  Google Scholar [25] R. Rosenfeld, A maximum entropy approach to adaptive statistical language modelling, Computer Speech & Language, 10 (1996), 187-228.  doi: 10.1006/csla.1996.0011.  Google Scholar [26] P. Rubin-Delanchy, N. A. Heard and D. J. Lawson, Meta analysis of mid-$p$-values: some new results based on the convex order, Journal of the American Statistical Association, 2018. doi: 10.1080/01621459.2018.1469994.  Google Scholar [27] B. W. Silverman, Density Estimation, London: Chapman and Hall, 1986.  Google Scholar [28] S. A. Stouffer, E. A. Suchman, L. C. DeVinney, S. A. Star and R. M. Williams, The American Soldier. Adjustment During Army Life, Princeton, New Jersey: Princeton University Press, 1949. Google Scholar [29] W. Y. Teh, A hierarchical Bayesian language model based on Pitman-Yor processes, in Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association of Computational Linguistics, 2006,985–992. doi: 10.3115/1220175.1220299.  Google Scholar [30] L. H. C. Tippett, The Methods of Statistics, 4th ed. John Wiley & Sons, Inc., New York, N. Y.; Williams & Norgate, Ltd., London, 1952.  Google Scholar [31] H. M. Wallach, S. T. Jensen, L. Dicker and K. A. Heller, An alternative prior process for nonparametric Bayesian clustering, in Proceedings of the Thireenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), 2010,892–899. Google Scholar
Cartoon example of the Chinese Restaurant metaphor for the Pitman-Yor process, where $C_j$ represents the $j$th customer and $x_j^\star$ is the unique dish served at table $T_j$. The vector $\boldsymbol x_n = (x_1, \dots, x_n)$ denotes the observed sequence of dishes eaten by each customer. Customer $C_i$ being seated at table $T_j$ is denoted $C_i\to T_j$. Then, for example, the tenth customer will sit at $T_1$ with probability $(3-d)/(\alpha+9)$
Kernel density estimates of the parameter estimates from 10000 simulations, $|V| = {1000}$
Kernel density estimates of the parameter estimates from 10000 simulations, $|V| = {16230}$
Uniform Q-Q plot for the Pitman-Yor process fitted to six destination nodes
Plot of the corrected $K_n$ (a), $H_{1n}$ (b) and their ratio $H_{1n}/K_n$ (c) as a function of $n$ for the connections to the destination computer ${\mathtt C1438}$ , and averaged sample paths from $100$ samples from a Pitman-Yor process, obtained using different estimates of the parameters. The grey lines correspond to 10 realised trajectories of simulated Pitman-Yor processes, obtained using the corrected estimate (5)
Estimated Pitman-Yor parameters for 6 destination nodes
 Destination $N$ $\tilde K_n$ $\hat K_n$ $\tilde H_{1n}$ $\hat H_{1n}$ $\hat\alpha$ $\hat d$ $\mathtt C5716$ ${113987}$ ${3401}$ ${3816.418}$ $144$ $182.164$ $651.519$ $0.047$ $\mathtt U7$ ${138286}$ ${2700}$ ${2952.989}$ $350$ $419.819$ $302.093$ $0.142$ $\mathtt C2525$ ${204532}$ ${1555}$ ${1634.571}$ $97$ $107.272$ $183.319$ $0.066$ $\mathtt C1877$ ${342766}$ ${5095}$ ${6114.758}$ $226$ $329.390$ $866.520$ $0.054$ $\mathtt C395$ ${518058}$ ${5957}$ ${7422.437}$ $442$ $698.259$ $841.357$ $0.094$ $\mathtt C423$ ${2426512}$ ${2705}$ ${2958.988}$ $166$ $199.188$ $230.040$ $0.067$
 Destination $N$ $\tilde K_n$ $\hat K_n$ $\tilde H_{1n}$ $\hat H_{1n}$ $\hat\alpha$ $\hat d$ $\mathtt C5716$ ${113987}$ ${3401}$ ${3816.418}$ $144$ $182.164$ $651.519$ $0.047$ $\mathtt U7$ ${138286}$ ${2700}$ ${2952.989}$ $350$ $419.819$ $302.093$ $0.142$ $\mathtt C2525$ ${204532}$ ${1555}$ ${1634.571}$ $97$ $107.272$ $183.319$ $0.066$ $\mathtt C1877$ ${342766}$ ${5095}$ ${6114.758}$ $226$ $329.390$ $866.520$ $0.054$ $\mathtt C395$ ${518058}$ ${5957}$ ${7422.437}$ $442$ $698.259$ $841.357$ $0.094$ $\mathtt C423$ ${2426512}$ ${2705}$ ${2958.988}$ $166$ $199.188$ $230.040$ $0.067$
Anomaly rankings for the four red-team source computers
 Events $x_{n+1}\vert y_{n+1}$ only. Events $x_{n+1}\vert y_{n+1}$ only. Standard $p$-values $p_{n+1}$ Mid-$p$-values $q_{n+1}$ Edge level combiner: Node level combiner: Node level combiner: Tippett Fisher Pearson Stouffer Fisher Pearson Stouffer ${\mathtt C17693}$ $5$ $2$ $4$ $2$ $1$ $5$ Source ${\mathtt C18025}$ $138$ $75$ $78$ $151$ $74$ $105$ computer ${\mathtt C19932}$ $3831$ $8870$ $8877$ $3571$ $2754$ $3151$ ${\mathtt C22409}$ $3767$ $15773$ $15764$ $3450$ $6984$ $3756$ Events $y_{n+1}$ only. Event level combiner: Tippett. Mid $p$-values $q_{n+1}$ Mid-$p$-values $q_{n+1}$ Edge level combiner: Node level combiner: Node level combiner: Tippett Fisher Pearson Stouffer Fisher Pearson Stouffer ${\mathtt C17693}$ $6$ $5$ $5$ $6$ $5$ $5$ Source ${\mathtt C18025}$ $2806$ $1536$ $1674$ $142$ $96$ $107$ computer ${\mathtt C19932}$ $5407$ $8882$ $8914$ $3813$ $2264$ $3232$ ${\mathtt C22409}$ $12126$ $15808$ $15878$ $3803$ $6516$ $4196$ Event level combiner: Fisher. Event level combiner: Fisher. Standard $p$-values $p_{n+1}$ Mid-$p$-values $q_{n+1}$ Edge level combiner: Node level combiner: Node level combiner: Tippett Fisher Pearson Stouffer Fisher Pearson Stouffer ${\mathtt C17693}$ $3$ $5$ $5$ $5$ $2$ $5$ Source ${\mathtt C18025}$ $151$ $88$ $101$ $155$ $90$ $106$ computer ${\mathtt C19932}$ $6339$ $3818$ $4879$ $4937$ $3017$ $3996$ ${\mathtt C22409}$ $6120$ $14799$ $5379$ $4451$ $6695$ $5236$
 Events $x_{n+1}\vert y_{n+1}$ only. Events $x_{n+1}\vert y_{n+1}$ only. Standard $p$-values $p_{n+1}$ Mid-$p$-values $q_{n+1}$ Edge level combiner: Node level combiner: Node level combiner: Tippett Fisher Pearson Stouffer Fisher Pearson Stouffer ${\mathtt C17693}$ $5$ $2$ $4$ $2$ $1$ $5$ Source ${\mathtt C18025}$ $138$ $75$ $78$ $151$ $74$ $105$ computer ${\mathtt C19932}$ $3831$ $8870$ $8877$ $3571$ $2754$ $3151$ ${\mathtt C22409}$ $3767$ $15773$ $15764$ $3450$ $6984$ $3756$ Events $y_{n+1}$ only. Event level combiner: Tippett. Mid $p$-values $q_{n+1}$ Mid-$p$-values $q_{n+1}$ Edge level combiner: Node level combiner: Node level combiner: Tippett Fisher Pearson Stouffer Fisher Pearson Stouffer ${\mathtt C17693}$ $6$ $5$ $5$ $6$ $5$ $5$ Source ${\mathtt C18025}$ $2806$ $1536$ $1674$ $142$ $96$ $107$ computer ${\mathtt C19932}$ $5407$ $8882$ $8914$ $3813$ $2264$ $3232$ ${\mathtt C22409}$ $12126$ $15808$ $15878$ $3803$ $6516$ $4196$ Event level combiner: Fisher. Event level combiner: Fisher. Standard $p$-values $p_{n+1}$ Mid-$p$-values $q_{n+1}$ Edge level combiner: Node level combiner: Node level combiner: Tippett Fisher Pearson Stouffer Fisher Pearson Stouffer ${\mathtt C17693}$ $3$ $5$ $5$ $5$ $2$ $5$ Source ${\mathtt C18025}$ $151$ $88$ $101$ $155$ $90$ $106$ computer ${\mathtt C19932}$ $6339$ $3818$ $4879$ $4937$ $3017$ $3996$ ${\mathtt C22409}$ $6120$ $14799$ $5379$ $4451$ $6695$ $5236$
 [1] Mokhtar Bouloudene, Manar A. Alqudah, Fahd Jarad, Yassine Adjabi, Thabet Abdeljawad. Nonlinear singular $p$ -Laplacian boundary value problems in the frame of conformable derivative. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020442 [2] Bahaaeldin Abdalla, Thabet Abdeljawad. Oscillation criteria for kernel function dependent fractional dynamic equations. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020443 [3] Antoine Benoit. Weak well-posedness of hyperbolic boundary value problems in a strip: when instabilities do not reflect the geometry. Communications on Pure & Applied Analysis, 2020, 19 (12) : 5475-5486. doi: 10.3934/cpaa.2020248 [4] Chao Xing, Jiaojiao Pan, Hong Luo. Stability and dynamic transition of a toxin-producing phytoplankton-zooplankton model with additional food. Communications on Pure & Applied Analysis, , () : -. doi: 10.3934/cpaa.2020275 [5] Haixiang Yao, Ping Chen, Miao Zhang, Xun Li. Dynamic discrete-time portfolio selection for defined contribution pension funds with inflation risk. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2020166 [6] Mehdi Badsi. Collisional sheath solutions of a bi-species Vlasov-Poisson-Boltzmann boundary value problem. Kinetic & Related Models, , () : -. doi: 10.3934/krm.2020052 [7] Yichen Zhang, Meiqiang Feng. A coupled $p$-Laplacian elliptic system: Existence, uniqueness and asymptotic behavior. Electronic Research Archive, 2020, 28 (4) : 1419-1438. doi: 10.3934/era.2020075 [8] Lihong Zhang, Wenwen Hou, Bashir Ahmad, Guotao Wang. Radial symmetry for logarithmic Choquard equation involving a generalized tempered fractional $p$-Laplacian. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020445 [9] Lei Liu, Li Wu. Multiplicity of closed characteristics on $P$-symmetric compact convex hypersurfaces in $\mathbb{R}^{2n}$. Discrete & Continuous Dynamical Systems - A, 2020  doi: 10.3934/dcds.2020378

Impact Factor:

## Tools

Article outline

Figures and Tables