\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

Semi-supervised classification on graphs using explicit diffusion dynamics

  • * Corresponding author: Mauricio Barahona

    * Corresponding author: Mauricio Barahona

Current address: Blue Brain Project, Éole polytechnique fédérale de Lausanne (EPFL), Campus Biotech, 1202 Geneva, Switzerland.

All authors acknowledge funding through EPSRC award EP/N014529/1 supporting the EPSRC Centre for Mathematics of Precision Healthcare at Imperial
Abstract / Introduction Full Text(HTML) Figure(0) / Table(5) Related Papers Cited by
  • Classification tasks based on feature vectors can be significantly improved by including within deep learning a graph that summarises pairwise relationships between the samples. Intuitively, the graph acts as a conduit to channel and bias the inference of class labels. Here, we study classification methods that consider the graph as the originator of an explicit graph diffusion. We show that appending graph diffusion to feature-based learning as an a posteriori refinement achieves state-of-the-art classification accuracy. This method, which we call Graph Diffusion Reclassification (GDR), uses overshooting events of a diffusive graph dynamics to reclassify individual nodes. The method uses intrinsic measures of node influence, which are distinct for each node, and allows the evaluation of the relationship and importance of features and graph for classification. We also present diff-GCN, a simple extension of Graph Convolutional Neural Network (GCN) architectures that leverages explicit diffusion dynamics, and allows the natural use of directed graphs. To showcase our methods, we use benchmark datasets of documents with associated citation data.

    Mathematics Subject Classification: Primary: 05C81, 05C85, 05C21, 68R10, 62M45; Secondary: 34B45, 60J60.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Table 1.  Statistics of datasets as reported in [35] and [30]

    Datasets Nodes Edges Classes Features
    Citeseer $ 3,327 $ $ 4,732 $ $ 6 $ $ 3,703 $
    Cora $ 2,708 $ $ 5,429 $ $ 7 $ $ 1,433 $
    Pubmed $ 19,717 $ $ 44,338 $ $ 3 $ $ 500 $
    Wikipedia $ 20,525 $ $ 215,056 $ $ 12 $ $ 100 $
     | Show Table
    DownLoad: CSV

    Table 2.  Percentage classification accuracy before and after application of relabelling by GDR for various classifiers. We present the improvement of GDR on the uniform prediction (which ignores features). We also consider four supervised classifiers (which learn from features without the graph): projection, RF, SVM and MLP. For RF, we use a maximum depth of $ 20 $; for SVM, we set $ C = 50 $; for MLP, we implement the same architecture as GCN ($ d_1 = 16 $-unit hidden layer, $ 0.5 $ dropout, $ 200 $ epochs, $ 0.01 $ learning rate, $ L^2 $ loss function). Finally, we compare with two semi-supervised graph classifiers: GCN [20] and Planetoid [35]. The numbers in brackets record the change in accuracy accomplished by applying GDR on the corresponding prior classifier. Boldface indicates the method with highest accuracy for each dataset

    Method Citeseer Cora Pubmed Wikipedia
    Uniform 7.7 13.0 18.0 28.7
    GDR (Uniform) 50.6 (+42.9) 71.8 (+58.8) 73.2 (+55.2) 31.4 (+2.7)
    Projection 61.8 59.0 72.0 32.5
    RF 60.3 58.9 68.8 50.8
    SVM 61.1 58.0 49.9 31.0
    MLP 57.0 56.0 70.7 43.0
    GDR (Projection) 70.4 (+8.7) 79.7 (+20.7) 75.8 (+3.8) 36.9 (+4.4)
    GDR (RF) 70.5 (+10.2) 78.7 (+19.8) 72.2 (+3.2) 50.8 (+0.0)
    GDR (SVM) 70.3 (+9.2) 81.2 (+23.2) 52.4 (+2.5) 41.9 (+10.8)
    GDR (MLP) 69.7(+12.7) 78.5 (+22.5) 75.5 (+4.8) 40.5 (-2.5)
    Planetoid 64.7 75.7 72.2 -
    GCN 70.3 81.1 79.0 39.2
    GDR (GCN) 70.8 (+0.5) 82.2 (+1.1) 79.4 (+0.4) 39.5 (+0.3)
     | Show Table
    DownLoad: CSV

    Table 3.  Percentage classification accuracy of GCN and its extension diff-GCN, which has an explicit diffusion operator (16)

    Model Citeseer Cora Pubmed Wikipedia
    GCN 70.3 81.1 79.0 34.1
    diff-GCN 71.9 82.3 79.3 45.9
     | Show Table
    DownLoad: CSV

    Table 4.  Accuracy of GDR using the undirected, directed, and reverse directed graphs of the Cora dataset

    Undirected Directed (fw) Directed (bw)
    Method $ A $ $ A_\text{dir} $ $ A_\text{dir}^T $
    GDR (Projection) 79.7 62.1 64.6
    GDR (RF) 78.7 58.0 57.6
    GDR (SVM) 81.2 63.6 62.1
    GDR (MLP) 78.5 57.3 56.4
     | Show Table
    DownLoad: CSV

    Table 5.  Accuracy of GCN and diff-GCN using the undirected, directed, reverse directed, and bidirectional (augmented) graphs of the Cora dataset. The highest accuracy is achieved by diff-GCN with the augmented graph (boldface)

    Undirected Directed (fw) Directed (bw) Augmented (fw, bw)
    Method $ A $ $ A_\text{dir} $ $ A_\text{dir}^T $ $ \begin{bmatrix} A_\text{dir} \, A_\text{dir}^T \end{bmatrix} $
    GCN 81.1 67.4 79.8 79.9
    diff-GCN 82.3 80.3 81.7 83.0
     | Show Table
    DownLoad: CSV
  • [1] A. Arnaudon, R. L. Peach and M. Barahona, Graph centrality is a question of scale, arXiv e-prints, arXiv: 1907.08624.
    [2] K. A. Bacik, M. T. Schaub, M. Beguerisse-Díaz, Y. N. Billeh and M. Barahona, Flow-Based Network Analysis of the Caenorhabditis elegans Connectome, PLoS Computational Biology, 12 (2016), e1005055, http://arXiv.org/abs/1511.00673. doi: 10.1371/journal.pcbi.1005055.
    [3] M. Beguerisse-Díaz, B. Vangelov and M. Barahona, Finding role communities in directed networks using Role-Based Similarity, Markov Stability and the Relaxed Minimum Spanning Tree, in 2013 IEEE Global Conference on Signal and Information Processing, 2013, 937–940, http://arXiv.org/abs/1309.1795.
    [4] M. Beguerisse-Díaz, G. Garduno-Hern{á}ndez, B. Vangelov, S. N. Yaliraki and M. Barahona, Interest communities and flow roles in directed networks: The Twitter network of the UK riots, Journal of The Royal Society Interface, 11 (2014), 20140940, https://royalsocietypublishing.org/doi/abs/10.1098/rsif.2014.0940.
    [5] C. M. Bishop, Pattern Recognition and Machine Learning, New York: Springer, 2006. doi: 10.1007/978-0-387-45528-0.
    [6] M. M. BronsteinJ. BrunaY. LeCunA. Szlam and P. Vandergheynst, Geometric deep learning: Going beyond euclidean data, IEEE Signal Processing Magazine, 34 (2017), 18-42.  doi: 10.1109/MSP.2017.2693418.
    [7] J. Bruna, W. Zaremba, A. Szlam and Y. Lecun, Spectral networks and locally connected networks on graphs, in International Conference on Learning Representations (ICLR2014), CBLS, April 2014, 2014, 1–14, http://arXiv.org/abs/1312.6203.
    [8] O. Chapelle and A. Zien, Semi-supervised classification by low density separation, in Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics (AISTATS 2005), 2005, 57–64.
    [9] J. Chen, J. Zhu and L. Song, Stochastic Training of Graph Convolutional Networks with Variance Reduction, arXiv e-prints, arXiv: 1012.2726, http://arXiv.org/abs/1710.10568.
    [10] F. Chung, Laplacians and the Cheeger inequality for directed graphs, Annals of Combinatorics, 9 (2005), 1-19.  doi: 10.1007/s00026-005-0237-z.
    [11] R. R. Coifman and S. Lafon, Diffusion maps, Applied and Computational Harmonic Analysis, 21 (2006), 5-30.  doi: 10.1016/j.acha.2006.04.006.
    [12] K. Cooper and M. Barahona, Role-based similarity in directed networks, arXiv e-prints, arXiv: 1012.2726, http://arXiv.org/abs/1012.2726.
    [13] M. Defferrard, X. Bresson and P. Vandergheynst, Convolutional neural networks on graphs with fast localized spectral filtering, in Advances in neural information processing systems, 2016, 3844–3852.
    [14] J.-C. Delvenne, S. N. Yaliraki and M. Barahona, Stability of graph communities across time scales., Proceedings of the National Academy of Sciences of the United States of America, 107 (2010), 12755–12760, http://arXiv.org/abs/0812.1811. doi: 10.1073/pnas.0903215107.
    [15] F. FoussA. PirotteJ. Renders and M. Saerens, Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation, IEEE Transactions on Knowledge and Data Engineering, 19 (2007), 355-369. 
    [16] H. Gao, Z. Wang and S. Ji, Large-scale learnable graph convolutional networks, in Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 8, Association for Computing Machinery, New York, NY, USA, 2018, 1416–1424, http://arXiv.org/abs/1808.03965. doi: 10.1145/3219819.3219947.
    [17] I. GoodfellowY. Bengio and  A. CourvilleDeep Learning, MIT press, 2016. 
    [18] D. K. HammondP. Vandergheynst and R. Gribonval, Wavelets on graphs via spectral graph theory, Applied and Computational Harmonic Analysis, 30 (2011), 129-150.  doi: 10.1016/j.acha.2010.04.005.
    [19] D. P. Kingma, S. Mohamed, D. J. Rezende and M. Welling, Semi-supervised learning with deep generative models, in Advances in Neural Information Processing Systems, 2014, 3581–3589.
    [20] T. N. Kipf and M. Welling, Semi-Supervised Classification with Graph Convolutional Networks, arXiv: 1609.02907v4, 1–14, http://arXiv.org/abs/1609.02907.
    [21] R. Lambiotte, J.-C. Delvenne and M. Barahona, Random walks, markov processes and the multiscale modular organization of complex networks, IEEE Transactions on Network Science and Engineering, 1 (2014), 76–90, http://arXiv.org/abs/1502.04381, http://arXiv.org/abs/0812.1770. doi: 10.1109/TNSE.2015.2391998.
    [22] Y. LeCunY. Bengio and G. Hinton, Deep learning, Nature, 521 (2015), 436-444.  doi: 10.1038/nature14539.
    [23] R. LevieF. MontiX. Bresson and M. M. Bronstein, CayleyNets: Graph convolutional neural networks with complex rational spectral filters, IEEE Transactions on Signal Processing, 67 (2019), 97-109.  doi: 10.1109/TSP.2018.2879624.
    [24] Z. Liu and M. Barahona, Geometric multiscale community detection: Markov stability and vector partitioning, Journal of Complex Networks, 6 (2018), 157-172.  doi: 10.1093/comnet/cnx028.
    [25] Z. Liu and M. Barahona, Graph-based data clustering via multiscale community detection, Applied Network Science, 5 (2020), 16pp, http://arXiv.org/abs/1909.04491.
    [26] Z. Liu, C. Chen, L. Li, J. Zhou, X. Li, L. Song and Y. Qi, GeniePath: Graph neural networks with adaptive receptive paths, AAAI Technical Track: Machine Learning, 33 (2019), http://arXiv.org/abs/1802.00910. doi: 10.1609/aaai.v33i01.33014424.
    [27] N. MasudaM. A. Porter and R. Lambiotte, Random walks and diffusion on networks, Physics Reports, 716/717 (2017), 1-58.  doi: 10.1016/j.physrep.2017.07.007.
    [28] L. Page, S. Brin, R. Motwani and T. Winograd, The PageRank Citation Ranking: Bringing Order to the Web, Technical Report 1999-66, Stanford InfoLab, 1999, http://ilpubs.stanford.edu:8090/422/.
    [29] B. Perozzi, R. Al-Rfou and S. Skiena, Deepwalk: Online learning of social representations, in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 4, Association for Computing Machinery, New York, NY, USA, 2014, 701–710. doi: 10.1145/2623330.2623732.
    [30] Y. Qian, P. Expert, T. Rieu, P. Panzarasa and M. Barahona, Quantifying the alignment of graph and features in deep learning, arXiv e-prints, arXiv: 1905.12921.
    [31] M. T. Schaub, J.-C. Delvenne, R. Lambiotte and M. Barahona, Multiscale dynamical embeddings of complex networks, Phys. Rev. E, 99 (2019), 062308. doi: 10.1103/PhysRevE.99.062308.
    [32] P. Sen, G. M. Namata, M. Bilgic, L. Getoor, B. Gallagher and T. Eliassi-Rad, Collective classification in network data, AI Magazine, 29 (2008), 93–106, http://www.cs.iit.edu/ ml/pdfs/sen-aimag08.pdf. doi: 10.1609/aimag.v29i3.2157.
    [33] P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò and Y. Bengio, Graph attention networks, Machine Learning, 3 (2018), 1–12, http://arXiv.org/abs/1710.10903.
    [34] J. Weston, F. Ratle, H. Mobahi and R. Collobert, Deep learning via semi-supervised embedding, ICML '08: Proceedings of the 25th International Conference on Machine Learning, 2008, 1168–1175. doi: 10.1145/1390156.1390303.
    [35] Z. Yang, W. W. Cohen and R. Salakhutdinov, Revisiting semi-supervised learning with graph embeddings, arXiv: 1603.08861v2, 48, http://arXiv.org/abs/1603.08861.
    [36] J. Zhang, X. Shi, J. Xie, H. Ma, I. King and D.-Y. Yeung, GaAN: Gated attention networks for learning on large and spatiotemporal graphs, arXiv e-prints, http://arXiv.org/abs/1803.07294.
    [37] X. Zhu, Z. Ghahramani and J. Lafferty, Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions, in Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML 3, AAAI Press, 2003, 912-919.
    [38] C. Zhuang and Q. Ma, Dual graph convolutional networks for graph-based semi-supervised classification, in Proceedings of the 2018 World Wide Web Conference, Lyon, France, 2018, 499–508. doi: 10.1145/3178876.3186116.
  • 加载中

Tables(5)

SHARE

Article Metrics

HTML views(3480) PDF downloads(580) Cited by(0)

Access History

Other Articles By Authors

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return