A COMPARATIVE STUDY OF ROBUSTNESS MEASURES FOR CANCER SIGNALING NETWORKS

Network robustness stands for the capability of networks in resisting failures or attacks. Many robustness measures have been proposed to evaluate the robustness of various types of networks, such as small-world and scale-free networks. However, the robustness of biological networks is different for their special structures related to the unique functionality. Cancer signaling networks which show the information transformation of cancers in molecular level always appear with robust complex structures which mean information exchange in the networks do not depend on skimp pathways in which resulting the low rate of cure, high rate of recurrence and especially, the short time in survivability caused by constantly destruction of cancer. So a network metric that shows significant changes when one node is removed, and further to correlate that metric with survival probabilities for patients who underwent cancer chemotherapy is meaningful. Therefore, in this paper, the relationship between 14 typical cancer signaling networks robustness and those cancers patient survivability are studied. Several widely used robustness measures are included, and we find that the natural connectivity, in which the redundant circles are satisfied with the need of information exchange of cancer signaling networks, is negatively correlated to cancer patient survivability. Furthermore, the top three affected nodes measured by natural connectivity are obtained and the analysis on these nodes degree, closeness centrality and betweenness centrality are followed. The result shows that the node found are important so we believe that natural connectivity will be a great help to cancer treatment.

Abstract. Network robustness stands for the capability of networks in resisting failures or attacks. Many robustness measures have been proposed to evaluate the robustness of various types of networks, such as small-world and scale-free networks. However, the robustness of biological networks is different for their special structures related to the unique functionality. Cancer signaling networks which show the information transformation of cancers in molecular level always appear with robust complex structures which mean information exchange in the networks do not depend on skimp pathways in which resulting the low rate of cure, high rate of recurrence and especially, the short time in survivability caused by constantly destruction of cancer. So a network metric that shows significant changes when one node is removed, and further to correlate that metric with survival probabilities for patients who underwent cancer chemotherapy is meaningful. Therefore, in this paper, the relationship between 14 typical cancer signaling networks robustness and those cancers patient survivability are studied. Several widely used robustness measures are included, and we find that the natural connectivity, in which the redundant circles are satisfied with the need of information exchange of cancer signaling networks, is negatively correlated to cancer patient survivability. Furthermore, the top three affected nodes measured by natural connectivity are obtained and the analysis on these nodes degree, closeness centrality and betweenness centrality are followed. The result shows that the node found are important so we believe that natural connectivity will be a great help to cancer treatment. more difficult the cancer can be cured. Moreover, even if the cancer is cured in the situation where the CSN is not collapsed, the cancer will recrudesce with a high possibility. This is the reason why cancers are cured in low rate and recrudesce in high rate.
Robust CSNs should result a short time for patients to survival; otherwise, a longer time. As is known to all, canceration of cells happen every now and then, so cancers may not fatal at the first stage. However, once the immune system cannot clean cancercells in time, they will constantly strengthen themselves and weaken human beings body. Even if a cancer is diagnosed, an effective method to control it is of great importance; or, the cancer will quickly develop into fatal degree. Directly, the more robust a CSN is, the more difficult to control the cancer. And in this case, cancer patients will survival for a short time during which cancers quickly developed to be a fatal one; otherwise will survival longer.
Dylan Breitkreutz et al. in [6] studied the relationship between CSNs degreeentropy and cancer patient survivability, and the strong negative relationship is obtained, furthermore, then analysis the betweenness centrality of nodes in CSNs and think that the high betweenness nodes may hold the cancer to be alive. Inspired by this, Kazuhiro Takemoto et al. in [26] discussed the relationship between modular organization of CSNs and patient survivability, and they declared this relationship is more reliable than the previous one. To be intuitionally, robustness including errors and tolerance should be more useful in cure cancer.
Nowadays, many useful robustness measures have been proposed [8,9,24,23,29,18,27,14], and different robustness measures always focus on different aspects and suit for different networks such as small-world, scale-free networks. While CSN is a special kind of networks, in which protein dependents on genes and protein is renewable if gene works well, this system is born to be robust and can be fixed by itself dynamically. No matter how robust the CSN is, we hope to destroy it effectively, so a robustness measure which can properly evaluate the robustness of CSN so as to destroy it easily is meaningful.
In this paper, the relationship between CSNs robustness and cancer patient survivability is studied. The measures used to analyze the networks robustness quantified are different and focused on different aspects of networks. That is to say, not all robustness measures reflect the essence of CSNs; so several widely used robustness measures are included and only one measure is found to be useful and according to this measure, what is important to CSNs is analyzed.
The rest of this paper is organized as follows. The background information about CSNs and patient survivability is introduced in Section II. Section III describes the robustness measures studied. Next, the experiments and conclusion are given in Sections IV and V, respectively.

2.
Cancer signaling networks and patient survivability. Cancers are system diseases constructed by genes, protein, and ramification with complex interaction. Usually, the proto-oncogene is activated by radioactive rays, toxic substance or virus infection and with the accumulation of proto-oncogenes, the compounded protein and ramification from proto-oncogenes turn the normal interaction system into a lesion one, and then normal cells turn into cancer cells. Immune system can clean these cancer cells so as to keep us healthy; otherwise the cancer cell will infinitely proliferate and develop into a tumor and diffuse to the whole body.
Recent technology on curing cancer is to disable the genes, protein or ramification in the CSNs so a technological understanding of cell information exchange cycle malfunction during carcinogenesis, cancer progression, and response to treatment, is crucial for optimum drug development and proper drug administration. Each cell is a complex interaction system constructed by huge amount of molecular which are not fully understood so far. Nonetheless, some insights on how genes, protein and ramification interact with each other through reconstruction of regulation network [13,2,22] and how specific drugs interact with their molecular targets in the cell are beginning to be elucidated [20,28].
Signaling pathways or metabolic pathways are reconstructed biochemical networks which are observed from the interactions in cells. These biochemical networks basically contain the secret of life including information of cancer and potential way to cure itand should directly relate to cancer patient survivability.
Large numbers of cancer sites system has been worked out with the moderately detailed pathway, and 14 of these typical caner pathways are available from http://www.genome.jp/kegg/, the Kyoto Encyclopedia of Genes and Genomes (KEGG). The KEGG cancer pathways were downloaded as KGML files from the KEGG PATHWAY database (http://www.genome.jp/kegg/pathway.html). And generate a mathematical graph representation of the pathway through KEGGgraph package downloaded from http://www.bioconductor.org/, the Bioconductor Web site. In the KEGGgraph package, function parseKGML2Graph are used to generate two types of networks, called CSN-EG and CSN-GO respectively, with two group of parameters, genesOnly = FALSE, expandGenes = TRUE and genesOnly = TRUE, expandGenes= FALSE. Here, we study the undirected and un-weighted networks and the details of these networks are reported in Table 1. In addition to the cancer pathways from KEGG, the 5-year survival statistics is collected from the Surveillance Epidemiology and End Results (SEER) Program database (http://seer.cancer.gov/), which is a resource for epidemiological data compiled by the National Cancer Institute as a service to researchers and physicians. Thanks to KEGG and SEER, the connection between cancer network structure and patient survivability can be investigated.
3. Robust measures. The early research on a category of robustness measures mainly based on connectivity which can be traced back to 1970 [15,3,17]. After that, several analytically studies on network robustness from the viewpoint of random graph theory are proposed [8,9,24]. Taking the critical fraction of attacks and realistic cases, widely used robustness measures based on percolation theory is proposed [23,29,18]. Another remarkable kind of robustness measures are designed based on the eigenvalue of network matrix cite10, 11,25. Measures based on connectivity basically consider the connectivity of networks, so it is not interesting now, but the next three categories analyzing the breaking down network, isolating network nodes and redundant circles seem to be useful. So 6 widely used measures, 2 for each category and excavating the different aspect of networks, including robustness R related to nodes proposed in [23], robustness R l related to edges proposed in [29], critical fraction removal when suffering from random attack p r c or targeted attack p t c proposed in [8] and [9], natural connectivitȳ λ proposed in [27] and algebraic connectivity a(G) proposed in [14], are selected and studied in detail.
Robustness R mainly considers sizes of the largest connected sub-graph during the network suffering from high degree preference attack on nodes. That is, the largest degree node is removed sequentially and during each attack process, the size of largest connected sub-graph is accumulated until the network is collapsed and the definition in [23] is described as: where N is the number of nodes in the network and s(Q) is the fraction of nodes in the largest connected cluster after removing Q = q/N nodes. The normalization factor 1/N ensures that the robustness of networks with different sizes can be compared. And the range of possible R values is between 1/N and 0.5, where these limits correspond, respectively, to a star network and a fully connected graph. The definition of R l is the same to R, in which robustness R l also takes the size of largest sub-graph into consideration and also uses the sequentially malicious attack while the attack target is changed to be edge with betweenness centrality to evaluate the importance of edges. The details in [29] are described as: where M is the total number of links, s(P ) is the fraction of nodes in the largest connected cluster after removing P = p × M edges. This measure captures the network response to any fraction of link removal. Apparently, if a network is robust against link attack, its R l should be relatively large. The critical fraction under random attack is marked as p r c , which measures the number of nodes needed to be removed when the nodes is removed randomly. According to [8], the p r c for any degree distribution P (k), is calculated as where κ 0 = k / k 2 , the k is the degree of nodes in original network before attack. According to this, the larger the p r c of a network is, the more robust the network is and if the degree distribution and each nodal degree are unchanged, the κ 0 = k / k 2 will not change. The critical fraction under random attack is easy, while thing becomes different when network faced with targeted attack.
In [9], the critical fraction under targeted attack p t c is defined when network suffering from high degree preference attack on nodes. And because of this operation, the degree distribution is affected a lot after each removal of the node with largest nodal degree so the re-calculation of degree distribution is needed. It is not too difficult when the degree distribution of original network is given, but what if the degree distribution is not given.
In fact, according to the definition in [9], we can calculate the p t c for graph with any degree distribution through simulation the process. Attack the node of a network with highest degree and check the condition of the network, repeat this process until the left network is disconnected, then p t c is obtained. So p t c of network with unknown degree distribution can be calculated. Being the same to p r c , the larger the value of p t c of a network is, the more robust the network is. Algebraic connectivity is the second smallest eigenvalue of the Laplacian matrix. Fiedler [14] showed that the magnitude of the algebraic connectivity reflects how well connected the overall graph is and Merris in [19] gave a survey of the vast literature on algebraic connectivity.
Wu et al. [27] introduce the concept of natural connectivity, which characterizes the redundancy of alternative routes in a network by quantifying the weighed number of closed walks of all lengths. The natural connectivity can be regarded as an average eigenvalue that changes strictly monotonically with the addition or deletion of edges.λ where λ i is the ith eigenvalue of the graph adjacency matrix. According the definition and has been proved in [13], the robustness measure follows that, given the number of vertices N , the empty graph has the minimum natural connectivity and the complete graph has the maximum natural connectivity. So the lager the value ofλ is, the more robust the network is.

Experiments.
In this section, the relationships between CSNs robustness and 5-year survival rate are reported first, and then interesting CSN-GOs are analyzed in detail. Robustness of CSNs including CSN-EG and CSN-GO related to 5-year survival rate are analyzed according to the Pearson's correlation coefficient which is defined as: where thex is the average of the vector x andȳ is the average of the y vector.
Here, the value of this coefficient lies in the range of [-1, 1], positive value means the relation between the tested two items is positively correlated, and vice versa. If the value is equal to 0, then the relation is uncorrelated.
4.1. CSNs related to 5-year survival rate. In this part, the results of the vector of 14 CSNs, including CSN-EG and CSN-GO, robustness are calculated and the Pearsons correlation coefficient is reported in Table 2. From the results, we can find that relationship between robustness and 5-year survival rate are not strongly correlated in group of CSN-EG, while in CSN-GO, p r c andλ are strongly correlated to 5-year survival rate. These means that the p r c and λ somehow catch the important structure of the CSNs. Taking the meaning of p r c andλ into consideration for further, the p r c thinks all nodes in network have equal possibility to be attacked which are not satisfied with real situations, so next we just analyze how theλ catch the important part of CSN-GO.

4.2.
Analysis on CSN-GO and natural connectivity. In this part, the degreeentropy H used in [6] and modularity Q in [26] are calculated and then, the top three nodes that most affected network measured byλ are selected and their degree, closeness centrality [4,16] and betweenness centrality [5,12] are analyzed. Table 3. Pearsons correlation coefficient between the network parameters of CSN-GO and 5-year survival rate is showed.

Network parameters
H Qλ γ -0.62 -0.21 -0.60 From the table 3, we can see that the natural connectivity is same to the result of degree-entropy H showed in [6] and modularity Q in [26]. Natural connectivity takes the redundant circles into consideration which is well matched with the understanding that more pathways the CSN has, the more difficult to cure the cancer. Furthermore, different from betweenness centrality, natural connectivity considers all the circles instead of the shortest path which tell us that the cancers may pass through their information through any possible pathway.
Next, three of the most effected nodes measured by natural connectivity are obtained through attack. For each attached node, the more affected the natural connectivity is, the more important the node is. After obtaining the top three affected nodes, the degree, closeness centrality and betweenness centrality of these nodes are analyzed. The results are given in Table 4 and Table 5. Table 4 and Table 5 give the degree, closeness centrality and betweenness centrality of the top three effected nodes measured by natural connectivity and the averaged values of the networks. From Table IV, we find only few number of closeness and betweenness lower than the average level. That to say, the nodes found are important and we think that the natural connectivity is suitable for measuring the robustness of CSNs and we believe that these information will be a great help to cancer treatment. The further investigation will be carried on with the cooperation of biologist.

5.
Conclusion. In this paper, a comparative study of robustness measures for CSNs is given by studying six different robustness measures with two kinds of CSNs, i.e. CSN-EG and CSN-GO. CSN-EG and CSN-GO are two kind networks obtained from the same KGML in which the protein, ramification are included or not, respectively. In fact, the CSNs can repair themselves dynamically, so attack on reproducible protein may not mortal to CSNs. For CSN-EG, large numbers of protein are included, so the natural connectivity is affected and lost its works on them. From the study, the p r c andλ are found to be strongly negative to the 5-year survival rate among CSN-GOs. The p r c thinks all nodes in the network have the same possibility to be attacked which is not satisfied with the real world situation so we only think theλ may suitable for CSNs in measuring their robustness.
Furthermore, the top three affected nodes measured by natural connectivity are obtained and the degree, closeness centrality and betweenness centrality of these nodes are given. From the results, we find the nodes found according to natural connectivity are important, that is to say the natural connectivity catch the essentials of CSN-GO. Otherwise, the natural connectivity describes the redundancy of routes between vertices which is directly related to the information exchange or communication of CSN. Through analysis on CSNs in complex network theory method, the conclusion that the natural connectivity is suitable for measuring the robustness of CSNs is obtained, and we believe that this information will be a great help to cancer treatment. The further investigation will be carried on with the cooperation of biologist.