Background: In biological systems, diseases are caused by small perturbations in a complex network of interactions between proteins. Perturbations typically affect only a small number of proteins, which go on to disturb a larger part of the network. To counteract this, a stress-response is launched, resulting in a complex pattern of variations in the cell. Identifying the key players involved in either spreading the perturbation or responding to it can give us important insights.
Results: We develop an algorithm, EpiTracer, which identifies the key proteins, or epicenters, from which a large number of changes in the protein-protein interaction (PPI) network ripple out. We propose a new centrality measure, ripple centrality, which measures how effectively a change at a particular node can ripple across the network by identifying highest activity paths specific to the condition of interest, obtained by mapping gene expression profiles to the PPI network. We demonstrate the algorithm using an overexpression study and a knockdown study. In the overexpression study, the gene that was overexpressed (PARK2) was highlighted as the most important epicenter specific to the perturbation. The other top-ranked epicenters were involved in either supporting the activity of PARK2, or counteracting it. Also, 5 of the identified epicenters showed no significant differential expression, showing that our method can find information which simple differential expression analysis cannot. In the second dataset (SP1 knockdown), alternative regulators of SP1 targets were highlighted as epicenters. Also, the gene that was knocked down (SP1) was picked up as an epicenter specific to the control condition. Sensitivity analysis showed that the genes identified as epicenters remain largely unaffected by small changes.
Conclusions: We develop an algorithm, EpiTracer, to find epicenters in condition-specific biological networks, given the PPI network and gene expression levels. EpiTracer includes programs which can extract the immediate influence zone of epicenters and provide a summary of dysregulated genes, facilitating quick biological analysis. We demonstrate its efficacy on two datasets with differing characteristics, highlighting its general applicability. We also show that EpiTracer is not sensitive to minor changes in the network. The source code for EpiTracer is provided at Github ( https://github.com/narmada26/EpiTracer ).
Keywords: Condition-specific network; Influential nodes; Network mining; Perturbation analysis; Ripple centrality.