Improving network inference algorithms using resampling methods

BMC Bioinformatics. 2018 Oct 12;19(1):376. doi: 10.1186/s12859-018-2402-0.

Abstract

Background: Relatively small changes to gene expression data dramatically affect co-expression networks inferred from that data which, in turn, can significantly alter the subsequent biological interpretation. This error propagation is an underappreciated problem that, while hinted at in the literature, has not yet been thoroughly explored. Resampling methods (e.g. bootstrap aggregation, random subspace method) are hypothesized to alleviate variability in network inference methods by minimizing outlier effects and distilling persistent associations in the data. But the efficacy of the approach assumes the generalization from statistical theory holds true in biological network inference applications.

Results: We evaluated the effect of bootstrap aggregation on inferred networks using commonly applied network inference methods in terms of stability, or resilience to perturbations in the underlying expression data, a metric for accuracy, and functional enrichment of edge interactions.

Conclusion: Bootstrap aggregation results in improved stability and, depending on the size of the input dataset, a marginal improvement to accuracy assessed by each method's ability to link genes in the same functional pathway.

Keywords: Aggregation; Bootstrapping; Gene regulatory network inference; Random subspace method; Resampling.

MeSH terms

  • Algorithms
  • Gene Expression / genetics*
  • Gene Regulatory Networks / genetics*
  • Humans