Background: Databases of literature-curated protein-protein interactions (PPIs) are often used to interpret high-throughput interactome mapping studies and estimate error rates. These databases combine interactions across thousands of published studies and experimental techniques. Because the tendency for two proteins to interact depends on the local conditions, this heterogeneity of conditions means that only a subset of database PPIs are interacting during any given experiment. A typical use of these databases as gold standards in interactome mapping projects, however, assumes that PPIs included in the database are indeed interacting under the experimental conditions of the study.
Results: Using raw data from 20 co-fractionation experiments and six published interactomes, we demonstrate that this assumption is often false, with up to 55% of purported gold standard interactions showing no evidence of interaction, on average. We identify a subset of CORUM database complexes that do show consistent evidence of interaction in co-fractionation studies, and we use this subset as gold standards to dramatically improve interactome mapping as judged by the number of predicted interactions at a given error rate.
Conclusions: We recommend using this CORUM subset as the gold standard set in future co-fractionation studies. More generally, we recommend using the subset of literature-curated PPIs that are specific to the experimental context whenever possible.
Keywords: Interactome; Literature curated database; Protein-protein interaction; Proteomics.