Numerous studies have demonstrated that the propensity of a protein to form amyloids or amorphous aggregates is encoded by its amino acid sequence. This led to the emergence of several computational programs to predict amyloidogenicity from amino acid sequences. However, a growing number of studies indicate that an accurate prediction of the protein aggregation can only be achieved when also accounting for the overall structural context of the protein, and the likelihood of transition between the initial state and the aggregate. Here, we describe a computational pipeline called TAPASS, which was designed to do just that. The pipeline assigns each residue of a protein as belonging to a structured region or an intrinsically disordered region (IDR). For this purpose, TAPASS uses either several state-of-the-art programs for prediction of IDRs, of transmembrane regions and of structured domains or the artificial intelligence program AlphaFold. In the next step, this assignment is crossed with amyloidogenicity prediction. As a result, TAPASS allows the detection of Exposed Amyloidogenic Regions (EARs) located within intrinsically disordered regions (IDRs) and carrying high amyloidogenic potential. TAPASS can substantially improve the prediction of amyloids and be used in proteome-wide analysis to discover new amyloid-forming proteins. Its results, combined with clinical data, can create individual risk profiles for different amyloidoses, opening up new opportunities for personalised medicine. The architecture of the pipeline is designed so that it makes it easy to add new individual predictors as they become available. TAPASS can be used through the web interface (https://bioinfo.crbm.cnrs.fr/index.php?route=tools&tool=32).
Keywords: Aggregation; AlphaFold; Amyloid; Bioinformatics; Intrinsically Disordered Regions; Proteome-wide analysis.
Copyright © 2022 Elsevier Inc. All rights reserved.