DeepLigand: accurate prediction of MHC class I ligands using peptide embedding

Haoyang Zeng; David K Gifford

doi:10.1093/bioinformatics/btz330

DeepLigand: accurate prediction of MHC class I ligands using peptide embedding

Bioinformatics. 2019 Jul 15;35(14):i278-i283. doi: 10.1093/bioinformatics/btz330.

Authors

Haoyang Zeng^{1

2}, David K Gifford^{1

2}

Affiliations

¹ Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA.
² Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA.

Abstract

Motivation: The computational modeling of peptide display by class I major histocompatibility complexes (MHCs) is essential for peptide-based therapeutics design. Existing computational methods for peptide-display focus on modeling the peptide-MHC-binding affinity. However, such models are not able to characterize the sequence features for the other cellular processes in the peptide display pathway that determines MHC ligand selection.

Results: We introduce a semi-supervised model, DeepLigand that outperforms the state-of-the-art models in MHC Class I ligand prediction. DeepLigand combines a peptide language model and peptide binding affinity prediction to score MHC class I peptide presentation. The peptide language model characterizes sequence features that correspond to secondary factors in MHC ligand selection other than binding affinity. The peptide embedding is learned by pre-training on natural ligands, and can discriminate between ligands and non-ligands in the absence of binding affinity prediction. Although conventional affinity-based models fail to classify peptides with moderate affinities, DeepLigand discriminates ligands from non-ligands with consistently high accuracy.

Availability and implementation: We make DeepLigand available at https://github.com/gifford-lab/DeepLigand.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Histocompatibility Antigens Class I
Ligands
Peptides / analysis*
Protein Binding
Software

Substances

Histocompatibility Antigens Class I
Ligands
Peptides

Grants and funding

R01 CA218094/CA/NCI NIH HHS/United States