TM-search: An Efficient and Effective Tool for Protein Structure Database Search

J Chem Inf Model. 2024 Feb 12;64(3):1043-1049. doi: 10.1021/acs.jcim.3c01455. Epub 2024 Jan 25.

Abstract

The quickly increasing size of the Protein Data Bank is challenging biologists to develop a more scalable protein structure alignment tool for fast structure database search. Although many protein structure search algorithms and programs have been designed and implemented for this purpose, most require a large amount of computational time. We propose a novel protein structure search approach, TM-search, which is based on the pairwise structure alignment program TM-align and a new iterative clustering algorithm. Benchmark tests demonstrate that TM-search is 27 times faster than a TM-align full database search while still being able to identify ∼90% of all high TM-score hits, which is 2-10 times more than other existing programs such as Foldseek, Dali, and PSI-BLAST.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Databases, Protein
  • Proteins* / chemistry
  • Sequence Alignment
  • Software

Substances

  • Proteins