Small-molecule 3D structure prediction using open crystallography data

J Chem Inf Model. 2013 Dec 23;53(12):3127-30. doi: 10.1021/ci4005282. Epub 2013 Dec 10.

Abstract

Predicting the 3D structures of small molecules is a common problem in chemoinformatics. Even the best methods are inaccurate for complex molecules, and there is a large gap in accuracy between proprietary and free algorithms. Previous work presented COSMOS, a novel data-driven algorithm that uses knowledge of known structures from the Cambridge Structural Database and demonstrates performance that was competitive with proprietary algorithms. However, dependence on the Cambridge Structural Database prevented its widespread use. Here, we present an updated version of the COSMOS structure predictor, complete with a free structure library derived from open data sources. We demonstrate that COSMOS performs better than other freely available methods, with a mean RMSD of 1.16 and 1.68 Å for organic and metal-organic structures, respectively, and a mean prediction time of 60 ms per molecule. This is a 17% and 20% reduction, respectively, in RMSD compared to the free predictor provided by Open Babel, and it is 10 times faster. The ChemDB Web portal provides a COSMOS prediction Web server, as well as downloadable copies of the COSMOS executable and library of molecular substructures.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Crystallography, X-Ray
  • Databases, Chemical
  • Databases, Factual
  • Heterocyclic Compounds / chemistry*
  • Molecular Conformation
  • Organometallic Compounds / chemistry*
  • Small Molecule Libraries / chemistry*
  • Software*

Substances

  • Heterocyclic Compounds
  • Organometallic Compounds
  • Small Molecule Libraries