FSees: Customized Enumeration of Chemical Subspaces with Limited Main Memory Consumption

J Chem Inf Model. 2016 Sep 26;56(9):1641-53. doi: 10.1021/acs.jcim.6b00117. Epub 2016 Sep 12.

Abstract

In the search for new marketable drugs, new ideas are required constantly. Particularly with regard to challenging targets and previously patented chemical space, designing novel molecules is crucial. This demands efficient and innovative computational tools to generate libraries of promising molecules. Here we present an efficient method to generate such libraries by systematically enumerating all molecules in a specific chemical space. This space is defined by a fragment space and a set of user-defined physicochemical properties (e.g., molecular weight, tPSA, number of H-bond donors and acceptors, or predicted logP). In order to enumerate a very large number of molecules, our algorithm uses file-based data structures instead of memory-based ones, thus overcoming the limitations of computer main memory. The resulting chemical library can be used as a starting point for computational lead-finding technologies, like similarity searching, pharmacophore mapping, docking, or virtual screening. We applied the algorithm in different scenarios, thus creating numerous target-specific libraries. Furthermore, we generated a fragment space from all approved drugs in DrugBank and enumerated it with lead-like constraints, thus generating 0.5 billion molecules in the molecular weight range 250-350.

MeSH terms

  • Algorithms
  • Drug Discovery / methods*
  • Hydrogen Bonding
  • Informatics / methods*
  • Small Molecule Libraries / chemistry*
  • Small Molecule Libraries / pharmacology

Substances

  • Small Molecule Libraries