Ligand based virtual screening (LBVS) approaches could be broadly divided into those relying on chemical similarity searches and those employing Quantitative Structure-Activity Relationship (QSAR) models. We have compared the predictive power of these approaches using some datasets of compounds tested against several G-Protein Coupled Receptors (GPCRs). The k-Nearest Neighbors (kNN) QSAR models were built for known ligands of each GPCR target independently, with a fraction of tested ligands for each target set aside as a validation set. The prediction accuracies of QSAR models for making active/inactive calls for compounds in both training and validation sets were compared to those achieved by the Prediction of Activity Spectra for Substances' (PASS) and the Similarity Ensemble Approach (SEA) tools both available online. Models developed with the kNN QSAR method showed the highest predictive power for almost all tested GPCR datasets. The PASS software, which incorporates multiple end-point specific QSAR models demonstrated a moderate predictive power, while SEA, a chemical similarity based approach, had the lowest prediction power. Our studies suggest that when sufficient amount of data is available to develop and rigorously validate QSAR models such models should be chosen as the preferred virtual screening tool in ligand-based computational drug discovery as compared to chemical similarity based approaches.
Keywords: GPCRs; Model validation; PASS; QSAR modeling; SEA.
© 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.