Drug-induced liver injury (DILI) is one of the leading causes of the termination of drug development programs. Consequently, identifying the risk of DILI in humans for drug candidates during the early stages of the development process would greatly reduce the drug attrition rate in the pharmaceutical industry but would require the implementation of new research and development strategies. In this regard, several in silico models have been proposed as alternative means in prioritizing drug candidates. Because the accuracy and utility of a predictive model rests largely on how to annotate the potential of a drug to cause DILI in a reliable and consistent way, the Food and Drug Administration-approved drug labeling was given prominence. Out of 387 drugs annotated, 197 drugs were used to develop a quantitative structure-activity relationship (QSAR) model and the model was subsequently challenged by the left of drugs serving as an external validation set with an overall prediction accuracy of 68.9%. The performance of the model was further assessed by the use of 2 additional independent validation sets, and the 3 validation data sets have a total of 483 unique drugs. We observed that the QSAR model's performance varied for drugs with different therapeutic uses; however, it achieved a better estimated accuracy (73.6%) as well as negative predictive value (77.0%) when focusing only on these therapeutic categories with high prediction confidence. Thus, the model's applicability domain was defined. Taken collectively, the developed QSAR model has the potential utility to prioritize compound's risk for DILI in humans, particularly for the high-confidence therapeutic subgroups like analgesics, antibacterial agents, and antihistamines.
Keywords: drug label; drug-induced liver injury; external validation.; predictive model; quantitative structure-activity relationship; therapeutic categories.