High numbers of drug recalls persist despite the tremendous time and effort invested by pharmaceutical organizations and regulatory bodies such as the Food and Drug Administration (FDA) to ensure the quality of safe and effective medicines for the patient. It is imperative to better understand the underlying risk factors of drug formulation-based recalls to best protect the patient from poor quality drugs. Increased knowledge of underlying factors of formulation risk can also help inform the future design and development of drugs. In this study, we used a text mining technique with Python to parse the data and examine drug recalls from the aspect of administration route, dosage form, release mechanism, market type, pharmacologic class, and excipients. Observational analysis of the recalls revealed both high- and low-risk factors for the formulation-based recalls. Higher risk, or an increased probability of a formulation-based recall, was associated with factors such as extended release mechanism, capsule dosage form, oral route of administration, and an increased number of excipients, while lower risk of formulation-based recalls was associated with other factors including the new drug application market type, immediate release mechanism, and solution dosage form. In addition, the factors did not work independently, and we observed interactions among variables. For example, the release mechanism modified the effect of market type, administration route, and dosage form. This study will help inform the future design of quality drug products by pharmaceutical organizations and assist risk-based oversight by regulatory organizations, such as FDA, to ensure patient safety.
Keywords: drug recalls; excipient; release mechanism; text mining.