This study investigates the application of various regression models for predicting drug solubility in polymer and API-polymer interactions in complex datasets. Four models-Gaussian Process Regression (GPR), Support Vector Regression (SVR), Bayesian Ridge Regression (BRR), and Kernel Ridge Regression (KRR)-are evaluated. Preprocessing the dataset using the Z-score approach helped to detect outliers, further improving the accuracy and dependability of the analysis. Also, Fireworks Algorithm (FWA) is employed for hyper-parameter tuning in this work. The GPR model demonstrated superior performance, achieving the lowest MSE and MAE for both drug solubility and gamma predictions, with R2 scores of 0.9980 and 0.9950 for training and test data, respectively. The results of this study show the robustness of GPR in generating reliable and precise forecasts, thus providing a strong method for intricate regression tasks in pharmaceutical and other scientific fields. In addition, the Fireworks Algorithm (FWA) is presented as an optimization method, demonstrating its potential in improving the model's predictive abilities by effectively exploring and exploiting the search space. The results emphasize the significance of choosing suitable regression models and optimization techniques to attain dependable and superior predictive analytics.
Keywords: Drug design; Drug solubility; Fireworks algorithm; Gaussian process regression; Hyper-parameter optimization.
© 2024. The Author(s).