Objectives: This study introduces Smart Imitator (SI), a 2-phase reinforcement learning (RL) solution enhancing personalized treatment policies in healthcare, addressing challenges from imperfect clinician data and complex environments.
Materials and methods: Smart Imitator's first phase uses adversarial cooperative imitation learning with a novel sample selection schema to categorize clinician policies from optimal to nonoptimal. The second phase creates a parameterized reward function to guide the learning of superior treatment policies through RL. Smart Imitator's effectiveness was validated on 2 datasets: a sepsis dataset with 19 711 patient trajectories and a diabetes dataset with 7234 trajectories.
Results: Extensive quantitative and qualitative experiments showed that SI significantly outperformed state-of-the-art baselines in both datasets. For sepsis, SI reduced estimated mortality rates by 19.6% compared to the best baseline. For diabetes, SI reduced HbA1c-High rates by 12.2%. The learned policies aligned closely with successful clinical decisions and deviated strategically when necessary. These deviations aligned with recent clinical findings, suggesting improved outcomes.
Discussion: Smart Imitator advances RL applications by addressing challenges such as imperfect data and environmental complexities, demonstrating effectiveness within the tested conditions of sepsis and diabetes. Further validation across diverse conditions and exploration of additional RL algorithms are needed to enhance precision and generalizability.
Conclusion: This study shows potential in advancing personalized healthcare learning from clinician behaviors to improve treatment outcomes. Its methodology offers a robust approach for adaptive, personalized strategies in various complex and uncertain environments.
Keywords: adversarial imitation learning; clinical decision-making; health care AI; imitation learning (IL); reinforcement learning (RL).
© The Author(s) 2025. Published by Oxford University Press on behalf of the American Medical Informatics Association.