Analysis of Autonomous Penetration Testing Through Reinforcement Learning and Recommender Systems

Ariadna Claudia Moreno; Aldo Hernandez-Suarez; Gabriel Sanchez-Perez; Linda Karina Toscano-Medina; Hector Perez-Meana; Jose Portillo-Portillo; Jesus Olivares-Mercado; Luis Javier García Villalba

doi:10.3390/s25010211

Analysis of Autonomous Penetration Testing Through Reinforcement Learning and Recommender Systems

Sensors (Basel). 2025 Jan 2;25(1):211. doi: 10.3390/s25010211.

Authors

Ariadna Claudia Moreno¹, Aldo Hernandez-Suarez¹, Gabriel Sanchez-Perez¹, Linda Karina Toscano-Medina¹, Hector Perez-Meana¹, Jose Portillo-Portillo¹, Jesus Olivares-Mercado¹, Luis Javier García Villalba²

Affiliations

¹ Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, Mexico.
² Group of Analysis, Security and Systems (GASS), Department of Software Engineering and Artificial Intelligence (DISIA), Faculty of Computer Science and Engineering, Office 431, Universidad Complutense de Madrid (UCM), Calle Profesor José García Santesmases, 9, Ciudad Universitaria, 28040 Madrid, Spain.

Abstract

Conducting penetration testing (pentesting) in cybersecurity is a crucial turning point for identifying vulnerabilities within the framework of Information Technology (IT), where real malicious offensive behavior is simulated to identify potential weaknesses and strengthen preventive controls. Given the complexity of the tests, time constraints, and the specialized level of expertise required for pentesting, analysis and exploitation tools are commonly used. Although useful, these tools often introduce uncertainty in findings, resulting in high rates of false positives. To enhance the effectiveness of these tests, Machine Learning (ML) has been integrated, showing significant potential for identifying anomalies across various security areas through detailed detection of underlying malicious patterns. However, pentesting environments are unpredictable and intricate, requiring analysts to make extensive efforts to understand, explore, and exploit them. This study considers these challenges, proposing a recommendation system based on a context-rich, vocabulary-aware transformer capable of processing questions related to the target environment and offering responses based on necessary pentest batteries evaluated by a Reinforcement Learning (RL) estimator. This RL component assesses optimal attack strategies based on previously learned data and dynamically explores additional attack vectors. The system achieved an F1 score and an Exact Match rate over 97.0%, demonstrating its accuracy and effectiveness in selecting relevant pentesting strategies.

Keywords: penetration testing; recommender systems; reinforcement learning.

Grants and funding

This research received no external funding.