Conducting penetration testing (pentesting) in cybersecurity is a crucial turning point for identifying vulnerabilities within the framework of Information Technology (IT), where real malicious offensive behavior is simulated to identify potential weaknesses and strengthen preventive controls. Given the complexity of the tests, time constraints, and the specialized level of expertise required for pentesting, analysis and exploitation tools are commonly used. Although useful, these tools often introduce uncertainty in findings, resulting in high rates of false positives. To enhance the effectiveness of these tests, Machine Learning (ML) has been integrated, showing significant potential for identifying anomalies across various security areas through detailed detection of underlying malicious patterns. However, pentesting environments are unpredictable and intricate, requiring analysts to make extensive efforts to understand, explore, and exploit them. This study considers these challenges, proposing a recommendation system based on a context-rich, vocabulary-aware transformer capable of processing questions related to the target environment and offering responses based on necessary pentest batteries evaluated by a Reinforcement Learning (RL) estimator. This RL component assesses optimal attack strategies based on previously learned data and dynamically explores additional attack vectors. The system achieved an F1 score and an Exact Match rate over 97.0%, demonstrating its accuracy and effectiveness in selecting relevant pentesting strategies.
Keywords: penetration testing; recommender systems; reinforcement learning.