Due to the hallucination of the underlying large language model(LLMs) or the unclear description of the task's ultimate goal, the agents have become somewhat confused. Despite having completed tasks, they have not ceased working, leading to a waste of resource. We propose similarity-based task timely termination method for image-based intelligent agents, This method involves recording the scenario state after the completion of each sub-task and comparing it with the fully completed task scenario state using a structural similarity method. The result is quantified and standardized into a structural similarity index, which is used to judge whether the task has been completed. Moreover, we categorize the types of agents based on model and created an image-based agent task dataset. In experimental results, the image-based agents using this method showed an average reduction of 1.94 steps in the number of steps to complete 20 task tests, a [Formula: see text] reduction in time costs, and a [Formula: see text] reduction in token costs. This method can effectively reduce the negative actions of image-based agents when they experience hallucinations, ensuring their tasks are completed excellently, and it can effectively reduce the waste of resources such as time, tokens, and hardware. Our project can be found at GitHub .
Keywords: Application; Intelligent agent; Large language model; Task termination; Timely.
© 2024. The Author(s).