A pipeline for estimating human attention toward objects with on-board cameras on the iCub humanoid robot

Front Robot AI. 2024 Oct 17:11:1346714. doi: 10.3389/frobt.2024.1346714. eCollection 2024.

Abstract

This research report introduces a learning system designed to detect the object that humans are gazing at, using solely visual feedback. By incorporating face detection, human attention prediction, and online object detection, the system enables the robot to perceive and interpret human gaze accurately, thereby facilitating the establishment of joint attention with human partners. Additionally, a novel dataset collected with the humanoid robot iCub is introduced, comprising more than 22,000 images from ten participants gazing at different annotated objects. This dataset serves as a benchmark for human gaze estimation in table-top human-robot interaction (HRI) contexts. In this work, we use it to assess the proposed pipeline's performance and examine each component's effectiveness. Furthermore, the developed system is deployed on the iCub and showcases its functionality. The results demonstrate the potential of the proposed approach as a first step to enhancing social awareness and responsiveness in social robotics. This advancement can enhance assistance and support in collaborative scenarios, promoting more efficient human-robot collaborations.

Keywords: attention; computer vision; gaze estimation; humanoid robot; human–robot scenario; learning architecture.

Grants and funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This work received funding from the Italian National Institute for Insurance against Accidents at Work (INAIL) ergoCub Project and the project Fit for Medical Robotics (Fit4MedRob)–PNRR MUR Cod. PNC0000007–CUP: B53C22006960001.