Work in Progress: Enhancing Human-Robot Interaction Through a Speech and Command Recognition System for a Service Robot Using ROS Melodic Academic Article in Scopus uri icon

abstract

  • This paper presents the development and evaluation of a Speech and Command Recognition system integrated into PiBot, an autonomous service robot developed at Tecnológico de Monterrey. The system executes on Robot Operating System (ROS) Melodic framework running on a Jetson TX2 embedded computer to enable natural language interaction through Automated Speech Recognition (ASR). The study focuses on the challenges and opportunities of implementing speech recognition in real-world environments, particularly within constrained hardware platforms. The system achieved a 25% Word Error Rate (WER) and a 73% Command Accuracy, with performance varying across different testing environments. Difficulties were noted in recognizing uncommon or non-Spanish words, highlighting the need for further model fine-tuning. A quick comparison with state-of-the-art models indicates room for improvement. This work is in progress, aiming to set the direction for future development. Future work will focus on fine-tuning the model using datasets with ground truth transcriptions to improve performance in diverse acoustic scenarios, ultimately aiming to enhance the system's reliability in complex, noise-prone settings. © 2024 IEEE.

publication date

  • January 1, 2024