Visual Signal Recognition with ResNet50V2 for Autonomous ROV Navigation in Underwater Environments

This study presents the design and evaluation of AquaSignalNet, a deep learning-based system for recognizing underwater visual commands to enable the autonomous navigation of a Remotely Operated Vehicle (ROV). The system is built on a ResNet50 V2 architecture and trained with a custom dataset, UVSRD, comprising 33,800 labeled images across 12 gesture classes, including directional commands, speed values, and vertical motion instructions. The model was deployed on a Raspberry Pi 4 integrated with a TIVA C microcontroller for real-time motor control, a PID-based depth control loop, and an MPU9250 sensor for orientation tracking. Experiments were conducted in a controlled pool environment using printed signal cards to define two autonomous trajectories. In the first trajectory, the system achieved 90% success, correctly interpreting a mixed sequence of turns, ascents, and speed changes. In the second, more complex trajectory, involving a rectangular inspection loop and multi-layer navigation, the system achieved 85% success, with failures mainly due to misclassification resulting from lighting variability near the water surface. Unlike conventional approaches that rely on QR codes or artificial markers, AquaSignalNet employs markerless visual cues, offering a flexible alternative for underwater inspection, exploration, and logistical operations. The results demonstrate the system¿s viability for real-time gesture-based control. © 2025 by the authors.