Bit-STED: A lightweight transformer for accurate agave counting with UAV imagery Academic Article in Scopus uri icon

abstract

  • This paper presented Bit-STED, a novel and simplified transformer encoder architecture for efficient agave plant detection and accurate counting using unmanned aerial vehicle (UAV) imagery. Addressing the critical need for accessible and cost-efficient solutions in agricultural monitoring, this approach automates a process that is typically time-consuming, labor-intensive, and prone to human error in manual practices. The Bit-STED model features a lightweight transformer design that incorporates innovative techniques for efficient feature extraction, model compression through quantization, and shape-aware object localization using circular bounding boxes for the roughly circular shape of the agave rosettes. To complement the detection model, a novel counting algorithm was developed to manage plants spanning multiple image tiles accurately. The experimental results demonstrated that the Bit-STED model outperformed the baseline models in terms of detection and agave plant count performance. Specifically, the Bit-STED nano model achieved F1 scores of 96.66% on a map with younger plants and 96.43% on a map with larger, highly overlapping plants. These scores surpassed state-of-the-art baselines, such as YOLOv8 Nano (F1 scores of 96.42% and 96.38%, respectively) and DETR (F1 scores of 93.03% and 85.61%, respectively). Furthermore, the Bit-STED nano model was significantly smaller, being less than one-eighth the size of the YOLOv8 nano model (1.4 MB compared to 12.0 MB), had fewer trainable parameters (0.35M compared to 3.01M), and was faster in average inference times (14.62 ms compared to 18.28 ms). © 2025

publication date

  • December 1, 2025