华南理工大学学报(自然科学版) ›› 2025, Vol. 53 ›› Issue (7): 1-.doi: 10.12141/j.issn.1000-565X.240591

• 电子、通信与自动控制 •    

基于SAM优化的饲喂目标识别方法与实现

张勤 翁凯航   

  1. 华南理工大学 机械与汽车工程学院,广东广州 510640

  • 出版日期:2025-07-25 发布日期:2025-02-28

Feeding Target Recognition Method and Implementation Based on SAM Optimization

ZHANG Qin WENG Kaihang   

  1. School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou 510640, Guangdong, China

  • Online:2025-07-25 Published:2025-02-28

摘要:

饲喂目标的快速、准确识别是机器人实现智能推料的重要保证,匹配分割精度和运行效率,是保证算法综合性能的关键步骤,也是识别算法的重要课题。针对现有奶牛饲喂目标识别方法在分割精度和运行效率的匹配问题,提出基于图像分割大模型SAM优化的饲喂目标实时识别方法RTFIS。在SAM-det架构基础上,通过轻量化图像编码器和目标检测器的参数,引入缓冲队列的并行化设计方法,平衡各模块运行效率,显著提升了推理效率;利用HQ-token增强特征空间解码能力,优化设计掩码解码器,并采用针对饲喂目标的分阶段训练方法提高了分割精度。研究和实验结果表明:提出的方法在提高分割精度前提下保证了分割效率。在奶牛饲喂目标识别中的奶牛分割精度达98.7%,饲料分割精度达到96.4%,运行速度为52.9FPS,满足复杂环境下对奶牛饲喂目标识别方法的应用需求。

关键词: 图像分割大模型, 奶牛饲喂, 目标识别

Abstract:

The rapid and accurate identification of feeding targets is a crucial guarantee for intelligent feeding assistance robots. Balancing segmentation accuracy and operational efficiency is a key aspect of ensuring the comprehensive performance of algorithms and a significant challenge for recognition methods. To address the issue of matching segmentation accuracy and efficiency in existing methods for identifying cow feeding targets, this paper proposes a real-time feeding target instance segmentation (RTFIS) based on Segment Anything Model (SAM) optimization. Building on the SAM-det architecture, the method introduces lightweight parameter designs for the image encoder and object detector, along with a parallelized buffer queue approach to balance the operational efficiency of each module, significantly improving inference speed. The use of HQ-token enhances feature space decoding capability, optimizes the design of the mask decoder, and employs a phased training strategy tailored to feeding targets, thereby improving segmentation accuracy. Research and experimental results show that the proposed method ensures segmentation efficiency while enhancing segmentation accuracy. In the task of cow feeding target recognition, the method achieves a segmentation accuracy of 98.7% for cows, 96.4% for feed, and a processing speed of 52.9 FPS, meeting the application requirements for cow feeding target recognition in complex environments.

Key words: segment anything model, cow feeding, target recognition