华南理工大学学报(自然科学版) ›› 2025, Vol. 53 ›› Issue (7): 60-69.doi: 10.12141/j.issn.1000-565X.240591

• 电子、通信与自动控制 • 上一篇    下一篇

基于SAM优化的饲喂目标实时识别方法

张勤, 翁凯航   

  1. 华南理工大学 机械与汽车工程学院,广东 广州 510640
  • 收稿日期:2024-12-20 出版日期:2025-07-25 发布日期:2025-02-28
  • 作者简介:张勤(1964—),女,博士,教授,主要从事机器人及其应用研究。E-mail: zhangqin@scut.edu.cn
  • 基金资助:
    海南省自然科学基金项目(324MS095);广东省现代农业产业共性关键技术研发创新团队建设项目(2019KJ129)

Real-Time Feeding Target Recognition Method Based on SAM Optimization

ZHANG Qin, WENG Kaihang   

  1. School of Mechanical and Automotive Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China
  • Received:2024-12-20 Online:2025-07-25 Published:2025-02-28
  • About author:张勤(1964—),女,博士,教授,主要从事机器人及其应用研究。E-mail: zhangqin@scut.edu.cn
  • Supported by:
    the Natural Science Foundation of Hainan Province(324MS095)

摘要:

饲喂辅助机器人是推动畜牧业现代化转型的关键设备,饲喂目标的快速、准确识别是机器人实现智能推料的重要保证。匹配分割精度和运行效率是保证算法综合性能的关键步骤,也是识别算法的重要课题。针对现有奶牛饲喂目标识别方法存在分割精度和运行效率不匹配的问题,该文提出了一种基于分割大模型(SAM)优化的饲喂目标实时识别方法RTFTR。该方法首先在SAM-det架构基础上,通过轻量化图像编码器和目标检测器的参数,引入缓冲区队列的并行化设计方法来平衡各模块的运行效率,以提升推理速率;然后利用HQ形符增强特征空间的解码能力,优化设计掩码解码器,并采用针对饲喂目标的分阶段训练,以提高分割精度。实验结果表明:所提方法在提高分割精度的前提下保证了推理速率;在奶牛饲喂目标识别中,奶牛分割精度达98.7%,饲料分割精度达96.4%,料槽分割精度达99.2%,整体平均分割精度达98.1%,运行速率为52.9 f/s,满足养殖场复杂环境和机器人计算资源限制下对奶牛饲喂目标识别方法的高精度、高效率的应用需求。

关键词: 饲喂辅助机器人, 分割大模型, 奶牛饲喂, 目标识别, 分割精度

Abstract:

Feeding-assistance robots are key equipment in promoting the modernization and transformation of animal husbandry. The rapid and accurate identification of feeding targets is essential for enabling intelligent feed-pushing, while balancing segmentation accuracy and operational efficiency is crucial for ensuring the overall performance of recognition algorithms—an important topic in the field of intelligent livestock management. To address the mismatch between segmentation accuracy and processing efficiency in current dairy cow feeding target recognition methods, this paper proposed a real-time feeding target recognition method (RTFTR) based on an optimized Segment Anything Model (SAM). Built on the SAM-det architecture, RTFTR first introduces lightweight image encoder and object detector, along with a parallelized buffer queue design, to balance the operational efficiency of each module and enhance inference speed. It then employs a High-Quality (HQ) token mechanism to enhance the feature space decoding capacity, optimizes the mask decoder, and applies stage-wise training tailored to feeding targets to improve segmentation accuracy. Experimental results show that the proposed method ensures inference efficiency while enhancing segmentation accuracy. In the task of cow feeding target recognition, the method achieves a segmentation accuracy of 98.7% for cows, 96.4% for feed, 99.2% for bunk, with an overall average accuracy of 98.1%, and a processing speed of 52.9 f/s, meeting the application requirements for cow feeding target recognition in complex environments and limited robotic computational resources.

Key words: feeding-assistance robot, segment anything model, cow feeding, target recognition, segmentation accuracy

中图分类号: