华南理工大学学报(自然科学版) ›› 2024, Vol. 52 ›› Issue (7): 19-28.doi: 10.12141/j.issn.1000-565X.230608

• 电子、通信与自动控制 • 上一篇    下一篇

基于全局一致性网络的参数化人体网格重建

鲍文霞1(), 田如震1, 王年1(), 陈和木2, 杨先军3   

  1. 1.安徽大学 电子信息工程学院, 安徽 合肥 230601
    2.安徽医科大学第一附属医院, 安徽 合肥 230022
    3.中国科学院 合肥物质科学研究院, 安徽 合肥 230031
  • 收稿日期:2023-09-28 出版日期:2024-07-25 发布日期:2024-01-31
  • 通信作者: 王年(1966—),男,教授,博士生导师,主要从事模式识别、计算机视觉、图像处理、智能信息处理等研究。 E-mail:bwxia@ahu.edu.cn
  • 作者简介:鲍文霞(1980—),女,教授,博士生导师,主要从事机器学习、图像视频处理和模式识别等研究。
  • 基金资助:
    国家重点研发计划项目(2020YFF0303803);安徽省重点研究与开发计划资助项目(2022k07020006);安徽高校自然科学研究资助项目(KJ2021ZD0004)

Parametric Human Body Mesh Reconstruction Based on Global Consistency Network

BAO Wenxia1(), TIAN Ruzhen1, WANG Nian1(), CHEN Hemu2, YANG Xianjun3   

  1. 1.School of Electronic and Information Engineering,Anhui University,Hefei 230601,Anhui,China
    2.The First Affiliated Hospital of Anhui Medical University,Hefei 230022,Anhui,China
    3.Hefei Institute of Physical Sciences,Chinese Academy of Sciences,Hefei 230031,Anhui,China
  • Received:2023-09-28 Online:2024-07-25 Published:2024-01-31
  • Contact: 王年(1966—),男,教授,博士生导师,主要从事模式识别、计算机视觉、图像处理、智能信息处理等研究。 E-mail:bwxia@ahu.edu.cn
  • About author:鲍文霞(1980—),女,教授,博士生导师,主要从事机器学习、图像视频处理和模式识别等研究。
  • Supported by:
    the National Key Research and Development Program of China(2020YFF0303803);the Key Research and Development Program of Anhui Province(2022k07020006);the Natural Science Research Funding Project of Anhui Universities(KJ2021ZD0004)

摘要:

人体网格重建(HMR)在人机交互和虚拟/增强现实等领域有广泛应用。为了进一步提高基于图像的人体网格重建中人体姿势和形状估计的精度,提出了基于混合逆运动学的全局一致性深度卷积神经网络,用于参数化人体网格重建(GloCoNet)。为了增强网络的全局一致性和全局上的长程依赖,该网络在特征提取网络基础上,设计了全局一致性增强器(GCB)模块,它能够增强模型对全局信息的感知能力和表达能力,并且使模型能够自适应地调整不同通道和空间位置的特征图权重。然后引入了多头注意力机制(MHSA)来捕获模型全局上的长程依赖,它可以帮助模型在处理长期依赖时更好地捕捉到关键的关系和模式,并建模全局上下文信息,从而更好的丰富特征子空间的多样性。同时,该网络采用混合逆运动学的方法弥合人体网格估计和3D人体关节点估计之间的差距,最终提升人体3D姿势和形状估计的准确度。实验结果表明,GloCoNet模型在公开的Human 3.6 M数据集上以平均每关节51.3 mm的位置误差(MPJPE)显著优于先前的主流方法。

关键词: 人体网格重建, 全局一致性, 混合逆运动学, 人体参数

Abstract:

Human body mesh reconstruction (HMR) has wide applications in human-computer interaction, virtual/augmented reality, and other fields. In order to further improve the accuracy of human body pose and shape estimation in image-based human body mesh reconstruction, this study proposed a parametric human body mesh reconstruction network based on hybrid inverse kinematics and global consistency deep convolutional neural network, called GloCoNet. To enhance the network’s global consistency and long-range dependencies, a Global Consistency Booster (GCB) module was designed on top of the feature extraction network. It can enhance the model’s perception and expression capabilities of global information, and allow the model to adaptively adjust the feature map weights of different channels and spatial positions. Furthermore, a multi-head attention mechanism was introduced to capture the model’s long-range dependencies globally, helping the model better capture key relationships and patterns when dealing with long-term dependencies, and modeling global contextual information to enrich the diversity of feature subspaces. Meanwhile, the network adopts a hybrid inverse kinematics approach to bridge the gap between human body mesh estimation and 3D human joint estimation, ultimately improving the accuracy of human 3D pose and shape estimation. Experimental results show that the GloCoNet model significantly outperforms previous mainstream methods with an average per joint position error of 51.3 mm on the publicly available Human3.6M dataset.

Key words: human body mesh reconstruction, global consistency, hybrid inverse kinematics, human body parameter

中图分类号: