华南理工大学学报(自然科学版) ›› 2011, Vol. 39 ›› Issue (7): 88-94.doi: 10.3969/j.issn.1000-565X.2011.07.015

• 电子、通信与自动控制 • 上一篇    下一篇

基于唇部灰度能量图的唇读方法

梁亚玲 杜明辉   

  1. 华南理工大学 电子与信息学院,广东 广州 510640
  • 收稿日期:2010-12-09 修回日期:2011-04-10 出版日期:2011-07-25 发布日期:2011-06-03
  • 通信作者: 梁亚玲(1977-) ,女,博士生,讲师,主要从事图像处理、图像编码研究. E-mail:ylliang@scut.edu.cn
  • 作者简介:梁亚玲(1977-) ,女,博士生,讲师,主要从事图像处理、图像编码研究.
  • 基金资助:

    NSFC-广东省自然科学联合基金资助项目( U0735004)

Lipreading Based on Lip Gray Energy Image

Liang Ya-ling  Du Ming-hui   

  1. School of Electronic and Information Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China
  • Received:2010-12-09 Revised:2011-04-10 Online:2011-07-25 Published:2011-06-03
  • Contact: 梁亚玲(1977-) ,女,博士生,讲师,主要从事图像处理、图像编码研究. E-mail:ylliang@scut.edu.cn
  • About author:梁亚玲(1977-) ,女,博士生,讲师,主要从事图像处理、图像编码研究.
  • Supported by:

    NSFC-广东省自然科学联合基金资助项目( U0735004)

摘要: 针对单视觉通道唇读系统的唇部特征提取问题,提出了基于唇部灰度能量图的特征提取方法.该方法将表示字或词的图像序列投影到二维灰度能量图上,不但统一了输入数据的维数,而且较好地保留了序列图像的运动信息.针对模板匹配方法对模板的依赖问题,文中将单训练样本唇部灰度能量模板图推广至多训练样本.文中还就唇部定位提出中心定位法.实验结果表明: 在单帧图像特征维数相同的情况下,文中唇读方法的识别率比传统的对单帧图像分别提取特征的方法有较大的提高,运算时间明显缩小; 双训练样本比单训练样本的识别率平均提高了11. 29%; 唇部精确定位后的识别率比定位前提高2%以上,系统最高识别率达90. 63%.

关键词: 唇读, 唇部灰度能量图, 步态能量图, 特征提取

Abstract:

In this paper,by taking the visual-only lipreading system as the research objective,a method to extract the visual lip feature based on the lip gray energy image ( LGEI) is proposed. In this method,the image sequences of a word are projected to the 2D lip gray energy image to unify the dimension of input data and maintain most motion information of image sequences. In order to eliminate the dependence of the template matching method on the
template,the LGEI of the single-training sample is extended to the multi-training sample. Moreover,a lip location method based on the lip center is also proposed. Experimental results show that,as compared with the conventional methods that extract features for each image of the sequence,the proposed method greatly improves the recognition rate and significantly decreases the computation time in the same dimension of features for a single image,that the recognition rate of double-training samples averagely improves by 11.29%,as compared with that of single-training samples,and that,after an accurate lip location,the recognition rate improves by more than 2%,with its maximum being up to 90.63%.

Key words: lipreading, lip gray energy image, gait energy image, feature extraction