华南理工大学学报(自然科学版) ›› 2012, Vol. 40 ›› Issue (3): 106-111.

• 电子、通信与自动控制 • 上一篇    下一篇

变异特征加权的异常语音说话人识别算法

何俊 李艳雄 贺前华 李威   

  1. 华南理工大学 电子与信息学院,广东 广州 510640
  • 收稿日期:2011-08-14 修回日期:2011-10-11 出版日期:2012-03-25 发布日期:2012-02-01
  • 通信作者: 李艳雄(1980-) ,男,博士,讲师,主要从事信号处理及模式识别研究. E-mail: eeyxli@scut.edu.cn E-mail:hejun_723@126.com
  • 作者简介:何俊(1978-) ,男,博士生,主要从事语音信号处理研究.
  • 基金资助:

    国家自然科学基金资助项目( 60972132, 61101160) ; 广东省自然科学基金团队项目( 9351064101000003) ; 广东省自然科学基金博士科研启动项目( 10451064101004651) ; 华南理工大学中央高校基本科研业务费专项资金资助项目( 2011ZM0029)

Speaker Recognition Algorithm for Abnormal Speech Based on Abnormal Feature Weighting

He Jun  Li Yan-xiong  He Qian-hua  Li Wei   

  1. School of Electronic and Information Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China
  • Received:2011-08-14 Revised:2011-10-11 Online:2012-03-25 Published:2012-02-01
  • Contact: 李艳雄(1980-) ,男,博士,讲师,主要从事信号处理及模式识别研究. E-mail: eeyxli@scut.edu.cn E-mail:hejun_723@126.com
  • About author:何俊(1978-) ,男,博士生,主要从事语音信号处理研究.
  • Supported by:

    国家自然科学基金资助项目( 60972132, 61101160) ; 广东省自然科学基金团队项目( 9351064101000003) ; 广东省自然科学基金博士科研启动项目( 10451064101004651) ; 华南理工大学中央高校基本科研业务费专项资金资助项目( 2011ZM0029)

摘要: 常用的加权算法难以跟踪非常态语音特征的变异,为此,文中提出了一种变异特征加权的异常语音说话人识别算法. 首先统计大量正常语音各阶MFCC 特征的概率分布,建立正常语音特征模板; 然后用测试语音特征与正常语音特征模板之间的K-L 距离和欧氏距离来度量语音的变异程度,确定K-L 加权因子和欧氏加权因子; 最后利用加权因子对测试语音的MFCC 特征进行加权,并将加权后的特征输入高斯混合模型进行异常语音说话人识别. 实验结果表明,文中提出的K-L 加权和欧氏加权的异常语音说话人识别算法的整体识别率分别为46.61%和42.25%,而基于各阶特征对说话人识别贡献的加权算法和不加权算法的整体识别率分别为39.68%和36.36%.

关键词: 异常语音, 说话人识别, 变异特征加权, K-L 距离, 加权因子

Abstract:

As the commonly-used weighting algorithm is inefficient in tracking the abnormal feature of abnormal speech,a speaker recognition algorithm for abnormal speech is proposed based on the abnormal feature weighting. In this algorithm,first,a feature template of normal speech is established by computing the probability distribution of MFCC features of each order in a large number of normal speech samples. Then,the K-L distance and the Euclidean distance are used to measure the differences between a given test speech and the normal speech templates and to further determine the K-L and the Euclidean weighting factors. Finally,the two weighting factors are used to weight the MFCC features of the test speech,and the weighted MFCC features are input in the Gaussian mixture model for the speaker recognition with abnormal speech. Experimental results show that the global recognition rates of the speaker recognition algorithms based on the K-L weighting and the Euclidean weighting are respectively 46.61% and 42.25%,while those of the algorithms with and without the weighting of speaker recognition contribution of each order feature are respectively only 39.68% and 36.36%.

Key words: abnormal speech, speaker recognition, abnormal feature weighting, K-L distance, weighting factor