华南理工大学学报(自然科学版) ›› 2011, Vol. 39 ›› Issue (10): 7-12.doi: 10.3969/j.issn.1000-565X.2011.10.002

• 电子、通信与自动控制 • 上一篇    下一篇

基于信道模式噪声的录音回放攻击检测

王志锋 贺前华 张雪源 罗海宇 苏卓生   

  1. 华南理工大学 电子与信息学院,广东 广州 510640
  • 收稿日期:2011-04-12 修回日期:2011-06-03 出版日期:2011-10-25 发布日期:2011-09-01
  • 通信作者: 王志锋(1985-) ,男,博士生,主要从事数字音频取证、说话人识别、信息安全、模式识别研究. E-mail:wang.zf01@mail.scut.edu.cn
  • 作者简介:王志锋(1985-) ,男,博士生,主要从事数字音频取证、说话人识别、信息安全、模式识别研究.
  • 基金资助:

    国家自然科学基金资助项目( 60972132) ; 广东省自然科学基金团队项目( 9351064101000003)

Playback Attack Detection Based on Channel Pattern Noise

Wang Zhi-feng  He Qian-hua  Zhang Xue-yuan  Luo Hai-yu  Su Zhuo-sheng   

  1. School of Electronic and Information Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China
  • Received:2011-04-12 Revised:2011-06-03 Online:2011-10-25 Published:2011-09-01
  • Contact: 王志锋(1985-) ,男,博士生,主要从事数字音频取证、说话人识别、信息安全、模式识别研究. E-mail:wang.zf01@mail.scut.edu.cn
  • About author:王志锋(1985-) ,男,博士生,主要从事数字音频取证、说话人识别、信息安全、模式识别研究.
  • Supported by:

    国家自然科学基金资助项目( 60972132) ; 广东省自然科学基金团队项目( 9351064101000003)

摘要: 近来出现的高质量低价格的录音及回放设备的高保真度使人耳也难以分辨出原始语音和回放语音,这使得录音回放攻击对说话人识别系统的安全造成了很大威胁.为此,文中提出了一种基于信道模式噪声的录音回放攻击检测方法.该方法根据原始语音与回放语音信道的差异,针对不同信道中的录音及回放设备引入不同的信道模式噪声,采用去噪滤波器和统计帧分析的方法提取信道模式噪声,在信道模式噪声的基础上提取了6阶Legendre 系数及6 个统计特征,并利用支持向量机训练信道噪声模型判决输入是否为回放攻击.实验结果表明,加载录音回放检测模块后,说话人识别系统对含有回放攻击数据的等错误率下降了约30%.

关键词: 说话人识别, 录音回放攻击检测, 信道模式噪声, 统计帧, Legendre 多项式

Abstract:

The high fidelity of recent recording and playback devices with high quality and low price make it difficult to distinguish the playback recording from the authentic speech even by auditory perception,thus causing the playback attack to pose a huge threat to speaker recognition systems. In order to solve this problem,this paper proposes a playback attack detection method based on channel pattern noise. In this method,according to the channel
difference between the playback recording and the authentic speech,different channel pattern noises are introduced respectively for recording and playback devices in different channels and are then extracted by using the denoising filter and the statistical frame. Moreover,6-order Legendre coefficients and six statistical features are extracted from the channel pattern noise,and the support vector machine is used to establish the channel noise model to judge whether the input speech is a playback attack. Experimental results show that,by introducing the playback detector,the equal error rate of the speaker recognition system against the playback attack decreases by about 30%.

Key words: speaker recognition, playback attack detection, channel pattern noise, statistical frame, Legendre polynomial

中图分类号: