华南理工大学学报(自然科学版) ›› 2009, Vol. 37 ›› Issue (9): 47-51.

• 电子、通信与自动控制 • 上一篇    下一篇

一种改进的BIC说话人改变检测算法

杨继臣 贺前华 潘伟锵 徐益君 李艳雄   

  1. 华南理工大学 电子与信息学院, 广东 广州 510640
  • 收稿日期:2009-03-05 修回日期:2009-04-22 出版日期:2009-09-25 发布日期:2009-09-25
  • 通信作者: 杨继臣(1980-),男,博士生,主要从事语音信号处理研究. E-mail:NisonYoung@yahoo.cn
  • 作者简介:杨继臣(1980-),男,博士生,主要从事语音信号处理研究.
  • 基金资助:

    国家自然科学基金资助项目(60602014,60972132)

A Modified BIC Algorithm of Speaker Change Detection

Yang Ji-chen  He Qian-hua  Pan Wei-qiang  Xu Yi-jun  Li Yan-xiong   

  1. School of Electronic and Information Engineering, South China University of Technology, Guangzhou 510640, Guangdong, China
  • Received:2009-03-05 Revised:2009-04-22 Online:2009-09-25 Published:2009-09-25
  • Contact: 杨继臣(1980-),男,博士生,主要从事语音信号处理研究. E-mail:NisonYoung@yahoo.cn
  • About author:杨继臣(1980-),男,博士生,主要从事语音信号处理研究.
  • Supported by:

    国家自然科学基金资助项目(60602014,60972132)

摘要: 针对贝叶斯信息准则(BIC)算法在说话人改变检测中计算量大、检测精度低的问题,文中提出了一种改进的BIC说话人改变检测算法.该算法通过限制分析窗内第一个数据窗的最大长度来降低计算量,并通过增加分析窗内第二个数据窗的有效长度(提高可测度)来提高检测精度;同时,该算法只在新增区间内寻找潜在说话人改变点,从而解决了长时间无说话人改变时计算量不断增大的问题.实验结果表明,该算法和传统的BIC算法相比,偏移误差范围由0.10-0.80降低到0.03-0.20;当分析窗长为40s时,计算时间节省了约75%.

关键词: 说话人检测, 改进贝叶斯信息准则, 检测精度, 可测度, 数据窗

Abstract:

In order to overcome such disadvantages as high computational cost and low detection precision existing in the speaker change detection with the Bayesian information criterion(BIC) algorithm,a modified BIC algorithm is proposed.In this algorithm,the maximum length of the first data window in the analysis window is restricted to reduce the computational cost,and the effective length,namely the detectability of the second data window in the analysis window is increased to improve the detection precision.Moreover, the growing computation without a long time speaker change is decreased by detecting potential speaker change points only in a new range. Experiment resuits show that, as compared with the conventional BIC algorithm, the proposed algorithm decreases the bias error from 0. 10 -0. 80 to 0. 03 - 0. 20, and saves 75 % of computation time for a 40-s analysis window.

Key words: speaker detection, modified Bayesian information criterion, detection precision, detectability, data window