华南理工大学学报(自然科学版) ›› 2015, Vol. 43 ›› Issue (1): 21-27,33.doi: 10.3969/j.issn.1000-565X.2015.01.004

• 电子、通信与自动控制 • 上一篇    下一篇

一种多说话人角色聚类方法

李威 贺前华 李艳雄   

  1. 华南理工大学 电子与信息学院, 广东 广州 510640
  • 收稿日期:2014-09-15 修回日期:2014-11-21 出版日期:2015-01-25 发布日期:2014-12-01
  • 通信作者: 李威(1979-),女,博士生,主要从事语音信号处理研究 . E-mail:livay_21@163.com
  • 作者简介:李威(1979-),女,博士生,主要从事语音信号处理研究 .
  • 基金资助:

    国家自然科学基金资助项目( 61101160 );广州市珠江科技新星专项( 2013J2200070 );华南理工大学中央高校基本科研业务费专项资金重点项目( 2013ZZ0053 )

A Clustering Method for Multiple Speaker Roles

Li Wei He Qian-hua Li Yan-xiong   

  1. School of Electronic and Information Engineering , South China University of Technology , Guangzhou 510640 , Guangdong , China 
  • Received:2014-09-15 Revised:2014-11-21 Online:2015-01-25 Published:2014-12-01
  • Contact: 李威(1979-),女,博士生,主要从事语音信号处理研究 . E-mail:livay_21@163.com
  • About author:李威(1979-),女,博士生,主要从事语音信号处理研究 .
  • Supported by:
    Supported by the National Natural Science Foundation of China ( 61101160 )

摘要: 为有效找出会议语音中的说话人角色个数及各角色的说话人语音,提出了一种多说话人角色聚类方法 . 首先定义说话人角色聚类的特征,然后采用测地距离度量特征的相似度,进而提出了一种利用类内距离来控制类间合并的多说话人角色聚类方法,最后采用 4 种不同类型的会议语音对该方法进行测试 . 结果表明:对手工分割和自动分割后的会议语音进行说话人角色聚类时,如果采用相同的聚类方法,则使用测地距离的性能优于使用传统距离的性能;如果采用相同的距离度量方法,则文中方法的性能优于传统层次聚类方法 .

关键词: 说话人角色, 特征距离度量, 角色聚类, 测地距离, 无监督聚类

Abstract:

In order to find the number of speaker roles and the corresponding speakers ’ speech in meeting speeches , a clustering method for multiple speaker roles is proposed. Firstly , features for speaker role clustering are defined. Secondly , geodesic distance is used to measure the similarities among features. Then , inner-class distance is used to control inter-class mergence to form the clustering method. Finally , four different types of meeting speech corpora are used to validate the effectiveness of the proposed method. The results indicate that ,for the meeting speeches obtained by both manual and automatic segmentation , the clustering performance using geodesic distance is superior to that using traditional distance when the same clustering algorithm is used in all cases , and that the proposed method performs better than the traditional hierarchical clustering method when the same measuring distance is used.

Key words: speaker role, characteristic distance measure, role clustering, geodesic distance, unsupervised clustering

中图分类号: