电子、通信与自动控制

一种多说话人角色聚类方法

展开
  • 华南理工大学 电子与信息学院, 广东 广州 510640
李威(1979-),女,博士生,主要从事语音信号处理研究 .

收稿日期: 2014-09-15

  修回日期: 2014-11-21

  网络出版日期: 2014-12-01

基金资助

国家自然科学基金资助项目( 61101160 );广州市珠江科技新星专项( 2013J2200070 );华南理工大学中央高校基本科研业务费专项资金重点项目( 2013ZZ0053 )

A Clustering Method for Multiple Speaker Roles

Expand
  • School of Electronic and Information Engineering , South China University of Technology , Guangzhou 510640 , Guangdong , China 
李威(1979-),女,博士生,主要从事语音信号处理研究 .

Received date: 2014-09-15

  Revised date: 2014-11-21

  Online published: 2014-12-01

Supported by

Supported by the National Natural Science Foundation of China ( 61101160 )

摘要

为有效找出会议语音中的说话人角色个数及各角色的说话人语音,提出了一种多说话人角色聚类方法 . 首先定义说话人角色聚类的特征,然后采用测地距离度量特征的相似度,进而提出了一种利用类内距离来控制类间合并的多说话人角色聚类方法,最后采用 4 种不同类型的会议语音对该方法进行测试 . 结果表明:对手工分割和自动分割后的会议语音进行说话人角色聚类时,如果采用相同的聚类方法,则使用测地距离的性能优于使用传统距离的性能;如果采用相同的距离度量方法,则文中方法的性能优于传统层次聚类方法 .

本文引用格式

李威 贺前华 李艳雄 . 一种多说话人角色聚类方法[J]. 华南理工大学学报(自然科学版), 2015 , 43(1) : 21 -27,33 . DOI: 10.3969/j.issn.1000-565X.2015.01.004

Abstract

In order to find the number of speaker roles and the corresponding speakers ’ speech in meeting speeches , a clustering method for multiple speaker roles is proposed. Firstly , features for speaker role clustering are defined. Secondly , geodesic distance is used to measure the similarities among features. Then , inner-class distance is used to control inter-class mergence to form the clustering method. Finally , four different types of meeting speech corpora are used to validate the effectiveness of the proposed method. The results indicate that ,for the meeting speeches obtained by both manual and automatic segmentation , the clustering performance using geodesic distance is superior to that using traditional distance when the same clustering algorithm is used in all cases , and that the proposed method performs better than the traditional hierarchical clustering method when the same measuring distance is used.

文章导航

/