Two-Stage Decision-Based Detection of Non-Lexical Audio Events in Spontaneous Vocalization

He Qian-hua Li Yan-xiong Li Tao Zhang Hong Yang Ji-chen

doi:10.3969/j.issn.1000-565X.2011.02.004

Journal of South China University of Technology(Natural Science) >

2011 , Vol. 39 >Issue 2: 20 - 25,31

DOI: https://doi.org/10.3969/j.issn.1000-565X.2011.02.004

Electronics, Communication & Automation Technology

Two-Stage Decision-Based Detection of Non-Lexical Audio Events in Spontaneous Vocalization

Expand

South China university of technology, electronic and information institute, guangdong guangzhou 51064

贺前华(1965-)，男，教授，博士生导师，主要从事语音及音频信号处理、嵌入式系统研究

Received date: 2010-04-16

Revised date: 2010-06-17

Online published: 2011-01-02

Supported by

国家自然科学基金资助项目(60972132)；广东省自然科学基金资助项目(10451064101004651，9351064101000003)

Fold

Abstract

In order to effectively utilize non-lexical audio events to analyze the semantics of conversational speech,the characteristic differences among the audio events frequently occurring in spontaneous vocalization are analyzed,and a two-stage decision-based method to detect non-lexical audio events in spontaneous vocalization is proposed.In this method,the characteristics of audio events are used to construct signal segments of audio events： the thre-shold decision is used to detect longer applause（the first-stage decision）,and statistical models are employed to detect other audio events（the second-stage decision）.Experimental results show that the average precision,the recall rate and the F1-measure of the proposed method for three non-lexical audio events（i.e.filled pause,laughter and applause） are respectively 87.3%,93.8% and 90.4%;and that,as compared with the existing method,the proposed method is of an average F1-measure increase by 7.5% and it helps to determine the boundaries of non-lexical audio events with higher accuracy.

Key words： Non-lexical events; threshold decision; statistical model detection; spontaneous speech; speech processing

Cite this article

He Qian-hua Li Yan-xiong Li Tao Zhang Hong Yang Ji-chen . Two-Stage Decision-Based Detection of Non-Lexical Audio Events in Spontaneous Vocalization[J]. Journal of South China University of Technology(Natural Science), 2011 , 39(2) : 20 -25,31 . DOI: 10.3969/j.issn.1000-565X.2011.02.004

Options

Outlines

模态框（Modal）标题

Abstract

Cite this article