华南理工大学学报(自然科学版) ›› 2011, Vol. 39 ›› Issue (2): 26-31.doi: 10.3969/j.issn.1000-565X.2011.02.005

• 电子、通信与自动控制 • 上一篇    下一篇

基于美尔卷曲的TELPC频谱包络估计算法

李烨 杨家玮 梁彦霞   

  1. 西安电子科技大学综合业务网理论及关键技术国家重点实验室,陕西西安710071
  • 收稿日期:2010-04-14 修回日期:2010-08-16 出版日期:2011-02-25 发布日期:2011-01-02
  • 通信作者: 李烨(1982-),男,博士生,主要从事语音编码、语音信号处理研究 E-mail:yli@pcn.xidian.edu.cn
  • 作者简介:李烨(1982-),男,博士生,主要从事语音编码、语音信号处理研究
  • 基金资助:

    国家杰出青年科学基金资助项目(60725105);国家自然科学基金资助项目(61072068)

 Mel-Warping-Based TELPC Algorithm for Estimating Spectral Envelope

Li Ye Yang Jia-wei Liang Yan-xia   

  1. Xian university of electronic science and technology YeWuWang theory and key technology comprehensive national key laboratory, xian in shaanxi 710071
  • Received:2010-04-14 Revised:2010-08-16 Online:2011-02-25 Published:2011-01-02
  • Contact: 李烨(1982-),男,博士生,主要从事语音编码、语音信号处理研究 E-mail:yli@pcn.xidian.edu.cn
  • About author:李烨(1982-),男,博士生,主要从事语音编码、语音信号处理研究
  • Supported by:

    国家杰出青年科学基金资助项目(60725105);国家自然科学基金资助项目(61072068)

摘要: 为了对TELPC算法所提取的频谱包络进行感知增强,提出一种利用人耳感知特性的频谱包络估计新算法.该算法首先采用真实包络估计器实时算法提取频谱包络,然后通过美尔卷曲对频谱包络进行感知增强,最后对卷曲的频谱包络进行线性预测分析.为了进一步提高线性预测分析的性能,该算法还对卷曲的频谱包络进行幅度压缩.美尔卷曲采用傅里叶变换对和频谱包络线性内插两种方法实现.客观测试表明,新算法所提取频谱包络的对数谱失真和谱平坦度在低频段均小于原TELPC算法所提取的频谱包络,且幅度压缩使算法性能更佳.

关键词: 语音分析, 包络检测器, 倒谱分析, 线性预测编码, 美尔卷曲

Abstract:

In order to perceptually enhance the spectral envelope extracted via TELPC algorithm,a new estimation algorithm of spectral envelope is proposed using the property of auditory perception.In this algorithm,the spectral envelope is extracted using a real-time true envelope(TE) estimator and is perceptually enhanced via the Mel-warping.Then,a linear predictive analysis is performed for the warped envelope.Moreover,the warped envelope is compressed to further improve the performance of linear prediction coding(LPC).The Mel-warping is implemented in two ways: one is by the Fourier transform pair and the other is by the linear interpolation of spectral envelope.Test results indicate that,as compared with the existing TELPC algorithm,the proposed algorithm is more effective because it results in smaller log-spectral distortion and spectral flatness in the low-frequency band,as well as higher performance due to the envelope compression.

Key words: speech analysis, envelope detectors, cepstral analysis, linear predictive coding, Mel-warping