可变码率压缩和音速、音调调整的音频信号的正弦模型(英文)

华南理工大学学报(自然科学版) ›› 2003, Vol. 31 ›› Issue (7): 22-27.

可变码率压缩和音速、音调调整的音频信号的正弦模型(英文)

Al-MoussawyRaed 尹俊勋宋少鹏

College of Electronic and Information Eng.,South China Univ.of Tech., Guangzhou 510640, China

出版日期:2003-07-20 发布日期:2022-09-26
通信作者: AI-Moussawy Raed(born in 1972)，male，Ph.D. candidate ，mainly researches on audio coding and signal processing.
作者简介:AI-Moussawy Raed(born in 1972)，male，Ph.D. candidate ，mainly researches on audio coding and signal processing.
基金资助:
国家自然科学基金(69820007) 广东省自然科学基金(011611)

Compact Sinusoidal Representations of Audio for Scalable Compression and Time/Pitch-Scale Modifications

Al-MoussawyRaed YI Junxun SONG Shaopeng

College of Electronic and Information Eng.,South China Univ.of Tech., Guangzhou 510640, China

Online:2003-07-20 Published:2022-09-26
Contact: AI-Moussawy Raed(born in 1972)，male，Ph.D. candidate ，mainly researches on audio coding and signal processing.
About author:AI-Moussawy Raed(born in 1972)，male，Ph.D. candidate ，mainly researches on audio coding and signal processing.
Supported by:
multiresolution sinusoidal modeling|parametric audio coding|low-rate audio coding|signal modifications

摘要/Abstract

摘要： 提出一种用于可变码率音频编码的正弦+噪声(SN)模型。提出了对正弦模型进行本质上的增强。从大、中、小三个前后衔接的尺度上对音频信号进行时域重叠相加(overlap-add)的正弦分析时引入了心理声学模型加权的匹配跟踪算法(matching pursuits algorithm),将大尺度正弦分析-合成后的余量送入相对小的尺度进行分析,以达到相应的分辨率。这种算法有效的解决了正弦模型固有的预回声效应,提高了重建音频的质量。这一模型适用于可变码率、高保真的音频压缩和发音速度、音调的调整。

关键词: 多分辨率正弦模型, 参数音频编码, 低码率音频编码, 信号调整

Abstract: This paper presents a signal model for scalable perceptual audio coding consisting of Sines + Noise (SN) representations. The paper essentially presents a fundamental enhancement to the sinusoidal modeling component. The enhancement involves an audio signal scheme based on carrying out overlap-add sinusoidal modeling at three successive time scales, large, medium, and small. The sinusoidal modeling is done in an analysis-by-synthesis overlap-add manner across the three scales by using a psychoacoustically based matching pursuits. The sinusoidal modeling residual at the first scale is passed to a couple of smaller scales to allow for modeling of various signal features at appropriate resolutions. This approach greatly helps to correct the pre-echo inherent in the sinusoidal model. The new scheme gives an improved perceptual audio quality compared to our previous work while using the same number of sinusoids.

Key words: mulfiresolution sinusoidal modeling, parametric audio coding, low-rate audio coding, signal modifications.

Al-MoussawyRaed 尹俊勋宋少鹏. 可变码率压缩和音速、音调调整的音频信号的正弦模型(英文)[J]. 华南理工大学学报(自然科学版), 2003, 31(7): 22-27.

Al-MoussawyRaed Yi Junxun Song Shaopeng. Compact Sinusoidal Representations of Audio for Scalable Compression and Time/Pitch-Scale Modifications[J]. Journal of South China University of Technology(Natural Science Edition), 2003, 31(7): 22-27.