Electronics, Communication & Automation Technology

Speech Bandwidth Extension Based on Flatten-CNN

  • YANG Jun-Mei ,
  • LEI Yang ,
  • CHEN Xi-Kun
Expand
  • School of Electronic and Information Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China
杨俊美(1979-),女,博士,副教授,主要从事智能信息处理技术研究。

Received date: 2021-03-29

  Revised date: 2021-09-29

  Online published: 2021-10-08

Supported by

Supported by the National Natural Science Foundation of China(61871188,61801133)

Abstract

The existing deep learning-based speech bandwidth extension algorithms have many disadvantages:the time domain algorithms speech  feature extraction  is not accurate enough and its training data is too large;the frequency domain algorithm pays little attention to the information association between frames in log power spectrum feature extraction and the number of frequency axes is odd number which is inconvenient for deepening the network depth.In addition,it ignores time domain information;the time-frequency two-domain algorithm model is relatively complicated.To solve these problems,this paper proposed a speech bandwidth extension algorithm based on Flatten-CNN.Firstly,in order to make full use of speech features and reduce the amount of data,the algorithm was operated on frequency domain.Secondly,an improved encoder was proposed  to make use of the logarithmic power spectrum time axis information.The log power spectrum feature extraction of two-axis was realized by introducing tile layers.Thirdly,in order to deepen the network depth,the last point was removed during the frequency axis data processing and a zero was added when restoring,so  to ensure that the frequency axis number is an even number.Finally,in order to utilize the voice signal time domain information,time domain loss was introduced into the loss function.The effectiveness of the algorithm  was verified with the TIMIT data set and the VCTK data set.The experimental results show that,compared with the current mainstream algorithms,the new algorithm can improve the high-bandwidth speech quality,showing better hearing effect.

Cite this article

YANG Jun-Mei , LEI Yang , CHEN Xi-Kun . Speech Bandwidth Extension Based on Flatten-CNN[J]. Journal of South China University of Technology(Natural Science), 2021 , 49(11) : 87 -94 . DOI: 10.12141/j.issn.1000-565X.210173

Outlines

/