Electronics, Communication & Automation Technology

Music Source Separation Method Based on Unet Combining SE and BiSRU

Expand
  • School of Microelectronics,Tianjin University,Tianjin 300072,China
张瑞峰(1974-),男,博士,副教授,主要从事机器视觉与音频处理研究。E-mail:zhangruifeng@tju.edu.cn

Received date: 2020-09-30

  Revised date: 2020-12-29

  Online published: 2021-01-11

Supported by

Supported by the National Natural Science Foundation of China (61471263) and the Natural Science Foundation of Tianjin (16JCZDJC31100)

Abstract

Music source separation is one of the most important research topics in the field of music information retrieval.Traditional music source separation methods have shortcomings,such as hypothesis dependence,limited model complexity,and poor representation ability.To resolve these problems,it takes a long time to train the time-domain end-to-end deep learning network model,and the separation performance still needs to be improved.Therefore,in order to further optimize the representation ability and computational efficiency of the time domain end-to-end separation model,the study proposed an end-to-end network Unet-SE-BiSRU based on the Demucs model which has the best performance in time domain separation at present.Attention mechanism was introduced into the generalized coding layer and decoding layer,and the squeezing-excitation block(SE) was used to extract features selectively according to the type of audio to be separated.To deal with gradient explosion or disappearance that may occur in the learning process,a group normalization was added after one-dimensional con-volution.The bidirectional long short-term memory network was refined to a bidirectional simple recurrent unit(BiSRU),which improves the parallelism of learning and reduces the amount of model parameters.The experimental results show that the signal-noise ratio of the improved network model is improved by 0.34dB,which is the best one among the time-domain end-to end methods to the best of our knowledge,and the training time is reduced by 3/5.

Cite this article

ZHANG Ruifeng, BAI Jintong, GUAN Xin, et al . Music Source Separation Method Based on Unet Combining SE and BiSRU[J]. Journal of South China University of Technology(Natural Science), 2021 , 49(11) : 106 -115,134 . DOI: 10.12141/j.issn.1000-565X.200593

Outlines

/