多重约束非负矩阵分解的非平稳噪声语音增强

邹月娴; 刘诗涵; 王迪松

引用本文:	邹月娴,刘诗涵,王迪松.多重约束非负矩阵分解的非平稳噪声语音增强[J].控制理论与应用,2017,34(6):761~768.[点击复制]
	ZOU Yue-xian,LIU Shi-han,WANG Di-song.Enhancing speech corrupted by nonstationary noise using nonnegative matrix factorization with multiple constraints[J].Control Theory & Applications,2017,34(6):761~768.[点击复制]

多重约束非负矩阵分解的非平稳噪声语音增强

Enhancing speech corrupted by nonstationary noise using nonnegative matrix factorization with multiple constraints

摘要点击 4832 全文点击 1920 投稿时间：2016-08-11 修订日期：2017-03-24

查看全文查看/发表评论下载PDF阅读器 HTML

DOI编号 10.7641/CTA.2017.60600

2017,34(6):761-768

中文关键词语音增强低秩约束稀疏约束非负矩阵分解非稳态噪声

英文关键词 speech enhancement low-rank sparsity nonnegative matrix factorization nonstationary noise

基金项目国家自然科学基金;其它

作者	单位	E-mail
邹月娴^*	北京大学信息工程学院现代信号与数据处理实验室	cynthiazou@qq.com
刘诗涵	北京大学信息工程学院现代信号与数据处理实验室
王迪松	北京大学信息工程学院现代信号与数据处理实验室

中文摘要

低信噪比非稳态噪声环境中的语音增强仍是一个开放且具有挑战性的任务. 为了提高传统的基于非负矩阵分解(nonnegative matrix factorization, NMF)的语音增强算法性能, 同时考虑到语音信号的时频稀疏特性和非稳态噪声信号的低秩特性, 本文提出了一种基于多重约束的非负矩阵分解语音增强算法(multi-constraint nonnegative matrix factorization speech enhancement, MC–NMFSE). 在训练阶段, 采用干净语音训练数据集和噪声训练数据集分别构建语音字典和噪声字典. 在语音增强阶段, 在非负矩阵分解目标函数中增加语音分量的稀疏性约束和噪声信号的低秩性约束条件, MC–NMFSE能够更好地从带噪语音中获得语音分量的表示, 从而提高语音增强效果. 通过实验表明, 在大量不同非平稳噪声条件和不同信噪比条件下, 与传统的基于NMF的语音增强方法相比, MC–NMFSE能获得较低的语音失真和更好的非稳态噪声抑制能力.

英文摘要

The enhancement of speech corrupted by nonstationary noises under low signal-to-noise ratio (SNR) conditions is remaining open and still a very challenging task. To improve the traditional nonnegative matrix factorization (NMF) based speech enhancement, jointly taking the speech sparsity property in time-frequency domain and the low-rank property of nonstationary noise into account, a termed multi-constraint NMF speech enhancement method (MC–NMFSE) is developed. Essentially, in training stage, the speech and noise dictionaries have been constructed by using speech and noise training sets, respectively. In the speech enhancement stage, multi-constraint NMF method is adopted where the data matrix is factorized into two nonnegative sub-matrices with the sparsity and low rank constraints to guarantee the good representation of the speech components from their corrupted version by nonstationary noise. Compared with the traditional NMF speech enhancement method (NMF–SpEnM) and MC–NMFSE, intensive experiments under different nonstationary noise conditions and different signal-to-noise ratios have been carried out to evaluate their performance. Experimental results demonstrate that MC–NMFSE has lower speech distortion and better capability to suppress nonstationary noises.