|
[1]
|
M. A. Abd-El-Fattah and M. I. Dessouky, Speech deconvolution as an inverse problem, International Journal of Speech Technology, 14 (2011), 273-284.
|
|
[2]
|
J. B. Allen and L. R. Rabiner, A unified approach to short-time fourier analysis and synthesis, Proceedings of the IEEE, 65 (2005), 1558-1564.
doi: 10.1109/PROC.1977.10770.
|
|
[3]
|
S. Arridge, P. Maass, O. Öktem and C. B. Schönlieb, Solving inverse problems using data-driven models, Acta Numerica, 28 (2019), 1-174.
doi: 10.1017/S0962492919000059.
|
|
[4]
|
M. Benning and M. Burger, Modern regularization methods for inverse problems, Acta Numerica, 27 (2018), 1–111.
|
|
[5]
|
S. F. Boll, Suppression of acoustic noise in speech using spectral subtraction, IEEE Trans Acoust Speech Signal Process, ASSP-27 (1979), 113-120.
doi: 10.1109/TASSP.1979.1163209.
|
|
[6]
|
D. de Oliveira, T. Peer and T. Gerkmann, Efficient transformer-based speech enhancement using long frames and stft magnitudes, Interspeech 2022, ISCA, (2022), 2948-2952.
doi: 10.21437/Interspeech.2022-10781.
|
|
[7]
|
A. Farina, Simultaneous Measurement of Impulse Response and Distortion with a Swept-Sine Technique, Audio engineering society convention 108. Audio Engineering Society, 2000.
|
|
[8]
|
P. Gonzalez, Z.-H. Tan, J. Østergaard, J. Jensen, T. S. Alstrøm and T. May, Investigating the design space of diffusion models for speech enhancement, IEEE/ACM Transactions on Audio, Speech, and Language Processing, 32 (2024), 4486-4500.
doi: 10.1109/TASLP.2024.3473319.
|
|
[9]
|
P. J. Goulart and Y. Chen, Clarabel: An interior-point solver for conic programs with quadratic objectives, (2024), https://arXiv.org/abs/2405.12762.
|
|
[10]
|
A. Hannun, C. Case, J. Casper, B. Catanzaro, G. Diamos, E. Elsen, R. Prenger, S. Satheesh, S. Sengupta, A. Coates and A. Y. Ng, Deep speech: Scaling up end-to-end speech recognition, (2014), https://arXiv.org/abs/1412.5567.
|
|
[11]
|
R. C. Heyser, Acoustical measurements by time delay spectrometry, Journal of the Audio Engineering Society, 15 (1967), 370-382.
|
|
[12]
|
Y. Hu, Y. Liu, S. Lv, M. Xing, S. Zhang, Y. Fu, J. Wu, B. Zhang and L. Xie, Dccrn: Deep complex convolution recurrent network for phase-aware speech enhancement, Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech, 2020 (2020), 2472-2476.
|
|
[13]
|
F. Jacobsen and P. M. Juhl, Fundamentals of General Linear Acoustics, John Wiley & Sons, 2013.
|
|
[14]
|
W. J. Klippel, Active reduction of nonlinear loudspeaker distortion, Proceedings of Active 99: the International Symposium on Active Control of Sound and Vibration, 1, 2, 1135-1146.
|
|
[15]
|
J. S. Lim and A. V. Oppenheim, Enhancement and bandwidth compression of noisy speech, Proceedings of the IEEE.
|
|
[16]
|
H. Liu, X. Liu, Q. Kong, Q. Tian, Y. Zhao, D. L. Wang, C. Huang and Y. Wang, Voicefixer: A unified framework for high-fidelity speech restoration, Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech, 2022 (2022), 4232-4236.
|
|
[17]
|
M. Ludvigsen, E. Karvonen, M. Juvonen and S. Siltanen, Helsinki speech challenge 2024 – competition and open dataset, Applied Mathematics for Modern Challenges, 6 (2025), 24-44.
|
|
[18]
|
S. Müller, Measuring transfer-functions and impulse responses, Handbook of Signal Processing in Acoustics, (2008), 65-85.
doi: 10.1007/978-0-387-30441-0_5.
|
|
[19]
|
K. Prawda, S. J. Schlecht and V. Välimäki, Time variance in measured room impulse responses, Proceedings of the 10th Convention of the European Acoustics Association Forum Acusticum, (2023), 1-8.
|
|
[20]
|
J. G. Proakis, Digital Signal Processing: Principles Algorithms and Applications, Pearson Education India, 2001.
|
|
[21]
|
M. Rajan, Convergence analysis of a regularized approximation for solving fredholm integral equations of the first kind, Journal of Mathematical Analysis and Applications, 279 (2003), 522-530, https://www.sciencedirect.com/science/article/pii/S0022247X03000271.
doi: 10.1016/S0022-247X(03)00027-1.
|
|
[22]
|
J. Richter, S. Welker, J. M. Lemercier, B. Lay and T. Gerkmann, Speech enhancement and dereverberation with diffusion-based generative models, IEEE/ACM Transactions on Audio Speech and Language Processing, 31 (2023), 2351-2364.
doi: 10.1109/TASLP.2023.3285241.
|
|
[23]
|
M. R. Schroeder, New method of measuring reverberation time, Journal of the Acoustical Society of America, 37 (1965), 409-412.
doi: 10.1121/1.1909343.
|
|
[24]
|
Silero Team, Silero vad: Pre-trained enterprise-grade voice activity detector, https://github.com/snakers4/silero-vad/tree/v4.0, (2024).
|
|
[25]
|
N. Upadhyay and A. Karmakar, Speech enhancement using spectral subtraction-type algorithms: A comparison and simulation study, Procedia Computer Science, 54 (2015), 574-584.
|