Authors :
D. R. Sandeep; K. Jahnavi; B. L. Gagana; B. S. Devika; A. V. S. S. S. Vyshnavi; A. M. Gowri
Volume/Issue :
Volume 11 - 2026, Issue 5 - May
Google Scholar :
https://tinyurl.com/bdh8zxmt
Scribd :
https://tinyurl.com/3r9watm7
DOI :
https://doi.org/10.38124/ijisrt/26May223
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Abstract :
Speech signals captured in real time are often messed up by background noise. This noise makes hard to
understand and hear clearly. Noise can come from places like traffic, machines or environmental disturbances. It makes
speech processing tasks more difficult. Speech enhancement techniques try to remove noise while keeping the important
parts of the original speech signal.
In this paper propose a method that combines classical signal processing techniques with subspace-based noise
reduction methods. The system examines speech signals frame by frame, breaks down the covariance matrix and projects
the signal onto a subspace to reduce noise. Also applying filtering and smoothing operations to improve sound quality.
The system is tested using metrics like Signal-to-Noise Ratio (SNR), Perceptual Evaluation of Speech Quality (PESQ),
Short-Time Objective Intelligibility (STOI) and Mean Opinion Score (MOS). The MATLAB tool was used to verify those
metrics and the results obtained from the proposed method shows significant improvement in speech quality and
intelligibility in noisy conditions.
Keywords :
Signal-to-Noise Ratio, Perceptual Evaluation of Speech Quality, Short-Time Objective Intelligibility and Mean Opinion Score (MOS), Signal Subspace Method, Signal Enhancement.
References :
- G. Ioannides and V. Rallis, “Real-Time Speech Enhancement Using Spectral Subtraction with Minimum Statistics and Spectral Floor,” arXiv, 2023.
- P. C. Loizou, “Speech Enhancement: Theory and Practice,” CRC Press, Boca Raton, FL, USA, 2013.
- J. S. Lim and A. V. Oppenheim, “All-Pole Modeling of Degraded Speech,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 26, no. 3, pp. 197–210, 1978.
- R. Martin, “Noise Power Spectral Density Estimation Based on Optimal Smoothing and Minimum Statistics,” IEEE Transactions on Speech and Audio Processing, vol. 9, no. 5, pp. 504–512, 2001.
- A. Pandey and D. Wang, “A New Framework for Supervised Speech Enhancement in the Time Domain,” IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 28, pp. 282–295, 2020.
- Y. Ephraim and D. Malah, “Speech Enhancement Using a Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 32, no. 6, pp. 1109–1121, 1984.
- H. Yu, A. L. De Ocampo, and R. Hernandez, “Speech Noise Reduction via Intelligent Spectral Gain Selection and Modification,” International Journal of Intelligent Systems and Applications in Engineering, 2024
- S. Boll, “Suppression of Acoustic Noise in Speech Using Spectral Subtraction,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 27, no. 2, pp. 113–120, 1979.
- Z.-T. Wu and J.-W. Hung, “Improving the Speech Enhancement Model with Discrete Wavelet Transform Sub-Band Features in Adaptive FullSubNet,” Electronics, vol. 14, 2025.
- I. Cohen and B. Berdugo, “Speech Enhancement for Non-Stationary Noise Environments,” Signal Processing, vol. 81, no. 11, pp. 2403–2418, 2001.
- P. Scalart and J. V. Filho, “Speech Enhancement Based on a Priori Signal-to-Noise Estimation,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 629–632, 1996.
- S. Poornimadarshini, “Robust Audio Signal Enhancement Using Hybrid Spectral–Temporal Deep Learning Models in Noisy Environments,” National Institute of STEM Research, India, 2025.
- Y. Hu and P. C. Loizou, “Evaluation of Objective Quality Measures for Speech Enhancement,” IEEE Transactions on Audio, Speech and Language Processing, vol. 16, no. 1, pp. 229–238, 2008.
- S. F. Boll and J. H. Porter, “Spectral Subtraction for Speech Enhancement,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 27, no. 2, pp. 113–120, 1979.
- D. Wang and J. Chen, “Supervised Speech Separation Based on Deep Learning: An Overview,” IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 26, no. 10, pp. 1702–1726, 2018.
- T. Rosenbaum, E. Winbrand, O. Cohen, and I. Cohen, “Deep Learning Framework for Efficient Real-Time Speech Enhancement and Dereverberation,” Sensors, vol. 25, 2025.
- K. Tan and D. Wang, “A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement,” Interspeech Conference, pp. 3229–3233, 2018.
- Y. Xu, J. Du, L. Dai, and C. Lee, “A Regression Approach to Speech Enhancement Based on Deep Neural Networks,” IEEE Transactions on Audio, Speech and Language Processing, vol. 23, no. 1, pp. 7–19, 2015
- C. H. Taal, R. C. Hendriks, R. Heusdens, and J. Jensen, “An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech,” IEEE Transactions on Audio, Speech and Language Processing, vol. 19, no. 7, pp. 2125–2136, 2011.
- J. Benesty, S. Makino, and J. Chen, “Speech Enhancement,” Springer Handbook of Speech Processing, Springer, Berlin, Germany, 2008.
Speech signals captured in real time are often messed up by background noise. This noise makes hard to
understand and hear clearly. Noise can come from places like traffic, machines or environmental disturbances. It makes
speech processing tasks more difficult. Speech enhancement techniques try to remove noise while keeping the important
parts of the original speech signal.
In this paper propose a method that combines classical signal processing techniques with subspace-based noise
reduction methods. The system examines speech signals frame by frame, breaks down the covariance matrix and projects
the signal onto a subspace to reduce noise. Also applying filtering and smoothing operations to improve sound quality.
The system is tested using metrics like Signal-to-Noise Ratio (SNR), Perceptual Evaluation of Speech Quality (PESQ),
Short-Time Objective Intelligibility (STOI) and Mean Opinion Score (MOS). The MATLAB tool was used to verify those
metrics and the results obtained from the proposed method shows significant improvement in speech quality and
intelligibility in noisy conditions.
Keywords :
Signal-to-Noise Ratio, Perceptual Evaluation of Speech Quality, Short-Time Objective Intelligibility and Mean Opinion Score (MOS), Signal Subspace Method, Signal Enhancement.