Authors :
Bi Tra Jean Claude YOUAN; N’tcho Assoukpou Jean GNAMELE; Digrais Moïse MAMBE
Volume/Issue :
Volume 9 - 2024, Issue 11 - November
Google Scholar :
https://tinyurl.com/n4dt9yn6
Scribd :
https://tinyurl.com/5eep5cc2
DOI :
https://doi.org/10.38124/ijisrt/IJISRT24NOV1194
Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.
Abstract :
In the work presented in this article, we
highlight the interest of choosing low-frequency
ultrasound for the calculation of Mel cepstral coefficients
combined with a singular restructuring of the study data.
These coefficients were used as descriptors in the
classification of sound samples produced by chainsaws in
forest environments in order to combat the destruction of
Ivorian fauna and flora. Three restructuring methods
were compared, namely: the Time Domain Channel
Fusion method, the cepstral Domain Channel Fusion
method and the one channel method. To do this, we first
calculated the MFCC on different frequency bands in the
acoustic band [170 Hz-22000 Hz]. The different frequency
bands selected range from 1 kHz to 21 kHz, increasing by
2 kHz at each new calculation phase. Low-frequency
ultrasound produced better classification rates than the
other acoustic bands. The best rate of 98.40% was obtained
for the 3 kHz bandwidth on the acoustic band [21170 Hz-
24170 Hz] combined with the 'Time Domain Channel
Fusion' method. A study of the ultrasounds deduced from
the central frequencies of the octave bands was then
carried out. A comparative approach of the sample
classification rates led to selecting the band [11313 Hz -
22627 Hz] deduced from the central frequency of the 16
KHz octave band as the best ultrasonic band for the
calculation of the MFCCs.
Keywords :
Ultrasound, Octave Band, KNN,, Data Structure.
References :
- N. A. J. Gnamele, B. T. J. C. Youan, et A. M. L. Famien, « Improvement of chainsaw sounds identification in the forest environment using maximum ratio combining and classification algorithme », Eureka: PE, no 3, p. 3‑16, mai 2024, doi: 10.21303/2461-4262.2024.003107.
- N. A. J. Gnamele, Y. Berenger, T. Arsene, G. Baudoin, et J.-M. Laheurte, « KNN and SVM Classification for Chainsaw Sound Identification in the Forest Areas », IJACSA, vol. 10, no 12, 2019, doi: 10.14569/IJACSA.2019.0101270.
- K. C. Shahira, S. Tripathy, et A. Lijiya, « Obstacle Detection, Depth Estimation And Warning System For Visually Impaired People », in TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON), Kochi, India: IEEE, oct. 2019, p. 863‑868. doi: 10.1109/TENCON.2019.8929334.
- A. M. Joseph, A. Kian, et R. Begg, « State-of-the-Art Review on Wearable Obstacle Detection Systems Developed for Assistive Technologies and Footwear », Sensors, vol. 23, no 5, p. 2802, mars 2023, doi: 10.3390/s23052802.
- J. Ye et N. Toyama, « Automatic defect detection for ultrasonic wave propagation imaging method using spatio-temporal convolution neural networks », Structural Health Monitoring, vol. 21, no 6, p. 2750‑2767, nov. 2022, doi: 10.1177/14759217211073503.
- M. Marchevsky, S. Prestemon, O. Lobkis, R. Roth, D. C. Van Der Laan, et J. D. Weiss, « Ultrasonic Waveguides for Quench Detection in HTS Magnets », IEEE Trans. Appl. Supercond., vol. 32, no 6, p. 1‑5, sept. 2022, doi: 10.1109/TASC.2022.3164035.
- J.-X. Shen et al., « Ultrasonic frogs show hyperacute phonotaxis to female courtship calls », Nature, vol. 453, no 7197, p. 914‑916, juin 2008, doi: 10.1038/nature06719.
- H. C. Gerhardt, M. A. Bee, et J. Christensen-Dalsgaard, « Neuroethology of sound localization in anurans », J Comp Physiol A, vol. 209, no 1, p. 115‑129, janv. 2023, doi: 10.1007/s00359-022-01576-9.
- N. Lee, A. Vélez, et M. Bee, « Behind the mask(ing): how frogs cope with noise », J Comp Physiol A, vol. 209, no 1, p. 47‑66, janv. 2023, doi: 10.1007/s00359-022-01586-7.
- N. A. Zaidan et M. S. Salam, « MFCC Global Features Selection in Improving Speech Emotion Recognition Rate », in Advances in Machine Learning and Signal Processing, vol. 387, P. J. Soh, W. L. Woo, H. A. Sulaiman, M. A. Othman, et M. S. Saat, Éd., in Lecture Notes in Electrical Engineering, vol. 387. , Cham: Springer International Publishing, 2016, p. 141‑153. doi: 10.1007/978-3-319-32213-1_13.
- M. A. Kassem, K. M. Hosny, R. Damaševičius, et M. M. Eltoukhy, « Machine Learning and Deep Learning Methods for Skin Lesion Classification and Diagnosis: A Systematic Review », Diagnostics, vol. 11, no 8, p. 1390, juill. 2021, doi: 10.3390/diagnostics11081390.
- X. Huang, A. Acero, F. Alleva, M. Hwang, L. Jiang, et M. Mahajan, « From Sphinx-II to Whisper — Making Speech Recognition Usable », in Automatic Speech and Speaker Recognition, vol. 355, C.-H. Lee, F. K. Soong, et K. K. Paliwal, Éd., in The Kluwer International Series in Engineering and Computer Science, vol. 355. , Boston, MA: Springer US, 1996, p. 481‑508. doi: 10.1007/978-1-4613-1367-0_20.
- R. M. Foratto, D. Llusia, L. F. Toledo, et L. R. Forti, « Treefrogs adjust their acoustic signals in response to harmonics structure of intruder calls », Behavioral Ecology, vol. 32, no 3, p. 416‑427, juin 2021, doi: 10.1093/beheco/araa135.
- L. R. Forti, M. R. De Melo Sampaio, C. R. Pires, J. K. Szabo, et L. F. Toledo, « Torrent frogs emit acoustic signals of a narrower spectral range in habitats with longer-lasting biotic background noise », Behavioural Processes, vol. 200, p. 104700, août 2022, doi: 10.1016/j.beproc.2022.104700.
- I. Alla, H. B. Olou, V. Loscri, et M. Levorato, « From Sound to Sight: Audio-Visual Fusion and Deep Learning for Drone Detection », in Proceedings of the 17th ACM Conference on Security and Privacy in Wireless and Mobile Networks, Seoul Republic of Korea: ACM, mai 2024, p. 123‑133. doi: 10.1145/3643833.3656133.
- S. Gupta et A. Cosgun, « Audio-Visual Traffic Light State Detection for Urban Robots », 2024, arXiv. doi: 10.48550/ARXIV.2404.19281.
- M. H. Pham, F. M. Noori, et J. Torresen, « Emotion Recognition using Speech Data with Convolutional Neural Network », in 2021 IEEE 2nd International Conference on Signal, Control and Communication (SCC), Tunis, Tunisia: IEEE, déc. 2021, p. 182‑187. doi: 10.1109/SCC53769.2021.9768372.
- Z. Kh. Abdul et A. K. Al-Talabani, « Mel Frequency Cepstral Coefficient and its Applications: A Review », IEEE Access, vol. 10, p. 122136‑122158, 2022, doi: 10.1109/ACCESS.2022.3223444.
- P. Chen, D. Yin, B. Yang, et W. Tang, « A Fusion Feature for the Oestrous Sow Sound Identification Based on Convolutional Neural Networks », J. Phys.: Conf. Ser., vol. 2203, no 1, p. 012049, févr. 2022, doi: 10.1088/1742-6596/2203/1/012049.
In the work presented in this article, we
highlight the interest of choosing low-frequency
ultrasound for the calculation of Mel cepstral coefficients
combined with a singular restructuring of the study data.
These coefficients were used as descriptors in the
classification of sound samples produced by chainsaws in
forest environments in order to combat the destruction of
Ivorian fauna and flora. Three restructuring methods
were compared, namely: the Time Domain Channel
Fusion method, the cepstral Domain Channel Fusion
method and the one channel method. To do this, we first
calculated the MFCC on different frequency bands in the
acoustic band [170 Hz-22000 Hz]. The different frequency
bands selected range from 1 kHz to 21 kHz, increasing by
2 kHz at each new calculation phase. Low-frequency
ultrasound produced better classification rates than the
other acoustic bands. The best rate of 98.40% was obtained
for the 3 kHz bandwidth on the acoustic band [21170 Hz-
24170 Hz] combined with the 'Time Domain Channel
Fusion' method. A study of the ultrasounds deduced from
the central frequencies of the octave bands was then
carried out. A comparative approach of the sample
classification rates led to selecting the band [11313 Hz -
22627 Hz] deduced from the central frequency of the 16
KHz octave band as the best ultrasonic band for the
calculation of the MFCCs.
Keywords :
Ultrasound, Octave Band, KNN,, Data Structure.