UZBEK COMMANDS RECOGNITION BY PROCESSING THE SPECTROGRAM IMAGE

Musayev M M; Khujayorov I Sh; Abdullaeva M I; Ochilov M M

490

This paper describes the most common algorithms with image approach
convolutional neural network and two-dimensional DCT with machine learning classification
KNN, SVM and RF. These algorithms are evaluated for applicability to the Uzbek language
and a comparative analysis on the accuracy and recognition rate. The command words of the
Uzbek language were chosen for the experiments. According to the results, it was found that
both methods give high rates of recognition accuracy and are 92% (CNN) and 90% (2DDCT+
Zigzag+SVM). Also the combinations of 2D-DCT+Zigzag+ KNN and 2D-DCT+Zigzag+ RF
with average recognition accuracy of 86% and 85%, respectively, were considered in the
paper.

Jurnal nomiTechnical science and innovation
Nashr soni2022, №2(12)
Ko'rishlar soni 490

Internet havola

DOI

UzSCI tizimida yaratilgan sana 07-09-2022

O'qishlar soni 490

Nashr sanasi 30-08-2022

Asosiy tilIngliz

Sahifalar55-61

Kalit so'z

Spectrogram image

feature extraction

speech classification

English

This paper describes the most common algorithms with image approach
convolutional neural network and two-dimensional DCT with machine learning classification
KNN, SVM and RF. These algorithms are evaluated for applicability to the Uzbek language
and a comparative analysis on the accuracy and recognition rate. The command words of the
Uzbek language were chosen for the experiments. According to the results, it was found that
both methods give high rates of recognition accuracy and are 92% (CNN) and 90% (2DDCT+
Zigzag+SVM). Also the combinations of 2D-DCT+Zigzag+ KNN and 2D-DCT+Zigzag+ RF
with average recognition accuracy of 86% and 85%, respectively, were considered in the
paper.

Kalit so'z

Spectrogram image

feature extraction

speech classification

№ Muallifning F.I.Sh. Lavozimi Tashkilot nomi

1 Musayev M.M. teacher TUIT

2 Khujayorov I.S. teacher Samarkand branch of TUIT

3 Abdullaeva M.I. teacher TUIT

4 Ochilov M.M. teacher TUIT

№ Havola nomi

1 P. Ibrahim., Y.R Srinivas. Speech recognition using HMM with MFCC-an analysis using frequency Spectral decomposing technique. “Signal Image Processing an International Journal (SIPIJ)”, 2010

2 A.M. Badshah., J. Ahmad., N. Rahim., S.W. Baik. Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network. 2017 “International conference on platform technology and service”, 2017

3 M. Gales., S. Young. The application of hidden Markov models in speech recognition. “Foundations and Trends in Signal Processing”, 2007. 195

4 N.G. Andrew., Y. Zhang. “Speech recognition using deep learning algorithms. published”, 2013

5 M.B. Gulmezoglu. A novel approach to isolated word recognition. “IEEE transactions on speech and audio processing”, 1999. 620

6 N.E. Sukmawati., A. Satriyo., R.A. Sutikno. Automatic Speech Recognition for Indonesian using Linear Predictive Coding (LPC) and Hidden Markov Model (HMM). “Proceeding of 5th international seminar on new paradigm and innovation on natural science and its application”, 2015

7 M. Ahmadi., N.J Bailey., B.S. Hoyle. Phoneme recognition using speech image (spectrogram). “Published in IEEE proceedings of third international conference on signal processing”, 1996

8 J. Zhang., S. Xiao., H. Zhang., L. Jiang. Isolated word recognition with audio derivation and CNN. “Proceedings international conference on tools with artificial intelligence”, 2018. 336

9 D. Polap., M. Woźniak. “Image approach to voice recognition. 2017 IEEE symposium series on computational intelligence”, 2018. 1

10 J.M. Padmanabhan., J.J. Premkumar. Machine learning in automatic speech recognition. “A survey. IETE Technical review institution of electronics and telecommunication engineers”, 2015. 240

11 C. Glackin., J. Wall., G. Chollet., N. Dugan., N. Cannings. Convolutional neural networks for phoneme recognition. “Proceedings of the 7th international conference on pattern recognition applications and methods”, 2018. 190

12 L. Yingying., P. Siyuan., X. Nanfeng. Speech Recognition Method Based on Spectrogram. “Proceedings of the international conference on mechatronics and intelligent robotics (ICMIR)”, 2017

13 A.H. Waibel., T. Hanazawa., G. Hinton., K. Shikano., K. Lang. “Phoneme recognition using time-delay neural networks”, 1989

14 W. Fisher., M. Doddington., R. George., M. Goudie., M. Kathleen M. “The DARPA Speech recognition research database: specifications and status”, 1986. 93

15 Q.T. Nguyen. Speech classification using sift features on spectrogram images. “Vietnam journal of computer science”, 2016. 247

16 M. Al-Darkazali. “Image processing methods to segment speech spectrograms for word level recognition”, 2017

17 E. Geoffrey., N.S. Hinton., A. Krizhevskiy., I.R Sutskever., R. Salakhutdinov. “Dropout a simple way to prevent neural networks from overfitting. journal of machine learning research”, 2014. 1929

18 A. Rosebrock. “Deep learning for computer vision with python starter bundle”, 2017

19 S. Rekik., D. Guerchi., S.A. Selouani. “Speech steganography using wavelet and fourier transforms”, 2012

Kutilmoqda

№	Muallifning F.I.Sh.	Lavozimi	Tashkilot nomi
1	Musayev M.M.	teacher	TUIT
2	Khujayorov I.S.	teacher	Samarkand branch of TUIT
3	Abdullaeva M.I.	teacher	TUIT
4	Ochilov M.M.	teacher	TUIT

№	Havola nomi
1	P. Ibrahim., Y.R Srinivas. Speech recognition using HMM with MFCC-an analysis using frequency Spectral decomposing technique. “Signal Image Processing an International Journal (SIPIJ)”, 2010
2	A.M. Badshah., J. Ahmad., N. Rahim., S.W. Baik. Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network. 2017 “International conference on platform technology and service”, 2017
3	M. Gales., S. Young. The application of hidden Markov models in speech recognition. “Foundations and Trends in Signal Processing”, 2007. 195
4	N.G. Andrew., Y. Zhang. “Speech recognition using deep learning algorithms. published”, 2013
5	M.B. Gulmezoglu. A novel approach to isolated word recognition. “IEEE transactions on speech and audio processing”, 1999. 620
6	N.E. Sukmawati., A. Satriyo., R.A. Sutikno. Automatic Speech Recognition for Indonesian using Linear Predictive Coding (LPC) and Hidden Markov Model (HMM). “Proceeding of 5th international seminar on new paradigm and innovation on natural science and its application”, 2015
7	M. Ahmadi., N.J Bailey., B.S. Hoyle. Phoneme recognition using speech image (spectrogram). “Published in IEEE proceedings of third international conference on signal processing”, 1996
8	J. Zhang., S. Xiao., H. Zhang., L. Jiang. Isolated word recognition with audio derivation and CNN. “Proceedings international conference on tools with artificial intelligence”, 2018. 336
9	D. Polap., M. Woźniak. “Image approach to voice recognition. 2017 IEEE symposium series on computational intelligence”, 2018. 1
10	J.M. Padmanabhan., J.J. Premkumar. Machine learning in automatic speech recognition. “A survey. IETE Technical review institution of electronics and telecommunication engineers”, 2015. 240
11	C. Glackin., J. Wall., G. Chollet., N. Dugan., N. Cannings. Convolutional neural networks for phoneme recognition. “Proceedings of the 7th international conference on pattern recognition applications and methods”, 2018. 190
12	L. Yingying., P. Siyuan., X. Nanfeng. Speech Recognition Method Based on Spectrogram. “Proceedings of the international conference on mechatronics and intelligent robotics (ICMIR)”, 2017
13	A.H. Waibel., T. Hanazawa., G. Hinton., K. Shikano., K. Lang. “Phoneme recognition using time-delay neural networks”, 1989
14	W. Fisher., M. Doddington., R. George., M. Goudie., M. Kathleen M. “The DARPA Speech recognition research database: specifications and status”, 1986. 93
15	Q.T. Nguyen. Speech classification using sift features on spectrogram images. “Vietnam journal of computer science”, 2016. 247
16	M. Al-Darkazali. “Image processing methods to segment speech spectrograms for word level recognition”, 2017
17	E. Geoffrey., N.S. Hinton., A. Krizhevskiy., I.R Sutskever., R. Salakhutdinov. “Dropout a simple way to prevent neural networks from overfitting. journal of machine learning research”, 2014. 1929
18	A. Rosebrock. “Deep learning for computer vision with python starter bundle”, 2017
19	S. Rekik., D. Guerchi., S.A. Selouani. “Speech steganography using wavelet and fourier transforms”, 2012