145

This paper describes the most common algorithms with image approach
convolutional neural network and two-dimensional DCT with machine learning classification
KNN, SVM and RF. These algorithms are evaluated for applicability to the Uzbek language
and a comparative analysis on the accuracy and recognition rate. The command words of the
Uzbek language were chosen for the experiments. According to the results, it was found that
both methods give high rates of recognition accuracy and are 92% (CNN) and 90% (2DDCT+
Zigzag+SVM). Also the combinations of 2D-DCT+Zigzag+ KNN and 2D-DCT+Zigzag+ RF
with average recognition accuracy of 86% and 85%, respectively, were considered in the
paper.

  • Web Address
  • DOI
  • Date of creation in the UzSCI system07-09-2022
  • Read count145
  • Date of publication30-08-2022
  • Main LanguageIngliz
  • Pages55-61
English

This paper describes the most common algorithms with image approach
convolutional neural network and two-dimensional DCT with machine learning classification
KNN, SVM and RF. These algorithms are evaluated for applicability to the Uzbek language
and a comparative analysis on the accuracy and recognition rate. The command words of the
Uzbek language were chosen for the experiments. According to the results, it was found that
both methods give high rates of recognition accuracy and are 92% (CNN) and 90% (2DDCT+
Zigzag+SVM). Also the combinations of 2D-DCT+Zigzag+ KNN and 2D-DCT+Zigzag+ RF
with average recognition accuracy of 86% and 85%, respectively, were considered in the
paper.

Author name position Name of organisation
1 Musayev M.M. teacher TUIT
2 Khujayorov I.S. teacher Samarkand branch of TUIT
3 Abdullaeva M.I. teacher TUIT
4 Ochilov M.M. teacher TUIT
Name of reference
1 P. Ibrahim., Y.R Srinivas. Speech recognition using HMM with MFCC-an analysis using frequency Spectral decomposing technique. “Signal Image Processing an International Journal (SIPIJ)”, 2010
2 A.M. Badshah., J. Ahmad., N. Rahim., S.W. Baik. Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network. 2017 “International conference on platform technology and service”, 2017
3 M. Gales., S. Young. The application of hidden Markov models in speech recognition. “Foundations and Trends in Signal Processing”, 2007. 195
4 N.G. Andrew., Y. Zhang. “Speech recognition using deep learning algorithms. published”, 2013
5 M.B. Gulmezoglu. A novel approach to isolated word recognition. “IEEE transactions on speech and audio processing”, 1999. 620
6 N.E. Sukmawati., A. Satriyo., R.A. Sutikno. Automatic Speech Recognition for Indonesian using Linear Predictive Coding (LPC) and Hidden Markov Model (HMM). “Proceeding of 5th international seminar on new paradigm and innovation on natural science and its application”, 2015
7 M. Ahmadi., N.J Bailey., B.S. Hoyle. Phoneme recognition using speech image (spectrogram). “Published in IEEE proceedings of third international conference on signal processing”, 1996
8 J. Zhang., S. Xiao., H. Zhang., L. Jiang. Isolated word recognition with audio derivation and CNN. “Proceedings international conference on tools with artificial intelligence”, 2018. 336
9 D. Polap., M. Woźniak. “Image approach to voice recognition. 2017 IEEE symposium series on computational intelligence”, 2018. 1
10 J.M. Padmanabhan., J.J. Premkumar. Machine learning in automatic speech recognition. “A survey. IETE Technical review institution of electronics and telecommunication engineers”, 2015. 240
11 C. Glackin., J. Wall., G. Chollet., N. Dugan., N. Cannings. Convolutional neural networks for phoneme recognition. “Proceedings of the 7th international conference on pattern recognition applications and methods”, 2018. 190
12 L. Yingying., P. Siyuan., X. Nanfeng. Speech Recognition Method Based on Spectrogram. “Proceedings of the international conference on mechatronics and intelligent robotics (ICMIR)”, 2017
13 A.H. Waibel., T. Hanazawa., G. Hinton., K. Shikano., K. Lang. “Phoneme recognition using time-delay neural networks”, 1989
14 W. Fisher., M. Doddington., R. George., M. Goudie., M. Kathleen M. “The DARPA Speech recognition research database: specifications and status”, 1986. 93
15 Q.T. Nguyen. Speech classification using sift features on spectrogram images. “Vietnam journal of computer science”, 2016. 247
16 M. Al-Darkazali. “Image processing methods to segment speech spectrograms for word level recognition”, 2017
17 E. Geoffrey., N.S. Hinton., A. Krizhevskiy., I.R Sutskever., R. Salakhutdinov. “Dropout a simple way to prevent neural networks from overfitting. journal of machine learning research”, 2014. 1929
18 A. Rosebrock. “Deep learning for computer vision with python starter bundle”, 2017
19 S. Rekik., D. Guerchi., S.A. Selouani. “Speech steganography using wavelet and fourier transforms”, 2012
Waiting