This paper describes the most common algorithms with image approach
convolutional neural network and two-dimensional DCT with machine learning classification
KNN, SVM and RF. These algorithms are evaluated for applicability to the Uzbek language
and a comparative analysis on the accuracy and recognition rate. The command words of the
Uzbek language were chosen for the experiments. According to the results, it was found that
both methods give high rates of recognition accuracy and are 92% (CNN) and 90% (2DDCT+
Zigzag+SVM). Also the combinations of 2D-DCT+Zigzag+ KNN and 2D-DCT+Zigzag+ RF
with average recognition accuracy of 86% and 85%, respectively, were considered in the
paper.
This paper describes the most common algorithms with image approach
convolutional neural network and two-dimensional DCT with machine learning classification
KNN, SVM and RF. These algorithms are evaluated for applicability to the Uzbek language
and a comparative analysis on the accuracy and recognition rate. The command words of the
Uzbek language were chosen for the experiments. According to the results, it was found that
both methods give high rates of recognition accuracy and are 92% (CNN) and 90% (2DDCT+
Zigzag+SVM). Also the combinations of 2D-DCT+Zigzag+ KNN and 2D-DCT+Zigzag+ RF
with average recognition accuracy of 86% and 85%, respectively, were considered in the
paper.
№ | Имя автора | Должность | Наименование организации |
---|---|---|---|
1 | Musayev M.M. | teacher | TUIT |
2 | Khujayorov I.S. | teacher | Samarkand branch of TUIT |
3 | Abdullaeva M.I. | teacher | TUIT |
4 | Ochilov M.M. | teacher | TUIT |
№ | Название ссылки |
---|---|
1 | P. Ibrahim., Y.R Srinivas. Speech recognition using HMM with MFCC-an analysis using frequency Spectral decomposing technique. “Signal Image Processing an International Journal (SIPIJ)”, 2010 |
2 | A.M. Badshah., J. Ahmad., N. Rahim., S.W. Baik. Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network. 2017 “International conference on platform technology and service”, 2017 |
3 | M. Gales., S. Young. The application of hidden Markov models in speech recognition. “Foundations and Trends in Signal Processing”, 2007. 195 |
4 | N.G. Andrew., Y. Zhang. “Speech recognition using deep learning algorithms. published”, 2013 |
5 | M.B. Gulmezoglu. A novel approach to isolated word recognition. “IEEE transactions on speech and audio processing”, 1999. 620 |
6 | N.E. Sukmawati., A. Satriyo., R.A. Sutikno. Automatic Speech Recognition for Indonesian using Linear Predictive Coding (LPC) and Hidden Markov Model (HMM). “Proceeding of 5th international seminar on new paradigm and innovation on natural science and its application”, 2015 |
7 | M. Ahmadi., N.J Bailey., B.S. Hoyle. Phoneme recognition using speech image (spectrogram). “Published in IEEE proceedings of third international conference on signal processing”, 1996 |
8 | J. Zhang., S. Xiao., H. Zhang., L. Jiang. Isolated word recognition with audio derivation and CNN. “Proceedings international conference on tools with artificial intelligence”, 2018. 336 |
9 | D. Polap., M. Woźniak. “Image approach to voice recognition. 2017 IEEE symposium series on computational intelligence”, 2018. 1 |
10 | J.M. Padmanabhan., J.J. Premkumar. Machine learning in automatic speech recognition. “A survey. IETE Technical review institution of electronics and telecommunication engineers”, 2015. 240 |
11 | C. Glackin., J. Wall., G. Chollet., N. Dugan., N. Cannings. Convolutional neural networks for phoneme recognition. “Proceedings of the 7th international conference on pattern recognition applications and methods”, 2018. 190 |
12 | L. Yingying., P. Siyuan., X. Nanfeng. Speech Recognition Method Based on Spectrogram. “Proceedings of the international conference on mechatronics and intelligent robotics (ICMIR)”, 2017 |
13 | A.H. Waibel., T. Hanazawa., G. Hinton., K. Shikano., K. Lang. “Phoneme recognition using time-delay neural networks”, 1989 |
14 | W. Fisher., M. Doddington., R. George., M. Goudie., M. Kathleen M. “The DARPA Speech recognition research database: specifications and status”, 1986. 93 |
15 | Q.T. Nguyen. Speech classification using sift features on spectrogram images. “Vietnam journal of computer science”, 2016. 247 |
16 | M. Al-Darkazali. “Image processing methods to segment speech spectrograms for word level recognition”, 2017 |
17 | E. Geoffrey., N.S. Hinton., A. Krizhevskiy., I.R Sutskever., R. Salakhutdinov. “Dropout a simple way to prevent neural networks from overfitting. journal of machine learning research”, 2014. 1929 |
18 | A. Rosebrock. “Deep learning for computer vision with python starter bundle”, 2017 |
19 | S. Rekik., D. Guerchi., S.A. Selouani. “Speech steganography using wavelet and fourier transforms”, 2012 |