46

Maqolada matn tilini aniqlashning sun’iy intellekt algoritmlariga asoslangan yondashuvlari muhokama qilinadi. Matn tilini aniqlash sun’iy intelletkning sinflashtirish masalasi bo‘lganligi sababli, maqolada mashinali o‘qitish va neyron tarmoq modellarining bir nechta sinflashtirish algoritmlari imkoniyatlari ko‘rib o‘tiladi. Ishda o‘zbek, inlgiz, rus, qoraqalpoq tillarini aniqlovchi model uchun o‘quv ma’lumotlar to‘plamini shakllantirish masalasi ko‘riladi. Shuningdek matn tilini aniqlashda foydalanilgan modellarning aniqlik ko‘rsatkichlari bo‘yicha qiyosiy tahlil amalga oshiriladi.

  • O'qishlar soni 46
  • Nashr sanasi 02-08-2024
  • Asosiy tilO'zbek
  • Sahifalar59-67
Ўзбек

Maqolada matn tilini aniqlashning sun’iy intellekt algoritmlariga asoslangan yondashuvlari muhokama qilinadi. Matn tilini aniqlash sun’iy intelletkning sinflashtirish masalasi bo‘lganligi sababli, maqolada mashinali o‘qitish va neyron tarmoq modellarining bir nechta sinflashtirish algoritmlari imkoniyatlari ko‘rib o‘tiladi. Ishda o‘zbek, inlgiz, rus, qoraqalpoq tillarini aniqlovchi model uchun o‘quv ma’lumotlar to‘plamini shakllantirish masalasi ko‘riladi. Shuningdek matn tilini aniqlashda foydalanilgan modellarning aniqlik ko‘rsatkichlari bo‘yicha qiyosiy tahlil amalga oshiriladi.

English

The article discusses approaches to text recognition based on artificial intelligence algorithms. Since text language identification is a classification problem in artificial intelligence, the article examines the capabilities of several classification algorithms using machine learning and neural network models. The study addresses the issue of forming a training dataset for a model that identifies Uzbek, English, Russian, and Karakalpak languages. Additionally, a comparative analysis of the accuracy indicators of the models used for text language identification is conducted.

Русский

В статье рассматриваются подходы к распознаванию текста на основе алгоритмов искусственного интеллекта. Поскольку идентификация языка текста является проблемой классификации в искусственном интеллекте, в статье рассматриваются возможности нескольких алгоритмов классификации с использованием моделей машинного обучения и нейронных сетей. Рассмотрен вопрос формирования обучающего набора данных для модели, определяющей узбекский, английский, русский и каракалпакские языки. Дополнительно проводится сравнительный анализ показателей точности моделей, используемых для идентификации языка текста.

Muallifning F.I.Sh. Lavozimi Tashkilot nomi
1 Xujayarov I.S. Kafedra mudiri Toshkent Axborot Texnologiyalari Universiteti Samarqand filiali
2 Ochilov M.. Dotsent TATU
3 Xolmatov O.. Doktorant TATU
4 Jurayev D.. Doktorant TATU
Havola nomi
1 Jurafskiy, D., & Martin, J. H. "Speech and Language Processing" (3rd ed.) (2019).
2 Xolmatov O.A., Kamolov R.K. NLP da savol-javob tizimlarini yaratish turlari va bosqichlari. Muhammad al-Xorazmiy nomidagi TATU Samarqand filiali. “O‘zbek tilining milliy korpusi: muammolar va vazifalar” mavzusidagi xalqaro ilmiy-amaliy konferensiya.
3 Xujayarov I.Sh, Ochilov M.M. Neyron tarmoqlariga asoslangan nutq signallarini akustik modellashtirish usullari tahlili.
4 S. Ibragimova, T. Boburkhon, M. Abdullayeva. Solving the problems of normalization of non-standard words in the text of the uzbek language Acta of Turin Polytechnic University in Tashkent 13 (3), pp. 38-42.
5 Bird, S., Klein, E., & Loper, E. "Natural Language Processing with Python." O‘Reilly Media (2009).
6 Nielsen, M. A. "Neural Networks and Deep Learning." Determination Press (2015).
7 Brownlee, J. "Long Short-Term Memory Networks with Python." Machine Learning Mastery (2018).
8 Gulli, A., & Pal, S. "Deep Learning with Keras." Packt Publishing (2017).
9 Zhang, J., Zhao, J., & LeCun, Y. "Character-level Convolutional Networks for Text Classification." Advances in Neural Information Processing Systems (2015).
10 Hochreiter S., Schmidhuber J. Long short-term memory. Neural computation. 1997. vol. 9. no. 8. pp. 1735–1780.
11 Musaev M., Khujayorov I., Ochilov M. “Development of integral model of speech recognition system for Uzbek language” IEEE 14th International Coference on Application of Information and Communication Technologies (AICT). 07 09 October 2020.
12 Musaev M., Khujayarov I., Ochilov M. “Speech Recognition Technologies Based on Artificial Intelligence Algorithms” Intelligent Human Computer Interaction: 14th International Conference, IHCI 2022, Tashkent, Uzbekistan, October 20–22, 2022, Pages 51–62.
13 Abdullaeva M.I., Juraev D.B., Ochilov M.M., Rakhimov M.F., Uzbek Speech Synthesis Using Deep Learning Algorithms. The 14th International Conference on Intelligent Human Computer Interaction, Springer, (LNCS,volume 13741), Tashkent – 2023, pp 39–50.
14 Xujayarov I.Sh, Ochilov M.M. Neyron tarmoqlariga asoslangan nutq signallarini akustik modellashtirish usullari tahlili.
15 S Ibragimova, T Boburkhon, M Abdullayeva. Solving the problems of normalization of non-standard words in the text of the uzbek language Acta of Turin Polytechnic University in Tashkent 13 (3), pp. 38-42 .
16 Abdullayeva M.I., Jurayev D.B, Ochilov M.M. Nutqni imo-ishora tiliga tarjima qilish tizimlarida matnlarga ishlov berish. TATU xabarlari, b. 112-118.
Kutilmoqda