NER YONDASHUVI BILAN O’ZBEK TILIDAGI MATNDAN MIQDORLARNI ANIQLASH QOIDALARI

431

Ushbu maqolada qoidaga asoslangan nomlangan ob'ektni tanib olish (Named Entity Recognation-NER) asoslari va mavjud usullari qiyosiy tahlil qilingan va qoidaga asoslangan NER ning avzalliklari keltirib o’tilgan. Xususan, NER yordamida matndan miqdor ko'rsatkichlarni ajratib olish masalasi muhokama qilinadi. Qoidalarga asoslangan NER o'lchovlar, foizlar, pul birliklari kabi miqdorlarni aniqlash va chiqarish uchun ko'p qirrali va moslashtirilgan yondashuvni taklif etadi. Lingvistik qoidalar va kalit so’zlarni ishlab chiqish orqali soha mutaxassislari tizimni sohaning o'ziga xos xususiyatlariga moslashtirishi mumkin, bu esa miqdorni aniqlashda aniqlik va moslikni ta'minlaydi. Maqola natijasida o’zbek tilidagi matndan miqdorlarni qoidaga asoslangan NER orqali ajratib olish uchun bir nechta qoidalar taklif etilgan.

Название журналаIlg’or texnologiyalar va tabiiy fanlar xalqaro jurnali
Номер выпускаVol 2(4), 2023
Количество просмотров 431

Ссылка в интернете

DOI10.24412/2181-144X-2023-2-23-32

Дата создание в систему UzSCI 06-04-2024

Количество прочтений 431

Дата публикации 23-06-2023

Язык статьиO'zbek

Страницы23

Ключевые слова

birlik

qoida

NER

kalit so'z

Ўзбек

Ushbu maqolada qoidaga asoslangan nomlangan ob'ektni tanib olish (Named Entity Recognation-NER) asoslari va mavjud usullari qiyosiy tahlil qilingan va qoidaga asoslangan NER ning avzalliklari keltirib o’tilgan. Xususan, NER yordamida matndan miqdor ko'rsatkichlarni ajratib olish masalasi muhokama qilinadi. Qoidalarga asoslangan NER o'lchovlar, foizlar, pul birliklari kabi miqdorlarni aniqlash va chiqarish uchun ko'p qirrali va moslashtirilgan yondashuvni taklif etadi. Lingvistik qoidalar va kalit so’zlarni ishlab chiqish orqali soha mutaxassislari tizimni sohaning o'ziga xos xususiyatlariga moslashtirishi mumkin, bu esa miqdorni aniqlashda aniqlik va moslikni ta'minlaydi. Maqola natijasida o’zbek tilidagi matndan miqdorlarni qoidaga asoslangan NER orqali ajratib olish uchun bir nechta qoidalar taklif etilgan.

Ключевые слова

birlik

qoida

NER

kalit so'z

Русский

В этой статье рассматриваются основы основанных на принципах NER и основанных на них методов, а также упоминаются преимущества NER на основе правил. В частности, обсуждается вопрос извлечения размера текста с помощью NER. NER на основе правил предлагает универсальный и настраиваемый подход для определения и расчета таких величин, как измерение, проценты и валюта. Разрабатывая лингвистические правила и ключевые слова, специалисты отрасли могут адаптировать систему к отраслевой специфике, обеспечить точность и последовательность количественных оценок. В результате в статье предложено несколько правил, из которых следует извлечь величину из текста на узбекском языке с использованием NER на основе правил.

Ключевые слова

правило

НЭР

ключевое слово

единица измерения

English

This article compares the fundamentals of Named Entity Recognition-NER and existing methods and mentions the advantages of rule-based NER. In particular, the issue of extracting quantities from text using NER is discussed. Rule-based NER offers a versatile and customizable approach for defining and calculating quantities such as dimensions, percentages, and currencies. By developing linguistic rules and keywords, industry professionals can tailor the system to industry specifics, ensuring accuracy and consistency in quantitative assessments. As a result of the article, several rules for extracting quantities from text in the Uzbek language using rule-based NER are proposed.

Ключевые слова

unit

rule

NER

keyword

№ Имя автора Должность Наименование организации

1 Kenjaev X.B. assistent Mummamad al-Xorazmiy nomidagi Nukus filiali

2 Toliev X.I. doktorant Muhammad al-Xorazmiy nomidagi TATU

№ Название ссылки

1 Shah, D. N., and H. Bhadka. 2017. A survey on various approaches used in named entity recognition for Indian languages. International Journal of Computer Application 167 (1):11–18. doi:10.5120/ijca2017913878.

2 L.A.Pizzato ,D.Molla , C.Paris, Pseudo relevance feedback using named entities for question answering, in: Proceeding soft he 2006 Australian Language Technology Workshop, ALTW-2006,2006,pp.89–90

3 Sazali, S. S., Rahman, N. A., & Bakar, Z. A. (2016). Information extraction: Evaluating named entity recognition from classical Malay documents. 2016 Third International Conference on Information Retrieval and Knowledge Management (CAMP). doi:10.1109/infrkm.2016.7806333

4 Luca Foppiano, Laurent Romary, Masashi Ishii, and Mikiko Tanifuji. 2019. Automatic identification and normalisation of physical measurements in scientific literature. In Proceedings of the ACM Symposium on Document Engineering 2019, Berlin, Germany, September 23-26, 2019, pages 24:1–24:4. ACM

5 Subhro Roy, Tim Vieira, and Dan Roth. 2015. Reasoning about quantities in natural language. Transactions of the Association for Computational Linguistics, 3:1–13.

6 Tongliang Li, Lei Fang, Jian-Guang Lou, Zhoujun Li, and Dongmei Zhang. 2021. AnaSearch: Extract, Retrieve and Visualize Structured Results from Unstructured Text for Analytical Queries. In WSDM’ 21, The Fourteenth ACM International Conference on Web Search and Data Mining, Virtual Event, Israel, March 8-12, 2021, pages 906–909. ACM

7 Sunita Sarawagi and Soumen Chakrabarti. 2014. Opendomain Quantity Queries on Web Tables: Annotation, Response, and Consensus Models. In The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, NY, USA - August 24 - 27, 2014, pages 711–720. ACM.

8 Somnath Banerjee, Soumen Chakrabarti, and Ganesh Ramakrishnan. 2009. Learning to Rank for Quantity Consensus Queries. In Proceedings of the 2 nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, Boston, MA, USA, July 19-23, 2009, pages 243–250. ACM.

9 Arun S. Maiya, Dale Visser, and Andrew Wan. 2015. Mining Measured Information from Text. In Proceedings of the 38th International SIGIR Conference on Research and Development in Information Retrieval, pages 899–902. ACM.

10 Gürkan, A. T., B. Özenç, I. Çam, B. Avar, G. Ercan, and O. T. Yıldız. 2017. A new approachfor named entity recognition. 2nd international conference on computer science and engineering 474–79. doi: 10.1109/UBMK.2017.8093439

11 Ben Abacha, A., Zweigenbaum, P.: Medical entity recognition: a comparaison of semantic and statistical methods. In: Proceedings of BioNLP 2011 Workshop, pp. 56–64. Association for Computational Linguistics, Portland, June 2011. http://www.aclweb.org/anthology/W11-0207

В ожидании

№	Имя автора	Должность	Наименование организации
1	Kenjaev X.B.	assistent	Mummamad al-Xorazmiy nomidagi Nukus filiali
2	Toliev X.I.	doktorant	Muhammad al-Xorazmiy nomidagi TATU

№	Название ссылки
1	Shah, D. N., and H. Bhadka. 2017. A survey on various approaches used in named entity recognition for Indian languages. International Journal of Computer Application 167 (1):11–18. doi:10.5120/ijca2017913878.
2	L.A.Pizzato ,D.Molla , C.Paris, Pseudo relevance feedback using named entities for question answering, in: Proceeding soft he 2006 Australian Language Technology Workshop, ALTW-2006,2006,pp.89–90
3	Sazali, S. S., Rahman, N. A., & Bakar, Z. A. (2016). Information extraction: Evaluating named entity recognition from classical Malay documents. 2016 Third International Conference on Information Retrieval and Knowledge Management (CAMP). doi:10.1109/infrkm.2016.7806333
4	Luca Foppiano, Laurent Romary, Masashi Ishii, and Mikiko Tanifuji. 2019. Automatic identification and normalisation of physical measurements in scientific literature. In Proceedings of the ACM Symposium on Document Engineering 2019, Berlin, Germany, September 23-26, 2019, pages 24:1–24:4. ACM
5	Subhro Roy, Tim Vieira, and Dan Roth. 2015. Reasoning about quantities in natural language. Transactions of the Association for Computational Linguistics, 3:1–13.
6	Tongliang Li, Lei Fang, Jian-Guang Lou, Zhoujun Li, and Dongmei Zhang. 2021. AnaSearch: Extract, Retrieve and Visualize Structured Results from Unstructured Text for Analytical Queries. In WSDM’ 21, The Fourteenth ACM International Conference on Web Search and Data Mining, Virtual Event, Israel, March 8-12, 2021, pages 906–909. ACM
7	Sunita Sarawagi and Soumen Chakrabarti. 2014. Opendomain Quantity Queries on Web Tables: Annotation, Response, and Consensus Models. In The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, NY, USA - August 24 - 27, 2014, pages 711–720. ACM.
8	Somnath Banerjee, Soumen Chakrabarti, and Ganesh Ramakrishnan. 2009. Learning to Rank for Quantity Consensus Queries. In Proceedings of the 2 nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, Boston, MA, USA, July 19-23, 2009, pages 243–250. ACM.
9	Arun S. Maiya, Dale Visser, and Andrew Wan. 2015. Mining Measured Information from Text. In Proceedings of the 38th International SIGIR Conference on Research and Development in Information Retrieval, pages 899–902. ACM.
10	Gürkan, A. T., B. Özenç, I. Çam, B. Avar, G. Ercan, and O. T. Yıldız. 2017. A new approachfor named entity recognition. 2nd international conference on computer science and engineering 474–79. doi: 10.1109/UBMK.2017.8093439
11	Ben Abacha, A., Zweigenbaum, P.: Medical entity recognition: a comparaison of semantic and statistical methods. In: Proceedings of BioNLP 2011 Workshop, pp. 56–64. Association for Computational Linguistics, Portland, June 2011. http://www.aclweb.org/anthology/W11-0207