logo
calendar17 феврал 2023
view2
Asosiy til:O'zbek

TABIIY TILNI QAYTA ISHLASH (NLP)DA SPACY MODULIDAN FOYDALANISH

Fan yo'nalishi:
pdf

65d5bf8b20144.pdf

PDF

MAQOLA ANNOTATSIYASI

quote
Ushbu maqolada kompyuter lingvistikasining asosiy yoʻnalishlaridan biri hisoblangan tabiiy tilni qayta ishlash (NLP)da matnlarni Python dasturlash tilida yozilgan spaCy moduli arxitekturasi va vositalari koʻrib chiqiladi. Tabiiy tildagi matn alohida birlik (belgi)lardan iborat boʻlib, uni turli sathlarga mansub oʻzaro bogʻliq bir qancha qismlarga ajratish mumkin. Shunga muvofiq ravishda spaCy kutubxonasi vositalari yordamida matnni tokenizatsiyalash va pipeline jarayoni orqali hosil qilingan lemma, POS, tag, dep, shape, alpha va stop atributlaridan foydalanish usullari keltirilgan.

MUALIFLAR

Teglar

# части речи# token# лемматизация# Python# tabiiy tilni qayta ishlash# NLP# spaCy# part-of-speech# lemmatizatsiya# parser# pipeline arxitekturasi# обработка естественного языка# токенизация# синтаксический анализатор# конвейерная архитектура.# Natural language processing# lemmatization# pipeline architecture

Maqolani baholang

0

0 ta

Maqola idintifikatorlari

Foydalanilgan adabiyotlar

GPT-3 Powers the NextGeneration of Apps. Available at: https://openai.com/blog/gpt-3-apps/.

Bolʹshakova Ye.I., Vorontsov K.V., Yefremova N.E., Klyshinskiy E.S., Lukashevich N.V., Sapin A.S. Avtomaticheskaya obrabotka tekstov na yestestvennom yazyke i analiz dannykh [Automatic natural language processing and data analysis]. Мoscow, NIU VShE Publ., 2017, 269 p.

Kharis M., Laksono K., Suhartono, Ridwan A., Mintowati, Yuniseffendri. Tokenization and lemmatization on German learning textbook level A1 of CEFR Standard. Journal of Higher Education Theory and Practice, 2022, no. 22 (1). DOI: 10.33423/jhetp.v22i1.4971/.

Chantrapornchai C., Tunsakul A. Information extraction on tourism domain using SpaCy and BERT. ECTI Transactions on Computer and Information Technology, 2021, 15 (1). DOI: 10.37936/ecticit.2021151.228621/.

Yanti R.M., Santoso I., Suadaa L.H. Application of named entity recognition via Twitter on SpaCy in Indonesian. Case Study: power failure in the special region of Yogyakarta. Indonesian Journal of Information Systems, 2021. DOI: 10.24002/ijis.v4i1.4677/.

Kharis M., Laksono K., Suhartono, Ridwan A., Mintowati, Yuniseffendri. Tokenization and lemmatization on german learning textbook level A1 of CEFR Standard. Journal of Higher Education Theory and Practice, 2022, no. 22 (1). DOI: 10.33423/jhetp.v22i1.4971/.

Cing D.L., Soe K.M. Improving accuracy of part-of-speech (POS) tagging using hidden markov model and morphological analysis for Myanmar language. International Journal of Electrical and Computer Engineering, 2020, no. 10 (2). DOI: 10.11591/ijece.v10i2. pp2023-2030/.

Chandola D., Garg A., Maurya A., Kushwaha A. Online Resume Parsing System Using Text Analytics, 2015. Available at: http://www.jmdet.com/wp-content/uploads/2015/08/CR9.pdf/.

Turgunbaev R., Elov B. The use of machine learning methods in the automatic extraction of metadata from academic articles. International Journal of Innovations in Engineering Research and Technology, 2021, no. 8 (12), pp. 72-79. DOI: 10.17605/OSF.IO/QB5PZ/.

Elov B., Akhmedova Kh. A mathematical model that semantically analyzes polysemantic words. Journal of Pedagogical Inventions and Practices, 2021, no. 3, pp. 119-122. Available at: https:// zienjournals.com/index.php/jpip/article/view/469/.

Jabeen H. Stemming and lemmatization in Python. Towardsdatascience, 2018.

Chong C., Sheikh U.U., Samah N.A., Sha’Ameri A.Z. Analysis on reflective writing using natural language processing and sentiment analysis. IOP Conference Series: Materials Science and Engineering, 2020, no. 884 (1). DOI: 10.1088/1757-899X/884/1/012069/.

Honnibal M., Montani I. SpaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. Appear, 2017, no. 7 (1), pp. 411-420. Available at: https://sentometrics-research.com/publication/72/.

Shelar H., Kaur G., Heda N., Agrawal P. Named entity recognition approaches and their comparison for custom NER Model. Science and Technology Libraries, 2020, no. 39 (3), pp. 324-337. DOI: 10.1080/0194262X.2020.1759479/.

Jugran S., Kumar A., Tyagi B.S., Anand V. Extractive automatic text summarization using SpaCy in Python NLP. 2021 International Conference on Advance Computing and Innovative Technologies in Engineering, ICACITE, 2021. DOI: 10.1109/ICACITE51222.2021.9404712/.

Honnibal M. Founder and CTO, SpaCy.io. Available at: http://scholar.google.com/ citations?user=FXwlnmAAAAAJ&hl=en/.

Ines, a software developer working on Artificial Intelligence and Natural Language Processing technologies, and the co-founder and CEO of Explosion. Available at: https://ines.io/.

Saloot M. A., Pham D.N. Real-time Text Stream Processing: A Dynamic and Distributed NLP Pipeline. ACM International Conference Proceeding Series. 2021. DOI: 10.1145/3459104.3459198/.

Rai A., Borah S. Study of various methods for tokenization. Lecture Notes in Networks and Systems, 2021, vol. 137. DOI: 10.1007/978-981-15-6198-6_18/.

Pudasaini S., Shakya S., Lamichhane S., Adhikari S., Tamang A., Adhikari S. Application of NLP for information extraction from unstructured documents. Lecture Notes in Networks and Systems, 2022, vol. 209. DOI: 10.1007/978-981-16-2126-0_54/.

Pota M., Marulli F., Esposito M., de Pietro G., Fujita H. Multilingual POS tagging by a composite deep architecture based on character-level features and on-the-fly enriched Word Embeddings. Knowledge-Based Systems, 2019, vol. 164. DOI: 10.1016/j.knosys.2018.11.003/.

Kumar A., Katiyar V., Kumar P. A Comparative analysis of pre-processing time in summary of hindi language using Stanza and Spacy. IOP Conference Series: Materials Science and Engineering, 2021, no. 1110 (1). DOI: 10.1088/1757-899x/1110/1/012019/.

public

SLIB.uz — O'zbekiston ilmiy jurnallari va maqolalar yagona tizimda ilmiy nashirlarni bir joyda ko'rish, izlash va ulardan foydalanish imkonini beruvchi zamonaviy platforma.

Ijtimoiy tarmoqlarda
instagramtelegramyoutubefacebook

Bog'lanish uchun

Manzil:Chilonzor tumani Qatortol ko'chasi 60B

Tel:+998(55)511-44-00

Savol-javob va takliflar uchun

© 2026 Barcha huquqlar himoyalangan.