Abstract
The process of assigning grammatical categories, such as ``Noun'' and ``Verb,'' to every word in a text corpus is known as part-of-speech (POS) tagging. This technique is widely used in applications like sentiment analysis, machine translation, and other linguistic and computational tasks. However, the unique features of the Pashto language and its limited resources present significant challenges for POS tagging. This study explores the critical role of POS tagging in the Pashto language by employing six popular deep-learning and machine-learning techniques. Experimental results demonstrate machine learning methods' effectiveness in capturing Pashto text's grammatical patterns. The evaluation is based on a well-curated and annotated dataset of Pashto text, meticulously compiled from diverse sources and enriched with POS tags, providing a reliable foundation for performance analysis. Among the tested algorithms, K-Nearest Neighbor (KNN) and Decision Tree achieved the highest accuracy rates, with 94.19% and 94.34%, respectively. Random Forest and Support Vector Machine (SVM) also delivered competitive results, exceeding the 90% accuracy threshold. Multi-Layer Perceptron (MLP), evaluated with various activation functions like ReLU and Tanh, achieved an accuracy of 87.25%, while Naïve Bayes, tested with different variants such as Multinomial NB and Gaussian NB, attained 83.33%. These results highlight the potential of machine learning techniques in overcoming the challenges associated with Pashto POS tagging.
Data Availability Statement
Data will be made available on request.
Funding
This work was supported without any funding.
Conflicts of Interest
The authors declare no conflicts of interest.
Ethical Approval and Consent to Participate
Not applicable.
Cite This Article
APA Style
Khan, A. A., Khan, W., Khan, M. A., Khan, K., Khan, F. M., Rahman, A. U., Bilal, H., & Monirul, I. M. (2024). Comparison of Machine Learning and Deep Learning Models for Part-of-Speech Tagging. ICCK Transactions on Advanced Computing and Systems, 1(2), 106–116. https://doi.org/10.62762/TACS.2024.493945
Publisher's Note
ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and Permissions

Copyright © 2024 by the Author(s). Published by Institute of Central Computation and Knowledge. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (
https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.