Enhanced Dynamic Label Allocation for Mathematical Formula Named Entity Recognition in Learning Path Recommendations
Article Information
Abstract
In the field of natural language processing, Named entity recognition (NER) is a essential task. Mathematical formulas usually contain a large number of terminologies, units of measure and other proprietary knowledge, and the integration of this information into the knowledge graph can significantly enhance the semantic expression ability of the graph. By identifying the named entities in data formulas, the key concepts, entities and relationships between them in the knowledge graph can be extracted, establishing basis for the construction of the knowledge graph and making it easier to interpret and analyse in practical applications. Furthermore, the structured knowledge derived from this process can facilitate personalized learning path recommendations by mapping identified entities to educational resources and prerequisite relationships. Aiming at the problem of insufficient recognition ability of existing models for mathematical formula entities, a mathematical formula named entity recognition method combining enhanced dynamic allocation of labels is proposed. A mathematical formula entity recognition model consisted of BERT(Bidirectional Encoder Representation from Transformer), BiLSTM(Bidirectional Long Short-term Memory) and Transformer was constructed, namely BERT-formula. The feature representation of deep semantic information is enhanced by adding extra sequences to the original vector representation for splicing at the model input; and the entity label prediction problem is regarded as a one-to-many linear allocation problem, and an auction algorithm is introduced to acquire the optimal allocation result with the smallest cost. Experiments demonstrate that the accuracy of the model prediction on the mathematical formula set is 98.8%, and the F1 value is 98.8%, which is improved by 1.51 and 1.05 percentage points compared with BERT-BiLSTM-CRF. It is evident that the approach performs well on the objective of identifying mathematical formula entities.
Graphical Abstract
Keywords
Data Availability Statement
Funding
Conflicts of Interest
Ethical Approval and Consent to Participate
References
- Yu, H. K., Zhang, H. P., Liu, Q., Lu, X. Q., & Shi, S. C. (2006). Chinese named entity identification using cascaded hidden Markov model. Tongxin Xuebao/Journal on Communications, 27(2), 87-94.
[Google Scholar] - Huang, H. W. (2009). SVM combined with error-driven learning for biological entity recognition. National University of Defense Technology.
[Google Scholar] - Feng, Y., Yu, H., Sun, G., & Zhao, Y. (2016). Domain-specific term recognition method based on word embedding and conditional random field. Journal of Computer Applications, 36(11), 3146-3151.
[Google Scholar] - Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504-507.
[CrossRef] [Google Scholar] - LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
[CrossRef] [Google Scholar] - Mikolov, T., Karafiát, M., Burget, L., Cernocký, J., & Khudanpur, S. (2010, September). Recurrent neural network based language model. In Interspeech (Vol. 2, No. 3, pp. 1045-1048).
[Google Scholar] - Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780.
[CrossRef] [Google Scholar] - Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019, June). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers) (pp. 4171-4186).
[CrossRef] [Google Scholar] - Hammerton, J. (2003). Named entity recognition with long short-term memory. Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL, 172-175.
[Google Scholar] - Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., & Dyer, C. (2016). Neural Architectures for Named Entity Recognition. In Proceedings of NAACL-HLT (pp. 260-270).
[CrossRef] [Google Scholar] - Zhou, J. W., Wang, K., Wu, Y. L., et al.. (2024). Research on Named Entity Recognition of Shen Nong's Materia Medica Based on BiLSTM-CRF. Journal of Chengdu University of Traditional Chinese Medicine, 47(03), 54-59.
[CrossRef] [Google Scholar] - Cheng, N., Li, B., Ge, S., Hao, X., & Feng, M. (2020). A joint model of automatic sentence segmentation and lexical analysis for ancient Chinese based on BiLSTM-CRF model. Journal of Chinese Information Processing, 34(4), 1-9.
[Google Scholar] - Huang, Z. Y., Yu, Y. N., Lin, R. M., et al.. (2024). Knowledge graph construction for network security base on modified BiLSTM-CRF. Modern Electronics Technique, 47(06), 15-21.
[CrossRef] [Google Scholar] - Zhou, L. L., Chen, L., Ji, F., et al.. (2023). ERNIE-BiLSTM-CRF Model-based entity recognition study in soil fertility. Horticulture and Seed, 43(09), 97-101.
[CrossRef] [Google Scholar] - Li, J., Lyu, G., Li, R., et al.. (2023). Chinese Negative Semantic Representation and Annotation Combined with Hybrid Attention Mechanism and BiLSTM-CRF. Computer Engineering and Applications, 59(09), 167-175.
[Google Scholar] - Strubell, E., Verga, P., Belanger, D., et al.. (2017). Fast and accurate entity recognition with iterated dilated convolutions. Proceedings of EMNLP, 2670-2680.
[Google Scholar] - Chen, T. Y., & Feng, S. (2022). Research on named entity recognition method and model stability of electronic medical record based on IDCNN+CRF and attention mechanism. China Digital Medicine, 17(11), 1-5.
[Google Scholar] - Peters, M. E., Neumann, M., Iyyer, M., et al.. (2018). Deep contextualized word representations. Proceedings of NAACL-HLT, 2227-2237.
[Google Scholar] - Radford, A., Narasimhan, K., Salimans, T., Sutskever, I. (2018). Improving language understanding by generative pre-training. OpenAI preprint.
[Google Scholar] - Li, S., & Pang, W. (2023). Joint Extraction Method of Entity and Relation in Maize Breeding Based on BERT-CRF and Word Embedding. Transactions of the Chinese Society of Agricultural Machinery, 1-16. http://kns.cnki.net/kcms/detail/11.1964.S.20230919.1113.006.html
[Google Scholar] - Zheng, X., Li, B., Feng, Z., et al.. (2023). Entity Recognition of Network Sensitive Words and Variants Based on BERT-BiLSTM-CRF. Computer and Digital Engineering, 51(07), 1585-1589.
[Google Scholar] - Yu, X., & Chang, E. (2023). Automatic Recognition of Place Names in Ancient PoetryBased on DA-BERT-CRF Models: Taking the AncientPoetries of Nanjing as an Example. Library Journal, 42(10), 87-94+73.
[CrossRef] [Google Scholar] - Li, X., Feng, J., Meng, Y., Han, Q., Wu, F., & Li, J. (2020). A unified MRC framework for named entity recognition. Proceedings of ACL, 5849-5859.
[Google Scholar] - Luo, X., Li, T., & Jia, Z. (2024). Chinese medical named entity recognition based on self-attention mechanism and lexicon enhancement. Journal of Computer Applications, 44(2), 385-392.
[Google Scholar] - Xue, M., Yu, B., Zhang, Z., Liu, T., Zhang, Y., & Wang, B. (2020). Coarse-toFine Pre-training for Named Entity Recognition. Proceedings of EMNLP, 6345-6354.
[Google Scholar] - Zheng, H., Qin, B., & Xu, M. (2021, January). Chinese medical named entity recognition using crf-mt-adapt and ner-mrc. In 2021 2nd International Conference on Computing and Data Science (CDS) (pp. 362-365). IEEE.
[CrossRef] [Google Scholar] - Shen, Y., Wang, X., Tan, Z., Xu, G., Xie, P., Huang, F., ... & Zhuang, Y. (2022). Parallel instance query network for named entity recognition. arXiv preprint arXiv:2203.10545.
[CrossRef] [Google Scholar] - Burkard, R. E., & Cela, E. (1999). Linear assignment problems and extensions. In Handbook of combinatorial optimization: Supplement volume A (pp. 75-149). Boston, MA: Springer US.
[CrossRef] [Google Scholar] - Al-Rfou, R., Choe, D., Constant, N., Guo, M., & Jones, L. (2019, July). Character-level language modeling with deeper self-attention. In Proceedings of the AAAI conference on artificial intelligence (Vol. 33, No. 01, pp. 3159-3166).
[CrossRef] [Google Scholar] - Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020, August). End-to-end object detection with transformers. In European conference on computer vision (pp. 213-229). Cham: Springer International Publishing.
[CrossRef] [Google Scholar] - Zhang, X., Liu, S., & Wang, H. (2023). Personalized learning path recommendation for e-learning based on knowledge graph and graph convolutional network. International journal of software engineering and knowledge engineering, 33(01), 109-131.
[CrossRef] [Google Scholar] - Zheng, Y., Wang, D., Zhang, J., Li, Y., Xu, Y., Zhao, Y., & Zheng, Y. (2024). A unified framework for personalized learning pathway recommendation in e-learning contexts. Education and Information Technologies, 1-38.
[CrossRef] [Google Scholar] - Shi, D., Wang, T., Xing, H., & Xu, H. (2020). A learning path recommendation model based on a multidimensional knowledge graph framework for e-learning. Knowledge-Based Systems, 195, 105618.
[CrossRef] [Google Scholar] - Duan, S., Chen, K., Yang, Y., & Shi, S. (2023, August). Research on personalized learning recommendation based on subject knowledge graphs and learner portraits. In International Conference on Computer Science and Educational Informatization (pp. 367-374). Singapore: Springer Nature Singapore.
[CrossRef] [Google Scholar]
Cited By (1)
-
Luiza Fatkulina, Gulchehra Yusupova, Asliddin Isomiddinov, Feruza Erkulova, Gulnoz Samandarova, Nargiza Babaniyazova, Saodat Shamshaddinova. .
2026 Second International Conference on Intelligent Systems for Communication, IoT and Security (ICISCoIS), 2026 .
[CrossRef]
Cite This Article
TY - JOUR AU - Liu, Hongchen AU - Zhang, Qingchuan PY - 2025 DA - 2025/05/20 TI - Enhanced Dynamic Label Allocation for Mathematical Formula Named Entity Recognition in Learning Path Recommendations JO - Frontiers in Educational Innovation and Research T2 - Frontiers in Educational Innovation and Research JF - Frontiers in Educational Innovation and Research VL - 1 IS - 1 SP - 10 EP - 21 DO - 10.62762/FEIR.2024.416675 UR - https://www.icck.org/article/abs/FEIR.2024.416675 KW - named entity recognition (NER) KW - mathematics KW - bidirectional encoder representations from transformer (BERT) KW - deep learning KW - auction algorithm AB - In the field of natural language processing, Named entity recognition (NER) is a essential task. Mathematical formulas usually contain a large number of terminologies, units of measure and other proprietary knowledge, and the integration of this information into the knowledge graph can significantly enhance the semantic expression ability of the graph. By identifying the named entities in data formulas, the key concepts, entities and relationships between them in the knowledge graph can be extracted, establishing basis for the construction of the knowledge graph and making it easier to interpret and analyse in practical applications. Furthermore, the structured knowledge derived from this process can facilitate personalized learning path recommendations by mapping identified entities to educational resources and prerequisite relationships. Aiming at the problem of insufficient recognition ability of existing models for mathematical formula entities, a mathematical formula named entity recognition method combining enhanced dynamic allocation of labels is proposed. A mathematical formula entity recognition model consisted of BERT(Bidirectional Encoder Representation from Transformer), BiLSTM(Bidirectional Long Short-term Memory) and Transformer was constructed, namely BERT-formula. The feature representation of deep semantic information is enhanced by adding extra sequences to the original vector representation for splicing at the model input; and the entity label prediction problem is regarded as a one-to-many linear allocation problem, and an auction algorithm is introduced to acquire the optimal allocation result with the smallest cost. Experiments demonstrate that the accuracy of the model prediction on the mathematical formula set is 98.8%, and the F1 value is 98.8%, which is improved by 1.51 and 1.05 percentage points compared with BERT-BiLSTM-CRF. It is evident that the approach performs well on the objective of identifying mathematical formula entities. SN - 3068-5664 PB - Institute of Central Computation and Knowledge LA - English ER -
@article{Liu2025Enhanced,
author = {Hongchen Liu and Qingchuan Zhang},
title = {Enhanced Dynamic Label Allocation for Mathematical Formula Named Entity Recognition in Learning Path Recommendations},
journal = {Frontiers in Educational Innovation and Research},
year = {2025},
volume = {1},
number = {1},
pages = {10-21},
doi = {10.62762/FEIR.2024.416675},
url = {https://www.icck.org/article/abs/FEIR.2024.416675},
abstract = {In the field of natural language processing, Named entity recognition (NER) is a essential task. Mathematical formulas usually contain a large number of terminologies, units of measure and other proprietary knowledge, and the integration of this information into the knowledge graph can significantly enhance the semantic expression ability of the graph. By identifying the named entities in data formulas, the key concepts, entities and relationships between them in the knowledge graph can be extracted, establishing basis for the construction of the knowledge graph and making it easier to interpret and analyse in practical applications. Furthermore, the structured knowledge derived from this process can facilitate personalized learning path recommendations by mapping identified entities to educational resources and prerequisite relationships. Aiming at the problem of insufficient recognition ability of existing models for mathematical formula entities, a mathematical formula named entity recognition method combining enhanced dynamic allocation of labels is proposed. A mathematical formula entity recognition model consisted of BERT(Bidirectional Encoder Representation from Transformer), BiLSTM(Bidirectional Long Short-term Memory) and Transformer was constructed, namely BERT-formula. The feature representation of deep semantic information is enhanced by adding extra sequences to the original vector representation for splicing at the model input; and the entity label prediction problem is regarded as a one-to-many linear allocation problem, and an auction algorithm is introduced to acquire the optimal allocation result with the smallest cost. Experiments demonstrate that the accuracy of the model prediction on the mathematical formula set is 98.8\%, and the F1 value is 98.8\%, which is improved by 1.51 and 1.05 percentage points compared with BERT-BiLSTM-CRF. It is evident that the approach performs well on the objective of identifying mathematical formula entities.},
keywords = {named entity recognition (NER), mathematics, bidirectional encoder representations from transformer (BERT), deep learning, auction algorithm},
issn = {3068-5664},
publisher = {Institute of Central Computation and Knowledge}
}
Publisher's Note
ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and Permissions
Copyright © 2025 by the Author(s). Published by Institute of Central Computation and Knowledge. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
Portico