CT-DETR and ReID-Guided Multi-Target Tracking Algorithm in Complex Scenes

Ming Gao; Shixin Yang

doi:10.62762/TETAI.2024.240529

Article Information

Published in ICCK Transactions on Emerging Topics in Artificial Intelligence

Volume/Issue Volume 1, Issue 1, 2024

Pages 44-57

Cited by 3 (Crossref) 4 (Scopus)

Abstract

In the era of rapid technological advancement, the demand for sophisticated Multi-Object Tracking (MOT) systems in applications such as intelligent surveillance and autonomous navigation has become increasingly critical.~However, existing models often struggle with accuracy and efficiency in densely populated or dynamically complex environments. Addressing these challenges, we introduce a novel deep learning-based MOT model that incorporates the latest CT-DETR detection technology and an advanced ReID module for improved pedestrian tracking. Experimental results demonstrate the model's superior performance in accurately identifying and tracking multiple targets across varied scenarios, significantly outperforming existing benchmarks.~This research not only marks a significant leap forward in the field of video surveillance technology but also lays a foundational framework for future advancements in intelligent system applications, underscoring the importance of innovation in deep learning methodologies for real-world challenges.

Graphical Abstract

CT-DETR and ReID-Guided Multi-Target Tracking Algorithm in Complex Scenes

Keywords

multi-object tracking deep learning CT-DETR pedestrian re-identification intelligent surveillance systems

Data Availability Statement

Data will be made available on request.

Funding

This work was supported without any funding.

Conflicts of Interest

The authors declare no conflicts of interest.

Ethical Approval and Consent to Participate

Not applicable.

References

Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., & Kim, T. K. (2021). Multiple object tracking: A literature review. Artificial intelligence, 293, 103448.
[CrossRef] [Google Scholar]
Ciaparrone, G., Sánchez, F. L., Tabik, S., Troiano, L., Tagliaferri, R., & Herrera, F. (2020). Deep learning in video multi-object tracking: A survey. Neurocomputing, 381, 61-88.
[CrossRef] [Google Scholar]
Yin, J., Wang, W., Meng, Q., Yang, R., & Shen, J. (2020). A unified object motion and affinity model for online multi-object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 6768-6777).
[Google Scholar]
Ristani, E., Solera, F., Zou, R., Cucchiara, R., & Tomasi, C. (2016, October). Performance measures and a data set for multi-target, multi-camera tracking. In European conference on computer vision (pp. 17-35). Cham: Springer International Publishing.
[CrossRef] [Google Scholar]
Benfold, B., & Reid, I. (2011, June). Stable multi-target tracking in real-time surveillance video. In CVPR 2011 (pp. 3457-3464). IEEE.
[CrossRef] [Google Scholar]
Ye, M., Shen, J., Lin, G., Xiang, T., Shao, L., & Hoi, S. C. (2021). Deep learning for person re-identification: A survey and outlook. IEEE transactions on pattern analysis and machine intelligence, 44(6), 2872-2893.
[CrossRef] [Google Scholar]
Wang, Z., Zheng, L., Liu, Y., Li, Y., & Wang, S. (2020, August). Towards real-time multi-object tracking. In European conference on computer vision (pp. 107-122). Cham: Springer International Publishing.
[CrossRef] [Google Scholar]
Liu, M., Wang, F., Wang, X., Wang, Y., & Roy-Chowdhury, A. K. (2024). A two-stage noise-tolerant paradigm for label corrupted person re-identification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(7), 4944-4956.
[CrossRef] [Google Scholar]
Ning, E., Wang, C., Zhang, H., Ning, X., & Tiwari, P. (2023). Occluded person re-identification with deep learning: a survey and perspectives. Expert Systems with Applications, 122419.
[CrossRef] [Google Scholar]
Wojke, N., Bewley, A., & Paulus, D. (2017, September). Simple online and realtime tracking with a deep association metric. In 2017 IEEE international conference on image processing (ICIP) (pp. 3645-3649). IEEE.
[CrossRef] [Google Scholar]
Wu, D., Ye, M., Lin, G., Gao, X., & Shen, J. (2021). Person re-identification by context-aware part attention and multi-head collaborative learning. IEEE transactions on information forensics and security, 17, 115-126.
[CrossRef] [Google Scholar]
Cui, Z., Zhou, J., Peng, Y., Zhang, S., & Wang, Y. (2023). Dcr-reid: Deep component reconstruction for cloth-changing person re-identification. IEEE transactions on circuits and systems for video technology, 33(8), 4415-4428.
[CrossRef] [Google Scholar]
Zhang, Y., Wang, C., Wang, X., Zeng, W., & Liu, W. (2021). Fairmot: On the fairness of detection and re-identification in multiple object tracking. International journal of computer vision, 129(11), 3069-3087.
[CrossRef] [Google Scholar]
Stadler, D., & Beyerer, J. (2021). Improving multiple pedestrian tracking by track management and occlusion handling. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 10958-10967).
[Google Scholar]
Zhang, Y., Sun, P., Jiang, Y., Yu, D., Weng, F., Yuan, Z., ... & Wang, X. (2022, October). Bytetrack: Multi-object tracking by associating every detection box. In European conference on computer vision (pp. 1-21). Cham: Springer Nature Switzerland.
[CrossRef] [Google Scholar]
Sun, Z., Chen, J., Chao, L., Ruan, W., & Mukherjee, M. (2020). A survey of multiple pedestrian tracking based on tracking-by-detection framework. IEEE Transactions on Circuits and Systems for Video Technology, 31(5), 1819-1833.
[CrossRef] [Google Scholar]
Zhao, Y., Lv, W., Xu, S., Wei, J., Wang, G., Dang, Q., ... & Chen, J. (2023). DETRs Beat YOLOs on Real-time Object Detection. arXiv preprint arXiv:2304.08069.
[CrossRef] [Google Scholar]
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., ... & Houlsby, N. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
[Google Scholar]
Yang, J., Li, C., Zhang, P., Dai, X., Xiao, B., Yuan, L., & Gao, J. (2021). Focal attention for long-range interactions in vision transformers. Advances in Neural Information Processing Systems, 34, 30008-30022.
[Google Scholar]
Mayer, C., Danelljan, M., Paudel, D. P., & Van Gool, L. (2021). Learning target candidate association to keep track of what not to track. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 13444-13454).
[CrossRef] [Google Scholar]
Pang, J., Qiu, L., Li, X., Chen, H., Li, Q., Darrell, T., & Yu, F. (2021). Quasi-dense similarity learning for multiple object tracking. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 164-173).
[Google Scholar]
Zheng, L., Tang, M., Chen, Y., Zhu, G., Wang, J., & Lu, H. (2021). Improving multiple object tracking with single object tracking. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2453-2462).
[Google Scholar]
Cao, Z., Huang, Z., Pan, L., Zhang, S., Liu, Z., & Fu, C. (2023). Towards real-world visual tracking with temporal contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(12), 15834-15849.
[CrossRef] [Google Scholar]
Khan, S., Naseer, M., Hayat, M., Zamir, S. W., Khan, F. S., & Shah, M. (2022). Transformers in vision: A survey. ACM computing surveys (CSUR), 54(10s), 1-41.
[CrossRef] [Google Scholar]
Jin, X., Lan, C., Zeng, W., Wei, G., & Chen, Z. (2020, April). Semantics-aligned representation learning for person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 07, pp. 11173-11180).
[CrossRef] [Google Scholar]
Bewley, A., Ge, Z., Ott, L., Ramos, F., & Upcroft, B. (2016, September). Simple online and realtime tracking. In 2016 IEEE international conference on image processing (ICIP) (pp. 3464-3468). Ieee.
[CrossRef] [Google Scholar]
Azhar, M. I. H., Zaman, F. H. K., Tahir, N. M., & Hashim, H. (2020, August). People tracking system using DeepSORT. In 2020 10th IEEE international conference on control system, computing and engineering (ICCSCE) (pp. 137-141). IEEE.
[CrossRef] [Google Scholar]
Fan, L., Wang, Z., Cail, B., Tao, C., Zhang, Z., Wang, Y., ... & Zhang, F. (2016, August). A survey on multiple object tracking algorithm. In 2016 IEEE international conference on information and automation (ICIA) (pp. 1855-1862). IEEE.
[CrossRef] [Google Scholar]
Tan, L., Dong, X., Ma, Y., & Yu, C. (2018, October). A multiple object tracking algorithm based on YOLO detection. In 2018 11th international congress on image and signal processing, biomedical engineering and informatics (CISP-BMEI) (pp. 1-5). IEEE.
[CrossRef] [Google Scholar]
Kshirsagar, V., Bhalerao, R. H., & Chaturvedi, M. (2023). Modified yolo module for efficient object tracking in a video. IEEE Latin America Transactions, 21(3), 389-398.
[CrossRef] [Google Scholar]
Milan, A., Leal-Taixé, L., Reid, I., Roth, S., & Schindler, K. (2016). MOT16: A benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831.
[CrossRef] [Google Scholar]
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020, August). End-to-end object detection with transformers. In European conference on computer vision (pp. 213-229). Cham: Springer International Publishing.
[CrossRef] [Google Scholar]
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., & Dai, J. (2020). Deformable detr: Deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159.
[CrossRef] [Google Scholar]
Leal-Taixé, L., Milan, A., Reid, I., Roth, S., & Schindler, K. (2015). Motchallenge 2015: Towards a benchmark for multi-target tracking. arXiv preprint arXiv:1504.01942.
[CrossRef] [Google Scholar]
Bergmann, P., Meinhardt, T., & Leal-Taixe, L. (2019). Tracking without bells and whistles. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 941-951).
[Google Scholar]
Wu, Y., Lim, J., & Yang, M. H. (2015). Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9), 1834-1848.
[CrossRef] [Google Scholar]
Zhao, F., Hui, K., Wang, T., Zhang, Z., & Chen, Y. (2021). A KCF-based incremental target tracking method with constant update speed. IEEE Access, 9, 73544-73560.
[CrossRef] [Google Scholar]
Jia, X., Lu, H., & Yang, M. H. (2012, June). Visual tracking via adaptive structural local sparse appearance model. In 2012 IEEE Conference on computer vision and pattern recognition (pp. 1822-1829). IEEE.
[CrossRef] [Google Scholar]
Yang, X., Zhu, S., Xia, S., & Zhou, D. (2020). A new TLD target tracking method based on improved correlation filter and adaptive scale. The Visual Computer, 36(9), 1783-1795.
[CrossRef] [Google Scholar]
Duan, Y., Wu, W., Liu, L., Liu, S., Liang, P., & Zhang, Y. (2022, December). DTTrack: Target Tracking Algorithm Combining DaSiamRPN Tracker and Transformer Tracker. In Proceedings of the 2022 5th International Conference on Algorithms, Computing and Artificial Intelligence (pp. 1-5).
[CrossRef] [Google Scholar]
Qian, K., Zhang, S. J., Ma, H. Y., & Sun, W. J. (2023). SiamIST: Infrared small target tracking based on an improved SiamRPN. Infrared Physics & Technology, 134, 104920.
[CrossRef] [Google Scholar]

Cited By (3)

Guoliang Yang, Dali Weng, Zhiteng Li, Yonggan Wu. Tomato Ripeness Detection Model Based on Improved RT-DETR Lightweight Model. Agronomy, 2026 , 16 (9).
[CrossRef]
Marjan Kia. Attention-guided deep learning for effective customer loyalty management and multi-criteria decision analysis. Iran Journal of Computer Science, 2025 , 8 (1).
[CrossRef]
Marjan Kia, Soroush Sadeghi, Homayoun Safarpour, Mohammadreza Kamsari, Saeid Jafarzadeh Ghoushchi, Ramin Ranjbarzadeh. Innovative fusion of VGG16, MobileNet, EfficientNet, AlexNet, and ResNet50 for MRI-based brain tumor identification. Iran Journal of Computer Science, 2025 , 8 (1).
[CrossRef]

* Citation data provided by Crossref Cited-by.

Cite This Article

APA Style

Gao, M., & Yang, S. (2024). CT-DETR and ReID-Guided Multi-Target Tracking Algorithm in Complex Scenes. ICCK Transactions on Emerging Topics in Artificial Intelligence, 1(1), 44-57. https://doi.org/10.62762/TETAI.2024.240529

Export Citation

RIS Format

Compatible with EndNote, Zotero, Mendeley, and other reference managers

TY  - JOUR
AU  - Gao, Ming
AU  - Yang, Shixin
PY  - 2024
DA  - 2024/05/29
TI  - CT-DETR and ReID-Guided Multi-Target Tracking Algorithm in Complex Scenes
JO  - ICCK Transactions on Emerging Topics in Artificial Intelligence
T2  - ICCK Transactions on Emerging Topics in Artificial Intelligence
JF  - ICCK Transactions on Emerging Topics in Artificial Intelligence
VL  - 1
IS  - 1
SP  - 44
EP  - 57
DO  - 10.62762/TETAI.2024.240529
UR  - https://www.icck.org/article/abs/TETAI.2024.240529
KW  - multi-object tracking
KW  - deep learning
KW  - CT-DETR
KW  - pedestrian re-identification
KW  - intelligent surveillance systems
AB  - In the era of rapid technological advancement, the demand for sophisticated Multi-Object Tracking (MOT) systems in applications such as intelligent surveillance and autonomous navigation has become increasingly critical.~However, existing models often struggle with accuracy and efficiency in densely populated or dynamically complex environments. Addressing these challenges, we introduce a novel deep learning-based MOT model that incorporates the latest CT-DETR detection technology and an advanced ReID module for improved pedestrian tracking. Experimental results demonstrate the model's superior performance in accurately identifying and tracking multiple targets across varied scenarios, significantly outperforming existing benchmarks.~This research not only marks a significant leap forward in the field of video surveillance technology but also lays a foundational framework for future advancements in intelligent system applications, underscoring the importance of innovation in deep learning methodologies for real-world challenges.
SN  - 3068-6652
PB  - Institute of Central Computation and Knowledge
LA  - English
ER  -

BibTeX Format

Compatible with LaTeX, BibTeX, and other reference managers

@article{Gao2024CTDETR,
  author = {Ming Gao and Shixin Yang},
  title = {CT-DETR and ReID-Guided Multi-Target Tracking Algorithm in Complex Scenes},
  journal = {ICCK Transactions on Emerging Topics in Artificial Intelligence},
  year = {2024},
  volume = {1},
  number = {1},
  pages = {44-57},
  doi = {10.62762/TETAI.2024.240529},
  url = {https://www.icck.org/article/abs/TETAI.2024.240529},
  abstract = {In the era of rapid technological advancement, the demand for sophisticated Multi-Object Tracking (MOT) systems in applications such as intelligent surveillance and autonomous navigation has become increasingly critical.~However, existing models often struggle with accuracy and efficiency in densely populated or dynamically complex environments. Addressing these challenges, we introduce a novel deep learning-based MOT model that incorporates the latest CT-DETR detection technology and an advanced ReID module for improved pedestrian tracking. Experimental results demonstrate the model's superior performance in accurately identifying and tracking multiple targets across varied scenarios, significantly outperforming existing benchmarks.~This research not only marks a significant leap forward in the field of video surveillance technology but also lays a foundational framework for future advancements in intelligent system applications, underscoring the importance of innovation in deep learning methodologies for real-world challenges.},
  keywords = {multi-object tracking, deep learning, CT-DETR, pedestrian re-identification, intelligent surveillance systems},
  issn = {3068-6652},
  publisher = {Institute of Central Computation and Knowledge}
}

Article Metrics

Citations

Crossref

3

Scopus

4

Views

8335

PDF Downloads

732

Publisher's Note

ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions

Copyright © 2024 by the Author(s). Published by Institute of Central Computation and Knowledge. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

ICCK Transactions on Emerging Topics in Artificial Intelligence

ISSN: 3068-6652 (Online)

[email protected]

Preserved at
Portico

User

Unlimited Downloads

Complete Library Access

Membership Eligibility

Community Leadership Opportunities