A Decentralised Multi-Agent DRL-based Approach for Pedestrian and Vehicle Traffic Signals Controlling Systems Optimisation

Mohammed Anis Oukebdane

doi:10.62762/TMWI.2025.878487

Article Information

Published in ICCK Transactions on Mobile and Wireless Intelligence

Volume/Issue Volume 2, Issue 1, 2026

Pages 31-43

Abstract

Urban traffic congestion is a major issue that negatively affects mobility efficiency, environmental sustainability and road safety. Many recent methods for controlling traffic signals have used methods based on deep reinforcement learning (DRL) and provided positive results. However, it focused primarily on vehicle flow and have not taken into account pedestrian dynamics due to inherent difficulty related to accurately sensing all pedestrians. As a result of these limitations, recent advances in sixth-generation (6G) localisation technology will provide new opportunities to provide precise, low-latency tracking of pedestrians at signalized intersections, allowing for improved control of pedestrian movements in urban areas. The model proposed in this paper named DRL-based pedestrian-vehicle traffic signal management (DRL-PVTSM), provides a solution to this need by providing a decentralized multi-agent DRL approach that jointly optimizes both vehicle and pedestrian movements at each intersection using independent agents each controlled by deep Q-network (DQN). The agents are provided with a pressure-based reward for optimizing vehicle and pedestrian queue densities and have created safety-penalizing rewards based on pressure from pedestrians that are waiting for the lights to change. The DRL-PVTSM framework has been designed in accordance with the principles of scalability, robustness and real-time applicability to large multi-intersection urban traffic networks. This work demonstrates in extensive simulations performed in SUMO software on multiple network traffic topologies of grid and random layouts that the DRL-PVTSM model provides statistically significant improvements in pedestrian waiting time, vehicle travel delay, and decreases in congestion mitigation and intersection-level safety indicators, thus confirming that decentralized DRL with future 6G will provide a viable method for optimizing the joint operation of pedestrian and vehicle traffic signal systems.

Graphical Abstract

A Decentralised Multi-Agent DRL-based Approach for Pedestrian and Vehicle Traffic Signals Controlling Systems Optimisation

Keywords

scalable adaptive traffic signal control decentralized multi-agent deep reinforcement learning pedestrian-vehicle coordination traffic signal optimization 6G

Data Availability Statement

Data will be made available on request.

Funding

This work was supported without any funding.

Conflicts of Interest

The authors declare no conflicts of interest.

AI Use Statement

The authors declare that no generative AI was used in the preparation of this manuscript.

Ethical Approval and Consent to Participate

Not applicable.

References

Bao, Z., Ng, S. T., Yu, G., Zhang, X., & Ou, Y. (2023). The effect of the built environment on spatial-temporal pattern of traffic congestion in a satellite city in emerging economies. Developments in the Built Environment, 14, 100173.
[CrossRef] [Google Scholar]
U.S. House Committee on Transportation and Infrastructure, Democrats. (2011, September 27). TTI Report: Cost of Congestion More Than \$100 Billion. Retrieved from https://democrats-transportation.house.gov/news/press-releases/tti-report-cost-of-congestion-more-than-100-billion
[Google Scholar]
Agarwal, I., Singh, A., Agarwal, A., Mishra, S., Satapathy, S. K., Cho, S. B., ... & Mohanty, S. N. (2024). Enhancing road safety and cybersecurity in traffic management systems: Leveraging the potential of reinforcement learning. IEEE Access, 12, 9963-9975.
[CrossRef] [Google Scholar]
Nam, G. S., Yang, Q., & Yoo, S. J. (2025). Joint Optimization of Vehicle and Pedestrian Traffic Signals Using Multi-Objective Deep Reinforcement Learning. IEEE Transactions on Intelligent Transportation Systems, 27(1), 501-520.
[CrossRef] [Google Scholar]
Telikani, A., Sarkar, A., Du, B., & Shen, J. (2024). Machine Learning for UAV-Aided ITS: A Review With Comparative Study. IEEE Transactions on Intelligent Transportation Systems, 25(11), 15388-15406.
[CrossRef] [Google Scholar]
Anis Oukebdane, M., Shahen Shah, A. F. M., Baharul Islam, M., Ekoru, J., & Madahana, M. (2025). Hybrid Model for 6G Network Traffic Prediction and Wireless Resource Optimization. IEEE Access, 13, 142129-142139.
[CrossRef] [Google Scholar]
Anis Oukebdane, M., Shahen Shah, A. F. M., Kalam Azad, A., Ekoru, J., & Madahana, M. (2025). Unraveling the Nexus of ML and 6G: Challenges, Opportunities, and Future Directions. IEEE Access, 13, 114934-114958.
[CrossRef] [Google Scholar]
Guo, Q., Li, L., & Ban, X. J. (2019). Urban traffic signal control with connected and automated vehicles: A survey. Transportation research part C: emerging technologies, 101, 313-334.
[CrossRef] [Google Scholar]
Hassan, M. A., Elhadef, M., & Khan, M. U. G. (2023). Collaborative Traffic Signal Automation Using Deep Q-Learning. IEEE Access, 11, 136015-136032.
[CrossRef] [Google Scholar]
Chen, T., Zhang, K., Giannakis, G. B., & Başar, T. (2022). Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning. IEEE Transactions on Control of Network Systems, 9(2), 917-929.
[CrossRef] [Google Scholar]
Qadri, S. S. U. H. S. M., Gökçe, M. A., & Öner, E. (2020). State-of-art review of traffic signal control methods: challenges and opportunities. European Transport Research Review, 12(1), 55.
[CrossRef] [Google Scholar]
Gregurić, M., Vujić, M., Alexopoulos, C., & Miletić, M. (2020). Application of deep reinforcement learning in traffic signal control: An overview and impact of open traffic data. Applied Sciences, 10(11), 4011.
[CrossRef] [Google Scholar]
Koch, L., Brinkmann, T., Wegener, M., Badalian, K., & Andert, J. (2023). Adaptive Traffic Light Control With Deep Reinforcement Learning: An Evaluation of Traffic Flow and Energy Consumption. IEEE Transactions on Intelligent Transportation Systems, 24(12), 15066-15076.
[CrossRef] [Google Scholar]
Haydari, A., & Yılmaz, Y. (2022). Deep Reinforcement Learning for Intelligent Transportation Systems: A Survey. IEEE Transactions on Intelligent Transportation Systems, 23(1), 11-32.
[CrossRef] [Google Scholar]
Kolat, M., Kővári, B., Bécsi, T., & Aradi, S. (2023). Multi-agent reinforcement learning for traffic signal control: A cooperative approach. Sustainability, 15(4), 3479.
[CrossRef] [Google Scholar]
Xu, K., Huang, J., Kong, L., Yu, J., & Chen, G. (2022). PV-TSC: Learning to control traffic signals for pedestrian and vehicle traffic in 6G era. IEEE Transactions on Intelligent Transportation Systems, 24(7), 7552-7563.
[CrossRef] [Google Scholar]
Li, Y., Zhao, J., Zhang, G., & Shen, J. (2026). DRL-Based Robust Adaptive Traffic Signal Control with Low V2I Penetration Rate under Mixed Environment. IEEE Transactions on Vehicular Technology.
[CrossRef] [Google Scholar]
Chen, X., Wang, X., Zhao, W., Wang, C., Cheng, S., & Luan, Z. (2025). Hierarchical deep reinforcement learning based multi-agent game control for energy consumption and traffic efficiency improving of autonomous vehicles. Energy, 323, 135669.
[CrossRef] [Google Scholar]
Pan, T. (2023). Traffic light control with reinforcement learning. arXiv preprint arXiv:2308.14295.
[Google Scholar]
Bernárdez, G., Suárez-Varela, J., López, A., Shi, X., Xiao, S., Cheng, X., ... & Cabellos-Aparicio, A. (2023). Magnneto: A graph neural network-based multi-agent system for traffic engineering. IEEE Transactions on Cognitive Communications and Networking, 9(2), 494-506.
[CrossRef] [Google Scholar]
Sunil, R., Mer, P., Parmar, P., & Khan, N. (2025). Fusion of Emerging Technologies for 6G-Enabled Secure Smart City. Security Paradigms in 6G Smart Cities and IoT Ecosystems, 229-251.
[Google Scholar]
Eclipse Foundation. (n.d.). Eclipse SUMO: Simulation of Urban Mobility. Retrieved December 27, 2025, from https://www.eclipse.dev/sumo/
[Google Scholar]
German Aerospace Center (DLR). (n.d.). SUMO User Documentation. Retrieved October 15, 2025, from https://sumo.dlr.de/docs/
[Google Scholar]
Vinitsky, E., Kreidieh, A., Le Flem, L., Kheterpal, N., Jang, K., Wu, C., ... & Bayen, A. M. (2018, October). Benchmarks for reinforcement learning in mixed-autonomy traffic. In Conference on robot learning (pp. 399-409). PMLR.
[Google Scholar]
Abdulhai, B., Pringle, R., & Karakoulas, G. J. (2003). Reinforcement Learning for Adaptive Traffic Signal Control. Journal of Transportation Engineering, 129(3), 278-285.
[CrossRef] [Google Scholar]
Tataria, H., Shafi, M., Molisch, A. F., Dohler, M., Sjöland, H., & Tufvesson, F. (2021). 6G wireless systems: Vision, requirements, challenges, insights, and opportunities. Proceedings of the IEEE, 109(7), 1166-1199.
[CrossRef] [Google Scholar]
Liang, X., Du, X., Wang, G., & Han, Z. (2019). A deep q learning network for traffic lights’ cycle control in vehicular networks. IEEE Transactions on Vehicular Technology, 68(2), 1243-1253.
[CrossRef] [Google Scholar]
Shirazi, M. S., & Morris, B. T. (2016). Vision-Based Vehicle and Pedestrian Tracking of Intersection Videos. International Journal on Artificial Intelligence Tools, 25(05), 1640004.
[CrossRef] [Google Scholar]
Panahi, F. H., & Panahi, F. H. (2024). Unmanned aerial vehicles toward intelligent transportation systems. Interconnected Modern Multi‐Energy Networks and Intelligent Transportation Systems: Towards a Green Economy and Sustainable Development, 379-399.
[CrossRef] [Google Scholar]
Oukebdane, M. A., & Shahen Shah, A. F. M. (2025). Computer Vision-Powered 6G Networks: Technologies, Applications, and Challenges. ICCK Transactions on Mobile and Wireless Intelligence, 1(1), 19-31.
[CrossRef] [Google Scholar]
Akyildiz, I. F., Kak, A., & Nie, S. (2020). 6G and Beyond: The Future of Wireless Communications Systems. IEEE Access, 8, 133995-134030.
[CrossRef] [Google Scholar]
Rappaport, T. S., Xing, Y., Kanhere, O., Ju, S., Madanayake, A., Mandal, S., ... & Trichopoulos, G. C. (2019). Wireless communications and applications above 100 GHz: Opportunities and challenges for 6G and beyond. IEEE Access, 7, 78729-78757.
[CrossRef] [Google Scholar]
Lopez, P. A., Behrisch, M., Bieker-Walz, L., Erdmann, J., Flötteröd, Y. P., Hilbrich, R., ... & Wießner, E. (2018, November). Microscopic traffic simulation using sumo. In 2018 21st international conference on intelligent transportation systems (ITSC) (pp. 2575-2582). IEEE.
[CrossRef] [Google Scholar]
Behrisch, M., Bieker, L., Erdmann, J., & Krajzewicz, D. (2011). SUMO–simulation of urban mobility: an overview. In Proceedings of SIMUL 2011, the third international conference on advances in system simulation. ThinkMind.
[Google Scholar]
Zhang, G., Huang, H., & Chang, F. (2025). TS-PVL: Two-Stage Deep Reinforcement Learning-Based Traffic Light With Pedestrian-Vehicle Control in Mixed-Autonomy Traffic. IEEE Internet of Things Journal, 12(15), 31001-31014.
[CrossRef] [Google Scholar]
Wang, K., Shen, Z., Lei, Z., Liu, X., & Zhang, T. (2025). Toward Multi-Agent Reinforcement Learning Based Traffic Signal Control Through Spatio-Temporal Hypergraphs. IEEE Transactions on Mobile Computing, 24(9), 8258–8271.
[CrossRef] [Google Scholar]
Jiang, S., Huang, Y., Jafari, M., & Jalayer, M. (2022). A Distributed Multi-Agent Reinforcement Learning With Graph Decomposition Approach for Large-Scale Adaptive Traffic Signal Control. IEEE Transactions on Intelligent Transportation Systems, 23(9), 14689-14701.
[CrossRef] [Google Scholar]

Cite This Article

APA Style

Oukebdane, M. A. (2026). A Decentralised Multi-Agent DRL-based Approach for Pedestrian and Vehicle Traffic Signals Controlling Systems Optimisation. ICCK Transactions on Mobile and Wireless Intelligence, 2(1), 31–43. https://doi.org/10.62762/TMWI.2025.878487

Export Citation

RIS Format

Compatible with EndNote, Zotero, Mendeley, and other reference managers

TY  - JOUR
AU  - Oukebdane, Mohammed Anis
PY  - 2026
DA  - 2026/03/21
TI  - A Decentralised Multi-Agent DRL-based Approach for Pedestrian and Vehicle Traffic Signals Controlling Systems Optimisation
JO  - ICCK Transactions on Mobile and Wireless Intelligence
T2  - ICCK Transactions on Mobile and Wireless Intelligence
JF  - ICCK Transactions on Mobile and Wireless Intelligence
VL  - 2
IS  - 1
SP  - 31
EP  - 43
DO  - 10.62762/TMWI.2025.878487
UR  - https://www.icck.org/article/abs/TMWI.2025.878487
KW  - scalable adaptive traffic signal control
KW  - decentralized multi-agent deep reinforcement learning
KW  - pedestrian-vehicle coordination
KW  - traffic signal optimization
KW  - 6G
AB  - Urban traffic congestion is a major issue that negatively affects mobility efficiency, environmental sustainability and road safety. Many recent methods for controlling traffic signals have used methods based on deep reinforcement learning (DRL) and provided positive results. However, it focused primarily on vehicle flow and have not taken into account pedestrian dynamics due to inherent difficulty related to accurately sensing all pedestrians. As a result of these limitations, recent advances in sixth-generation (6G) localisation technology will provide new opportunities to provide precise, low-latency tracking of pedestrians at signalized intersections, allowing for improved control of pedestrian movements in urban areas. The model proposed in this paper named DRL-based pedestrian-vehicle traffic signal management (DRL-PVTSM), provides a solution to this need by providing a decentralized multi-agent DRL approach that jointly optimizes both vehicle and pedestrian movements at each intersection using independent agents each controlled by deep Q-network (DQN). The agents are provided with a pressure-based reward for optimizing vehicle and pedestrian queue densities and have created safety-penalizing rewards based on pressure from pedestrians that are waiting for the lights to change. The DRL-PVTSM framework has been designed in accordance with the principles of scalability, robustness and real-time applicability to large multi-intersection urban traffic networks. This work demonstrates in extensive simulations performed in SUMO software on multiple network traffic topologies of grid and random layouts that the DRL-PVTSM model provides statistically significant improvements in pedestrian waiting time, vehicle travel delay, and decreases in congestion mitigation and intersection-level safety indicators, thus confirming that decentralized DRL with future 6G will provide a viable method for optimizing the joint operation of pedestrian and vehicle traffic signal systems.
SN  - 3069-0692
PB  - Institute of Central Computation and Knowledge
LA  - English
ER  -

BibTeX Format

Compatible with LaTeX, BibTeX, and other reference managers

@article{Oukebdane2026A,
  author = {Mohammed Anis Oukebdane},
  title = {A Decentralised Multi-Agent DRL-based Approach for Pedestrian and Vehicle Traffic Signals Controlling Systems Optimisation},
  journal = {ICCK Transactions on Mobile and Wireless Intelligence},
  year = {2026},
  volume = {2},
  number = {1},
  pages = {31-43},
  doi = {10.62762/TMWI.2025.878487},
  url = {https://www.icck.org/article/abs/TMWI.2025.878487},
  abstract = {Urban traffic congestion is a major issue that negatively affects mobility efficiency, environmental sustainability and road safety. Many recent methods for controlling traffic signals have used methods based on deep reinforcement learning (DRL) and provided positive results. However, it focused primarily on vehicle flow and have not taken into account pedestrian dynamics due to inherent difficulty related to accurately sensing all pedestrians. As a result of these limitations, recent advances in sixth-generation (6G) localisation technology will provide new opportunities to provide precise, low-latency tracking of pedestrians at signalized intersections, allowing for improved control of pedestrian movements in urban areas. The model proposed in this paper named DRL-based pedestrian-vehicle traffic signal management (DRL-PVTSM), provides a solution to this need by providing a decentralized multi-agent DRL approach that jointly optimizes both vehicle and pedestrian movements at each intersection using independent agents each controlled by deep Q-network (DQN). The agents are provided with a pressure-based reward for optimizing vehicle and pedestrian queue densities and have created safety-penalizing rewards based on pressure from pedestrians that are waiting for the lights to change. The DRL-PVTSM framework has been designed in accordance with the principles of scalability, robustness and real-time applicability to large multi-intersection urban traffic networks. This work demonstrates in extensive simulations performed in SUMO software on multiple network traffic topologies of grid and random layouts that the DRL-PVTSM model provides statistically significant improvements in pedestrian waiting time, vehicle travel delay, and decreases in congestion mitigation and intersection-level safety indicators, thus confirming that decentralized DRL with future 6G will provide a viable method for optimizing the joint operation of pedestrian and vehicle traffic signal systems.},
  keywords = {scalable adaptive traffic signal control, decentralized multi-agent deep reinforcement learning, pedestrian-vehicle coordination, traffic signal optimization, 6G},
  issn = {3069-0692},
  publisher = {Institute of Central Computation and Knowledge}
}

Article Metrics

Citations

Google Scholar

0

Crossref

0

Scopus

0

Web of Science

0

Views

19

PDF Downloads

7

Publisher's Note

ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions

Institute of Central Computation and Knowledge (ICCK) or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

ICCK Transactions on Mobile and Wireless Intelligence

ISSN: 3069-0692 (Online)

[email protected]

Preserved at
Portico

User

Unlimited Downloads

Complete Library Access

Membership Eligibility

Community Leadership Opportunities