A Decentralised Multi-Agent DRL-based Approach for Pedestrian and Vehicle Traffic Signals Controlling Systems Optimisation
Article Information
Abstract
Urban traffic congestion is a major issue that negatively affects mobility efficiency, environmental sustainability and road safety. Many recent methods for controlling traffic signals have used methods based on deep reinforcement learning (DRL) and provided positive results. However, it focused primarily on vehicle flow and have not taken into account pedestrian dynamics due to inherent difficulty related to accurately sensing all pedestrians. As a result of these limitations, recent advances in sixth-generation (6G) localisation technology will provide new opportunities to provide precise, low-latency tracking of pedestrians at signalized intersections, allowing for improved control of pedestrian movements in urban areas. The model proposed in this paper named DRL-based pedestrian-vehicle traffic signal management (DRL-PVTSM), provides a solution to this need by providing a decentralized multi-agent DRL approach that jointly optimizes both vehicle and pedestrian movements at each intersection using independent agents each controlled by deep Q-network (DQN). The agents are provided with a pressure-based reward for optimizing vehicle and pedestrian queue densities and have created safety-penalizing rewards based on pressure from pedestrians that are waiting for the lights to change. The DRL-PVTSM framework has been designed in accordance with the principles of scalability, robustness and real-time applicability to large multi-intersection urban traffic networks. This work demonstrates in extensive simulations performed in SUMO software on multiple network traffic topologies of grid and random layouts that the DRL-PVTSM model provides statistically significant improvements in pedestrian waiting time, vehicle travel delay, and decreases in congestion mitigation and intersection-level safety indicators, thus confirming that decentralized DRL with future 6G will provide a viable method for optimizing the joint operation of pedestrian and vehicle traffic signal systems.
Graphical Abstract
Keywords
Data Availability Statement
Funding
Conflicts of Interest
AI Use Statement
Ethical Approval and Consent to Participate
References
- Bao, Z., Ng, S. T., Yu, G., Zhang, X., & Ou, Y. (2023). The effect of the built environment on spatial-temporal pattern of traffic congestion in a satellite city in emerging economies. Developments in the Built Environment, 14, 100173.
[CrossRef] [Google Scholar] - U.S. House Committee on Transportation and Infrastructure, Democrats. (2011, September 27). TTI Report: Cost of Congestion More Than \$100 Billion. Retrieved from https://democrats-transportation.house.gov/news/press-releases/tti-report-cost-of-congestion-more-than-100-billion
[Google Scholar] - Agarwal, I., Singh, A., Agarwal, A., Mishra, S., Satapathy, S. K., Cho, S. B., ... & Mohanty, S. N. (2024). Enhancing road safety and cybersecurity in traffic management systems: Leveraging the potential of reinforcement learning. IEEE Access, 12, 9963-9975.
[CrossRef] [Google Scholar] - Nam, G. S., Yang, Q., & Yoo, S. J. (2025). Joint Optimization of Vehicle and Pedestrian Traffic Signals Using Multi-Objective Deep Reinforcement Learning. IEEE Transactions on Intelligent Transportation Systems, 27(1), 501-520.
[CrossRef] [Google Scholar] - Telikani, A., Sarkar, A., Du, B., & Shen, J. (2024). Machine Learning for UAV-Aided ITS: A Review With Comparative Study. IEEE Transactions on Intelligent Transportation Systems, 25(11), 15388-15406.
[CrossRef] [Google Scholar] - Anis Oukebdane, M., Shahen Shah, A. F. M., Baharul Islam, M., Ekoru, J., & Madahana, M. (2025). Hybrid Model for 6G Network Traffic Prediction and Wireless Resource Optimization. IEEE Access, 13, 142129-142139.
[CrossRef] [Google Scholar] - Anis Oukebdane, M., Shahen Shah, A. F. M., Kalam Azad, A., Ekoru, J., & Madahana, M. (2025). Unraveling the Nexus of ML and 6G: Challenges, Opportunities, and Future Directions. IEEE Access, 13, 114934-114958.
[CrossRef] [Google Scholar] - Guo, Q., Li, L., & Ban, X. J. (2019). Urban traffic signal control with connected and automated vehicles: A survey. Transportation research part C: emerging technologies, 101, 313-334.
[CrossRef] [Google Scholar] - Hassan, M. A., Elhadef, M., & Khan, M. U. G. (2023). Collaborative Traffic Signal Automation Using Deep Q-Learning. IEEE Access, 11, 136015-136032.
[CrossRef] [Google Scholar] - Chen, T., Zhang, K., Giannakis, G. B., & Başar, T. (2022). Communication-Efficient Policy Gradient Methods for Distributed Reinforcement Learning. IEEE Transactions on Control of Network Systems, 9(2), 917-929.
[CrossRef] [Google Scholar] - Qadri, S. S. U. H. S. M., Gökçe, M. A., & Öner, E. (2020). State-of-art review of traffic signal control methods: challenges and opportunities. European Transport Research Review, 12(1), 55.
[CrossRef] [Google Scholar] - Gregurić, M., Vujić, M., Alexopoulos, C., & Miletić, M. (2020). Application of deep reinforcement learning in traffic signal control: An overview and impact of open traffic data. Applied Sciences, 10(11), 4011.
[CrossRef] [Google Scholar] - Koch, L., Brinkmann, T., Wegener, M., Badalian, K., & Andert, J. (2023). Adaptive Traffic Light Control With Deep Reinforcement Learning: An Evaluation of Traffic Flow and Energy Consumption. IEEE Transactions on Intelligent Transportation Systems, 24(12), 15066-15076.
[CrossRef] [Google Scholar] - Haydari, A., & Yılmaz, Y. (2022). Deep Reinforcement Learning for Intelligent Transportation Systems: A Survey. IEEE Transactions on Intelligent Transportation Systems, 23(1), 11-32.
[CrossRef] [Google Scholar] - Kolat, M., Kővári, B., Bécsi, T., & Aradi, S. (2023). Multi-agent reinforcement learning for traffic signal control: A cooperative approach. Sustainability, 15(4), 3479.
[CrossRef] [Google Scholar] - Xu, K., Huang, J., Kong, L., Yu, J., & Chen, G. (2022). PV-TSC: Learning to control traffic signals for pedestrian and vehicle traffic in 6G era. IEEE Transactions on Intelligent Transportation Systems, 24(7), 7552-7563.
[CrossRef] [Google Scholar] - Li, Y., Zhao, J., Zhang, G., & Shen, J. (2026). DRL-Based Robust Adaptive Traffic Signal Control with Low V2I Penetration Rate under Mixed Environment. IEEE Transactions on Vehicular Technology.
[CrossRef] [Google Scholar] - Chen, X., Wang, X., Zhao, W., Wang, C., Cheng, S., & Luan, Z. (2025). Hierarchical deep reinforcement learning based multi-agent game control for energy consumption and traffic efficiency improving of autonomous vehicles. Energy, 323, 135669.
[CrossRef] [Google Scholar] - Pan, T. (2023). Traffic light control with reinforcement learning. arXiv preprint arXiv:2308.14295.
[Google Scholar] - Bernárdez, G., Suárez-Varela, J., López, A., Shi, X., Xiao, S., Cheng, X., ... & Cabellos-Aparicio, A. (2023). Magnneto: A graph neural network-based multi-agent system for traffic engineering. IEEE Transactions on Cognitive Communications and Networking, 9(2), 494-506.
[CrossRef] [Google Scholar] - Sunil, R., Mer, P., Parmar, P., & Khan, N. (2025). Fusion of Emerging Technologies for 6G-Enabled Secure Smart City. Security Paradigms in 6G Smart Cities and IoT Ecosystems, 229-251.
[Google Scholar] - Eclipse Foundation. (n.d.). Eclipse SUMO: Simulation of Urban Mobility. Retrieved December 27, 2025, from https://www.eclipse.dev/sumo/
[Google Scholar] - German Aerospace Center (DLR). (n.d.). SUMO User Documentation. Retrieved October 15, 2025, from https://sumo.dlr.de/docs/
[Google Scholar] - Vinitsky, E., Kreidieh, A., Le Flem, L., Kheterpal, N., Jang, K., Wu, C., ... & Bayen, A. M. (2018, October). Benchmarks for reinforcement learning in mixed-autonomy traffic. In Conference on robot learning (pp. 399-409). PMLR.
[Google Scholar] - Abdulhai, B., Pringle, R., & Karakoulas, G. J. (2003). Reinforcement Learning for Adaptive Traffic Signal Control. Journal of Transportation Engineering, 129(3), 278-285.
[CrossRef] [Google Scholar] - Tataria, H., Shafi, M., Molisch, A. F., Dohler, M., Sjöland, H., & Tufvesson, F. (2021). 6G wireless systems: Vision, requirements, challenges, insights, and opportunities. Proceedings of the IEEE, 109(7), 1166-1199.
[CrossRef] [Google Scholar] - Liang, X., Du, X., Wang, G., & Han, Z. (2019). A deep q learning network for traffic lights’ cycle control in vehicular networks. IEEE Transactions on Vehicular Technology, 68(2), 1243-1253.
[CrossRef] [Google Scholar] - Shirazi, M. S., & Morris, B. T. (2016). Vision-Based Vehicle and Pedestrian Tracking of Intersection Videos. International Journal on Artificial Intelligence Tools, 25(05), 1640004.
[CrossRef] [Google Scholar] - Panahi, F. H., & Panahi, F. H. (2024). Unmanned aerial vehicles toward intelligent transportation systems. Interconnected Modern Multi‐Energy Networks and Intelligent Transportation Systems: Towards a Green Economy and Sustainable Development, 379-399.
[CrossRef] [Google Scholar] - Oukebdane, M. A., & Shahen Shah, A. F. M. (2025). Computer Vision-Powered 6G Networks: Technologies, Applications, and Challenges. ICCK Transactions on Mobile and Wireless Intelligence, 1(1), 19-31.
[CrossRef] [Google Scholar] - Akyildiz, I. F., Kak, A., & Nie, S. (2020). 6G and Beyond: The Future of Wireless Communications Systems. IEEE Access, 8, 133995-134030.
[CrossRef] [Google Scholar] - Rappaport, T. S., Xing, Y., Kanhere, O., Ju, S., Madanayake, A., Mandal, S., ... & Trichopoulos, G. C. (2019). Wireless communications and applications above 100 GHz: Opportunities and challenges for 6G and beyond. IEEE Access, 7, 78729-78757.
[CrossRef] [Google Scholar] - Lopez, P. A., Behrisch, M., Bieker-Walz, L., Erdmann, J., Flötteröd, Y. P., Hilbrich, R., ... & Wießner, E. (2018, November). Microscopic traffic simulation using sumo. In 2018 21st international conference on intelligent transportation systems (ITSC) (pp. 2575-2582). IEEE.
[CrossRef] [Google Scholar] - Behrisch, M., Bieker, L., Erdmann, J., & Krajzewicz, D. (2011). SUMO–simulation of urban mobility: an overview. In Proceedings of SIMUL 2011, the third international conference on advances in system simulation. ThinkMind.
[Google Scholar] - Zhang, G., Huang, H., & Chang, F. (2025). TS-PVL: Two-Stage Deep Reinforcement Learning-Based Traffic Light With Pedestrian-Vehicle Control in Mixed-Autonomy Traffic. IEEE Internet of Things Journal, 12(15), 31001-31014.
[CrossRef] [Google Scholar] - Wang, K., Shen, Z., Lei, Z., Liu, X., & Zhang, T. (2025). Toward Multi-Agent Reinforcement Learning Based Traffic Signal Control Through Spatio-Temporal Hypergraphs. IEEE Transactions on Mobile Computing, 24(9), 8258–8271.
[CrossRef] [Google Scholar] - Jiang, S., Huang, Y., Jafari, M., & Jalayer, M. (2022). A Distributed Multi-Agent Reinforcement Learning With Graph Decomposition Approach for Large-Scale Adaptive Traffic Signal Control. IEEE Transactions on Intelligent Transportation Systems, 23(9), 14689-14701.
[CrossRef] [Google Scholar]
Cite This Article
TY - JOUR AU - Oukebdane, Mohammed Anis PY - 2026 DA - 2026/03/21 TI - A Decentralised Multi-Agent DRL-based Approach for Pedestrian and Vehicle Traffic Signals Controlling Systems Optimisation JO - ICCK Transactions on Mobile and Wireless Intelligence T2 - ICCK Transactions on Mobile and Wireless Intelligence JF - ICCK Transactions on Mobile and Wireless Intelligence VL - 2 IS - 1 SP - 31 EP - 43 DO - 10.62762/TMWI.2025.878487 UR - https://www.icck.org/article/abs/TMWI.2025.878487 KW - scalable adaptive traffic signal control KW - decentralized multi-agent deep reinforcement learning KW - pedestrian-vehicle coordination KW - traffic signal optimization KW - 6G AB - Urban traffic congestion is a major issue that negatively affects mobility efficiency, environmental sustainability and road safety. Many recent methods for controlling traffic signals have used methods based on deep reinforcement learning (DRL) and provided positive results. However, it focused primarily on vehicle flow and have not taken into account pedestrian dynamics due to inherent difficulty related to accurately sensing all pedestrians. As a result of these limitations, recent advances in sixth-generation (6G) localisation technology will provide new opportunities to provide precise, low-latency tracking of pedestrians at signalized intersections, allowing for improved control of pedestrian movements in urban areas. The model proposed in this paper named DRL-based pedestrian-vehicle traffic signal management (DRL-PVTSM), provides a solution to this need by providing a decentralized multi-agent DRL approach that jointly optimizes both vehicle and pedestrian movements at each intersection using independent agents each controlled by deep Q-network (DQN). The agents are provided with a pressure-based reward for optimizing vehicle and pedestrian queue densities and have created safety-penalizing rewards based on pressure from pedestrians that are waiting for the lights to change. The DRL-PVTSM framework has been designed in accordance with the principles of scalability, robustness and real-time applicability to large multi-intersection urban traffic networks. This work demonstrates in extensive simulations performed in SUMO software on multiple network traffic topologies of grid and random layouts that the DRL-PVTSM model provides statistically significant improvements in pedestrian waiting time, vehicle travel delay, and decreases in congestion mitigation and intersection-level safety indicators, thus confirming that decentralized DRL with future 6G will provide a viable method for optimizing the joint operation of pedestrian and vehicle traffic signal systems. SN - 3069-0692 PB - Institute of Central Computation and Knowledge LA - English ER -
@article{Oukebdane2026A,
author = {Mohammed Anis Oukebdane},
title = {A Decentralised Multi-Agent DRL-based Approach for Pedestrian and Vehicle Traffic Signals Controlling Systems Optimisation},
journal = {ICCK Transactions on Mobile and Wireless Intelligence},
year = {2026},
volume = {2},
number = {1},
pages = {31-43},
doi = {10.62762/TMWI.2025.878487},
url = {https://www.icck.org/article/abs/TMWI.2025.878487},
abstract = {Urban traffic congestion is a major issue that negatively affects mobility efficiency, environmental sustainability and road safety. Many recent methods for controlling traffic signals have used methods based on deep reinforcement learning (DRL) and provided positive results. However, it focused primarily on vehicle flow and have not taken into account pedestrian dynamics due to inherent difficulty related to accurately sensing all pedestrians. As a result of these limitations, recent advances in sixth-generation (6G) localisation technology will provide new opportunities to provide precise, low-latency tracking of pedestrians at signalized intersections, allowing for improved control of pedestrian movements in urban areas. The model proposed in this paper named DRL-based pedestrian-vehicle traffic signal management (DRL-PVTSM), provides a solution to this need by providing a decentralized multi-agent DRL approach that jointly optimizes both vehicle and pedestrian movements at each intersection using independent agents each controlled by deep Q-network (DQN). The agents are provided with a pressure-based reward for optimizing vehicle and pedestrian queue densities and have created safety-penalizing rewards based on pressure from pedestrians that are waiting for the lights to change. The DRL-PVTSM framework has been designed in accordance with the principles of scalability, robustness and real-time applicability to large multi-intersection urban traffic networks. This work demonstrates in extensive simulations performed in SUMO software on multiple network traffic topologies of grid and random layouts that the DRL-PVTSM model provides statistically significant improvements in pedestrian waiting time, vehicle travel delay, and decreases in congestion mitigation and intersection-level safety indicators, thus confirming that decentralized DRL with future 6G will provide a viable method for optimizing the joint operation of pedestrian and vehicle traffic signal systems.},
keywords = {scalable adaptive traffic signal control, decentralized multi-agent deep reinforcement learning, pedestrian-vehicle coordination, traffic signal optimization, 6G},
issn = {3069-0692},
publisher = {Institute of Central Computation and Knowledge}
}
Article Metrics
Publisher's Note
ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and Permissions
Portico