Privacy-Preserving Federated Learning for IoT Botnet Detection: A Federated Averaging Approach
Research Article  ·  Published: 20 May 2025
Issue cover
ICCK Transactions on Machine Intelligence
Volume 1, Issue 1, 2025: 6-16
Research Article Free to Read

Privacy-Preserving Federated Learning for IoT Botnet Detection: A Federated Averaging Approach

1 University of Colorado Boulder, Boulder, CO 80309, United States
Corresponding Author: Praveen Kumar Myakala, [email protected]
Volume 1, Issue 1

Abstract

Traditional centralized machine learning approaches for IoT botnet detection pose significant privacy risks, as they require transmitting sensitive device data to a central server. This study presents a privacy-preserving Federated Learning (FL) approach that employs Federated Averaging (FedAvg) to detect prevalent botnet attacks, such as Mirai and Gafgyt, while ensuring that raw data remain on local IoT devices. Using the N-BaIoT dataset, which contains real-world benign and malicious traffic, we evaluated both the IID and non-IID data distributions to assess the effects of decentralized training. Our approach achieved 97.5% accuracy in IID and 95.2% in highly skewed non-IID scenarios, closely matching centralized learning performance while preserving privacy. Additionally, communication optimization techniques—Top-20% gradient sparsification and 8-bit quantization—reduce communication overhead by up to 80%, significantly enhancing the efficiency. Our convergence analysis further shows that FedAvg remains effective under non-IID conditions, thereby demonstrating its robustness for real-world deployments. These results demonstrate that FL provides a scalable and privacy-preserving solution for securing IoT networks against botnet threats.

Graphical Abstract

Privacy-Preserving Federated Learning for IoT Botnet Detection: A Federated Averaging Approach

Keywords

federated learning federated averaging (FedAvg) privacy-preserving machine learning IoT Security botnet detection edge AI

Data Availability Statement

Data will be made available on request.

Funding

This work was supported without any funding.

Conflicts of Interest

The authors declare no conflicts of interest.

Ethical Approval and Consent to Participate

Not applicable.

References

  1. Sengupta, S., Ruj, S., & Bit, S. D. (2020). A comprehensive survey on attacks, security issues and detection mechanisms for IoT devices. Journal of Network and Computer Applications, 149, 102481.
    [CrossRef] [Google Scholar]
  2. Sicari, S., Rizzardi, A., Grieco, L. A., & Coen-Porisini, A. (2015). Security, privacy and trust in Internet of Things: The road ahead. Computer Networks, 76, 146–164.
    [CrossRef] [Google Scholar]
  3. Naayini, P., Myakala, P. K., & Bura, C. (2025). How ai is reshaping the cybersecurity landscape. Iconic Research And Engineering Journals. https://ssrn.com/abstract=5138207
    [Google Scholar]
  4. Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., & Zhang, L. (2016). Deep learning with differential privacy. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 308–318.
    [CrossRef] [Google Scholar]
  5. Konečný, J., McMahan, H. B., Yu, F. X., Richtárik, P., Suresh, A. T., & Bacon, D. (2017). Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492.
    [CrossRef] [Google Scholar]
  6. Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., ... & Zhao, S. (2021). Advances and open problems in federated learning. Foundations and Trends in Machine Learning, 14(1–2), 1–210.
    [CrossRef] [Google Scholar]
  7. McMahan, H. B., Moore, E., Ramage, D., Hampson, S., & Avestimehr, S. (2017). Communication-efficient learning of deep networks from decentralized data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), 54, 1273–1282. https://arxiv.org/abs/1602.05629
    [Google Scholar]
  8. Thomas, S. G., & Myakala, P. K. (2025). Beyond the Cloud: Federated Learning and Edge AI for the Next Decade. Journal of Computer and Communications, 13(2), 37-50.
    [CrossRef] [Google Scholar]
  9. Myakala, P. K., Jonnalagadda, A. K., & Bura, C. (2024). Federated learning and data privacy: A review of challenges and opportunities. International Journal of Research Publication and Reviews, 5(12), 10-55248.
    [CrossRef] [Google Scholar]
  10. McMahan, H. B., Ramage, D., Talwar, K., & Zhang, L. (2018). Learning differentially private recurrent language models. International Conference on Learning Representations (ICLR). https://arxiv.org/abs/1710.06963
    [Google Scholar]
  11. Meidan, Y., Bohadana, M., Mathov, Y., Mirsky, Y., Breitenbacher, D., & Shabtai, A. (2018). detection\_of\_IoT\_botnet\_attacks\_N\_BaIoT [Dataset]. UCI Machine Learning Repository. Kaggle.
    [CrossRef] [Google Scholar]
  12. Meidan, Y., Bohadana, M., Shabtai, A., Ochoa, M., Tippenhauer, N. O., & Elovici, Y. (2018). N-BaIoT—Network-based detection of IoT botnet attacks using deep autoencoders. IEEE Pervasive Computing, 17(3), 12–22.
    [CrossRef] [Google Scholar]
  13. Sheller, M. J., Reina, G. A., Edwards, B., Martin, J., & Bakas, S. (2019). Multi-institutional deep learning modeling without sharing patient data: A feasibility study on brain tumor segmentation. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 4th International Workshop, BrainLes 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Revised Selected Papers, Part I 4 (pp. 92-104). Springer International Publishing. http://doi.org/10.1007/978-3-030-11723-8_9
    [Google Scholar]
  14. Long, G., Tan, Y., Jiang, J., & Zhang, C. (2020). Federated learning for open banking. In Federated learning: privacy and incentive (pp. 240-254). Cham: Springer International Publishing.
    [CrossRef] [Google Scholar]
  15. Dong, X., Hu, J., & Cui, Y. (2018, September). Overview of botnet detection based on machine learning. In 2018 3rd International Conference on Mechanical, Control and Computer Engineering (ICMCCE) (pp. 476-479). IEEE.
    [CrossRef] [Google Scholar]
  16. Kamatala, S. Federated Learning with Transformers: Privacy Preserving AI at Scale. International Journal of Computer Techniques. https://ijctjournal.org/ijct-current-issue/federated-learning-transformers-privacy-preserving-ai/
    [Google Scholar]
  17. Yang, Q., Liu, Y., Chen, T., & Tong, Y. (2019). Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology, 10(2), 1–19.
    [CrossRef] [Google Scholar]
  18. Sattler, F., Müller, K. R., & Samek, W. (2019). Clustered federated learning: Model-agnostic distributed multitask optimization under privacy constraints. IEEE Transactions on Neural Networks and Learning Systems, 32(8), 3219–3233.
    [CrossRef] [Google Scholar]
  19. Geyer, R. C., Klein, T., & Nabi, M. (2017). Differentially private federated learning: A client-level perspective. arXiv preprint arXiv:1712.07557. https://arxiv.org/abs/1712.07557
    [Google Scholar]
  20. Bonawitz, K., Ivanov, V., Kreuter, B., Marcedone, A., Mazzocchi, S., McMahan, H. B., ... & Zhao, F. (2017). Practical secure aggregation for privacy-preserving machine learning. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 1175–1191.
    [CrossRef] [Google Scholar]
  21. Li, T., Sahu, A. K., Talwalkar, A., & Smith, V. (2020). Federated learning: Challenges, methods, and future directions. IEEE signal processing magazine, 37(3), 50-60.
    [CrossRef] [Google Scholar]
  22. Xu, J., Glicksberg, B. S., Su, C., Walker, P., Bian, J., & Wang, F. (2021). Federated learning for healthcare informatics. Journal of healthcare informatics research, 5, 1-19.
    [CrossRef] [Google Scholar]
  23. Hodge, V. J., & Austin, J. (2018). An evaluation of classification and outlier detection algorithms. arXiv preprint arXiv:1805.00811.
    [CrossRef] [Google Scholar]
  24. Santos, L., Rabadao, C., & Gonçalves, R. (2018, June). Intrusion detection systems in Internet of Things: A literature review. In 2018 13th Iberian conference on information systems and technologies (CISTI) (pp. 1-7). IEEE.
    [CrossRef] [Google Scholar]
  25. Chalapathy, R., & Chawla, S. (2019). Deep learning for anomaly detection: A survey. arXiv preprint arXiv:1901.03407. https://arxiv.org/abs/1901.03407
    [Google Scholar]
  26. Mothukuri, V., Khare, P., Parizi, R. M., Pouriyeh, S., Dehghantanha, A., & Srivastava, G. (2021). Federated-learning-based anomaly detection for IoT security attacks. IEEE Internet of Things Journal, 9(4), 2545-2554.
    [CrossRef] [Google Scholar]
  27. Silva, L., Utimura, L., Costa, K., Silva, M., & Prado, S. (2020). Study on machine learning techniques for botnet detection. IEEE Latin America Transactions, 18(05), 881-888.
    [CrossRef] [Google Scholar]
  28. McDermott, C. D., Majdani, F., & Petrovski, A. V. (2018, July). Botnet detection in the internet of things using deep learning approaches. In 2018 international joint conference on neural networks (IJCNN) (pp. 1-8). IEEE.
    [CrossRef] [Google Scholar]
  29. Popoola, S. I. (2022). Federated deep learning for botnet attack detection in IoT networks (Doctoral dissertation, Manchester Metropolitan University). Retrieved from https://e-space.mmu.ac.uk/id/eprint/629824
    [Google Scholar]
  30. Xiong, Z., Cai, Z., Takabi, D., & Li, W. (2021). Privacy threat and defense for federated learning with non-iid data in AIoT. IEEE Transactions on Industrial Informatics, 18(2), 1310-1321.
    [CrossRef] [Google Scholar]
  31. Szynkiewicz, P. (2022). Signature-based detection of botnet DDoS attacks. In Cybersecurity of Digital Service Chains: Challenges, Methodologies, and Tools (pp. 120-135). Cham: Springer International Publishing.
    [CrossRef] [Google Scholar]
  32. Buczak, A. L., & Guven, E. (2016). A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications Surveys & Tutorials, 18(2), 1153–1176.
    [CrossRef] [Google Scholar]
  33. Li, T., Sahu, A. K., Talwalkar, A., & Smith, V. (2020). Federated optimization in heterogeneous networks. Proceedings of Machine Learning and Systems (MLSys), 2, 429–450. https://arxiv.org/abs/1812.06127
    [Google Scholar]
  34. Wang, J., Liu, Q., Liang, H., Joshi, G., & Poor, H. V. (2020). Tackling the objective inconsistency problem in heterogeneous federated optimization. Advances in Neural Information Processing Systems (NeurIPS), 33, 7611–7623. https://arxiv.org/abs/2007.07481
    [Google Scholar]

Cite This Article

APA Style
Myakala, P. K., Kamatala, S., & Bura, C. (2025). Privacy-Preserving Federated Learning for IoT Botnet Detection: A Federated Averaging Approach. ICCK Transactions on Machine Intelligence, 1(1), 6–16. https://doi.org/10.62762/TMI.2025.796490
Export Citation
RIS Format
Compatible with EndNote, Zotero, Mendeley, and other reference managers
TY  - JOUR
AU  - Myakala, Praveen Kumar
AU  - Kamatala, Srikanth
AU  - Bura, Chiranjeevi
PY  - 2025
DA  - 2025/05/20
TI  - Privacy-Preserving Federated Learning for IoT Botnet Detection: A Federated Averaging Approach
JO  - ICCK Transactions on Machine Intelligence
T2  - ICCK Transactions on Machine Intelligence
JF  - ICCK Transactions on Machine Intelligence
VL  - 1
IS  - 1
SP  - 6
EP  - 16
DO  - 10.62762/TMI.2025.796490
UR  - https://www.icck.org/article/abs/TMI.2025.796490
KW  - federated learning
KW  - federated averaging (FedAvg)
KW  - privacy-preserving machine learning
KW  - IoT Security
KW  - botnet detection
KW  - edge AI
AB  - Traditional centralized machine learning approaches for IoT botnet detection pose significant privacy risks, as they require transmitting sensitive device data to a central server. This study presents a privacy-preserving Federated Learning (FL) approach that employs Federated Averaging (FedAvg) to detect prevalent botnet attacks, such as Mirai and Gafgyt, while ensuring that raw data remain on local IoT devices. Using the N-BaIoT dataset, which contains real-world benign and malicious traffic, we evaluated both the IID and non-IID data distributions to assess the effects of decentralized training. Our approach achieved 97.5% accuracy in IID and 95.2% in highly skewed non-IID scenarios, closely matching centralized learning performance while preserving privacy. Additionally, communication optimization techniques—Top-20% gradient sparsification and 8-bit quantization—reduce communication overhead by up to 80%, significantly enhancing the efficiency. Our convergence analysis further shows that FedAvg remains effective under non-IID conditions, thereby demonstrating its robustness for real-world deployments. These results demonstrate that FL provides a scalable and privacy-preserving solution for securing IoT networks against botnet threats.
SN  - 3068-7403
PB  - Institute of Central Computation and Knowledge
LA  - English
ER  - 
BibTeX Format
Compatible with LaTeX, BibTeX, and other reference managers
@article{Myakala2025PrivacyPre,
  author = {Praveen Kumar Myakala and Srikanth Kamatala and Chiranjeevi Bura},
  title = {Privacy-Preserving Federated Learning for IoT Botnet Detection: A Federated Averaging Approach},
  journal = {ICCK Transactions on Machine Intelligence},
  year = {2025},
  volume = {1},
  number = {1},
  pages = {6-16},
  doi = {10.62762/TMI.2025.796490},
  url = {https://www.icck.org/article/abs/TMI.2025.796490},
  abstract = {Traditional centralized machine learning approaches for IoT botnet detection pose significant privacy risks, as they require transmitting sensitive device data to a central server. This study presents a privacy-preserving Federated Learning (FL) approach that employs Federated Averaging (FedAvg) to detect prevalent botnet attacks, such as Mirai and Gafgyt, while ensuring that raw data remain on local IoT devices. Using the N-BaIoT dataset, which contains real-world benign and malicious traffic, we evaluated both the IID and non-IID data distributions to assess the effects of decentralized training. Our approach achieved 97.5\% accuracy in IID and 95.2\% in highly skewed non-IID scenarios, closely matching centralized learning performance while preserving privacy. Additionally, communication optimization techniques—Top-20\% gradient sparsification and 8-bit quantization—reduce communication overhead by up to 80\%, significantly enhancing the efficiency. Our convergence analysis further shows that FedAvg remains effective under non-IID conditions, thereby demonstrating its robustness for real-world deployments. These results demonstrate that FL provides a scalable and privacy-preserving solution for securing IoT networks against botnet threats.},
  keywords = {federated learning, federated averaging (FedAvg), privacy-preserving machine learning, IoT Security, botnet detection, edge AI},
  issn = {3068-7403},
  publisher = {Institute of Central Computation and Knowledge}
}

Article Metrics

Citations
Google Scholar
10
Crossref
6
Scopus
9
Web of Science
8
Views
3995
PDF Downloads
528

Publisher's Note

ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions

Institute of Central Computation and Knowledge (ICCK) or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
ICCK Transactions on Machine Intelligence
ICCK Transactions on Machine Intelligence
ISSN: 3068-7403 (Online)
Portico
Preserved at
Portico