-
CiteScore
-
Impact Factor
Volume 1, Issue 1, Journal of Computing Intelligence
Volume 1, Issue 1, 2025
Submit Manuscript Edit a Special Issue
Article QR Code
Article QR Code
Scan the QR code for reading
Popular articles
Journal of Computing Intelligence, Volume 1, Issue 1, 2025: 3-8

Open Access | Research Article | 03 August 2025
AUDD: Audio Deepfake Detection Using Paralinguistic Feature Extraction Techniques
1 Balochistan University of Information Technology, Engineering and Management Sciences, Baleli, Quetta 87300, Pakistan
2 Benazir Bhutto Shaheed University Lyari, Karachi 75660, Sindh, Pakistan
* Corresponding Author: Raja Vavekanand, [email protected]
Received: 30 November 2024, Accepted: 27 June 2025, Published: 03 August 2025  
Abstract
This work investigates the effectiveness of incorporating paralinguistic feature extraction in audio deepfake detection models. The proposed model extracts paralinguistic features from audio clips and represents them as 1024-dimensional vector embeddings. These embeddings are then used as input for a logistic regression model, which performs binary classification to distinguish between real and deepfake audio samples. The ASVspoof2019 dataset, comprising both genuine and spoofed audio clips, is used to evaluate the model's performance. The results are assessed using evaluation metrics such as Equal Error Rate (EER) and accuracy, which provide insight into the model's effectiveness compared to state-of-the-art methods. The proposed model achieves an EER of 3.04% and an accuracy of 97.9%, indicating that paralinguistic feature extraction is a promising approach for audio deepfake detection. These results suggest that incorporating paralinguistic features can improve the performance of audio deepfake detection systems, making it a valuable tool for future research in this area. Overall, the study demonstrates the potential of paralinguistic feature extraction in enhancing the accuracy and reliability of audio deepfake detection methods.

Graphical Abstract
AUDD: Audio Deepfake Detection Using Paralinguistic Feature Extraction Techniques

Keywords
audio
deepfake
paralinguistic
deep learning

Data Availability Statement
Data will be made available on request.

Funding
This work was supported without any funding.

Conflicts of Interest
The authors declare no conflicts of interest.

Ethical Approval and Consent to Participate
Not applicable.

References
  1. Yamagishi, J., Todisco, M., Sahidullah, M., Delgado, H., Wang, X., Evans, N., ... & Nautsch, A. (2019). Asvspoof 2019: Automatic speaker verification spoofing and countermeasures challenge evaluation plan. ASV Spoof, 13.
    [Google Scholar]
  2. Shor, J., & Venugopalan, S. (2022). TRILLsson: Distilled Universal Paralinguistic Speech Representations. Interspeech 2022.
    [CrossRef]   [Google Scholar]
  3. Kaur, R., Gabrijelcic, D., & Klobucar, T. (2023). Artificial intelligence for cybersecurity: Literature review and future research directions. Information Fusion, 97, 101804.
    [CrossRef]   [Google Scholar]
  4. Chisom, O. N., Biu, P. W., Umoh, A. A., Obaedo, B. O., Adegbite, A. O., & Abatan, A. (2024). Reviewing the role of AI in environmental monitoring and conservation: A data-driven revolution for our planet. World Journal of Advanced Research and Reviews, 21(1), 161-171.
    [CrossRef]   [Google Scholar]
  5. Oladoyinbo, T. O., Olabanji, S. O., Olaniyi, O. O., Adebiyi, O. O., Okunleye, O. J., & Alao, A. I. (2024). Exploring the challenges of artificial intelligence in data integrity and its influence on social dynamics. Asian Journal of Advanced Research and Reports, 18(2), 1-23.
    [CrossRef]   [Google Scholar]
  6. Sontan, A. D., & Samuel, S. V. (2024). The intersection of artificial intelligence and cybersecurity: Challenges and opportunities. World Journal of Advanced Research and Reviews, 21(2), 1720-1736.
    [CrossRef]   [Google Scholar]
  7. Familoni, B. T. (2024). Cybersecurity challenges in the age of AI: Theoretical approaches and practical solutions. Computer Science & IT Research Journal, 5(3), 703-724.
    [CrossRef]   [Google Scholar]
  8. Khan, A., & Malik, K. M. (2023). Securing voice biometrics: One-shot learning approach for audio deepfake detection. In 2023 IEEE International Workshop on Information Forensics and Security (WIFS) (pp. 1-6). IEEE.
    [CrossRef]   [Google Scholar]
  9. Masood, M., Nawaz, M., Malik, K. M., Javed, A., Irtaza, A., & Malik, H. (2023). Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward. Applied Intelligence, 53(4), 3974-4026.
    [CrossRef]   [Google Scholar]
  10. Wang, C., Yi, J., Tao, J., Zhang, C., Zhang, S., Fu, R., & Chen, X. (2023).TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection. INTERSPEECH 2023, 3137-3141.
    [CrossRef]   [Google Scholar]
  11. Yadav, A. K. S., Bartusiak, E. R., Bhagtani, K., & Delp, E. J. (2023). Synthetic speech attribution using self supervised audio spectrogram transformer. Electronic Imaging, 35(4), 372-1-372-11.
    [CrossRef]   [Google Scholar]
  12. Hu, C., & Zhou, R. (2022). Synthetic voice spoofing detection based on online hard example mining. arXiv preprint arXiv:2209.11585.
    [Google Scholar]
  13. Zhang, Y., Lu, J., Shang, Z., Wang, W., & Zhang, P. (2024). Improving short utterance anti-spoofing with AASIST2. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 11636-11640). IEEE.
    [CrossRef]   [Google Scholar]
  14. Pastor, E., Koudounas, A., Attanasio, G., Hovy, D., & Baralis, E. (2023). Explaining Speech Classification Models via Word-Level Audio Segments and Paralinguistic Features. Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), 2221-2238.
    [CrossRef]   [Google Scholar]
  15. Wang, X., Yamagishi, J., Todisco, M., Delgado, H., Nautsch, A., Evans, N., ... & Lee, K. A. (2020). ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech. Computer Speech & Language, 64, 101114.
    [CrossRef]   [Google Scholar]
  16. Liu, T., & Yuan, X. (2023). Paralinguistic and spectral feature extraction for speech emotion classification using machine learning techniques. EURASIP Journal on Audio, Speech, and Music Processing, 2023(1), 23.
    [CrossRef]   [Google Scholar]
  17. Wang, C., Yi, J., Tao, J., Sun, H., Chen, X., Tian, Z., ... & Fu, R. (2022). Fully automated end-to-end fake audio detection. In Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia (pp. 27-33).
    [CrossRef]   [Google Scholar]
  18. Conti, E., Salvi, D., Borrelli, C., Hosler, B., Bestagini, P., Antonacci, F., ... & Tubaro, S. (2022). Deepfake speech detection through emotion recognition: A semantic approach. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 8962-8966). IEEE.
    [CrossRef]   [Google Scholar]
  19. Zhang, L., Wang, X., Cooper, E., Evans, N., & Yamagishi, J. (2023). Range-based equal error rate for spoof localization. Interspeech 2023, 3212-3216.
    [CrossRef]   [Google Scholar]
  20. Saha, S., Sahidullah, M., & Das, S. (2024). Exploring green AI for audio deepfake detection.In 2024 32nd European Signal Processing Conference (EUSIPCO), 186-190.
    [CrossRef]   [Google Scholar]
  21. Crystal, D., & Quirk, R. (2021). Systems of prosodic and paralinguistic features in English. Walter de Gruyter GmbH & Co KG.
    [CrossRef]   [Google Scholar]
  22. Bhavitha, B., Rodrigues, A. P., & Chiplunkar, N. N. (2017). Comparative study of machine learning techniques in sentimental analysis. In 2017 International Conference on Inventive Communication and Computational Technologies (ICICCT) (pp. 216-221). IEEE.
    [CrossRef]   [Google Scholar]
  23. Rana, M. S., Nobi, M. N., Murali, B., & Sung, A. H. (2022). Deepfake detection: A systematic literature review. IEEE Access, 10, 25494-25513.
    [CrossRef]   [Google Scholar]
  24. Ahsan, M. M., Mahmud, M. P., Saha, P. K., Gupta, K. D., & Siddique, Z. (2021). Effect of data scaling methods on machine learning algorithms and model performance. Technologies, 9(3), 52.
    [CrossRef]   [Google Scholar]

Cite This Article
APA Style
Ahmed, Z., Khan, G. S. A., & Vavekanand, R. (2025). AUDD: Audio Deepfake Detection Using Paralinguistic Feature Extraction Techniques. Journal of Computing Intelligence, 1(1), 3–8. https://doi.org/10.62762/JCI.2024.667518

Article Metrics
Citations:

Crossref

0

Scopus

0

Web of Science

0
Article Access Statistics:
Views: 6
PDF Downloads: 1

Publisher's Note
ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions
CC BY Copyright © 2025 by the Author(s). Published by Institute of Central Computation and Knowledge. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
Journal of Computing Intelligence

Journal of Computing Intelligence

ISSN: request pending (Online)

Email: [email protected]

Portico

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/icck/