AI vs. AI
Detecting Audio Deepfakes with Artificial Intelligence
Audio deepfakes pose an increasing threat to businesses and society. Fraunhofer AISEC researchers have shown: humans struggle to identify AI-generated voices, but specialised AI models excel. Findings from an experiment with around 500 participants can refine detection systems and training initiatives.
A suspicious call, a familiar voice can wreak havoc on a business. Deepfake attacks are skyrocketing: According to the Identity Fraud Report 2025 (in German language) by the Entrust Cybersecurity Institute, one deepfake attempt occurred worldwide every five minutes in 2024. Signicat, a leading European provider of digital identity solutions, reported in a study (in German language) a staggering 2,137% rise in such attacks on banks, insurers, and payment firms across Europe over three years.
To compare human and AI detection of manipulated voices, the Fraunhofer Institute for Applied and Integrated Security (AISEC) conducted an experiment with 472 participants. After analysing nearly 15,000 audio files: humans detected 80% of audio deepfakes, while specialised AI models achieved 95% accuracy. The study revealed differences across age groups and language proficiency, but not educational background: older individuals were more likely to be deceived than younger ones, and native speakers outperformed non-native speakers. Notably, IT professionals were no better at detecting deepfakes than non-experts.
»These insights can aid in developing effective cybersecurity training programmes and improving detection algorithms,« says Dr Nicolas Müller, a researcher in the Cognitive Security Technologies department at Fraunhofer AISEC.
Hone Your Detection Skills: Play »Spot the Deepfake« Online
To train users in identifying audio deepfakes, the research team developed the online game »Spot the Deepfake«. It is part of the free, Germany-hosted »Deepfake Total« platform, designed to raise public awareness and provide training in audio deepfake detection. Users listen to audio samples there and decide: real or fake? An integrated evaluation shows how well they performed,
On »Deepfake Total« platform, Dr Müller is training an AI model to detect manipulated voice recordings. It employs an enhanced version of the publicly available MLAAD (Multi-Language Audio Anti-Spoofing Dataset), which combines publicly sourced and custom-created original and deepfake audio files. A high detection rate depends not on the volume of data, but on a balanced combination of samples to avoid biases, such as the model performing better with male voices than female ones or being misled by irrelevant factors like accents or volume.
Dr Müller explains: »While AI-based detection is improving, we can only effectively counter the emerging deepfake era through a combination of technology, public awareness, and training across society.«
This press release is based on the article »On the Line: Fake!« from Fraunhofer Magazine 2/2025 (pages 32-34).