Adversarial audio generated by AI can trick authentication

TL;DR:

  • AI-generated audio poses a threat to voice authentication systems.
  • Off-the-shelf AI tools can clone human voices, making it easier to deceive authentication software.
  • Researchers have achieved a 99% success rate in subverting voice authentication security.
  • The technique involves creating “adversarial” samples to trick authentication systems.
  • Voice authentication relies on unique physical and social characteristics of individuals’ voices.
  • AI-generated audio can mimic human voices but contains distinctive artifacts that can be detected.
  • The researchers’ technique aims to remove these artifacts while preserving the overall sound.
  • The success rate of the attacks highlights the need for improved voice authentication mechanisms.
  • Companies developing voice authentication software must continue to enhance security measures.

Main AI News:

Adversarial audio, generated by artificial intelligence (AI), has the potential to deceive voice authentication systems, posing a significant challenge for developers in the field. With the advent of off-the-shelf AI tools capable of replicating human voices, the need to incorporate an additional layer of security has become apparent. This additional security measure aims to discern whether an audio sample originates from a human or has been generated by a machine.

Voice authentication plays a crucial role in numerous sectors, including call centers, banks, and government agencies. However, the susceptibility of these systems to AI-based attacks, wherein machine learning is employed to mimic and authenticate as individuals, has become more pronounced. In fact, researchers from the University of Waterloo in Canada have claimed an impressive 99 percent success rate in subverting such security measures, given the appropriate circumstances.

The aforementioned computer scientists devised a technique to deceive voice authentication systems, as outlined in their paper published in the 44th IEEE Symposium on Security and Privacy. By manipulating AI-generated speech recordings through the creation of “adversarial” samples, they were able to achieve highly effective results. These samples successfully circumvented the voice authentication checks implemented by the tested systems.

The fundamental principle behind voice authentication lies in the unique nature of each individual’s voice. Physical attributes such as the size and shape of the vocal tract and larynx, combined with social factors like accent, contribute to the distinctiveness of a person’s voice. Authentication systems capture these subtle nuances in voiceprints. While AI-generated audio can convincingly imitate human voices, AI algorithms inherently possess distinct artifacts that experts can identify as artificially created. The researchers’ technique aims to eliminate these distinguishing features while preserving the overall sound.

To enhance their understanding of what renders speech genuinely human-like, the researchers trained their system using samples of utterances from 107 different speakers. They then crafted multiple adversarial samples to test their algorithm’s efficacy in fooling authentication systems, achieving a success rate of 72 percent. Against less robust systems, their success rate rose to an impressive 99 percent after six attempts.

However, it is premature to declare voice authentication software obsolete. For instance, the researchers achieved a success rate of only ten percent in a four-second attack against Amazon Connect, a software utilized by cloud contact centers. Furthermore, authentication software continues to evolve to counter such attempts effectively.

Perpetrators seeking to carry out these types of attacks must possess access to the targeted individual’s voice and possess sufficient technological expertise to generate their own adversarial audio samples, particularly when targeting more secure systems. In a noteworthy example, a Vice reporter claimed to have gained access to his own bank account using AI trained on his voice. Although the barrier to entry remains relatively high, the researchers emphasized the importance of companies developing voice authentication software to persist in their efforts to enhance security measures.

The success rates of our attacks are concerning,” expressed the researchers, highlighting the fact that they were achieved in a black-box setting and under realistic threat assumptions. These findings underscore the significant vulnerabilities of voice authentication systems and underscore the urgent need for more reliable mechanisms in the field.

Conclusion:

The rise of AI-generated adversarial audio poses a significant challenge for the market of voice authentication systems. The ease with which off-the-shelf AI tools can clone human voices and deceive authentication software raises concerns about system vulnerabilities. The researchers’ technique demonstrates the effectiveness of generating “adversarial” samples to trick authentication systems, emphasizing the need for an extra layer of security. Companies in the market must prioritize ongoing development and improvement of voice authentication software to address these emerging threats and ensure more reliable mechanisms for the future.

Source