Title:  Synthetic Speech Attacks Against Voice Assistants and Defenses

Date/Time: Monday, December 2, 2024, 2:00-4:00 pm EST

Location (in-person): Coda C0908 Home Park

Zoom Linkhttps://gatech.zoom.us/j/96401837809?pwd=YMH05b3kK4CpT73a9uNn4GCg6BHI7c.1&from=addon

 

Zhengxian He

Ph.D. Candidate in Computer Science

School of Cybersecurity and Privacy

Georgia Institute of Technology

 

Committee:

Dr. Mustaque Ahamad (Advisor), School of Cybersecurity and Privacy, Georgia Institute of Technology

Dr. Alexandra Boldyreva, School of Cybersecurity and Privacy, Georgia Institute of Technology

Dr. Saman Zonouz, School of Cybersecurity and Privacy, Georgia Institute of Technology

Dr. Frank Li, School of Cybersecurity and Privacy, Georgia Institute of Technology

Dr. Ashish Kundu, Cisco Research

 

Abstract:

Voice assistants have become prevalent in both home and enterprise environments, offering natural and convenient ways to interact with computing devices. However, their growing adoption has also introduced new security vulnerabilities. This dissertation investigates three critical security aspects of voice assistant systems. First, we demonstrate that attackers can synthesize malicious voice assistant commands using limited, unrelated speech samples to bypass currently available speaker verification mechanisms. In fact, they can achieve high success rates even with a lightweight unit-selection speech synthesis  technique. Second, we show how malicious commands directed at voice assistants can be used to setup  covert channels for data exfiltration from nearby compromised computers. We explore high frequency modulation to make data transfer unnoticeable to humans and characterize  the achievable data rates and reliability of such a channel under various conditions. Third, we develop a novel liveness detection method based on harmonic distortion analysis, which leverages physical characteristics of audio reproduction systems to effectively distinguish between live and synthetic commands while maintaining computational efficiency. Through  empirical evaluation, we demonstrate the feasibility of synthetic command attacks and the effectiveness of our proposed defense mechanism. Our findings highlight significant security challenges in current voice assistant systems while providing practical approaches for enhancing their security. This research contributes to both the understanding of voice assistant vulnerabilities and the development of countermeasures against attacks that could exploit such vulnerabilities.