Safeguarding Voices via Adversarial Examples: Defense and Way Forward in the Era of GenAI
Key | Action |
---|---|
K or space | Play / Pause |
M | Mute / Unmute |
C | Select next subtitles |
A | Select next audio track |
V | Show slide in full page or toggle automatic source change |
left arrow | Seek 5s backward |
right arrow | Seek 5s forward |
shift + left arrow or J | Seek 10s backward |
shift + right arrow or L | Seek 10s forward |
control + left arrow | Seek 60s backward |
control + right arrow | Seek 60s forward |
shift + down arrow | Decrease volume |
shift + up arrow | Increase volume |
shift + comma | Decrease playback rate |
shift + dot or shift + semicolon | Increase playback rate |
end | Seek to end |
beginning | Seek to beginning |
Information on this media
Links:
Number of views:
10 (this month: 1)Creation date:
Jan. 22, 2024Description
The recent advancement in generative AI is bringing paradigm shifts to society. Using the contemporary AI-based voice synthesizers, it now becomes practical to produce speech that vividly mimics a specific person. While these technologies are designed to improve lives, they also pose significant risks of misuse, potentially harming voice actors' livelihoods and enabling financial scams. In recognition of such threats, existing strategies primarily focus on detecting synthetic speech. In complementary to these defenses, we propose AntiFake as a proactive approach that hinders unauthorized speech synthesis. AntiFake works by adding minor noises to speech samples, such that the attacker's synthesis attempts will lead to audio that does not sound like the target speaker. To attain an optimal balance between sample quality, protection strength, and system usability, we propose adversarial optimization on the three-way trade-offs guided by minimal user inputs. In this work, we make an initial step towards actively protecting our voices, and highlight the ongoing need for robust and sustainable defenses in this evolving landscape.