Detecting Deepfake Audio by Modeling the Human Acoustic Tract

Source

This is interesting research : In this paper, we develop a new mechanism for detecting audio deepfakes using techniques from the field of articulatory phonetics. Specifically, we apply fluid dynamics to estimate the arrangement of the human vocal tract during speech generation and show that deepfakes often model impossible or highly-unlikely anatomical arrangements. When parameterized to achieve 99.9% precision, our detection mechanism achieves a recall of 99.5%, correctly identifying all but one deepfake sample in our dataset. From an article by two of the researchers: The first step in differentiating speech produced by humans from speech generated by deepfakes is [...]