More advanced and realistic audio manipulation is now possible with the development of new software such as Adobe VoCo and Lyrebird. With the ability to alter speech or even to add new words, these technologies intend to make audio editing easier in cases when a speaker stumbles on a word or misspeaks. However, these programs bring with them various ethical concerns and could contribute to the rising distrust of the media.
Adobe’s anticipated VoCo software works by breaking down a sample of recorded speech into phonemes, referring to the smallest discernible unit of the speech system, according to TechCrunch. These phonemes are used to create a speech model that can be used to alter the original recording.
This technology is reliant on deep learning, an approach to machine learning that uses layered neural networks, modeled after the brain, to break down and recreate the task in a way that can be addressed by a computer.
Asst. Computer Science Prof. Vicente Ordóñez said this technology uses a form of deep learning called “generative adversarial networks” which allows for the generation of new content 一 in this case the generation of words that had not been spoken.
“So you show your software, or model, or algorithm a bunch of examples of what you’re trying to generate and the model will learn to generate new samples following, more or less, the distribution of the samples you showed the algorithm for images, text and audio,” Ordóñez said.
Engineering and Society Assoc. Prof. Rosalyn Berne said the release of this new technology is intended to be an incremental stage of improvement on existing technology despite the potential negative implications.
“These technologies are moving so rapidly that I think it's going to be a little hard for human beings to keep up with the changes,” Berne said. “Because we’ve adopted these things over evolution of a million years, and now they’re being tampered with pretty directly.”
These technologies may also impact the way we form and define relationships, Berne said.
“There’s a great film, ‘Her’, that I use in my class, that sort of began to play with that notion of what will it mean when we begin to confuse relationships with people with systems,” Berne said. “It’s foreshadowing a question about what is a relationship, what do we expect from other people when we are now having to interpret meaning from systems that sounds a whole lot like what a person sounds like.”
Not only may these technologies alter perceptions of what a relationship entails, but they have the potential to be used to imitate others, especially those in the public eye.
This technology would allow a user “to generate the audio that resembles a particular person you’re targeting,” Ordóñez said. “This is more problematic for public figures where you can gather a lot of audio from public speeches so you could conceivably do that.”
The manipulation of recorded speeches also has the potential to contribute to the rise of fake news.
“It’s very relevant these days with the fake news,” Ordóñez said. “You will be able to generate not just text that is fake, but potentially the video or the voice will sound like a public figure.”
Berne said she believes it will be impossible to ignore programs like VoCo or any prevalent technology 一 similar to how students can no longer go without a computer.
One reaction to this technology is teaching how to better identify fake news as part of college courses, Berne said. However, a proactive way to address implications of this technology would be to integrate ethical measures in the software itself.
“You could conceive of having some type of code or signature that accompanies audio or video manipulations that shows that it’s been digitally modified after some point in time,” Ordóñez said.
Despite having the technological capability, designing in elements to prevent misuse may not be one of the engineer’s primary technological goals, Berne said.
According to Berne, these concerns are more likely to be addressed post-development, at the policy level. Often this will be after the technology has hit the market when questions concerning expectations, consumer needs and consumer safety are brought to the forefront. Like Berne, Ordóñez said ethical concerns may be better addressed outside the realm of engineering.
“Certainly software engineers have the technology, but for addressing some of the other ethical questions, you have to look into the humanities as well,” Ordóñez said. “I think it will require a lot of interdisciplinary work, with other people working on computer security perhaps, working on digital content identification, but I think it general it even comes down to doing better journalism.”
Ordóñez said that readers must also be more wary of what they see or hear or read on the internet.
According to Berne, the excitement of new technology can often be distraction.
“Most of us are just becoming skeptical, but we’re also enamored with a lot of this technology,” Berne said. “A lot of people get excited about the next iPhone, and the next app and what it can do. As consumers who have a propensity for wanting to be entertained, I think we are going to be more entertained than concerned until it gets to the point where we actually realize what has happened.”
Nevertheless, people must be discerning when consuming media, Berne said.
“Be a critical consumer, of everything,” Berne said. “A lot of us are very careful now in choosing what we eat, what we put in our bodies. Maybe we just have to be really careful about what we put in our minds, and that’s a matter of intention.”