Illness took her voice away. AI created a replica that she carries in her phone

PROVIDENCE, R.I. (AP) — The voice Alexis “Lexi” Bogan had before last summer was exuberant.

She loved belting out Taylor Swift and Zach Bryan ballads in the car. She laughed all the time — even while rounding up misbehaving toddlers or debating politics with friends over a backyard fire pit. In high school she was a soprano in the choir.

Then that voice disappeared.

Doctors removed a life-threatening tumor at the back of her brain in August. When the breathing tube came out a month later, Bogan had trouble swallowing and struggled to say “hello” to her parents. Months of rehabilitation helped her recovery, but her speech is still impaired. Friends, strangers and her own family members struggle to understand what she is trying to tell them.

In April, the 21-year-old got her old voice back. Not the real one, but a voice clone generated by artificial intelligence that she can summon from a phone app. Trained on a 15-second time capsule of her teenage voice – taken from a cooking demonstration video she recorded for a high school project – her synthetic but remarkably realistic-sounding AI voice can now say almost anything she wants.

She types a few words or sentences into her phone and the app immediately reads it out loud.

“Hello, can I please have a grande espresso with brown sugar and oat milk,” Bogan’s AI voice said as she held the phone out the window of her car at a Starbucks drive-thru.

Experts have warned that rapidly improving AI voice cloning technology could amplify phone fraud, disrupt democratic elections and violate the dignity of people – living or dead – who have never consented to having their voices simulated to say things that they never spoke to.

It has been used to produce deepfake robocalls for New Hampshire voters impersonating President Joe Biden. In Maryland, authorities recently accused a high school athletic director of using AI to generate a fake audio clip of the school’s principal making racist comments.

But Bogan and a team of doctors from the Lifespan hospital group in Rhode Island believe they have found an application that justifies the risks. Bogan is one of the first people – the only one with her condition – to recreate a lost voice with OpenAI’s new Voice Engine. Some other AI providers, such as startup ElevenLabs, have tested similar technology for people with speech defects and loss — including one lawyer who is now using her voice clone in court.

“We hope that Lexi will be a pioneer as the technology continues to develop,” says Dr. Rohaid Ali, a neurosurgery resident at Brown University Medical School and Rhode Island Hospital. Millions of people with debilitating strokes, throat cancer or neurogenerative diseases could benefit, he said.

“We must be aware of the risks, but we cannot forget the patient and the social importance,” said Dr. Fatima Mirza, another resident working on the pilot. “We can help give Lexi her true voice back and she can speak in terms that are most true to herself.”

Mirza and Ali, who are married, caught the attention of ChatGPT maker OpenAI for their previous research project at Lifespan using the AI ​​chatbot to simplify medical consent forms for patients. Earlier this year, the San Francisco company looked for promising medical applications for its new AI voice generator.

Bogan was still slowly recovering from surgery. The illness started last summer with headaches, blurred vision and facial drooping, alerting doctors at Hasbro Children’s Hospital in Providence. They discovered a vascular tumor the size of a golf ball pressing on her brain stem and entangled with blood vessels and cranial nerves.

“It was a struggle to control the bleeding and get the tumor out,” says pediatric neurosurgeon Dr. Konstantina Svokos.

The 10-hour surgery combined with the location and severity of the tumor damaged Bogan’s tongue muscles and vocal cords, hindering her ability to eat and talk, Svokos said.

“It’s almost like a part of my identity was taken away when I lost my voice,” Bogan said.

The feeding tube came out this year. Speech therapy continues, allowing her to speak intelligibly in a quiet room, but without any signs she will regain the full clarity of her natural voice.

“At some point I started to forget what I sounded like,” Bogan said. “I’ve gotten so used to the way I sound now.”

Whenever the phone rang at the family’s home in the Providence suburb of North Smithfield, she passed it to her mother to take her calls. She felt like she was bothering her friends when they went to a loud restaurant. Her father, who has hearing loss, had difficulty hearing her.

Back at the hospital, doctors were looking for a pilot patient to experiment with OpenAI’s technology.

“The first person that came to Dr. Svokos’ mind was Lexi,” Ali said. “We contacted Lexi to see if she would be interested, not knowing what her response would be. She wanted to try it out and see how it would work.”

Bogan had to go back a few years to find a suitable recording of her voice to ‘train’ the AI ​​system on the way she spoke. It was a video in which she explained how to make a pasta salad.

Her doctors deliberately fed the AI ​​system only a 15-second clip. Cooking sounds make other parts of the video imperfect. It was also everything OpenAI needed: an improvement over previous technology that required much longer samples.

They also knew that getting something useful out of 15 seconds could be crucial for future patients who have no trace of their voice on the Internet. A short voicemail left for a family member may be sufficient.

When they tested it for the first time, everyone was amazed by the quality of the voice clone. Occasional glitches – a mispronounced word, a missing intonation – were usually unnoticeable. In April, doctors equipped Bogan with a custom-made phone app that only she can use.

“Every time I hear her voice I get so emotional,” her mother, Pamela Bogan, said through tears.

“I’m excited to be able to have that sound again,” Lexi Bogan added, saying it helped “boost my confidence to a level where it was before all this happened.”

She now uses the app about 40 times a day, sending feedback that she hopes will help future patients. One of her first experiments was to talk to the children at the kindergarten where she works as a teaching assistant. She typed “ha ha ha ha,” expecting a robotic response. To her surprise, it sounded like her old laugh.

She used it at Target and Marshall’s to ask where she could find stuff. It helped her reconnect with her father. And it has made it easier for her to order fast food.

Bogan’s doctors have begun cloning the voices of other willing Rhode Island patients and hope to bring the technology to hospitals around the world. OpenAI said it is being cautious in expanding its use of Voice Engine, which is not yet publicly available.

A number of smaller AI startups are already selling voice cloning services to entertainment studios or making them more widely available. Most voice generation providers say they prohibit impersonation or abuse, but they vary in how they enforce their terms of use.

“We want to ensure that everyone whose voice is used in the service provides ongoing consent,” said Jeff Harris, OpenAI’s head of product. “We want to make sure it is not used in political contexts. That is why we have chosen to be very limited in who we give the technology to.”

Harris said OpenAI’s next step is to develop a secure “voice authentication” tool so users can replicate only their own voice. That could be “limiting for a patient like Lexi, who suddenly lost her ability to speak,” he said. “So we think we need to have high-trust relationships, especially with medical providers, to give a little bit more unfettered access to the technology.”

Bogan has impressed her doctors with her focus on thinking about how the technology could help others with similar or more severe speech impediments.

“Part of what she’s done throughout this process is thinking about ways to adapt and change this,” Mirza said. “She has been a great inspiration to us.”

While she has to fiddle with her phone to get the voting machine to talk for now, Bogan envisions an AI voting machine that improves on older voice restoration solutions — like the robotic-sounding electrolarynx or a voice prosthesis — by merging with the human body. or translate words in real time.

She is less sure of what will happen as she gets older and her AI voice continues to sound like it did as a teenager. Perhaps the technology can “aging out” her AI voice, she said.

For now, “even though I don’t have my voice completely back, I have something to help me find my voice again,” she said.

___

The Associated Press and OpenAI have a licensing and technology agreement that gives OpenAI access to some of AP’s text archives.

Leave a Comment