Due to the treatment of a tumor, this woman was no longer able to speak. AI gave her her voice back

Before undergoing life-saving and life-changing surgery, the voice of young American Alexis ‘Lexi’ Bogan was exuberant.

She loved belting out Taylor Swift and Zach Bryan ballads in the car. She was always smiling, even when rounding up misbehaving toddlers or debating politics with friends. At school she was a soprano in the choir.

Then, overnight, that voice was gone.

In August last year, doctors removed a tumor at the back of her brain. When the breathing tube came out a month later, Bogan had trouble swallowing and struggled to say “hello” to her parents.

Months of rehabilitation helped her recovery, but her speech is still impaired. Friends, strangers and her own family members struggle to understand what she is trying to tell them.

In April, the 21-year-old got her old voice back. Not the real one, but a voice clone generated by artificial intelligence (AI) technology from ChatGPT maker OpenAI that can summon them from a phone app.

Fatigue over AI deepfakes

Trained on a 15-second time capsule of her teenage voice – taken from a cooking demonstration video she recorded for a school project – her synthetic but remarkably realistic-sounding AI voice can now say almost anything she wants.

She types a few words or sentences into her phone and the app immediately reads it out loud.

“Hello, can I please have a large espresso shaken with brown sugar and oat milk,” Bogan’s AI voice said as she held the phone out the window of her car at a Starbucks drive-thru.

Experts have warned that rapidly improving AI voice cloning technology could amplify phone fraud, disrupt democratic elections and violate the dignity of people – living or dead – who have never consented to having their voices simulated to say things that they never spoke to.

It has been used to produce deepfake robocalls to New Hampshire voters impersonating US President Joe Biden.

In the US state of Maryland, authorities recently accused a high school athletic director of using AI to generate a fake audio clip of the school principal making racist comments.

But Bogan and a team of doctors from the Lifespan hospital group in Rhode Island believe they have found an application that justifies the risks.

Alexis Bogan types an answer to a journalist's question with an app that approximates her lost voice. — Alexis Bogan types an answer to a journalist’s question with an app that approximates her lost voice. -Josh Reynolds/AP

Recreating lost voices

Bogan is one of the first people – the only one with her condition – who has been able to recreate a lost voice OpenAI’s new Voice Engine.

Some other AI providers, such as startup ElevenLabs, have tested similar technology for people with speech defects and loss — including one lawyer who is now using her voice clone in court.

We must be aware of the risks, but we must not forget the patient and the social interest.

“We hope Lexi will be a pioneer as the technology develops,” says Dr. Rohaid Ali, a neurosurgery resident at Brown University Medical School and Rhode Island Hospital.

Millions of people with debilitating strokes, throat cancer or neurogenerative diseases could benefit, he said.

“We must be aware of the risks, but we cannot forget the patient and the social importance,” said Dr. Fatima Mirza, another resident working on the pilot. “We can help Lexi regain her true voice and speak in terms that are most true to herself.”

Mirza and Ali, who are married, caught the attention of ChatGPT maker OpenAI for their previous research project at Lifespan using the AI chatbot to simplify medical consent forms for patients.

Earlier this year, the San Francisco company looked for promising medical applications for its new AI voice generator.

Slow recovery

Bogan was still slowly recovering from surgery.

The illness started last summer with headaches, blurred vision and facial drooping, alerting doctors at Hasbro Children’s Hospital in Providence.

It’s almost like a part of my identity was taken away when I lost my voice.

They discovered a vascular tumor the size of a golf ball pressing on her brain stem and entangled with blood vessels and cranial nerves.

“It was a struggle to control the bleeding and get the tumor out,” said pediatric neurosurgeon Dr. Konstantina Svokos.

The location and severity of the tumor, combined with the complexity of the 10-hour surgery, damaged Bogan’s control of her tongue muscles and vocal cords, hindering her ability to eat and talk, Svokos said.

“It’s almost like a part of my identity was taken away when I lost my voice,” Bogan said.

The feeding tube came out this year. Speech therapy continues, allowing her to speak intelligibly in a quiet room, but without any signs she will regain the full clarity of her natural voice.

“At some point I started to forget what I sounded like,” Bogan said. “I’ve gotten so used to the way I sound now.”

‘Train’ AI how to speak

Whenever the phone rang at the family’s home in the Providence suburb of North Smithfield, she passed it to her mother to take her calls.

She felt like she was bothering her friends when they went to a loud restaurant. Her father, who has hearing loss, had difficulty hearing her.

Back at the hospital, doctors were looking for a pilot patient to experiment with OpenAI’s technology.

“The first person that came to Dr. Svokos’ mind was Lexi,” Ali said. “We contacted Lexi to see if she would be interested, not knowing what her reaction would be. She was eager to try it out and see how it would work.”

Bogan had to go back a few years to find a suitable recording of her voice to ‘train’ the AI system on the way she spoke. It was a video in which she explained how to make a pasta salad.

Her doctors deliberately fed the AI system only a 15-second clip. Cooking sounds make other parts of the video imperfect. It was also everything OpenAI needed: an improvement over previous technology that required much longer samples.

They also knew that getting something useful out of 15 seconds could be crucial for future patients who have no trace of their voice on the Internet. A short voicemail left for a family member may be sufficient.

‘Every time I hear her voice I get so emotional’

When they tested it for the first time, everyone was amazed by the quality of the voice clone. Occasional problems – a mispronounced word, a missing intonation – were usually unnoticeable.

In April, doctors equipped Bogan with a custom-made phone app that only she can use.

“Every time I hear her voice I get so emotional,” her mother, Pamela Bogan, said through tears.

“I’m excited to be able to have that sound again,” Lexi Bogan added, saying it helped “kind of boost my confidence to where it was before all this happened.”

She now uses the app about 40 times a day, sending feedback that she hopes will help future patients.

One of her first experiments was to talk to the children at the kindergarten where she works as a teaching assistant.

She typed “ha ha ha ha” expecting a robotic response. To her surprise, it sounded like her old laugh.

She used it at Target and Marshall’s to ask where she could find things. It helped her reconnect with her father. And it has made it easier for her to order fast food.

Bogan’s doctors have begun cloning the voices of other willing Rhode Island patients and hope to bring the technology to hospitals around the world.

OpenAI said it is being cautious in expanding its use of Voice Engine, which is not yet publicly available.

A number of smaller AI startups are already selling voice cloning services to entertainment studios or making them more widely available.

Most voice generation providers say they prohibit impersonation or abuse, but they vary in how they enforce their terms of use.

Alexis Bogan (center) and her mother Pamela Bogan (right) react to hearing a recreation of her lost voice via a prompt typed by Dr. Fatima Mirza (left). -Josh Reynolds/AP

Wider access to AI voice cloning

“We want to ensure that everyone whose voice is used in the service provides ongoing consent,” said Jeff Harris, OpenAI’s head of product.

“We want to make sure it’s not used in political contexts. That’s why we’ve chosen to be very limited in who we give the technology to.”

Harris said OpenAI’s next step is to develop a secure “voice authentication” tool so users can replicate only their own voice. That could be “limiting for a patient like Lexi, who suddenly lost her ability to speak,” he said.

“So we think we need to have high-trust relationships, especially with medical providers, to give a little bit more unfettered access to the technology.”

Bogan has impressed her doctors with her focus on thinking about how the technology could help others with similar or more severe speech impediments.

“Part of what she’s done throughout this process is thinking about ways to adapt and change this,” Mirza said. “She has been a great inspiration to us.”

While she has to fiddle with her phone to get the voting machine to talk for now, Bogan envisions an AI voting machine that improves on older speech restoration remedies — like the robotic-sounding electrolarynx or a voice prosthesis — by fusing together with the human body. or translate words in real time.

She is less sure of what will happen as she gets older and her AI voice continues to sound like it did as a teenager. Perhaps the technology can “aging out” her AI voice, she said.

For now, “even though I don’t have my voice completely back, I have something to help me find my voice again,” she said.

Fatigue over AI deepfakes

Recreating lost voices

Slow recovery

‘Train’ AI how to speak

‘Every time I hear her voice I get so emotional’

Wider access to AI voice cloning

Leave a Comment Cancel reply