Phillipa Rewaj, Rebecca Devon and Shuna Colville from the Euan MacDonald Centre for MND Research, University of Edinburgh, help us celebrate Global MND Awareness day. This year’s theme is ‘voice’ and here the researchers provide us with an update on their pioneering ‘voicebanking project’, which is part-funded by the MND Association.
Ring ring….ring ring….
“Hello?”
“Hi there, it’s me.”
“Oh hello dear, how nice to hear from you!”
Sound familiar? How many of your friends or family could you recognise from a few words of their voice? Two, five, ten or more?
It may have never previously occurred to you, but our voices are as unique as our face shape, our walk and even our eyes. A person’s voice is an essential component of his or her identity.
This is part of the reason why it can be so distressing for some people living with MND—or indeed any neurodegenerative condition—if they start to lose their voice. For some, the early symptoms of altered speech or dysarthria (dis-arth-ree-ah) can lead rapidly to an inability to move the muscles in the mouth and tongue that are required for speech.
It is only when speech becomes difficult that it becomes clear just how valuable it is as a communication tool, and how much we take it for granted. Even asking for a cup of tea can become a time-consuming effort, let alone holding a conversation.
Alternative communication
It is in these situations that people often rely on “Augmentative and Alternative Communication” (AAC) to get their message across. AAC can range from using simple strategies such as gestures or writing things with pen and paper, through to voice output communication aids (VOCA) that generate synthetic speech from text inputted on a keyboard or via an eye-tracking system.
However many people complain that the synthetic voices pre-installed in these devices don’t represent the user’s identity. The British physicist Stephen Hawking, probably the most famous AAC user and person with MND alive today, is now so closely associated with his American voice that it has become part of his identity, and he has declined offers to update it. But this is not the case for everyone. While the quality of synthetic voices available on many AAC devices has improved greatly in the last few years, users are often limited to a choice of only a few voices, of which only a couple might be British, let alone representative of their own accent.
Voice banking: speaking with your own voice
But what if it were possible to speak in your own voice through an AAC device? This idea sparked a research project at the Euan MacDonald Centre for MND Research in Scotland. Clinical researchers teamed up with speech and language therapists and speech scientists at the University of Edinburgh’s Centre for Speech Technology Research to try to deliver personalised synthetic voices for people living with MND to use in AAC devices. The voicebanking project was born, and its development has been part-funded by the MND Association.
Ideally, we record a person’s voice soon after diagnosis, and before speech has become affected. People are asked to read aloud around 400 sentences (which takes about an hour) whilst being recorded in our purpose-built sound-proof room in the Anne Rowling Regenerative Neurology Clinic in Edinburgh. The sentences have been chosen to capture all the speech sounds of English in all the different possible combinations. While 400 sentences is an ideal number, we can create a synthetic voice from as little as 100 sentences if people aren’t able to manage the 400 mark. This voice recording is then “banked” and stored ready to create a synthetic voice for a communication aid if, and when, that person needs one. Using software developed by speech scientists, all the parameters of that unique voice can be automatically analysed and synthetically reproduced in a process called “voice cloning”.
This is where “donor” voices come into play. During the voice cloning process the synthetically reproduced parameters of a patient’s voice are combined with those of healthy donor voices. Features of donor voices with the same age, sex and regional accent as the patient are pooled together to form an “average voice model” (AVM), which acts as a base on which to generate the synthetic voice. It’s a bit like going to the paint-mixing counter in a DIY shop, taking a 5 litre tin of light gloss base paint to the counter and mixing in your personal colour of choice (Sumptuous Plum for example…). It is the use of these donor voices that means we can use just a short recording from the patient, as the bulk of the speech data has been collected in the donor AVM or “base paint”.
What about people who are already beginning to lose their voice?
The really clever bit happens if a person comes in to record his or her voice once there is already mild to moderate dysarthria. It is possible for us to “repair” the voice in the synthesis process using more of the donor AVM to patch the damaged elements of the voice (adding more of the “base paint” to the personal colour). To date, we have recorded the voices of around 600 healthy individuals – old and young, male and female, and with a glorious range of regional accents. We are always trying to expand our bank of voices, as the bigger the pool of donors, the closer we can get to replicating the original voice of the person living with MND.
Developing a personalised synthetic voice app
Recently, we asked a small number of people living with MND to pilot a communication aid app we are developing for the iPad, into which we can install a personalised synthetic voice. Feedback about the intelligibility and similarity of the synthesised voice to the patient’s own was good, although there is still room for improvement. When we have refined and tested the app further, our long-term goal is to make it widely available for use as a communication aid,helping people living with MND retain their self-identity and dignity by keeping their own voice.