Case studies

Taking care of your voice every day: The Vocal’iz app from MGEN

11/04/2022

Transcript of Tomorrow Will Be Heard (S2E3), the podcast that deciphers the new uses of audio in our daily lives

Voice research will profoundly change our relationship to objects. One example is Vocal’iz, a virtual vocal coach imagined by MGEN that helps you analyze and work on your voice, your vocal tone, your breath control and even your public speaking every day via your smart phone.

Mélusine Harlé, Director of Prevention at MGEN: “It’s an app that you can download to your phone. You only have to register and then you can instantly take a test to discover the quality and tone of your voice. Is your voice quiet? Are you a soprano, alto? Once you have taken the test, certain exercises are shared. Typically, if you have a tired voice, Vocal’iz will tell you: “Try some breathing exercises today”. If your voice is feeling really good one day, Vocal’iz might suggest singing exercises or something much more powerful that will allow you to have a lot more fun during the day.”

Initially designed for teachers, the application is open to everyone and is based on research carried out at Ircam and developed by Ircam Amplify

Méluse Harlé: “Vocal health is important for MGEN, which is why we developed this application with Ircam Amplify. We said to ourselves, “No one has taken on this subject in a way that is both playful and educational.” Naturally, we went to Ircam Amplify to see how they could help us, with Ircam’s research, to build a technological foundation that would allow us to realize our dream for this type of preventative care.”

Technology at the service of speech and health

Frederic Amadu, CTO of Ircam Amplify: “Alongside software functions developed in the Ircam laboratory, a signal analysis algorithm provides data on, for example, the frequency at which we speak, our pitch height, which is to say, our overall tonality. Frequency is the easiest parameter to understand of those we examine. We also analyze other vocal parameters such as vocal power: is the user whispering or speaking too loudly? Our system also counts the number of syllables spoken in a given time frame, which makes it possible to tell whether a user is speaking quickly or slowly.

By looking at all these parameters, the main analysis we can provide is to decide whether we have diction and speech patterns that are calm and understandable, and thus effective. Our technology provides raw analysis parameters, then we set thresholds in order to define whether the user’s speech is, for example, too high-pitched or too low-pitched, or too fast. Speech therapists helped to decide on those thresholds. We worked together with MGEN and speech therapists to define the rules and exercises used in Vocal’iz. The objective of the coaching is for a user to improve their speech scores by repeating an exercise several times, following the advice that the application gives based on the results of a given exercise.”

Vocal’iz analyzes your voice, but also improves your public speaking.

Méluse Harlé: “There is a whole series of exercises on prosody which allow you to work, in particular, on tone, rhythm, and pauses, which are very important. For example, there is a series of exercises with great classics of French literature, such as Cyrano de Bergerac’s tirade that we all know. With Vocal’iz, we combine pleasure and vocal training.”

Corinne Loie, Prevention Officer at MGEN and an opera singer was among the speech therapists who worked on the project: “Why take care of your voice? Because it will help you know yourself better. Most of the time, it also helps improve relationships and social interactions, and increases confidence and comfort in carrying out professional tasks. These improvements are MGEN’s goal as an occupational risk prevention organization. Most of the time, understanding our own voices helps us to bring ourselves into the world.”

Today, technology makes it possible to better ourselves through our interactions with others. Tomorrow, research offers the possibility of improving our interactions with objects.

Nathalie Birocheau, CEO of Ircam Amplify:

“It’s a huge field of research and there are different areas of application. There is the analysis, synthesis, and cloning or transformation of voices in real time. These uses often arise from the arts world, which is Ircam’s primary domain. The research then gradually finds use cases in other sectors. Ircam Amplify’s objective is to apply these technological foundations to industrial use cases that are broadly related to interfaces with voice assistants, interfaces with robots and connected objects.

This is a very important field of possibility for us, especially since there will be 8 billion voice assistants in circulation by 2023 and it is estimated that roughly 30% of web browsing is already done without a screen. These vocal interactions have to be high-quality, otherwise we won’t want to use these technologies. But above all, the technology has to work. Devices have to understand when there are several speakers or how to interpret our speech differently based on the way we express ourselves. We know that between people, the delivery and the way communication is perceived is often more important than the content.

Today, objects do not yet analyze vocal prosody, or the way in which we articulate a sentence, and therefore they cannot interpret if we are sad, in a hurry, if there are children in the room or the age of a speaker. If the user is an elderly person, a voice assistant may have to speak slower, more calmly and louder, and make different adjustments if the user is a child. Devices must learn to absorb all this information and then adapt the output appropriately. In a car, there are a lot of sounds, in the middle of a storm, it is raining, and there’s noise on the windows, so the car’s voice assistant should get louder. For now, these abilities are not yet integrated in technologies that communicate with us as human beings.”

Share
The Vocal’iz app from MGEN

From Virtual Singing to Deepfakes: How is Speech Used Today

27/04/2022

With the rise of new technology, as well as audio-only social media, our voices are becoming more prominent features in our world. Ircam’s first steps into research on human speech stemmed from the world of music, but today the applications for audio research and vocal analysis extend far beyond the music industry.

Exciting Trends in Audio at SXSW 2022

28/03/2022

South by Southwest (SXSW) takes place every March in Austin, Texas and features ten days of conferences and festivals that combine music, technology, cinema, and interactive media. In 2022, for the first time in two years, SXSW was held face-to-face, and the Ircam Amplify team went to Austin to be part of the French Tech Pavilion presented by Business France. Read on to learn about the major trends in audio technology that were showcased at SXSW 2022.

Listening to our Bodies: Health, Technology and Sound

01/03/2022

From how technology impacts complete hearing loss to the use of hearing to relieve physical pain, there are numerous ways that the health and wellness industries might meet the audio and technology sectors. Experts from several fields are now exploring questions like: what place does sound have in the world of health, how do sounds, our emotions, and our bodies interact, and how does technology impact our physical experience of sound?