Speak Like a Native: NVIDIA Parlays Win in Voice Challenge
From Nvidia:
A team at NVIDIA, including members Akshit Arora and Rafael Valle, won the LIMMITS ’24 challenge, advancing personalized voice interfaces for over a billion native speakers in India. The novel AI model they developed demonstrated the ability to recreate speaker’s voices in English and six Indian languages with the correct accents using only a three-second speech sample.
Existing voice interfaces often fail to capture the accents of the target language or nuances of the speaker’s voice. The challenge, which focused on the naturalness and similarity of the resulting speech, judged entries based on those criteria. Next steps in the technology’s development promise personalized, realistic conversations and experiences that can break language barriers.
Members of the winning team, Arora, Valle, Kim, and Badlani, each have personal motivation for developing this technology. Arora, a native Punjabi speaker, aims to bridge the language barrier with his wife’s family who speak Tamil, while Valle, originally from Brazil, faces similar challenges with his wife’s family speaking Gujarati. Badlani was inspired by his experience living in states with seven distinct languages.
The team’s winning code base focused on Indic languages, but its development was accelerated after the team learned of the 2024 challenge, only 15 days before the deadline. Fortunately, a member of their team had been working on a suitable AI model, and their sprint ultimately secured their victory. The resulting P-Flow model will be part of the NVIDIA Riva framework, enabling anyone to deploy the technology.
Read more: Speak Like a Native: NVIDIA Parlays Win in Voice Challenge