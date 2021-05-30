



Voice is becoming one of the main ways people interact with devices, but voice technology remains largely closed to African languages, accents, and voice patterns. A good example: The world’s most popular voice assistants, Siri, Alexa, and Google Assistant, still don’t support African languages. There are more than 1,000 languages ​​on the continent.

Common Voice, a crowdsourcing project launched by the Mozilla Foundation in 2017, invites speakers in African languages ​​to use free public data for researchers and developers to train voice-enabled apps, products, and services. We are working on this by donating audio to the set.

Common voice

The idea was to diversify voice technology and democratize space through open source initiatives, Chenai, chairman of Mozilla Foundation’s Special Advisor to African Innovation, told Quartz.

To date, Common Voice has recorded over 9,000 hours of voice in 90 languages ​​around the world, including three African languages: Luganda (Uganda), Kabyle (Algeria), and Kinyarwanda (Kinyarwanda). I will. This week, with the help of a $ 3.4 million investment from four organizations, we announced the expansion of our project to Swahili, an East African language spoken by an estimated 100 million people.

Common Voice will support East Africans who are playing a direct role in creating technologies that support the community by making it easy to donate voice data in Swahili, the chair said.

According to Mozilla, one of the main goals of this project is to assess the potential to develop speech recognition for the languages ​​of poorly serviced communities. The open source nature of the data allows local innovators to develop products and services for marginalized communities, the company adds.

Already used by a startup called Digital Umuganda, the Kinyarwanda dataset, which has 1,800 hours of voice, has developed the AI ​​chatbot Mbaza with voice-to-text and text-to-voice capabilities that provide Covid-19 information. doing. language. The Kinyarwanda project by Common Voice is now important because of its willingness to digitize the country’s public services, says project community leader Remy Muhire.

Africa is not the main market for Apple, Amazon and Google

Africa is not the main market for tech companies behind popular voice assistants such as Apple, Amazon, and Google, but another challenge is that much of the voice data used to train machine learning algorithms is small. Is held by a large company. It is difficult for others to develop high quality speech recognition technology. Research shows that as a result African languages ​​are left behind in speech recognition innovation.

A spokeswoman for Amazon, the developer of Alexa, which supports at least eight languages, told Quartz. Our vision that Alexa is everywhere in our customers. They refused to comment on whether they were doing anything to include the African language.

Apple, the developer of Siri, which also supports at least eight languages, didn’t respond to requests for comment. The Google Assistant supports nearly 30 languages, but not African languages, and the company declined to comment.

The Chair does not believe that large international companies have the right to direct the use of languages ​​in technology. With Common Voice, she says, technicians can retrieve data and build models and technologies that work in the community.

The guardian must be with real people who speak these languages, she says.

