Africa, home to more than 2,000 languages, faces a major challenge in the AI era: most of its indigenous tongues are left out of digital tools due to limited written data. To solve this, a major collaborative project, the AfricaNext Voices, has created the largest known dataset of African languages to date. For the project, researchers recorded 9,000 hours of speech across 18 languages, capturing everyday conversations in farming, health, and education to build an inclusive resource. Funded by the Gates Foundation, the open-access database aims to power AI tools that translate, transcribe, and respond in local languages. Early successes, like farming apps in Setswana and startups like Lelapa AI, highlight the real-world benefits. Advocates say this effort safeguards not just communication, but Africa’s culture, knowledge, and imagination in the digital future.
BBC



