Inclusion of African Languages into AI: What’s at Stake

Inclusion of African Languages into AI: What’s at Stake - Nelson Otieno

The inclusion of African indigenous languages in AI models is needed to safeguard Africans’ rights. While African languages have been resilient during European colonialism, the contemporary moment presents new challenges.

Without deliberate efforts to integrate indigenous languages into AI models, African cultural schema and ways of writing face a greater risk of marginalization. This is why initiatives by Orange and other tech giants who plan to train AI models in African languages is a welcome development.

The 2024 announcement highlights an awareness of AI companies investing in Africa. The first phase of this collaboration will focus on incorporating key regional languages, primarily those spoken in West Africa. Even so, these commercial ventures raise other concerns, especially when one thinks about the checkered history of industrialization.

Lessons from the Past

I seek to draw attention to how global historical experiences are interconnected, providing insights as we navigate the present and look toward the future. Comparisons between industrialization and digitalization can offer cautionary lessons for training AI models in African languages.

Like industrialization, the development of AI involves groundbreaking technical change. However, almost all of these changes were carried out in imperial languages. When industrial technologies were introduced to colonized spaces local languages were displaced.

Industrialization saw the abandonment of traditional crafts in favor of modern machinery, which delivered growth that favored metropoles. However, this shift came at a steep cost, as Marx and Engels documented, because rewards were uneven.

What is at Stake?

Keeping the uneven experience of industrialization in mind, the critical question that must be addressed is how to ensure inequalities are not repeated. These inequalities can be social, political, economic as well as linguistic. And so, how can we avoid replicating the displacement of African languages? How might African languages be integrated into the data systems used for AI training and development? How can we create markets for African languages while recognizing that market forces can also be forces that displace?

Mainstreaming African Languages

As we seek to answer these questions and look ahead to the incorporation of African languages into AI, several reflections are worth considering.

First, attention must be given to the range of African dialects. There may be markets for more widely spoken languages like Wolof and Pulaar. But if commercial viability is the only consideration, then there is a risk of creating an internal linguistic hierarchy that replicates displacement.

Second, it would be helpful for developers — Orange, Meta, and OpenAI — to account for the linguistic variations in the African languages. Dialects bring subtle nuances that can differ between regions, communities, and generations.

Third, AI developers can consider the standards set forth by African regional frameworks that emphasize the importance of mainstreaming languages into technological development.

The lessons from colonial industrialization are stark. Regions excluded from early technological transformations often faced decades or even centuries of economic and social disadvantage. To avoid repeating this pattern with AI, we must act decisively during these formative stages of development to deeply integrate African languages and linguistic patterns into AI architectures and training data.

Afrika Techno Policy

Search This Blog

Inclusion of African Languages into AI: What’s at Stake - Nelson Otieno

Labels

Comments

Post a Comment

Popular posts from this blog

Beyond a buzzword: Can Ubuntu reframe AI Ethics? - Anye Nyamnjoh

Towards Tech Self-Determination: The case for an African AI Safety Institute - Scott Timcke