Skip to main content

Inclusion of African Languages into AI: What’s at Stake - Nelson Otieno

The inclusion of African indigenous languages in AI models is needed to safeguard Africans’ rights. While African languages have been resilient during European colonialism, the contemporary moment presents new challenges. 

Without deliberate efforts to integrate indigenous languages into AI models, African cultural schema and ways of writing face a greater risk of marginalization. This is why initiatives by Orange and other tech giants who plan to train AI models in African languages is a welcome development. 

The 2024 announcement highlights an awareness of AI companies investing in Africa. The first phase of this collaboration will focus on incorporating key regional languages, primarily those spoken in West Africa. Even so, these commercial ventures raise other concerns, especially when one thinks about the checkered history of industrialization.



x

Lessons from the Past

I seek to draw attention to how global historical experiences are interconnected, providing insights as we navigate the present and look toward the future. Comparisons between industrialization and digitalization can offer cautionary lessons for training AI models in African languages.

Like industrialization, the development of AI involves groundbreaking technical change. However, almost all of these changes were carried out in imperial languages. When industrial technologies were introduced to colonized spaces local languages were displaced.

Industrialization saw the abandonment of traditional crafts in favor of modern machinery, which delivered growth that favored metropoles. However, this shift came at a steep cost, as Marx and Engels documented, because rewards were uneven.

 

What is at Stake?

Keeping the uneven experience of industrialization in mind, the critical question that must be addressed is how to ensure inequalities are not repeated. These inequalities can be social, political, economic as well as linguistic. And so, how can we avoid replicating the displacement of African languages? How might African languages be integrated into the data systems used for AI training and development? How can we create markets for African languages while recognizing that market forces can also be forces that displace?

 

Mainstreaming African Languages

As we seek to answer these questions and look ahead to the incorporation of African languages into AI, several reflections are worth considering.

First, attention must be given to the range of African dialects. There may be markets for more widely spoken languages like Wolof and Pulaar. But if commercial viability is the only consideration, then there is a risk of creating an internal linguistic hierarchy that replicates displacement.

Second, it would be helpful for developers — Orange, Meta, and OpenAI — to account for the linguistic variations in the African languages. Dialects bring subtle nuances that can differ between regions, communities, and generations.

Third, AI developers can consider the standards set forth by African regional frameworks that emphasize the importance of mainstreaming languages into technological development. 

The lessons from colonial industrialization are stark. Regions excluded from early technological transformations often faced decades or even centuries of economic and social disadvantage. To avoid repeating this pattern with AI, we must act decisively during these formative stages of development to deeply integrate African languages and linguistic patterns into AI architectures and training data.

Comments

Popular posts from this blog

Beyond a buzzword: Can Ubuntu reframe AI Ethics? - Anye Nyamnjoh

The turn to Ubuntu in AI ethics scholarship marks a critically important shift toward engaging African moral and politico-philosophical traditions in shaping technological futures. Often encapsulated through the phrase “a person is a person through other persons”, Ubuntu is frequently invoked to highlight ontological interdependency, communal responsibility, relational personhood, and the moral primacy of solidarity and care. It is often positioned as an alternative to individualism, with the potential to complement or “correct” Western liberal frameworks. But what does this invocation actually do? Is Ubuntu being used to transform how we think about ethical challenges in AI, or is the emerging discourse merely softening existing paradigms with a warmer cultural tone?   The emerging pattern A recurring pattern across the literature reveals a limited mode of Ubuntu engagement. It begins with a description of AI-related ethical concerns: dependency, bias, privacy, data coloni...

Towards Tech Self-Determination: The case for an African AI Safety Institute - Scott Timcke

As AI foundation models become ubiquitous, the African continent faces a reckoning.  Almost all of the digital technology Africa uses is imported. The anchoring effects of technical codes, standards and specifications act as a kind of shadow regulation that limits how much direct control Africans can have on these systems. Africa cannot afford to be a passive recipient of technologies developed elsewhere, with little consideration for disruptions to local contexts. Instead, a proactive, comprehensive approach to AI safety must emerge, one that is holistic in nature.     A Strategic Imperative for Preserving Self-Determination The traditional approach to tech governance - characterized by reactive regulation (or the lack thereof) - is inadequate. By contrast, an African AI Safety Institute could rise above the narrow confines of technical assessment. Its mandate could extend far beyond simple compliance or risk mitigation to better understanding the ways in which alg...