
AI interest and use have surged as next-gen capabilities disrupt industries, redefine work, and reshape the future at an unprecedented pace. But while generative AI and other advanced models are poised to deliver enormous benefits for companies, many struggle to understand how to use them—let alone implement them successfully. Independent AI consultants are an innovative solution—but demand for their niche expertise outweighs talent supply and availability.
As large language models (LLMs) gain broader adoption, their inconsistent performance across languages has become a significant concern. To address the disparity in LLM performance between English and African languages, a major non-profit global health organization enlisted Ghamut Corporation—a boutique AI consulting firm led by Dr. Mohammad Ghassemi, an AI expert and former BCG strategy consultant with deep experience in artificial intelligence and data systems—and Dr. Tuka Alhanai, a renowned AI innovator named a Young Global Leader by the World Economic Forum.
Identifying the Need
Among AI experts, it’s known that LLMs perform near human levels in English, but in other languages, it’s not even close. While AI capabilities are improving every day—to the point that the difference in performance between top-of-the-line LLMs and humans is fading away—all of those advancements are occurring, again, almost exclusively in English. Performance drops on African languages compared to high-resource languages—and we lack benchmark data to fully understand how LLMs perform on key tasks like reasoning, reading comprehension, and domain knowledge.
When readily available linguistic resources—such as dictionaries, extensive bodies of written works, and annotated datasets—are scarce for a particular language, it is considered low-resource. And without that data to train conversational models, LLM performance is reduced. More than 2,000 native African languages are considered to be low-resource, and 31 of those languages have over 10 million speakers—an undeniably sizeable market. So, in their current form, LLMs are least effective for the very people who could gain the most from AI capabilities bridging language gaps in education, healthcare, and digital services.
This was a top priority for the global nonprofit, whose core mission is to support communities where non-English languages are predominantly spoken. The organization’s goal is to provide critical resources—such as contraceptives, vaccines, and other tools that empower people to live healthier lives. But to do this effectively, it needed a way to communicate clearly and meaningfully with these populations.
From Insight to Action
The team began by translating two gold-standard reasoning and knowledge benchmarks into 11 African languages that are frequently spoken, geographically diverse, and have limited AI training resources.
By translating, validating, and fine-tuning these benchmarks, they were able to create a dataset of more than a million human-translated words in eleven underserved languages spoken by more than 230 million people. They then used the translated benchmarks to measure how well state-of-the-art, “frontier” models (including Anthropic’s Claude 3, Google’s Gemini, and OpenAI’s GPT-4 models) perform in these languages and identify targeted strategies to improve them.
With the performance gap effectively assessed, the team could shift their efforts toward closing it.
Next they developed a method to retrain LLMs that improves their accuracy in limited data contexts generally, and for African languages specifically. Their approach combines small, high-quality datasets from a targeted African language with larger, lower-quality datasets from related languages, enabling LLMs to extract broad knowledge from the larger data while developing domain-specific reasoning from the higher-quality sources.
While their focus for this project was on African languages, their approach offers a scalable solution for improving AI performance in any domain with limited data.
The Impact of Innovation
“This project holds a special place in my heart because it will help us improve the performance of LLMs for over 230 million people… That’s a lot of people!” Dr. Ghassemi shared.
“And also very importantly, people can take the code and continue to build on it and extend it to create more public good,” he continued, citing the publicly available resources released once the team completed the project—which will help others build more inclusive and effective AI models across diverse languages and low-data environments.
The team’s work earned recognition from the Association for the Advancement of Artificial Intelligence (AAAI), one of the most prestigious conferences in the field. Selected for an invited talk—an honor extended to only the top 5% of AI research globally—the project was acknowledged not only for addressing a meaningful problem, but also for introducing innovative AI methods to tackle it.
Source Tomorrow’s Most In-Demand Skills Today
This initiative demonstrates how the right expertise, strategy, and execution can unlock AI’s full potential. Whether you’re exploring opportunities, scaling a solution, or navigating increasingly complex AI challenges, an AI consultant or team can help you move the needle.
Ready to explore our AI consulting capabilities?
Get the Skills You Need
Thousands of independent consultants, subject matter experts, project managers, and interim executives are ready to help address your biggest business opportunities.