Project Gecko, a new Microsoft research initiative will brings AI to underserved cultures.
Generative AI has grown fast, powering tools used in offices, homes, and classrooms around the world. But experts warn that these systems often fail communities whose languages and cultures are poorly represented online. Poor performance in local languages has slowed down adoption in regions where digital access is already uneven.
A new Microsoft Research initiative, Project Gecko, aims to close these gaps. Led by teams in Africa, India, and the United States, the project focuses on building cost-effective AI systems that work for people outside the world’s major data corridors. The goal is simple but ambitious: create AI that speaks local languages, understands cultural context, and works across text, speech, and video.

At the center is the MultiModal Critical Thinking Agent, or MMCTAgent, a system that analyzes speech, images, and video to deliver grounded, context-rich answers. MMCTAgent is now available through Azure AI Foundry Labs, with open-source code on GitHub. While the announcement highlights Microsoft’s wider push for equitable AI, the deeper signal for experts is that multimodal, domain-specific models may be the future of scaling AI beyond high-resource markets.
Agriculture as the First Test Bed
The team chose agriculture as its starting point. In countries like Kenya and India, farming remains a major economic driver, but linguistic diversity often makes digital tools hard to use. Farmers switch between multiple languages and rely heavily on oral and visual guidance. Because most global AI models were trained mainly in English, they often produce incomplete or inaccurate responses.
Project Gecko builds on Digital Green’s FarmerChat platform, which hosts more than 10,000 community-generated agricultural videos in over 40 languages. With Gecko, a farmer can now ask a question aloud in Kikuyu or Kalenjin and receive an answer through text, audio, or a video clip that jumps directly to the relevant moment.
To support this, engineers built new speech recognition and text-to-speech tools for languages such as Swahili, Dholuo, Maa, and Somali, using a 3,000-hour Kenyan speech dataset. Small language models help the system run on low-cost devices common in rural areas.
Field studies in Kenya and India show higher accuracy, improved trust, and better ease of use than generic AI systems. For researchers, the early results offer a key lesson: locally grounded data, not scale alone, may be the most powerful driver of AI impact.
Project Gecko will expand next into health and education. A multilingual playbook for developers is also underway, signaling a shift toward AI systems built from the ground up for the global majority.