Case Study

Egra-AI

12 August 2025

University of Cape Town, University of Western Sydney University, Binding Constraints Lab, Neurabuild, Funda Wande, Orla Humphries, María José Ogando Portela

Egra-AI hero

The EGRA-AI project uses a reading assessment tool with automatic speech recognition (ASR) to evaluate early grade reading in South African languages. Following initial testing in Sepedi, with our support the tool was adapted into isiXhosa to refine the approach and deepen insights into cross-language scalability.

Why this is important
Large scale data on foundational reading skills is essential for identifying learning gaps, monitoring progress, and evaluating effective interventions to improve early reading. Assessments are time consuming so automation with AI at scale could improve access and use of reading assessments in African languages.

Key Learnings

  1. One round of data collection is not enough - flexibility is key - It's hard to predict gaps in training data you collect in advance, for example words or letters students will mispronounce. Try building in flexibility with pilot rounds before full data collection.
  2. Train with in-domain child speech - The EGRA-AI project fine-tuned using speech taken only from EGRA AI tasks, giving high model accuracy on individual items (letters or words) but drops for full test scoring. Using in-domain child speech (such as for other reading sub tasks) which considers real world complexity could be helpful.
  3. Human expertise is essential for high-quality training data - Good training data relies on consistent annotation - but early reading is complex, and annotators don’t always agree. Using a range of experts and AI to co-develop tasks and refine guidelines could improve inter relator reliability.

The models are available on request from Western Sydney University under a free perpetual license. To request access to models reach out to ben@neurabuild.com

Grant we provided: $77,979

Learn More

Links