Benchmark

Visual Maths Benchmark

06 November 2025

AI-for-Education.org's data science team.

Visual Maths Benchmark hero

AI models can answer complex mathematics tests, but how well do they perform with visual maths, key for learning in early grades? We made the Visual Maths Benchmark to see if models can answer early grade visual maths questions.

Teaching foundational numeracy follows a “Concrete, Pictorial, Abstract” methodology. Students first use physical objects known as manipulatives like counters, blocks, beads, or cubes to represent mathematical ideas. Once children are comfortable with concrete materials, they transition to drawings, diagrams, and visual models. The final stage involves using mathematical symbols and numbers without the need for physical or visual aids. This means that visual images are an important part of early grade maths.

To first test this, we built on the fantastic work of the team behind the Math-Vision (Math-V) dataset. We selected a sub-set of their dataset - the 237 questions focusing on early grades - to see how models perform on visual maths questions.

We found AI models' struggles in performance were primarily on the visual interpretation of the images, and less on the related mathematical calculations needed to answer the question. So we dug in deeper on this challenge in the Visual Reasoning Benchmark.

Can AI do basic visual maths?

Learn More