Google and OpenAI Earn Top Honors For Major Progress In Mathematical Skills

The logo of Google LLC is shown on a building in San Diego, California, U.S., October 9, 2024. — Reuters


Alphabet’s Google and OpenAI revealed that their AI models won gold medals at a global math competition, showing a huge step forward in mathematical reasoning in the race to build AI systems closer to human intelligence.

This marks the first time AI models have gone beyond the gold medal score at the International Mathematical Olympiad (IMO), a world-famous contest for high school students.

Both companies’ models solved five of the six questions, reaching the score with general-purpose “reasoning” models that worked through math concepts in natural language, unlike earlier AI methods.

Google DeepMind teamed up with IMO to have its models graded and certified by the committee, while OpenAI did not officially enter. Instead, the company announced on Saturday that its models had scored gold-medal results on this year’s questions, citing grades from three external IMO medalists.

According to Junehyuk Jung, a Brown University professor and visiting researcher at DeepMind, this milestone suggests AI may be less than a year away from helping mathematicians tackle unsolved research problems at the edges of the field.

“I think solving hard reasoning problems in natural language will open the door for real collaboration between AI and mathematicians,” Jung told Reuters.

OpenAI’s breakthrough came through a new experimental model that scaled up “test-time compute.” The approach let the model “think” longer and used parallel computing power to run many reasoning paths at once, said Noam Brown, researcher at OpenAI. He declined to reveal the exact cost but described it as “very expensive.”

For OpenAI, this proves AI can develop strong reasoning skills that could extend into areas beyond math. Google researchers share this view, believing AI could also tackle problems in fields like physics, said Jung, who himself won an IMO gold medal in 2003.

At the 66th IMO on Australia’s Sunshine Coast, 67 of the 630 student contestants — about 11% — achieved gold-medal scores.

Last year, Google’s DeepMind earned a silver medal using AI models designed for math. This year, it succeeded with a general-purpose model called Gemini Deep Think, first previewed at its developer conference in May. Unlike earlier AI models that relied on formal languages and long calculations, Google’s system solved the problems in natural language within the official 4.5-hour contest limit, the company explained in a blog post.

OpenAI also built an experimental reasoning model for the IMO, researcher Alexander Wei wrote on X, but said the company would not release such high-level math abilities for several months.

This year was the first time the IMO worked directly with some AI labs, after years of their unofficial testing. Judges verified and certified results from Google and asked the companies to publish results on July 28.

“We respected the IMO Board’s request that all AI labs share results only after experts had verified them and the student contestants had received their recognition,” Google DeepMind CEO Demis Hassabis said on X Monday.

OpenAI, which released its results Saturday and was first to claim gold-medal status, told Reuters it had permission from an IMO board member to do so after the closing ceremony.

On Monday, the competition allowed participating companies to release results, IMO board president Gregor Dolinar confirmed.

Leave a Comment