AI fashions are beginning to crack high-level math issues

Over the weekend, Neel Somani, who’s a software program engineer, former quant researcher, and a startup founder, was testing the mathematics expertise of OpenAI’s new mannequin when he made an sudden discovery. After pasting the issue into ChatGPT and letting it assume for quarter-hour, he got here again to a full answer. He evaluated the proof and formalized it with a software known as Harmonic — nevertheless it all checked out.

“I used to be curious to ascertain a baseline for when LLMs are successfully capable of resolve open math issues in comparison with the place they wrestle,” Somani mentioned. The shock was that, utilizing the most recent mannequin, the frontier began to push ahead a bit.

ChatGPT’s chain of thought is much more spectacular, rattling off mathematical axioms like Legendre’s formula, Bertrand’s postulate, and the Star of David theorum. Finally, the mannequin discovered a Math Overflow post from 2013, the place Harvard mathematician Noam Elkies had given a chic answer to the same drawback. However ChatGPT’s last proof differed from Elkies’ work in necessary methods, and gave a extra full answer to a model of the issue posed by legendary mathematician Paul Erdős, whose huge assortment of unsolved issues has turn into a proving floor for AI.

For anybody skeptical of machine intelligence, it’s a stunning end result — and it’s not the one one. AI instruments have turn into ubiquitous in arithmetic, from formalization-oriented LLMs like Harmonic’s Aristotle to literature assessment instruments like OpenAI’s deep analysis. However because the launch of GPT 5.2 — which Somani describes as “anecdotally extra expert at mathematical reasoning than earlier iterations” — the sheer quantity of solved issues has turn into tough to disregard, elevating new questions on massive language fashions’ potential to push the frontiers of human information.

Somani was wanting on the Erdős issues, a set of over one thousand conjectures by the Hungarian mathematician which can be maintained online. The issues have turn into a tempting goal for AI-driven arithmetic, various considerably in each material and issue. The primary batch of autonomous options got here in November from a Gemini-powered model called AlphaEvolve — however extra not too long ago, Somani and others have discovered GPT 5.2 to be remarkably adept with high-level math.

Since Christmas, 15 issues have been moved from “open” to “solved” on the Erdős web site — and 11 of the options have particularly credited AI fashions as concerned within the course of.

The revered mathematician Terence Tao has a extra nuanced take a look at the progress on his GitHub page, counting eight totally different issues the place AI fashions made significant autonomous progress on an Erdős drawback, with six different instances the place progress was made by finding and constructing on earlier analysis. It’s a great distance from AI methods with the ability to do math with out human intervention, however it’s clear that there’s an necessary position for giant fashions to play.

Techcrunch occasion

San Francisco
|
October 13-15, 2026

On Mastodon, Tao conjectured that the scalable nature of AI methods makes them “higher suited to being systematically utilized to the ‘lengthy tail’ of obscure Erdős issues, lots of which even have simple options.”

“As such, many of those simpler Erdős issues are actually extra more likely to be solved by purely AI-based strategies than by human or hybrid means,” Tao continued.

One other driving drive is a current shift in direction of formalization, a labor-intensive activity that makes mathematical reasoning simpler to confirm and prolong. Formalization doesn’t require use of AI and even computer systems, however a brand new crop of automated instruments have made the method far simpler. The open-source “proof assistant” Lean, which was developed at Microsoft Analysis in 2013, has turn into extensively used inside the area as a method of formalizing proof— and AI instruments like Harmonic’s Aristotle promise to automate a lot of the work of formalization.

For Harmonic founder Tudor Achim, the sudden leap in solved Erdős issues is much less necessary than the truth that the world’s best mathematicians are beginning to take these instruments severely. “I care extra about the truth that math and laptop science professors are utilizing [AI tools],” Achim mentioned. “These folks have reputations to guard, so once they’re saying they use Aristotle or they use ChatGPT, that’s actual proof.”

Source link