While computers have long surpassed humans in computational speed, the pinnacle of formal mathematics has remained a human stronghold. However, Google DeepMind has recently made significant strides towards changing this with the development of two AI systems, AlphaProof and AlphaGeometry 2. These systems collaborated to tackle problems from the International Mathematical Olympiad (IMO), an esteemed global math competition for high school students. The Olympiad challenges participants with six highly complex problems each year, spanning areas like algebra, geometry, and number theory, where excelling with a gold medal marks one as among the world’s top young mathematicians.
DeepMind’s AI systems, while impressive, didn’t quite reach the gold medal level but achieved a notable score of 28 out of 42, which would have earned a silver medal in the competition. This score, determined by Prof Timothy Gowers—a renowned mathematician and former IMO gold medalist—highlights the potential of AI in this domain. The AI systems demonstrated a binary performance: they either solved problems perfectly or couldn’t tackle them at all. Notably, unlike human contestants who have a nine-hour time limit, the AI had no such constraint and took varying amounts of time to solve the problems, from mere seconds to several days.
The two systems operated distinctly from each other. AlphaProof, which handled three of the problems, combined a large language model akin to those used in chatbots with a reinforcement learning technique similar to the one used by DeepMind in the game Go. This approach relied on “formal mathematics,” where a proof is constructed as a program that only runs if the proof is valid. This method allowed AlphaProof to attempt problem-solving by generating and verifying proofs in a formal mathematical language, though the process was not always quick.
AlphaGeometry 2, on the other hand, specialized in geometry problems and combined a language model with a mathematical reasoning approach. Its performance was notably swift, solving one of the geometry problems in just 16 seconds using an unconventional but effective method. This solution involved constructing a circle around a point to connect various elements of the problem, which initially puzzled experts but was later deemed elegant.
These advancements underscore the potential for AI to complement human expertise in mathematics, offering new tools and methods to tackle complex problems.