AI's Role in Transforming Mathematics

Hello everyone! Hello! Hello everyone!

Some of you, dear contestants, are very young. So, perhaps you do not know who is Professor Terence Tao. So, just a few words of introduction. He participated at the IMO when he was for the first time when he was 11 years old and he received a bronze medal next year he came back and he received a silver medal after that at the age of 30 he received a gold medal and he was the youngest participant to receive a gold medal. Then he, I don't know, got bored and he went to university and he didn't participate at the IMO anymore.

Now he is a professor at the University of California, LA, and I can say that he is definitely the biggest IMO. IMO star and of course one of the most influential mathematicians of our time. Especially for you, professor Terence Tao.

Thank you. I'm very happy to be back here at an IMO. The time I had the IMO was one of the most fun times of my life.

I still look back on it fondly. I hope we all had fun, not just in the competition, whether you get a good score or not, but also in the social activities. And it's really, they always host a really good event here. So my talk is on AI and more generally machine assistance in mathematics. So you've all heard about AI and how it's changing.

changing everything. I think earlier today there was a talk by DeepMind on how they released a new product, AlphaGeometry, that can answer some IMO geometry questions now. And there will be a presentation on the AI Math Olympiad right after my talk, actually, so please stay after my talk for that. So I'll be talking more about how these tools are beginning to change research mathematics, which is different from competition mathematics. Instead of having three three hours, whatever, to solve a problem, you take months and sometimes you don't solve the problem and you have to change the problem.

It is definitely not the same as math competitions, although there's some overlap in skills. So it's all very exciting and it's beginning to be transformative, but on the other hand, there's also a sense of continuity. We've actually been using computers and machines to do mathematics for a long time, and it's just the nature, the way in which we're doing it is changing, but it actually is following on a long tradition of machine assistance.

So, okay, so here's a question. How long have we been using machines to do mathematics? And the answer is thousands of years. I mean, okay, here's a machine that the Romans used to do mathematics. Okay, the abacus on the early machines, there's even some earlier ones.

Okay, so that's kind of boring. That's not a very smart machine. What about computers? How long have we been using computers to do mathematics?

That's about 300, 400 years. I think that's a bit weird because, you know, our modern computers, we didn't have the electronic computers until the 1930s and 1940s. But computers weren't always electronic.

Before that they were mechanical and then before that they were human. The computer was actually a job profession, someone who computes. Here is a cluster of computers during the World War. So to compute ballistics and other things, they had a whole cluster of computers. of computers who are usually girls, because the men were fighting the war, with us adding machines.

And it was basically, you know, and there were programmers who basically just told the girls what to do. And, you know, the basic unit of computational power at the time was not the CPU, it was the kilogore. It's a kilogore hour, so how much computation you can do with 1,000 girls working like this for one hour.

So, but with... We've been using computers actually even before that. As I said, since the 1700s or even earlier.

The most basic use of computers in those times was to build tables. So you may have heard of the logarithm tables of Napier. You know, if you wanted to compute sines and cosines and so forth, you used a computer to generate these huge tables.

When I was still in high school, we still, in our curriculum, learned how to use these tables. They were just being phased out. Because now we had calculators and now computers.

We still use tables today. I mean in mathematical research we rely on tables. We call them databases now, but they're still the same thing.

There are many, many important results in mathematics that were first discovered through tables. In number theory, one of the most fundamental results in number theory is called the prime number theorem. It tells you roughly how many...

how many primes there are up to a large number x. And it was discovered by Legendre and Gauss, they couldn't prove it, but they conjectured it to be true because Gauss and others, they basically had computers, in fact Gauss himself was a bit of a computer, to compute tables of the first million primes and try to find patterns. And then a couple centuries later, there's another really important conjecture.

So the pattern number theorem was eventually proven in like 1907 or so. But there's another really central problem. in number theory called the Bertrand-Swinton die conjecture, which I won't talk about here, but it was also first discovered by looking at lots and lots of tables, this time tables of data about elliptic curves.

A table that lots and lots of mathematicians use now, including myself, is something called the Online Encyclopedia of Integer Sequences. Maybe you've encountered it yourself. So you may recognize many integer sequences just from memory, like if I tell you the sequence one, one, two, three, five, eight, 13, you know what that is, as the Fibonacci sequence.

And the OEIS is a database of hundreds of thousands of sequences like this. And many times when a mathematician is working on a research problem, there is some natural sequence of numbers associated. Maybe there's a sequence of spaces depending on n, and you compute the dimension or the cardinality of a set or something. And you can compute the first five or six of these.

or 10 of these numbers, and then compare it into the OEIS, and if you're lucky, this sequence has already been put there by somebody else who has, and it came from some, it was discovered from a completely different source coming from studying some other mathematical problem, and then that really gives you a big clue that there's a connection between two problems, and many, many promising productive research has come up that way. Okay, so tables, one of the earliest ways we've been using computers. Most famous, when you think of using computers to do mathematics, you think of number crunching.

The formal name for this is scientific computation. So you just want to do a very big calculation and you just do lots and lots of arithmetic, you source it out to a computer. And so we've been doing that since the 1920s.

Maybe the first person to really do a scientific computation was Hendrik Lorentz. So he was tasked by, I think, the Dutch to figure out what's going to happen. They wanted to build a really giant dike, the Atheris dike.

like. And they wanted to know what happened to the water flow, and so they had to model some fluid equations. And so he used a whole bunch of human computers, actually, to work this out.

He had to invent floating-point arithmetic to do this. So he realized that if you wanted to get a lot of people to do a lot of calculations very quickly, you should represent lots of numbers of different magnitudes as floating points. So of course we now use computers to model all kinds of things, solving lots of linear equations or or partial differential equations.

You want to count some combinatorial, do some combinatorial calculations. You can also solve algebra problems. So like in principle, like all the geometry questions, not quite all, but many of the geometry questions you see at Olympiads can be solved in principle by a scientific computation.

There are these algebra packages that can solve, you can turn any geometry problem, say, involving 10 points and some lines and circles into a system of equations of 20 real points. real variables in some 20 unknowns, and just whack it into Sage or Maple or something. Unfortunately, once it gets to the size, the complexity becomes exponential, or even double exponential. And so until recently, it was not really feasible to just brute force these problems with just standard computer algebra packages, but now with AI systems, maybe it's more promising. So you heard a talk about that this morning.

Another type of scientific computation that has become quite powerful is what are called SAT solvers, satisfiability solvers. So these are meant to solve kind of, basically they solve logic puzzles. Like if you have 10 statements or maybe 1,000 statements that are true or false, and you know that maybe if the third statement is true and the sixth statement is true, then the seventh statement must be false. And if you've given a whole bunch of constraints like that, a SAT solver will try to take all this information and conclude, you know, can you prove a certain combination?

of these sentences or not. And then there's a more fancy version of a SAT solver called an SMT solver, Satisfiability Modular Theories, where you also have some variables, X, Y, and Z, and you assume some laws, like maybe there's an additional operation and decision is commutative and associative. And you plug in these laws as well as some other facts and you try to just brute force, can you deduce some conclusion out of some finite set of hypotheses.

So those are quite... quite powerful, but unfortunately they also scale very, they don't scale well at all. Again, the time complexity to solve them grows exponentially, and so once you get past 1,000 or so propositions, it becomes really, really hard for these solvers to actually run in any reasonable amount of time.

But they can actually solve some problems. So one recent success, for example, here's a problem in combinatorics, which probably will only ever be solved by computer. This is... This will not be possible to solve by a human, I think, unassisted.

So it concerned what was called the Pythagorean triple problem, which was unsolved until this big computer SAT-SOLAR calculation. So the question is, you take the natural numbers and you color them two colors, red or blue. And the question is, is it true that no matter how you color these two natural numbers, one of the colors must contain a Pythagorean triple problem? triple, ABC, three numbers which form the Lasalli's providing which are like three, four, five. And so this was not known to be true.

We have no sort of human proof of this. But we have a computer proof. It is now known that, in fact, you don't need all the natural numbers. You just need to go up to 7,824. No matter how many ways you can color 7,824 into two color classes, one of them will contain a Pythagorean triple.

Now there's two to the 7,824 such classes. You can't do it by brute force. Okay, so you have to be somewhat clever, but it is possible.

And then if you do, okay, and then if 7, 8, 2, 5, sorry, once you have 7, 8, 2, 5, you must have a Pythagorean triple. There's an example of 7, 8, 2, 4 where there's no Pythagorean triple in either class. So that's provable.

It actually, I think it's the, at the time it was the world's longest proof ever. I think now it's the second longest proof. The proof required, yes, 7 CPM. CPU use of computation and they generated a proof certificate, so the actual proof is 200 terabytes long, although it's since been compressed to a mere 86 gigabytes. So this is one way in which we use computers, just to do enormous, basically enormous case analysis.

So that's a fairly obvious way to use computers. But in recent years we're beginning to use computers in more creative ways. So there are three ways in which computers are being used to do mathematics, which I think I've are really exciting, particularly when they are combined with each other and with more classical databases, tables and symbolic computation, scientific computation.

So first of all, we're using machine learning, neural networks to discover new connections and find out ways in which different types of mathematics are correlated in ways that you would not see as a human or unlikely to see as a human. Most especially, there's a large language barrier models which are in some sense very large versions of machine learning algorithms which can take natural language you know like chat GPT and cloud and so forth and and and and they can sometimes generate possible proofs approaches to problems which sometimes they work sometimes they don't you see it a bit more examples of this actually in the talk after my after mine but then there's also another technology which is just becoming usable by the everyday mathematician, which are called formal proof assistants. So these are languages, so, you know, languages, computer languages you're used to using to write executable code, you know, programs that do things. Formal proof assistants are languages that you write to check things, to check whether a certain argument is actually true and actually gives you the conclusion from the data. And these have been fairly annoying to use until very recently.

And they're becoming now somewhat easy to use, and they are facilitating a lot of... of interesting math projects that wouldn't have been possible without these proof assistants. And they will combine very well in the future with the other tools I described here. So I want to talk about sort of these more modern ways to use machines and computers to do mathematics. So I think I started with proof assistants.

So the first really computer assisted proof maybe in history was the proof of the four-color theorem. Every plane of can be colored in up to Using only four colors. That was proven in 1976. This was before proof assistance. There wasn't really, it wouldn't really be called a computer proof nowadays.

It's a proof which, it was a massive computation, like half of which was done by a computer and half of which was done by humans. The way they proved the four color theorem is that you basically induct on the number of countries and you show that if you have a massive map, there is some sub graph of countries. There's a produced list of about a thousand. 1,000, 2,000 special subgraphs. And every big graph of countries had to contain, in some sense, one of these subgraphs.

That was one thing they had to check. And then they had to check that every time you had a subgraph, you could replace that subgraph with something simpler, and you could forecolor the simpler thing, you could color the main thing. And you had to check these properties, I think they're called dischargeability and reducibility, for each of these 1,000 or so graphs.

I think one of these tasks they could do by computer, although this was a 1970s computer, I think they had to enter in each graph by hand into this one program and check it. The other task was actually done by a human computer. One of the daughters of one of the authors actually had to spend hours and hours manually checking, I think, the reduceability thing. It was very tedious. The process was not perfect.

There were lots of little mistakes and they had to update the table. So it was not, by modern standards, computer-proof. computer verified proof, that was only, came out much later in the 90s, where there was a simpler proof using mere 700 or so graphs, but now all the things that need to be checked, it was a very precise, well defined list of properties, and you could write code in your favorite computer language, C or Python or something, and you could check it in a couple pages. And a couple hundred lines of code, in like a few minutes for the modern computer. And then to actually check that it's completely, to provide a proof that goes all the way down to the axioms of mathematics, that was done in 2005 using a proof-assistant language called Koch.

I think now it's renamed Roch. So that was one of the first proofs. And you see there's a huge gap between sort of when the proof first appeared and then when we actually could completely verify by computer.

Another famous example is the Kepler conjecture. for sphere packing. This is a really old conjecture of Kepler from the 17th century. It's very easy to state. You take a whole bunch of unit spheres and you want to cover three-dimensional space as efficiently as possible.

So there's sort of an obvious way to try to pack spheres. It's a triangular packing. Sorry, like the way you pack oranges at a grocery store. There's also a dual packing called the cubic packing, which has the same density. So there's a density of about 74%, which is sort of the obvious packing.

And the question is, Is that the best possible? And this turns out to be a surprisingly hard problem. In two dimensions, the hexagonal packing is not too hard to show it's the best. Only very recently in eight and 24 dimensions do we know the answer.

As a great work of Yuriy Sovtskin, maybe she talked about it yesterday. But three is the only other case that we know of, except for one, which is trivial. But yeah, it was surprisingly difficult to prove. Again, there was no completely human readable proof of this conjecture. There is a strategy.

So of course the problem is that there's infinitely many of these spheres, and the density is an asymptotic thing. So it's not a finite problem that you can just throw at a computer. But you can try to reduce it to a finite problem.

So there's a strategy proposed by Toth in the 50s. So every time you have a packing, it subdivides space. into these polyhedra called voronoi regions. So the voronoi polytope of a sphere is just all the points which are closer to the center of that sphere than to all the other spheres. So you can sort of split up space into all these polytopes, and these polytopes have certain volumes.

You can also count their faces and their surface areas and so forth. And so they all have these statistics. And the volumes, like the packing density is very closely related to sort of the average volume of these regions. So if you can say something about kind of how the polytope of a sphere how these volumes of these polytopes behave on average, then you could get at least maybe some upper bound on to how efficient these packings can be. And you can try to make relations between these polytopes.

If one polytope is very big, maybe this forces the nearby ones to be very small. And so maybe you can try to find some inequalities connecting the volume of one boron or so to another. And so Toth's proposal, maybe you should just collect watts from all these inequalities and then Do some linear program or something and hopefully you can just derive the right bound of this magic pi over 3 root 2, which is the right density, from all these inequalities.

So people tried this. There were many attempts. Some even claimed success, but none have been accepted as actual proofs.

So the problem was eventually solved by Thomas Hales and his co-authors. So he did basically the same strategy, but using lots of technical tweaks. He changed the... from voinoa cells to slightly more fancy cells.

And instead of taking the volume, he invented this thing called a score that you assign to each of these. It's a volume plus or minus, lots of little ad hoc adjustments. But again, with the aim of trying to create all these linear inequalities between these different scores, and to eventually get upper bounds of the density, and to hopefully hit exactly the optimal density.

So it's a very flexible method. It's still too flexible, because there's too many things you can try. There's so many ways you can set up a score and so forth.

So there's a quote here. So Sammie Ferguson was the caller. Sorry, I forgot the name. So Harrison Ferguson realized that every time... he encountered problems in trying to minimize his functionals and so forth, he could just change the score and try it again.

But then all the things that they checked already, they had to redo. And so the scoring function became more and more complicated. They worked on this for almost a decade, I think. And it became more and more complicated.

But with each change, we cut months or years from our work. This incessant fiddling was unpopular with our colleagues. Every time I presented my work in progress at a conference, I was minimizing a different function.

Even worse, the function was... was mildly incompatible with what I did in earlier papers, and this required going back and patching the earlier papers. But they eventually did it.

So in 1998, they announced that they had finally found a scope which obeyed a whole bunch of linear inequalities in 150 variables, which they minimized and got their thing. Initially, they did not plan to make this a computer-assisted proof, but as the project became more and more complicated, it was inevitable. They had to use more and more computers.

Yeah, so the proof was enormous by the standards of 1998. It was 250 pages of notes and three gigabytes of computer programs and data. It actually had a very tough time getting refereed. Yeah, so it got sent to the top journal of mathematics, one of the top, the Annals of Mathematics. And yeah, it took four years to referee with a panel of 12 referees. And at the end they said they were 99% certain of the correctness of the proof, but they could not certify the correctness of the computer calculations.

And they did a very unusual thing. actually, is that they published the paper with a little caveat from the editors saying this. They have since removed that caveat, actually. At the time, there was a lot more controversy as to whether a computer-assisted proof qualified as an actual proof. Now, I think we are much more comfortable with it.

But even after it was published, there was doubt about whether it was really, really a proof. So this was maybe the first major high-profile problem. problem where there was really a big incentive to really formalize this in a completely, all the way down to first principles in a formal proof language.

And so Hayer was in fact created a language, well a modification of existing languages to do this. He called it the FlySpec project. He estimated it would take 20 years to formalize this proof, but actually with the help of 21 collaborators he actually finally in a mere 12 years, and it finally appeared in 2014. Okay, so we now have sort of complete confidence in this particular result.

But it was quite a painful thing to do. So moving now to the last few years, we've now figured out sort of a better workflow for how to formalize. It's still tedious, but it is getting better.

So Peter showed us who is a very prominent young mathematician, Fields Metallus for instance. So he's famous for many, many things, but he created this amazing, the promising area of mathematics, it's called condensed mathematics. It deploys the power of algebra, category theory and all the tools from algebra to apply to functional analysis, the theory of function spaces like biolog spaces and so forth, which analysis has really been resistant to the methods of algebra.

But this area of mathematics, in principle, could allow one to solve questions in it. in at least functional analysis, certain types of questions with algebraic methods. So he set up this whole category of these things called condensed, maybe in groups and condensed vector spaces. It would take too long to explain what condensed means.

Basically, his thesis is that all our categories of function spaces that we learn in our graduate classes are incorrect. They're not the natural ones. They're ones with better properties. So he set up this theory. But there was this one...

very important vanishing theorem, which he needed to prove. Okay, so I've stated it here, but I'm not going to explain what any of these words mean, or symbols mean. But there was a very technical vanishing of a certain category-theoretic group that he needed to compute.

And without this, the whole theory doesn't have any interesting consequences. So this was sort of a foundation of his theory. So... So he wrote a blog post about this result.

He said he spent a whole year getting obsessed with the proof of this theorem, going almost crazy over it. In the end we were able to get an argument put down on paper, but no one has dared to look at the details of this, so I still have lingering doubts. Yeah, with this theorem, the hope that this condensed formalism can be fruitfully applied to functional analysis stands or falls.

This theorem is of the utmost foundational importance, so being 99.9% sure is not enough. He said he was happy to see many study groups on quintessence. mathematics throughout the world, but they all stopped short of the proof of this theorem. This proof is not much fun. So he says, this may be my most important theory to date, better be sure it's correct.

So he was also very incentivized to formalize this theorem. Now in a more modern language, proof is this language called Lean. So Lean is a language that has been developed quite a lot in recent years.

There's a crowdsourced effort to develop this massive math library. So rather than deriving everything from the axioms of mathematics, which becomes very tedious the more advanced you go. And this type of mathematics is very advanced. So there's a central math library in Lean that has already proved lots of intermediate results, like the type of things you would see in, say, an undergraduate math course, like, say, the fundamental calculus or basic things like group theory or topology and so forth. These have already been formalized, and so you have a standing base.

You're not starting from the axioms. You're starting from roughly an undergraduate level math education. Still a big gap to where you need to go, but it helps.

But in order to formalize this, they had to add many extra, so the math library was not complete, it's still not complete. There's lots of areas of mathematics there, listed some homological algebra, shape theory, topos theory that needed to be added to this library. But in a mere 18 months, they were able to formalize this theorem.

So, and the problem is that the math library The proof was basically correct. There was some minor technical issues, but nothing really major was discovered. They found some nice simplifications.

There were some technical steps that were just too hard to formalize, and so they were forced to find some shortcuts. But actually, the value of this project was more indirect. Firstly, they greatly added to Lean's math library.

So now this math library can handle lots of abstract algebra to a much greater extent than it could before. But also, it there was other supporting software that got set up that future formalization projects have started using, including some that I did. So, for example, one tool that was set up in the course of this project was what's called a blueprint. So taking a huge, you know, like 50-page proof and trying to directly formalize it is really painful.

You know, I mean, you have to keep the whole proof in your head. But so what we've realized is the right workflow is that you take a big proof, you apply... you first write what's called a blueprint, which sort of breaks up this proof into like hundreds of tiny little steps. Each step you can formalize separately, and then you just put them all together.

So you try to break up a huge, huge argument into lots of little pieces. And so you write that first, and then different people on your team can formalize different parts of, different steps of your argument. So they also, as a byproduct of this formalization, they also wrote this very nice blueprint. This is why, if you actually want to read the proof as a human, The Blueprint's probably the best place to go to now. Another spin-off of this, so there's now also this formal proof, which is tens of thousands of lines long, but now there's efforts to try to convert that back to a human-readable proof.

So another thing that's been developed is that there are now tools that you can take a proof that's been written in, say, this language Lean. So here's an example where there's a proof written of a topological problem, and they converted it back to human-readable proof. So, so.

All this text here is a proof that... is computer generated from a formal proof. So it looks like a human proof. It uses the same sort of math language, but it's much more interactive.

You can click on any location. I can't do it because it's a static PDF, but you can click on any location here, and it will tell you wherever you are, what are the hypotheses, what you're trying to prove, what are the variables. If there's a step that is too short, you can expand it, and it will explain where it came from, and you can go all the way down to the axioms if you want. I think this is a great, I think in the future, textbooks we've written in this interactive style. You formalize them first, and then you can...

much more interactive textbooks than currently. Inspired by this, I myself started a project to formalize. So last year I solved a problem in combinatorics with several people, including Tim Gowes, who's here in the audience. So it's a problem in combinatorics. It's not super important what the problem is.

Okay, there's a subset of a energy two group. So think of Z mod two to the n, like what's called a Hamming cube. And it obeys a point. probably called small doubling, then there's a certain limit to how big it can be. Okay, but it doesn't really matter what the statement is.

But it's a common statement. We proved it. The proof is about 33 pages. And we formalized it in a relatively record. It's probably still the fastest formalized actual research paper in three weeks in a group project of about 20 people.

Using all this Blueprint machinery that had been developed. in Schurter's project. So basically, and it makes the task of proving things much more open and collaborative.

And you get all these nice visualizations. So as I said, the first thing you do is that you take your big theorem and you break it up into lots and lots of little pieces. So the theorem that we have is we call it PFR.

I won't explain why. So that corresponds to this little bubble at the bottom of this graph here. And then we introduce all these other statements, and so the proof of PR has to depend on several other previous statements, and then these ones depend on previous statements as well.

So there's this dependency graph, and they have different colors depending on whether you formalize them or not. So a green bubble is a statement that you've already formally proven in your language. A blue bubble is one that hasn't yet been formalized, but is ready to be formalized. All the definitions are in place. If someone needs to, I can go ahead and do it.

And a white bubble, even the statement, and it is... it hasn't been formalized yet. Someone has to write in the statement. So this is, you get this tree of tasks. And then the beauty of this project is that you can get all these people to collaborate on these, on different pieces of this graph independently.

So, you know, every little bubble corresponds to some statement. And you don't need to understand the whole proof in order to just work on your little piece. So you can, you know, so this was a problem in combinatorics, but the people.

The people who contributed, there were people from probability, there were people who were not even mathematicians, they were computer programmers, but they were just very good at doing these little mini-puzzle type things. And so everyone just sort of picked one thing, one bubble that they think they could do, and they did it, and in three weeks we did the whole thing. It was a really exciting project. And in mathematics, we don't normally collaborate with this many people.

Every five people is the most I've ever met. I've normally seen. It's because when you collaborate on a big project, you have to trust that everyone's math is correct. And past a certain size, this is just not feasible. But with a project like this, this lean compiler automatically checks.

You cannot upload anything that doesn't compile. It will get rejected. So you can collaborate with people that you never met before.

So yeah, I met a lot of people. I actually wrote a lot of letters of recommendation, actually, coming out of this project. Yeah. Yeah, so this is some of the example.

This is like one little piece of the proof. So this is what a proof looks like in Lean. It's not exact.

I mean, if you know the language, it's human readable, but it looks a little bit unusual. Yeah, but it really sort of decouples the task of proving things into many different sort of disjoint skills. You can have some people who see the big picture and sort of organize things into little pieces, and then you have people who just don't necessarily necessarily know all the mathematics, but they can just work on little pieces at a time.

So I think this will be a more and more common way of doing mathematics going forward. It still is painful to do. The tools are not really they're getting better and user-friendlier, but you still need to have some expertise in programming.

I would say it takes maybe 10 times longer to formalize a proof than to write it by hand. On the other hand, if you want to change your proof. So for example, there was a There was a, okay.

There was another 12 that showed up in this theorem. We later improved this 12 to an 11. We got a slightly stronger theorem. And normally, if you do that, you have to rewrite the whole proof. Or you could maybe cut and paste 12 to 11, but then you have to check you didn't make any mistakes when you did that. But actually, when we formalized this, then we got an improvement.

It only took a few days to change the 12 to 11. We just changed the 12 to 11 somewhere, and then the compiler complained at like five different places. Now a certain, this very specific part of the group were not working, and we could just target it. to do some targeted fixes.

So, in fact, already for some specific types of doing mathematics, actually already the formal approach is actually faster. So now there are actually quite a few big proof formalization projects going on right now. The biggest is Kevin Buzzard's project.

He's just got a big grant to formalize Fermat's last theorem into Lean. He says it will take five years to do the most important parts of this proof. He doesn't claim to do the whole thing in five years. But that is a...

That'd be an interesting project. It's already underway, actually. So that is, for my proof assistants, talk about machine learning. So machine learning, yeah, so these are using neural networks to predict answers to various questions that you can use in many, many ways.

I think I'll skip the first way I was discussing it, which is to use neural networks to guess solutions to partial differential equations, which is... a very exciting new tool in PDE, but I will skip it. I would say talk about another application of machine learning to knot theory.

So knot theory is quite a fun area in mathematics. It's an interesting area that it brings together many, many different fields of mathematics, and they don't really talk to each other. So a knot is just a loop of string, or a curve, really, in space that is closed. And two knots are equivalent if there's some way to...

continuously deform one knot to another, in a way in which you're not allowed to cross, the string is not allowed to cross itself. And so the basic questions in knot theory are when are two knots equivalent? If I give you two knots, is there some way to turn one into the other? And the way you approach this question normally is that you develop these things called knot invariants.

So these are various numbers, sometimes also polynomials, that you can attach to a knot, and these numbers don't change no matter how you change. you continuously deform the knot, so if two knots have different invariants, they cannot be equivalent. So there are many, many types of knots, of invariants.

There's something called a signature, which sort of counts, you flatten the knot and you count crossings, whether the crossings go over or under, and you create a certain matrix and so forth, and there's a certain integer called the signature, so that is one type of knot invariant. There's some famous polynomials called the Jones polynomial and the Holmfein polynomial, which are connected to. to many areas of mathematics, but I won't talk about. And then there's this thing called hyperbolic invariance, which comes from geometry.

You can take the complement of the knot, and that is actually what's called, it's usually a hyperbolic space. It comes with a certain geometric structure. It has a notion of distance, and you can keep it as volume and some other invariance.

And so these are invariants that are real or complex numbers. And so every knot comes with some combinatorial invariance, which is called hyperbolic signatures, these geometric invariants, like these hyperbolic invariants. So here is a whole list of knots with various of the hyperbolic invariants. There's something called the hyperbolic volume and the homological cusp shape and so forth.

And these are real and complex numbers. But no one knew of any link between these two. So there are these two separate ways to create statistics of knots. And there was no connection between them.

So it was only very recently that people started using machine learning. to attack this problem. So they created databases of millions of knots, which actually was already a slightly non-trivial task, and they trained a neural network on this. And they found that after training the neural network, you could give it all the hyperbolic geometry invariance, and like 90% of the time, it will guess the right signature. So it created this black box, and it will tell you how the signature was somehow hidden somewhere in these geometric invariants.

But it didn't tell you how. It was a this black box. But that's still useful because once you have this black box, you can just play with it.

So what they did next is actually a very simple analysis. It's what's called saliency analysis. So what this black box does is it takes about 20 different inputs, one for each hyperligin variant, and reaches one output, the signature.

So once you have this black box, you can just tweak each input. You just say, well, what if I change one input? How likely is it to change the output?

inputs that they found. They found that Only three of them actually had played a really major role in the output. The other 17 were barely relevant. And it wasn't the three that they expected. They expected the volume, for example, to be very important, and the volume turns out to be almost irrelevant.

There were three, something called longitudinal translation and the real and complex parts of meridional translation. There were these three invariants that were the most important. So once they identified the ones that were most important, they could just plot directly the signature against those three. three particular inputs. And then they could eyeball, rather than using neural network, they used a human network to then see, oh, okay, there's some obvious patterns here.

And so by staring at these graphs, they could actually make conjectures as to what was actually going on. So they made a conjecture based on this, which turned out to be wrong, actually. But they actually used the, I think they turned back to the neural network to actually show that it was wrong. But then the way that it failed, they could correct it.

and they found a corrected version of their conjecture, which actually did explain this phenomenon. And then once they found the right statement, they were able to prove it. So they actually have a theoretical explanation of why the signature is so closely related to these particular statistics.

So this, I think, is a way in which machine learning is being increasingly used in mathematics. It doesn't directly solve the problem for you, but it gives you all these really useful hints as to what the connections are and go where to look for them. But you still need to know You still need the human to actually make the connections.

And then finally we have the large language models, which are the most splashy and maybe made the most news. Neural networks have been around for 20 years. Language models have also been around five or so years, but they've only become sort of human level in output very recently. So you've all probably heard of GPT-4.

This is ChatGPT's current model. Very famously, when GPT-4 came out, there was a paper describing its capabilities, and they fed it basically a question from the 2022 IMO. It's a slightly simplified version. If you studied the 2022 IMO, it's not exactly the same form, but it's a simplified form. And for this particular question, actually, you give it the question, and it actually gives a complete correct solution to this question.

It actually solved an IMO question. Unfortunately, this is an extremely cherry-picked example. I think...

I think they tested hundreds of IMO-level questions, and I think they had a success rate of about 1%. So this particular problem they were able to solve, and they had to format the problem in the right way to get the solution. But still, this is quite amazing.

On the other hand, the funny thing about these tools is that things that humans find difficult, AI can do very easily sometimes, but things that humans find easy, AI often struggles with. It is a very orthogonal way of solving problems. In the same, in a very related paper or presentation, they asked the same model to do a basic arithmetic computation, seven times four plus eight times eight, and the model which is just guessing the most likely output based on the input, basically just guessed, oh, the answer is 120. And then it paused and said, okay, maybe I should give an explanation why it's 120. So they did it step by step.

But when they did it step by step, they actually arrived at the actual answer, which is 92, and not the answer that they started with. So then you can see that the answer is If you asked, wait, but you said that it was 120, and they said, oh, that was a typo, sorry. The correct answer is 92. So, you know, they're not solving the problem by first principles, they're just sort of guessing at each step of the output, what is the most natural thing to say next. And the amazing thing is sometimes that works, but often it doesn't.

And it's still, yeah, I mean, it's still ongoing how to sort of make it more accurate. So people are trying all kinds of things. you can connect these models to other more reliable software. So, in fact, you will see a presentation after where there's a large language model connected where you don't do the computation yourself, you outsource it to Python in that case. But another thing you can do is that you can force the language model to only produce correct answers by forcing the model to output in one of these proof-of-system languages.

And if it doesn't compile... you send it back to the AI and the AI has to try again. Or you can try to teach it directly the same problem solving techniques that you use to solve IMO problems. Try simple examples, proof of contradiction, try to actually prove step by step and so forth. So people are trying all kinds of things.

It's still nowhere near, well, okay, we're not near able to solve a large majority of say math and computer problems, let alone math research problems, but we may. making progress. Besides being able to actually solve problems directly, actually it's also useful to just, as a muse actually. I've also used these models myself. I've experimented with various problems.

I had a combinatorial problem which I was... trying a few things and they weren't working. So as an experiment, I just tried asking GPT, what other techniques would you suggest to solve this question?

And it gave me a list of ten techniques, five of which I'd already tried or were Obviously not helpful. Well, yeah, a couple were not helpful. But there was one technique I had not tried, which was to use generating functions for this particular question, which once they suggested it, I realized it was the right approach, but I had missed it.

So just as someone to converse with, it is somewhat useful. It is not great right now, but it is not completely useless. There's another...

Another type of AI assistance that has become very useful for proof assistance, actually. As I said, writing formal proofs is a very tedious task. I mean, it's like any really fussy computer language.

You have to get the syntax exactly right. If you miss a step, it doesn't compile. But there are tools. So I use one called GitHub Copilot, where you can write down half of a proof, and it will try to guess what the next line is.

And about 20% of the time, it actually guesses something close to being correct. And then you can just say, I'll accept that, and say, okay. So in this case, I was trying to prove the statement here, and the lines in gray are the ones that Copilot suggested.

It turns out the first line is useless, but the second line, which you can't quite see, actually did solve this particular problem. So you still have to, you can't accept the input because it won't necessarily compile. But if you already know sort of how the code works, it saves you a lot of time.

So these tools are getting better. better. So right now they can maybe, like if a proof is one line or two lines long, they can fill it in automatically.

There are now experiments to sort of iterate an AI suggesting a proof and then you feed it back to the compiler and then if it compiles wrong, you send the error message back and we're beginning to sort of, proofs that are like four or five lines long, they can begin to find by this method. So, you know, of course a big proof is like tens of thousands of lines, so it's nowhere near the point. where you can just instantly get your proof formalized immediately, but it is already a useful tool. Okay, so where are we now?

So there are people who are hoping that in a few years we can use computers to actually solve math problems directly, I think we are still a long way away from that. For very narrowly focused problems, you can sort of set up specialized AIs to handle just very narrow band of problems, but even then they're not. not fully reliable, they can be useful.

But still, yeah, so for the next few years at least, they're basically gonna be really useful assistance. And beyond the brute force computational assistance that we're already familiar with. People are trying all kinds of creative things. I think one direction which I find particularly exciting, hasn't really been successful yet, is that hopefully these AIs become very at generating good conjectures.

So we already saw a little example of this with the knots, where they could already sort of conjecture those connections between two different statistics. So there's just the hope that you just create these enormous data sets and feed them into an AI, and they will just automatically generate lots of nice connections between different mathematical objects. This hasn't really been, we don't really know how to do this yet, partly because we don't have these massive data sets.

But I think this is. This is something that would eventually be possible. One thing I'm also excited, this is a type of method that just doesn't exist yet.

Right now, because proving theorems is such a painful, painstaking process, we prove one theorem at a time, or maybe two or three if you're efficient. But with AIs, you could imagine in the future, instead of trying to solve one problem or something, you take a class of 1,000 similar problems and just say, okay, I'm gonna, We tell your AI, try to solve these 1,000 problems with this technique. And they will pull back, oh, I can solve 35% of these problems with this technique.

What about this technique? Now I can solve this percentage of problems. Or if I combine them, I can do this.

You can start exploring the space of problems rather than just each problem separately. And this is something that you either cannot do right now or you do over a process of decades with dozens and dozens of papers slowly figuring out what you can and can't do with various techniques. But with these tools, you could really do.

I mean, I... You could really sort of start doing mathematics on a scale which is really not really unprecedented. So you know I mean the future is going to be really exciting I think.

still also be proving theorems the old fashioned way. In fact, we'll have to because we won't be able to guide these AIs unless we also know how to do these things ourselves, but we'll be able to do lots of things that we can't do right now. Okay, I think I will stop there.

So thank you very much. I was doing it. Yeah, so. So we are on a tight schedule, but I was told we have time for maybe three or so questions. So if people would raise hands.

And there's someone over there. Thank you. Can you hear me? Thank you.

That was a beautiful talk. I particularly loved about formalizing mathematics, but one thing that you didn't mention was Vala Vodsky, who... left algebraic geometry because he made a mistake and started formalizing homotopy type theory.

I would be interested to know if you have studied this and have any comments on it. Right. Yeah, so for.

Yeah, he was worried about a crisis in certain areas of mathematics, including some that he created, that the proofs were so abstract and sophisticated that there was no way to verify that they're completely true. Yeah, so he proposed changing the foundation of mathematics to a higher-order type theory, homotopy type theory, which is more robust. If you change the underlying axiom of mathematics, a lot of what you prove in this theory is still true. There are proof-assistant languages that are based on this sort of homotopy-type theory. Lean is not actually by design, because Lean wants to formalize a lot of traditional mathematics, which is not written in this language.

I do hope in the future there will be multiple proof-assist— I mean, there are multiple proof-assist languages with different strengths and weaknesses. One thing that is not—we don't have right now is automatic ways to translate a proof in one language to another. That's actually one place where we're at.

where AI, I think, will be very useful. Once we have that, then we can... If you have a different philosophy of your foundation of mathematics, hopefully we can just translate a proof that has been formalized in one language to another, and then everyone will be convinced, including Votopovsky, if he was still alive, hopefully.

Yeah. So, I mean, there are multiple approaches to formalizing mathematics, and we shouldn't just... Certainly, we shouldn't just fix on one particular standard just yet.

Well, I, um... Okay. It's sort of a random process, but okay.

Well, that will not be quite relevant to the topic of the talk. But I was recently applying for PhDs, and the advice I was given by the professors was basically along the lines of, the longer the better. So it seems like there's kind of a general agreement that mathematicians need somehow to, it would seem to me, grow up to big ideas. So from that perspective, how do you think about your decision about going to the university at such a young age? Did you think it, how it influenced you as a mathematician and as a human being?

And... Well, I had some very good advisors, both at the high school and undergraduate and graduate level. I mean, I don't think it's a race.

I mean, you go to university when you're ready. to go. You shouldn't go just because you were told you need X years or something to do this.

I think it's different for different people. It was very important for me. I went to undergraduate when I was 13. But at a university that was very close to where I lived, so I lived with my parents, and they drove quite a lot, actually, to the university for all my classes.

If I didn't have that, I don't think I would have had a good experience. So it really depends. I did my university at a very young age.

It doesn't mean everybody should do that. Yeah. There's no single answer to this question. Thank you. Okay.

So another kind of more general question. Given that you've contributed to truly a million mathematical fields, how do you go about choosing your next research topic and problem you want to solve? And also what's your... number.

Okay, my original number is two, that's easy. But I actually don't know, I mean, early on in my career, you know, I had advisors who suggest problems to me. Nowadays, often it just comes I mean, math is a very social activity.

You go to lots of events. I mean, after this, I'm going to a math conference in Edinburgh. And I'm going to talk to a lot of people in an area connected to this PFR conjecture thing, actually. And likely, I'll have some interesting math conversations, and maybe some research questions will come out of it.

I do have some long-term projects that I would like to, you know, something that I'd like to solve. increasingly I find that it's the questions that just sort of come up by conversation with other mathematicians. That, you know, I didn't know I'd be working, I'd be talking so much about AI actually until about two years ago, you know, for instance.

Yeah, I think people, it should be sort of, you know, the future is going to require more flexibility and, you know, I mean, there will still be people who will specialize in one topic and just want to be the world expert in X. But increasingly, I think there'll be more and more people who... who will move around over time.

And just, they will find interesting new mathematics every few years just by talking to other people. Okay, I think we have to move on actually. So the next speaker is Simon Coyle, from XTX who will talk about the AI Math Olympiad.

Transcript for:AI's Role in Transforming Mathematics

Transcript for:
AI's Role in Transforming Mathematics