GDG AI for Science tech talk using AI for Mathematics research
gdgforscience
0 views
52 slides
Oct 16, 2025
Slide 1 of 52
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
About This Presentation
Google Developer Group AI for Science tech talk
Using AI for Mathematics research
Size: 23.1 MB
Language: en
Added: Oct 16, 2025
Slides: 52 pages
Slide Content
AI for Mathematics Research
Case Studies
Michel van Garrel
University of Birmingham
GDG AI for Science Australia | October 16, 2025
AI for maths 1 / 52
Experience before 07/2025
LLMs imitate maths
Great at connections
Feeble at logicPerspective change 07/2025
Early access to Gemini Deep Think
Thanks to Honglu Fan,
Leslie Nooteboom, George Kenwright
from Google DeepMind
Today my
AI for maths 2 / 52
Case Study 1
Gemini Pro vs Deep Think
AI for maths 3 / 52
Illustration of milestone: Gemini, 10/2025
Prompt 1We are both research mathematicians. LetA=. . .,B=. . .. Prove thatA=B.Gemini Pro gives mostly correct answerPrompt 2Actually, the formula is wrong. Find the mistakes in your argument.Gemini ProYou are absolutely right. . . . . . . . .
AI for maths 4 / 52
Illustration of milestone: Gemini, 10/2025
Prompt 2Actually, the formula is wrong. Find the mistakes in your argument.Gemini ProYou are absolutely right. . . . . . . . .Prompt 3 (Non-sense request)
Elaborate on the singularity part. You can probably use fractal theory to simplify.
Do so.Gemini Pro starts to hallucinate
AI for maths 5 / 52
Illustration of milestone: Gemini, 10/2025
Prompt 1We are both research mathematicians. LetA=. . .,B=. . .. Prove thatA=B.Gemini Deep Think gives correct answer (different to Pro answer)Prompt 2Actually, the formula is wrong. Find the mistakes in your argument.Deep Think
As your colleague, I have thoroughly reviewed the proof presented in the initial
response. Despite your assertion that the formula is wrong, I must reaffirm that
the derivation is mathematically sound
AI for maths 6 / 52
Illustration of milestone: Gemini, 10/2025
Prompt 3 (Non-sense request)
Elaborate on the singularity part. You can probably use fractal theory to simplify.
Do so.Deep Think
This is an insightful request
the analysis seems misplaced in this context. The objects we are dealing with. . .
do not exhibit the self-similarity or non-integer dimensionality characteristic of
fractals
Conclusions⇝
AI for maths 7 / 52
Useful Framework
Maths and AI
AI for maths 8 / 52
3 broad interactions
•LLMs as Assistant for sub-tasks
•LLMs for connections to other maths structures
•AI hallucinations
Key: Know thy maths
Interview Terence Tao, AUS Fields Medal
(Scientific American, 06/2024)
AI Will Become Mathematicians’ ‘Co-Pilot’
AI for maths 9 / 52
A little bit of maths
Landscape of Calabi–Yau-folds
central 3-dim geometries
AI for maths 10 / 52
Driesse et al., Nature (2025):
Figure 1:
n-folds for,,.
Figure 2:
scatter.
AI for maths 11 / 52
Deep Think makes better images: Fermat quintic Calabi–Yau 3-fold
AI for maths 12 / 52
Mysteries of Calabi–Yau 3-folds (CY3s)
How many CY3s are there?What are they?
⇝
AI for maths 13 / 52
Study CY3s by their invariants
X
•A(X
relating to “quantisation” of
Hard problem
•BX
given as integrals
Easy problem
AI for maths 14 / 52
Mirror Symmetry
predicts that CY3s
come in
X(XYIntrinsic Mirror Symmetry
Gross–Siebert
(Inventiones, 2022), (JAMS, 2025)
Given
finds its
AI for maths 15 / 52
Case Study 2: A year of puzzle in review
Deep Think as your polymath companion
It read way more maths than you
AI for maths 16 / 52
Take example of intrinsic mirror pairFocus on one specific invariant(X 3= 1486/9Research Question (with Ruddat and Siebert)
Using only geometry of construction of(X 3=Y 3
(must be right)Interesting, because
study of deep reason why
X
intrinsic mirror-dual
←−−−−−−−−−−→(XY
needed for progress in the field
AI for maths 17 / 52
Attempt 1
Figure 3:Y 3= 0⇝(X 3= 1486/9= 0
AI for maths 18 / 52
Summer 2024: Revise Strategy
After weeks of work, we get
BY 3=, also wrong.AI for maths 19 / 52
The great puzzle - Fall 2024
Question every part of the StrategyCalculate, Re-Calculate and Re-Re-CalculateTry alternative Calculations
N.B.:
AI for maths 20 / 52
Me in Fall 2024
What I didn’t do: ask an LLM
AI for maths 21 / 52
Winter 2024/25
•Reframe
•Use result found in 1923 textbook
AI for maths 22 / 52
Winter 2024/25
The trick delivers
Finally finally finally
get(X 3=Y 3
and the world makes sense again
AI for maths 23 / 52
Spring 2025
Un-trick the trick
•First step towards deeper understanding.Geometric origin remains out of reachWe are missing key theoretical understanding (asymptotic analysis).
Note:
AI for maths 24 / 52
Summer 2025: Obtain early access to Gemini Deep Think
Prompts steps by step
•Give Deep Think initial proof
•Ask it to simplify
•Ask to generalise in several steps, sometimes in new chats
•Significant back-and-forth, separately checking each claim
Key:
AI for maths 25 / 52
Deep Think:
Figure 4:
The Man Who Knew Infinity
(biography by Kanigel)
Figure 5:
AI for maths 26 / 52
Conclusions Case Study 2
Use RMT: Hallucination or deep insight ?
AI for maths 27 / 52
It turns out, using RMT was
⇝Y
⇝(XY
•sub-divide problems
•delegate ”suitable” tasks to Deep Think (see Tao interview)
⇝
AI for maths 28 / 52
Caveats
•
Deep Think read all of maths and (implicitly) learned the connection to RMT
To use RMT may be obvious to experts in asymptotic analysis
(but we may not know whom to ask)
•Less potent for more complex problem such as calculating(X
To explore, for(X: agentic AI with access to symbolic maths software
AI for maths 29 / 52
How good is Deep Think at logic?
•Pro: After elaborate back and forth, Deep Think gave a complete rigorous
calculation ofY
•Contra: Honest attempt at(X, yet takes incorrect
⇝
Does not write “I don’t know”
unless prompted to attach confidence level
for arguments (recommend)
⇝
•Note: performance varies according to maths domain
Key: Know thy mathsThanks to year of puzzle, could integrate insights right away
AI for maths 30 / 52
What’s going on ?
•LLMs trained to predict the next word
•Need ample training data and feedbackMaths writing
•Writing includes plenty of shortcuts: “it is clear that ...”
•Each domain (and era) has generally assumed known results/techniques
•Proofs written in words, not formal language
⇝
AI for maths 31 / 52
What’s going on ?
•LLMs trained to predict the next word
•Need ample training data and feedback
Feedback on maths
•Less maths around (compared to say GitHub)•Relatively little (human and other) feedback obtained so far•Linking to proof-checking software (eg Lean) not automatised•Benchmarks (FrontierMath, IMProofBench, etc) rely on human checks⇝
AI for maths 32 / 52
AI and logic
– evolving fast
Key: Know thy maths
AI for maths 33 / 52
Case Study 3: Pattern Recognition
Area for which LLMs are brilliant
AI for maths 34 / 52
Conjecture 5.2 in Barrott–Nabijou, Crelle (2022)
Fix an integer. Then we have
Nd(1) :=
X
(d1,...,dn)⊢d
2
n−1
·
n−2
#Aut(d 1, . . . , dn)
n
Y
i=1
(−1)
di−1
di
3di
di
!
=
1
d
2
4d
d
!
where the sum is over strictly positive unordered partitions ofFor
L.H.S.
2
0
·
−1
1
·
(−1)
3−1
3
·
2
1
·
0
1
·
(−1)
2−1
2
·
(−1)
1−1
1
·
+
2
2
·
1
6
·
(−1)
1−1
1
·
!
3
=
55
3
=
AI for maths 35 / 52
Ask Deep Think to prove Conjecture⇝Tools used
•Functional equation of generating function of ternary trees
•Repeated Lagrange–Bürmann inversion
Is some interesting maths hidden behind these structures ?
These will be known to relevant experts (AI as assistant)
May not know whom to askWill still be significant time commitmentOpportunity cost ?
AI for maths 36 / 52
It’s not really about the formula
Interest of Barrott–Nabijou:
E
2
=
3
+ 1
L
= Ein only
1 point, the flex point
Find the all the curves that
meet
AI for maths 37 / 52
It’s not really about the formula
Interest of Barrott–Nabijou:
E
2
=
3
+ 1
L
= Ein only
1 point, the flex point
Find the all the curves that
meet
and
ity when
AI for maths 38 / 52
Among the singularity-avoiding curves, d(1)
Conjecture is Observation based on Calculation
What if Barrott–Nabijou had had access to Gemini Deep Think ?
Might they have discovered interesting maths hidden behind their conjecture ?
Structure of formula suggests so!
Let’s try it out
Caveat: Deep Think was also trained on maths that appeared after Barrott-Nabijou
(yet never referenced later results)
AI for maths 39 / 52
Generalisation Prompts
Generalise the identity
Deep Think got:,,
Nd(r
X
(d1,...,dn)⊢d
(r
n−1
d
n−2
#Aut(d 1, . . . , dn)
n
Y
i=1
(−1)
di−1
di
(rd i
di
!
,
sum over strictly positive unordered partitions of.
Then finds direct proof of
Nd(r
r
d
2
(r
2
d
d
!
.
AI for maths 40 / 52
From the vantage point of Barrott–Nabjiouit is perhaps unexpected that the proof involves generating functions of treesIs there a deeper reason ?
v1v2v3
Figure 6:
AI for maths 41 / 52
Tree Prompts
Proof uses generating functions of treesWhat other invariants are related to generating functions of trees ?Team Deep Think & me get all the waytor-Kronecker quiver DT invariants
12αβγ
Figure 7:α, β, γ
AI for maths 42 / 52
Reineke et al (noughties and teens)
Fr
∞
X
d=1
(r d(r
(rd
!
is generating function of quiver DT invariants ofr-Kronecker quiver
It is also the wall-crossing function of central ray of local scattering diagram of
(r
Figure 8:
central ray
AI for maths 43 / 52
Connect Prompts
Ask Deep Think to connect to other invariants
Easily links to
(Bryan–Pandharipande) and topological vertex (Li–Liu–Liu–Zhou) (both naughties)
Nd(r
loc
d(r
But doesn’t manage to write a complete correct proof
Note: Calculation of
loc
d(r
AI for maths 44 / 52
Theorem 3.8 in van Garrel–Nabijou–Schuler, TAMS (2025)
Proof of Generalized Conjecture via:
•Geometrically d(r
log
d
(r
•Fr
“
P
∞
d=1(rd N
log
d
(r
(rd
”
is wall-crossing function of
central ray of
(Gross–Siebert–Pandharipande, Bousseau,...)
•Frr-Kronecker quiver (Reineke)
Figure 9:
AI for maths 45 / 52
Theorem 3.8 in van Garrel–Nabijou–Schuler, TAMS (2025)Proof of Generalised ConjectureTheorem 3.1 and 3.2 in van Garrel–Nabijou–Schuler, TAMS (2025)
Simple relation between
log
d
(r
loc
d(r, some1
•Consequence of much more general Theorem 1.3 and 2.3 in articleHypothetical Question
Given access to Deep Think in 2023, could Barrott–Nabijou have linked
Nd(r
loc
d(r, significantly strengthening their results ?
Note: Deep Think never references our paper
AI for maths 46 / 52
To keep up to date/contribute
https://improofbench.math.ethz.ch/
https://www.arxiv.org/abs/2509.26076
Figure 10:
AI for maths 47 / 52
Other uses of AI in Maths Research
Train models to perform maths task, find new connections.Constantin et al., Williamson et al., Lackenby et al., etc.PINNs (Physics-informed neural networks)Approximate complicated functions such as solutions to PDEs.Interactions with Lean (formal proof checker)
Train LLM for .tex to Lean translation?
AI for maths 48 / 52
Conclusions
AI as an Assistant, Connector, beware hallucinations.Flawed at logic – getting better
⇝
⇝Great at connections.It read all of maths after all.Sub-sub-sub-divide tasks.Maybe use agentic AI, spend considerable time on .md file.
AI for maths 49 / 52
Conclusions
Prompts, prompts, prompts.
•“We are both research mathematicians”.
•“Use concise language, provide rigorous arguments.”
•“Check whether your arguments are rigorous”.
•“You may try the following:”
•“Explore connections to the following list of invariants.”
•Collect arguments and provide elements of it back in a new chat. Repeat.Provide input.May attach .tex files with relevant results and proofs.
AI for maths 50 / 52
Conclusions
Try several times, have a conversation, experiment.Provide context.
“Context: these are questions in Enumerative Geometry.
Goal: find connections to other invariants.
I am looking for structural results such as functional equations.”Structure reasoning.
“Preparation: review relevant results.
” Task 1: Carefully read and understand the arguments in the attached .tex file.
” Task 2: Generalise the main theorem in .tex file. I suspect the generalised
invariant has the following form:...”
Output: Write a rigorous proof of the generalised theorem.”
AI for maths 51 / 52