ChatGPT Ain't Got $%@& On Me! by Andy Pavlo

ScyllaDB 11 views 31 slides Oct 21, 2025
Slide 1
Slide 1 of 31
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31

About This Presentation

This talk presents research on a new generation of autonomous tuning agents that optimize more parts of a database in less time and at lower costs than previous methods. We will describe our automated methods for tuning nearly every possible option available in PostgreSQL: system/table/index knobs, ...


Slide Content

A ScyllaDB Community
ChatGPT Ain't Got $%@&
On Me!
Andy Pavlo
Associate Professor

ChatGPT Ain’t Got SHIT On Me!
Next Generation Automated Database Tuning
[email protected]

DATABASES ARE HARD
3
Lifecyle Mgmt. Physical Design System Knobs
Automation is Necessary!
Query Options
Scale Up/Out
Migrations
Upgrades
Plan Hints
Query Knobs
Rewriting
Indexes
Partitioning
Mat. Views
Resource Allocation
Optimizer Flags
Policies
iTuned (2009)
PGTune (2014)
OtterTune (2017)
CDBTune (2019)
DBTune (2022)
Partitioner (1976)
AutoAdmin (1997)
DB2 Advisor (2002)
Dexter (2017)
Microsoft DTA (2020)
TiDB TiAdvisor (2023)
Bao (2021)
EverSQL (2023)
AutoSteer (2023)
DBMarlin (2024)

Pas
t
presen
t
frontierfuture
197
5
202
5
198
5
199
5
200
5
201
5
LLM
s

AUTOMATED DATABASE TUNERS
5
Tuning Agent DBMS
PAST
Workload
Objective
Environment
Configuration
Feedback

↓ Lower is Better
Source: Will Zhang
DATABASE TUNING TOOLS
6
PostgreSQL v16 - Join Order Benchmark
PAST
▼30%
▼48%
▼36%
PGTune (2014)
Microsoft DTA (2020)
AutoSteer (2023)

SEQUENTIAL TUNING
Invoke a series of single-purpose
tuning agent one after another.
→Each agent is unaware of the reasoning of
the previous agent.
→Example: Knobs→Indexes→Queries

The order of agent invocation greatly
influences the overall efficacy of the
tuning process.
7 PAST

DATABASE TUNING TOOLS
8
Source: Will Zhang
↓ Lower is Better
PostgreSQL v15 - Join Order Benchmark
PAST
▼30%
▼48%
▼36%
▼66%
▼80%

Pas
t
presen
t
frontierfuture
197
5
202
5
198
5
199
5
200
5
201
5

MULTI-ROUND SEQUENTIAL TUNING
Coordinate multiple agents targetting
different parts of a database.
→Can revisit an agent in subsequent rounds.
→Combine each tool's local optima to a
holistic configuration

Different Scheduling Policies:
→Round Robin
→Expected Benefit
10
UDO: UNIVERSAL DATABASE OPTIMIZATION
USING REINFORCEMENT LEARNING
VLDB 2021
Coordinator
PRESENT
A UNIFIED AND EFFICIENT COORDINATING
FRAMEWORK FOR AUTONOMOUS DBMS TUNING
SIGMOD 2023

DATABASE TUNING TOOLS
Source: Will Zhang
11
↓ Lower is Better
PostgreSQL v15 - Join Order Benchmark
PRESENT
▼30%
▼48%
▼36%
▼66%
▼80%
▼67%

OBSERVATION
Multi-round sequential tuning relies on a separate
cost model to determine agent prioritization.

A better approach is to use a single agent that
considers all possible tuning options at the same
time to account for the implicit dependencies
between those options.
PRESENT

STRAWMAN: COMBINE AND TUNE





Problem #1: Dimensionality
→Too many configurations to consider!

Problem #2: Sparsity
→Few configurations perform well. Most are garbage.
13
System Knobs Physical Design Query Options
× ×
PRESENT

LEARNING FROM SIMILAR ACTIONS
Many tuning actions are related:





Instead of evaluating each action individually, a
better approach is to infer impacts of similar actions
without trying them.
14
BufferPool=32GB ≅ BufferPool=33GB
Index(a) ≅ Index(a,b)
Q1:HashMem=1.0 ≅ Q1:HashMem=1.1
PRESENT

PROTO-X TUNING AGENT
Holistic tuning agent that encodes them in a
high-dimensional model and then uses actor-critic
reinforcement learning to navigate the organized
space for promising configurations.

Similar actions are close to each other in the latent
space so the agent can better reason exploration vs.
exploitation.

15
THE HOLON APPROACH FOR SIMULTANEOUSLY TUNING MULTIPLE
COMPONENTS IN A SELF-DRIVING DATABASE MANAGEMENT SYSTEM
WITH MACHINE LEARNING VIA SYNTHESIZED PROTO-ACTIONS
VLDB 2024
PRESENT

Source: Will Zhang
DATABASE TUNING TOOLS
16
↓ Lower is Better
PostgreSQL v15 - Join Order Benchmark
PRESENT
▼30%
▼48%
▼36%
▼66%
▼80%
▼67%
▼88%

OBSERVATION
Proto-X's models are local-only and are specific to a
single database /application.
It has to retrain its models for each new database or
if the current database undergoes major changes
(DDL migrations, new queries).
→Database-specific options are one-hot encoded.

Ideally, we want to use the world's collective
intelligence to tune anything without retraining…
17PRESENT

Pas
t
presen
t
frontierfuture
197
5
202
5
198
5
199
5
200
5
201
5

LLMS FOR DATABASE TUNING
Knobs + Physical Design Tuning
→DB-BERT (2022), GPTuner (2023), λ-Tune (2025),
LLMIdxAdvis (2025)

Query Rewriting / Hints
→EverSQL (2023), GenRewrite (2024), LLM-R2 (2024),
R-Bot (2024), TiDB (2025)

Performance Monitoring / Debugger
→D-Bot (2023), Supabase Assistant (2024),
DBMarlin (2024), Tembo (2025)

Upgrade Advisor
→CMU-DB Prototype (2025)

19
FRONTIE
R

KNOBS + PHYSICAL DESIGN TUNING
20
Recommend configuration
parameters for Postgres to
optimize the system’s
performance. The machine has 8
CPU cores and 32 GB of RAM.

${ADDITIONAL_CONTEXT}
Prompt
shared_buffers = 8GB
effective_cache_size = 24GB
maintenance_work_mem = 2GB
wal_buffers = 16MB

CREATE INDEX index1 ON …
CREATE INDEX index2 ON …

Output
LLM
(optionally fine-tuned)
FRONTIE
R
Λ-TUNE: HARNESSING LARGE LANGUAGE MODELS
FOR AUTOMATED DATABASE SYSTEM TUNING
SIGMOD 2025

DATABASE TUNING TOOLS
21
Source: Will Zhang
↓ Lower is Better
PostgreSQL v15 - Join Order Benchmark
▼30%
▼48%
▼36%
▼66%
▼80%
▼67%
▼88%
▼59%
Proto-X: 12
hrsλ-Tune:14 min
λ
FRONTIE
R

SELECT MIN(mi.info) AS release_date,
MIN(miidx.info) AS rating,
MIN(t.title) AS german_movie
FROM company_name AS cn,
company_type AS ct,
info_type AS it,
info_type AS it2,
kind_type AS kt,
movie_companies AS mc,
movie_info AS mi,
movie_info_idx AS miidx,
title AS t
WHERE cn.country_code ='[de]'
AND ct.kind ='production companies'
AND it.info ='rating'
AND it2.info ='release dates'
AND kt.kind ='movie'
AND mi.movie_id = t.id
AND it2.id = mi.info_type_id
AND kt.id = t.kind_id
AND mc.movie_id = t.id
AND cn.id = mc.company_id
AND ct.id = mc.company_type_id
AND miidx.movie_id = t.id
AND it.id = miidx.info_type_id
AND mi.movie_id = miidx.movie_id
AND mi.movie_id = mc.movie_id
AND miidx.movie_id = mc.movie_id;
SELECT MIN(zeta.jrn),
MIN(echo.mux),
MIN(alpha.vqk)
FROM delta, bravo, hotel, golf, zeta,
alpha, charlie, foxtrot, echo
WHERE udf_str_3() = delta.lmn
AND delta.zxc = charlie.qwe
AND bravo.tyu = charlie.poi
AND charlie.asd = alpha.ghj
AND alpha.ghj = zeta.rty
AND zeta.rty = hotel.bnm
AND alpha.ghj = foxtrot.klo
AND golf.vbn = alpha.mnb
AND udf_str_2() = golf.kjh
AND udf_str_1() = hotel.mju
AND alpha.ghj = echo.iop
AND hotel.bnm = echo.hgf
AND zeta.rty = echo.iop
AND zeta.rty = charlie.asd
AND udf_str_0() = bravo.lkj;
BENCHMARKING CHALLENGES
Measuring the efficacy of LLM-based
tuning agents is difficult because
LLMs know how to optimize queries
from common database benchmarks.
→TPC-H, TPC-DS, JOB, ClickBench

Name/ordering obfuscation is not
enough to hide a query's source.
22
FRONTIE
R

Pas
t
presen
t
frontierfuture
197
5
202
5
198
5
199
5
200
5
201
5

LEARNING FROM PRIOR KNOWLEDGE
LLMs are generalized models that contain
unstructured and incomplete information about a
lot of databases.

Such generality means they (currently)
underperform specialized algorithms for clearly
defined and measurable tasks.

The future will be a combination of specialized
algorithms controlled by reasoning models.



24FUTURE

AGENT BOOSTING
Use LLMs to discover and extract relevant
information to accelerate specialized tuning agents.

Bootstrap the initial configuration to a tuning agent
so that it does not start from scratch each time.
→Facilitates knowledge transfer as a database / application
evolve over time.
→Also supports transfer for tuning decisions across
workloads, environments, and versions.



25
THIS IS GOING TO SOUND CRAZY, BUT WHAT IF WE USED LARGE-LANGUAGE
MODELS TO BOOST AUTOMATIC DATABASE TUNING ALGORITHMS BY
LEVERAGING PRIOR HISTORY? WE WILL FIND BETTER CONFIGURATIONS
MORE QUICKLY THAN RETRAINING FROM SCRATCH!
SIGMOD 2026
FUTURE

AGENT BOOSTING
26
Source: Will Zhang
↓ Lower is Better
▼30%
▼48%
▼36%
▼66%
▼80%
▼67%
▼88%
▼59%
λ
PostgreSQL v16 - Join Order Benchmark
Boosted with Voyage3-large +
llama-3.2-8b
FUTURE

AGENT BOOSTING
27
↓ Lower is Better
PostgreSQL v16 - Join Order Benchmark
Boosted with Voyage3-large +
llama-3.2-8b
FUTURE
Source: Will Zhang
▼45%
▼38%
▼65%

OBSERVATION
None of the approaches we discussed today handle
lifecycle management issues of running a
production database fleet.

The agents do not consider temporal and
relationship aspects of the choices they make.
→Emergency vs. Long-term Care
→Dependencies Between Instances
→Black Swans vs. Scheduled Events

28FUTURE

AGENTIC REASONING FOR DATABASES
SO-YOU-DONT-HAVE-TO
(SYDHT) is our next-generation
self-driving database platform that
supports holistic optimization and
real-time intervention across fleets.

Leverages on reasoning model to
identify what/when/how to apply
actions to maintain a DBMS.
29FUTURE
https://sydht.ai

CONCLUSION
Tuning a database holistically (all at once) is much
better than tuning sequentially (one at a time).

Reusing prior experiences improves the time and
efficacy of database tuning agents.

Reasoning agent + specialized algorithms is the
correct path forward to achieving autonomous
databases.
30

Tags