Measuring Query Latency the Hard Way: An Adventure in Impractical Postgres Monitoring by Simon Notley
ScyllaDB
0 views
20 slides
Oct 15, 2025
Slide 1 of 20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
About This Presentation
Sampling the session state (as exposed by pg_stat_activity) is a surprisingly powerful way to understand how your Postgres instance spends its time. It is something I can wholeheartedly recommend to any Postgres DBA that needs a lightweight way to monitor query performance in production. However, it...
Sampling the session state (as exposed by pg_stat_activity) is a surprisingly powerful way to understand how your Postgres instance spends its time. It is something I can wholeheartedly recommend to any Postgres DBA that needs a lightweight way to monitor query performance in production. However, it's a terrible way to measure query latency, fraught with complexity and weird statistical biases that could be avoided by simply using an extension built for the job, or even log analysis. But pursuing terrible ideas can be fun, so in this talk, I dive into my adventures in measuring query latency from session sampling, generate some extremely funky charts, and end up unexpectedly performing a vector similarity search.
Size: 2.18 MB
Language: en
Added: Oct 15, 2025
Slides: 20 pages
Slide Content
A ScyllaDB Community
Measuring Query Latency the
Hard Way: An Adventure in
Impractical Postgres Monitoring
Simon Notley
Observability and Optimization
Simon Notley (he/him)
Observability and Optimization PM at EDB
■Something cool: Gained the freedom of Tryfan
■Perspective on P99s: The time 1 in every 100 of
your users is waiting and wondering if it’s broken
■Another thing: I used to race bicycles
■Away from work: dad stuff
Pursuing terrible ideas can be fun
Take a good idea
Understand it’s strengths and
weaknesses
Undeterred, try to apply it to something
you know it’s no good at
Have fun
Session sampling
Good at proportions, bad at details
Query latency
“fun”
Time-domain sampling, huh! What is it good for?
23%
20%
Time-domain sampling, huh! What is it good for?
Details
Proportions
…and I will now use
it for details…
time
We know the query started here
It was still running here
It wasn’t running here
a b
Even an end has a start
time
a bc
estimated_duration = 2a + c
Unbiased!
For a 1000 ms query, for a range of sample periods, calculate our estimated duration for all possible
relative positions of the query and the samples
Box: 25th - 75th percentile