Shared Nothing Databases at Scale by Nick Van Wiggeren
ScyllaDB
0 views
24 slides
Oct 14, 2025
Slide 1 of 24
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
About This Presentation
This talk will discuss how PlanetScale scaled databases in the cloud, focusing on a shared-nothing architecture that is built around expecting failure. Nick will go into how to deliver low-latency high-throughput systems that span multiple nodes, availability zones, and regions, while maintaining su...
This talk will discuss how PlanetScale scaled databases in the cloud, focusing on a shared-nothing architecture that is built around expecting failure. Nick will go into how to deliver low-latency high-throughput systems that span multiple nodes, availability zones, and regions, while maintaining sub-millisecond response times. This starts at the storage layer and builds all the way up to micro-optimizing the load balancer, with a lot of learning at every step of the way.
Size: 2.24 MB
Language: en
Added: Oct 14, 2025
Slides: 24 pages
Slide Content
A ScyllaDB Community
Shared Nothing
Databases at Scale
Nick Van Wiggeren
CTO
Nick Van Wiggeren
CTO @ PlanetScale
■Based in Seattle, WA
■I spend a lot of my day thinking about tail latency
■Outside of work, I like getting as far away from
computers as possible
Vitess & PlanetScale
Vitess: Scale-out MySQL
Vitess is a CNCF-graduated project that scales MySQL. With Vitess, an application
can speak to tens of thousands of MySQL instances with a single connection
string.
■Built for OLTP workloads, designed for low-latency and high availability
■Fully MySQL compatible
■Improves on MySQL with transparent query rewriting, caching, connection
pooling and much more
Vitess Architecture
PlanetScale: Applied Vitess
PlanetScale runs petabyte-scale Vitess clusters in AWS and GCP for some of the
web’s largest businesses.
■Individual clusters running at over 5M QPS peak with <3ms p99
●Sustained throughput of 1,000,000s of IOPS and 10s of GB/s of file system throughput
■We execute 1000s of MySQL failovers every day
■This puts us in a unique position of being able to talk about what it takes to
run workloads at scale in the cloud
How?
Shared Nothing
What is “shared nothing?”
Shared nothing is the only way to scale beyond a single machine
■Build self-contained units of scale instead of systems of coordination
■Modern consensus algorithms are great, but they can’t beat the speed of light
■Linear scalability and consensus are like oil and water - it’s not possible to
achieve one and keep the other
■When state needs to be shared, keep it isolated, fault tolerant, and off the
query path
What is “shared nothing?”
Do one tiny thing really well, thousands and thousands of times
■A write operation follows a simple path:
●Client -> VTGate -> MySQL Primary -> Disk (+ replicas)
●No consensus, no distributed commits, no cross-node coordination
■At scale, this means:
●10,000 shards = 10,000 independent failure domains
●10,000 shards = 10,000 parallel write paths
●10,000 shards = 1 database that performs like 10,000 databases
What is “shared nothing?” in Vitess
Build thousands of individual MySQL clusters, not one single huge one
■Every Vitess shard is an isolated MySQL cluster, with its own replication
topology and ability to independently failover
■If a query reads or writes data from an individual shard, only that shard needs
to be available to serve the query.
■If a query reads data from multiple shards, only those involved are contacted
What is “shared nothing?” in Vitess
Deploy horizontally scalable query planners + routers, not a monolithic service
■VTGate reads from the “topology” server (etcd, consul, etc) and builds a view
of the world
■If the system doesn’t change, it can continue to operate without topology
updates
■Horizontally scalable: you can run one, you can run thousands of them, it
scales as far as key/value lookups
Achieving “Shared Nothing”
In order to achieve this, you need to design your data just as much as you need to
pick the right system
■Data needs to be partitioned by a functional unit: user, organization, etc
■Use a separate system for summing and counting: HTAP isn’t real, and you’re
better using both the best OLTP and OLAP database
Lessons Learned
The Network is Slow
In a distributed system, it’s impossible to match the latency of someone running
MySQL on their laptop, and it’s really easy to do a lot worse
■AWS Latencies
●500 microseconds between two EC2 instances in the same zone
●1-2 milliseconds between two EC2 instances in the same region, different zones
■Every “hop” between your user and a write - whether it’s to move data or reach
consensus - is a cost.
■The network is variable: these are median numbers, the P99 has no SLA
Network-backed storage is harmful
In order to give our users the performance they demand, we had to abandon
network-attached storage
■AWS gp3 promises “single-digit millisecond latency” 99 percent of the time
●This means you can’t achieve a P99 of < 10ms if a gp3 volume is in your write path!
■You can pay 4-10x more for io2 and still get worse performance than a
handful of consumer SSDs
The Cloud is Unpredictable
It’s not just network and storage - it’s everything. In the datacenter, you can touch
your physical drives and know where every machine lives. In the cloud, assume
everything is ephemeral.
■Test your failover paths daily, and not just planned failovers
■Partial failure is a constant
●Are you prepared for an EBS volume to only achieve 10% of it’s provisioned throughput for 30
minutes?
Your P99 is a lie
Measuring percentiles in steady-state isn’t useful - over a year, your P99 is
dominated by failure scenarios
■Have you measured your P99:
●During peak load
●While orchestrating a failover
●At the same time as an unplanned maintenance event is happening in the cloud
■Measure from the client, not the server: just as much can go wrong
■The fastest P99s I’ve seen are when the database returned errors quickly
In Summary
Scaling Beyond a Single Machine is Hard
■Pick your technology wisely
●The wrong database will add complexity and give you no benefits
■Measure your steady-state P99, care about your edge cases
Thank you! Let’s connect.
Nick Van Wiggeren [email protected]
@NickVanWig