Surviving Majority Loss: When a Leader Fails by Konstantin Osipov
ScyllaDB
191 views
29 slides
Mar 10, 2025
Slide 1 of 29
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
About This Presentation
In this lightning talk I will present all common combinations of ScyllaDB deployments, single and multi-DC, and how well they play out with Raft based topology management. I'll present ScyllaDB's 2024.2 new feature, zero token nodes, and discuss how they could be used to improve resilience t...
In this lightning talk I will present all common combinations of ScyllaDB deployments, single and multi-DC, and how well they play out with Raft based topology management. I'll present ScyllaDB's 2024.2 new feature, zero token nodes, and discuss how they could be used to improve resilience to failure in complex deployments. Another nice feature, which we call "Dynamic voter selection" which is currently available in scylladb.git and our nightly builds helps reduce the load of Raft on large clusters and dynamically adjust Raft to node starts and stops, always maintaining maximal Raft availability.
Size: 3.54 MB
Language: en
Added: Mar 10, 2025
Slides: 29 pages
Slide Content
A ScyllaDB Community
Surviving Majority Loss:
When a Leader Fails
Konstantin Osipov
Director of Engineering
Konstantin Osipov
■ScyllaDB geek since 2019
■All things consistency
■Aquarist and a father of three
Previous Episodes
■ScyllaDB consistent cluster management recap
■Choices for production cluster topology
■… and disaster recovery consequences of those
Presentation Agenda
Strong consistency recap
Data vs metadata
-metadata
-data
Schema information: table,
view, type definitions
Topology information:
nodes, tokens
Static and regular rows,
counters
Replicated everywhere Partitioned
Not commutative Commutative
Changes rarely Changes frequently
Consistency of Metadata
1
2 3
3
1 2
replication_factor=2
ScyllaDB cluster
■Runs alongside Raft leader
■Highly available
■Drives the progress of topology changes
■Performs linearizable reads and writes of the topology
■Request coordinators still use the local view on topology
■No extra coordination when executing user requests
The Centralized Topology Coordinator
All the fun ways to deploy
… and what happens next
1.Identify alive nodes
2.Choose recovery leader
3.Erase group0 state from system tables
4.Restart the nodes with recovery_leader option
The new recovery procedure - 2025.2
●Adaptive voter selection algorithm makes ScyllaDB clusters even more
resilient to failures
Recap
●Adaptive voter selection algorithm makes ScyllaDB clusters even more
resilient to failures
●Zero token nodes allow improving resilience with low cost
Recap
●Adaptive voter selection algorithm makes ScyllaDB clusters even more
resilient to failures
●Zero token nodes allow improving resilience with low cost
●A new disaster recovery procedure allows to form a new majority
automatically and supports tablets
Recap
●Certain roll-out scenarios, such as:
○Even number of nodes
○Two data centers
○Low number of racks
.. are more error-prone than others and should be avoided.
Recap