Elasticity, Speed & Simplicity: Get the Most Out of New ScyllaDB Capabilities
ScyllaDB
215 views
28 slides
Sep 13, 2024
Slide 1 of 28
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
About This Presentation
What users need to know about tablets and Raft
The latest releases of ScyllaDB feature major architectural shifts. Tablets build upon a multiyear project to re-architect our legacy ring architecture. And our metadata is now fully consistent, thanks to the assistance of the Raft consensus protocol. ...
What users need to know about tablets and Raft
The latest releases of ScyllaDB feature major architectural shifts. Tablets build upon a multiyear project to re-architect our legacy ring architecture. And our metadata is now fully consistent, thanks to the assistance of the Raft consensus protocol. Together, these changes enable new levels of elasticity, speed, and operational simplicity.
But what does your team need to know at a purely pragmatic level?
Join us for a technical walkthrough of exactly what’s changed from the user/operator perspective. We’ll cover:
- The best use cases (and a few antipatterns) for the new capabilities
- Recommendations for configuring related settings
- Tips for working with the new drivers, monitoring metrics, and more
- What admin tasks you no longer need to worry about
- Differences across open source and Enterprise + ScyllaDB Cloud
Size: 3.73 MB
Language: en
Added: Sep 13, 2024
Slides: 28 pages
Slide Content
Elasticity, Speed
& Simplicity
Felipe Cardeneti Mendes, Technical Director, ScyllaDB
Lucas Martins Guimarães, Customer Success Manager, ScyllaDB
Get the Most Out of New ScyllaDB Capabilities
Predictable
Performance at
Scale
Scale Predictably
ScyllaDB and Memcached
Trillions of messages
60% fewer nodes
5ms p99
“It’s a much more
efficient database —
we’re going from running
177 Cassandra nodes to
just 72 ScyllaDB nodes.”
2015 – Million
2016 – Billion
2019 – Trillion
Bo Ingram
How Discord Stores Billions and Trillions of Messages
6
+400 Gamechangers Leverage ScyllaDB
Seamless experiences
across content + devices
Connecting people all
around the globe
Corporate fleet
management
Real-time analytics 2,000,000 SKU -commerce
management
Video recommendation
management
Threat intelligence service
using JanusGraph
Real time fraud detection
across 6M transactions/day
Uber scale, mission critical
chat & messaging app
Network security threat
detection
Power ~50M X1 DVRs with
billions of reqs/day
Precision healthcare via
Edison AI
Inventory hub for retail
operations
Property listings and
updates
Unified ML feature store
across the business
Cryptocurrency exchange
app
Geography-based
recommendations
Global operations- Avon,
Body Shop + more
Predictable performance for
on sale surges
GPS-based exercise
tracking
Serving dynamic live
streams at scale
Powering India's top
social media platform
Personalized
advertising to players
Distribution of game
assets in Unreal Engine
Presenters
Felipe Cardeneti Mendes, Technical Director
+Puppy Lover
+Database Performance at Scale Co-author
+Open Source Enthusiast
Lucas Martins Guimarães, Customer Success Manager
+Scrum Master
+Former Product Owner
+An Engineer after all!
Agenda
+New ScyllaDB capabilities
+What you need to know
+New Enterprise features
What's New in ScyllaDB?
Elasticity, Speed, and Operational Simplicity
Driver Support
■Drivers are tablet-aware
■But without reading the tablets table
■Driver contacts a random node/shard
■On miss, gets updated routing information for that tablet
■Ensures fast start-up even with 100,000 entries in the
tablets table
■Driver Support
■Java 3x >= 3.11.5.2, Java 4.x >=4.18.0.2
■Python >= 3.26.6
■GoCQL >= 1.13.0
■Rust >= 0.13.0
Schema Commit Log
■Motivation (#10333):
■Applying schema mutations involves flushing all schema
■Schema merge is very sensitive to fdatasync latency
■Flushing a single memtable involves many syncs
■Mandatory since #16254 (hard dependency on strongly consistent
topology changes with Raft – #16334)
■Introduces /var/lib/scylla/commitlog/schema/SchemaLog-*
ScyllaDB SSTable Tools
■Replaces and extends sstabledump & sstablemetadata to:
■Retrieve owner shard
■Dump SSTable components (Data, Index, Metadata,
CompressionInfo, Statistics, Summary)
■Validate and Scrub
■Decompress / Write
ScyllaDB SSTable
Docs – REST API Reference
API Reference
ScyllaDB Maintenance Mode
In this mode, the node is not reachable from the outside:
■Refuses all incoming RPC connections
■No comms. with other Nodes
■Cluster-wide operations are disabled for the node
■Can't read or write data from/to other replicas
■Doesn't listen on either CQL nor Alternator (DynamoDB)
Use 'maintenance_mode: true' & 'maintenance_socket: workdir'
and connect via cqlsh with:
■cqlsh /var/lib/scylla/cql.m
Audience: People Familiar with ScyllaDB Internals
Mutation Fragments – CQL Extension
■Diagnostics tooling to debug performance or
correctness problems:
■SELECT * FROM
MUTATION_FRAGMENTS(keyspace.table);
■mutation_source can be {memtable, row-cache,
sstable}
■metadata – content of the mutation-fragment in
JSON
■value – value of the mutation-fragment,
represented as JSON
Reading Mutation Fragments
Used to determine which SSTables do not contain a partition key,
speeding up reads:
■Held in memory
■Size depends on data
■Many smaller partitions require larger filters
■Trade-off: Allocated memory & Filter efficiency
■Drop Bloom Filters if its consumption is above limit:
■#17747 – components_memory_reclaim_threshold: 0.2
■Reload when space is available again (eg due to compactions):
■#18186 – Reload reclaimed bloom filters when memory is available
Dynamic Bloom Filters
Materialized Views & Index Improvements
■Synchronous Indexes and Views (since 5.1+):
■ALTER ks.my_view WITH synchronous_updates = true;
■ALTER ks.idx WITH synchronous_updates = true;
■Secondary index of collection columns (since 5.2+):
■CREATE INDEX ON test(keys(somemap));
■CREATE INDEX ON test(values(somemap));
■CREATE INDEX ON test(entries(somemap));
■CREATE INDEX ON test(values(somelist));
■CREATE INDEX ON test(values(someset));
New ScyllaDB Enterprise
Features
Even Faster, Security Hardened, and Efficient
Performance Improvements
■Up to 1.5x Higher Throughput than Open Source
■Up to 35% Lower Latencies (mean and P99)
Network (RPC) Compression Improvements
■Improved network compression for RPC traffic
■Option of Zstd instead of LZ4
■Periodically trained dictionaries, instead
compression per message
■See Łukasz Paszkowski on Cheating the Cloud: 50%
Savings with Compression Dictionaries at P99 CONF
Upcoming: Tablet File-based streaming
■Similar to Cassandra Zero-copy Streaming
■But better ;-)
■Tablets are always owned by the replica
■Simply copy, done.
■Up to 75% faster than Open Source for Streaming
Security Hardening
■Amazon KMS Integration for Encryption at Rest
■Transparent Data Encryption (TDE)
■Encrypts all system resources (commitlog, hints, system
and user tables)
■Authentication & Authorization via TLS Certificates
■FIPS Tolerant
■Since 2023.1.1 for Linux deployments
■Since 2024.1 includes scylladb/scylla-enterprise-pro
container image
Poll
Which database do you currently use in
production?
Guilherme Nogueira [email protected]
Keep Learning
scylladb.com/category/engineering
Register now at p99conf.io
Visit our blog for
more on ScyllaDB
engineering
Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com@scylladb company/scylladb/
scylladb/