Database Migration Strategies and Pitfalls by Patrick Bossman

ScyllaDB 172 views 31 slides Mar 10, 2025
Slide 1
Slide 1 of 31
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31

About This Presentation

Jump start your migration to ScyllaDB! In this talk, we will discuss how to migrate from Cassandra, DynamoDB, as well as other sources. Review tooling available to assist with Scylla Migrations. Review approaches and considerations for online and offline migrations, and how to plan for a faster m...


Slide Content

A ScyllaDB Community
Database Migration
Strategies and Pitfalls
Patrick Bossman
Senior Customer Engineer

Patrick Bossman (he/him)
■30+ years experience helping enterprise customers
design and deploy highly performant, reliable,
available solutions
■Passion for building new things, solving problems
■2 x girl dad

■Overview of types of migrations
■How to from
■Cassandra (CQL)
■DynamoDB
■Others
■Tuning migrations
Presentation Agenda

Migration Overview

We can do it! Migrations don’t have to be feared.
■We’re successfully migrating customers all the time
■Frequently with no/minimal application changes
■Rich tooling to assist with the process
■Experienced professionals to lend a hand

Demystifying migration

Migration options
■Online - using Scylla and DC to DC migration (way to go!)
■set replication factor (==dual writes)
■start rebuild (== historic data load)
■repair (==validation)
■Online / Live Migration (way to go)
■dual writes
■historic data fork lifting
■validation
■Offline / Downtime Migration (but why?)
■historic data fork lifting
■validation

Cassandra (CQL) Migration

■Show example of on-line migration
■Can preserve source writetime and ttl
■Bonus - backfill will not “overwrite” new writes
■Discuss tooling
■Zero Downtime Migration (ZDM) - application transparent dual writes
■Scylla Migrator - Spark based performant backfill and validation

Cassandra migration

Write to Cassandra
Time
Read from Cassandra
Live Migration
9
Dual Writes (Application or ZDM)
Enable Dual
Writes
Migrate
Schema

Write to Cassandra
Time
Read from Cassandra
Write to ScyllaDB
Live Migration
10
Dual Writes (Application or ZDM)
Enable Dual
Writes

Write to Cassandra
Forklifting Existing Data
DBs in Sync
Time
Read from Cassandra
Write to ScyllaDB
Live Migration
11
Dual Writes (Application or ZDM)
Scylla
Migrator

Dual Reads
Write to Cassandra
Forklifting Existing Data
Validation
DBs in Sync
Time
Read from ScyllaDB
Read from Cassandra
Write to ScyllaDB
Live Migration
12
Dual Writes (Application or ZDM)

Dual Reads
Write to Cassandra
Forklifting Existing Data
Validation
DBs in Sync
Time
Read from ScyllaDB
Fade off
Cassandra
Read from Cassandra
Write to ScyllaDB
Live Migration
13
Dual Writes (Application or ZDM)

Alternative to implementing dual writes within application.
■Application connects to zdm-proxy
■Proxy dual writes to source and target
■Optionally dual read to target (validate target performance)
■zdm-proxy

SQL
NoSQL
zdm-proxy - Zero Downtime Migration
14

■Highly resilient to failures
■Access compatible Databases using a native connector
■Scalable, high performance parallelized reads and writes
■Support transformations
■Open Source!

SQL
NoSQL
ScyllaDB Migrator
15

■Slower - but maybe ok for smaller datasets
■DSBulk
■Bulk unload/load tool
■Can also pipe
■CQLSH COPY TO/FROM
■CQLSH copy to/from csv

Other options
16

DynamoDB Migration

■ScyllaDB Alternator API compatible with DynamoDB
■Frequently, no application changes required
■Show example of on-line migration using Scylla migrator
■Alternatives

DynamoDB migration

Write to DynamoDB
Time
Read from DynamoDB
Live Migration
19
Updates from DynamoDB Stream
Migrate
Schema
Enable
Streams

Write to DynamoDB
Forklifting Existing Data
DBs in Sync
Time
Read from DynamoDB
Live Migration
20
Updates from DynamoDB Stream
Apply updates from DynamoDB Streams
Migrate
Schema

Dual Reads
Write to DynamoDB
Forklifting Existing Data
Validation
DBs in Sync
Time
Read from ScyllaDB
Fade off
DynamoDB
Read from DynamoDB
Write to ScyllaDB
Live Migration
21
Updates from DynamoDB Stream
Apply updates from DynamoDB Streams

■Highly resilient to failures
■Access compatible Databases using a native connector
■High performance parallelized reads and writes
■Support transformations
■Open Source!
SQL
NoSQL
ScyllaDB Migrator
22

■Existing application fed from queue (kafka)
■Load backup or scylla migrate from source
■Replay queue from offset into target database
■Scylla Migrator now supports DynamoDB S3 backup (Parquet) into
ScyllaDB


Alternatives

Migration Tuning

■Spark and ScyllaDB are scalable!
■Scylla migrator breaks work into segments
■Allocate larger/more workers to process faster
■Consider over-provisioning spark/scylla during migration phase
■Configuration (during migration)
■DynamoDB - temporarily disable TTL during backfill
■Compactions: Increase min_threshold, set static_compaction_shares: 100

Migration tuning considerations

■Test migration throughput for a few minutes, then terminate
■Observe impact on source
■Verify throughput meets migration duration requirements
■Practice validation
■Truncate target, and plan/execute real migration

Validate migration throughput

Other Migrations

■SQL, MongoDB, …
■Data modeling and application changes likely required
■ETL style migration with application consideration required
■Follow migration dual write and backfill principles discussed earlier

Other migrations

Wrap it up

Summary
■ScyllaDB scales well horizontally and vertically

■Many migrations successful with no application changes!
■Rich tooling available to perform migration

■You can do it!
■Don’t be afraid to reach out and ask questions!

Stay in Touch
Patrick Bossman
[email protected]
linkedin.com/in/bossman
Tags