MongoDB to ScyllaDB: Technical Comparison and the Path to Success

ScyllaDB 460 views 31 slides Jun 20, 2024
Slide 1
Slide 1 of 31
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31

About This Presentation

What can you expect when migrating from MongoDB to ScyllaDB? This session provides a jumpstart based on what we’ve learned from working with your peers across hundreds of use cases. Discover how ScyllaDB’s architecture, capabilities, and performance compares to MongoDB’s. Then, hear about your...


Slide Content

MongoDB vs ScyllaDB: Path to Success Felipe Mendes, Solution Architect at ScyllaDB Pete Aven , Solution Architect at ScyllaDB

Meet The Team Felipe Cardeneti Mendes Published Author ScyllaDB Committer IT Specialist & Solution Architect Your photo goes here, smile :) Pete Aven Data Engineer Application Development ScyllaDB Solution Architect

Soccer Football Goal?

Survival Rule #1

Lower Node Count Efficient hardware usage results in direct savings Where Do You Wanna Be? Predictable, Low Latencies Consistent single-digit millisecond p99 latencies Autonomous Self-Tuning Self-optimizing, smaller footprint, easy to use Always-On Architecture Active-active deployments for mission-critical workloads

Architecture Differences

Data Model & Query Language MQL – Mongo Query Language Documents stored in BSON format Schemaless Favors Query Flexibility over Performance

Data Model & Query Language CQL or DynamoDB API Wide-Column Representation Schema-oriented Favors Performance over Query Flexibility

MongoDB – Replica Sets Reads Reads (optional) Reads (optional) REPLICA SET 3x replication Primary-Secondary Impact on Write Throughput Election on Failure Scaling Limitations Up to 50 members Arbiters Non-voting members Limited Storage/Replica* Affect Reads Scaling MongoDB Atlas *

MongoDB – Sharded Cluster A Shard is a Replica Set Adds Overhead & Complexity Further Availability Concerns Underutilized Capacity Sharding Performed Manually Hashed vs Ranged Still Require a Query Router Reads SHARDED CLUSTER 3 shards, 3x replication

ScyllaDB – Sharding & Replication Simplified Architecture Equal Nodes Sharded to the Core Maximizes Infrastructure Power Unlimited Scale Always-On No Limits Fully Autonomous App App

Independent Benchmark Results " ScyllaDB outperforms MongoDB in 132 of 133 measurements. " – Daniel Seybold , benchANT CTO Read the full report

"MongoDB is Web Scale" " The scalability results demonstrate that ScyllaDB achieves up to near linear scalability, while MongoDB shows less efficient horizontal scalability. " – Daniel Seybold , benchANT CTO Read the full report

Migration Options

It’s a Simple Matter of Programming(™)

Connectors Connecting ScyllaDB Shard-Aware Drivers

Collections vs Tables

Mental Exercise MemeGroup [ { id : simpsons, title : 'The Simpsons' , author : uid1, tags : [' homer ', ' liza '], photos : [pid1, pid2], views : 200 } { id : comics, title : 'Comics' , author : uid2, tags : [' drawing ', ' cartoon '], photos : [pid1, pid2, pid3], views : 300 } { id : matrix, title : 'Matrix' , author : uid1, tags : [' neo ', ' morpheus ', ' pill '], photos : [pid4, pid5], views : 50 } ] Memes [ { id : homer_grass, title : 'Homer Hiding' , location : p1, tags : [' hiding ', ' grass '], likes : 30 , views : 200 } { id : morpheus_pill, title : Choose Destiny ' , location : p2, tags : [' pills '], likes : 99 , views : 300 } { id : trollface, title : 'Troll face' , location : p3, tags : [' troll ', ' face ], likes : 9999999 , views : NaN } ] Queries: Display all Memes posted by a given Author, sorted by time Retrieve all Groups of Memes matching a particular Tag

Mental Exercise Queries: Display all Memes posted by a given Author, sorted by time Retrieve all Groups of Memes matching a particular Tag CREATE TABLE MemeByAuthor ( author uuid, m eme_id uuid, post_time timeuuid, t itle text, t ags frozen<set<text>>, p ath text, PRIMARY KEY (author, post_time) ) CREATE TABLE MemeGroup ( group_id uuid, a uthor_id uuid, title text, t ags set<text>, memes uuid, PRIMARY KEY (group_id, memes) ) CREATE INDEX ON MemeGroup (tags); Adapted from StackOverflow

ScyllaDB JSON Support INSERT INTO mytable JSON '{ "\"_id\"": 1, "title": "Reservoir Dogs", "director": ...}' SELECT and INSERT statements

Leave JSON “As-Is” ID #1 document

Hybrid Approach ID #1 D ocument Document ID #1

Before Migrating Understand the differences between a document vs wide-column Denormalize and design a query-driven approach Optimize your model for fast retrieval & ingestion Take the opportunity to re-evaluate your access patterns! Get to know the available data-types, indexes, collections and UDTs in ScyllaDB Implement highly concurrent asynchronous code in your application BLAST ScyllaDB – It WILL handle it!

Migration Walkthrough ETL Application Writes & Reads Reads Forklift Historical Data Source Connector MongoDB Change Streams ScyllaDB Sink Connector Writes & Reads

Success Stories

Trillions of messages 60% fewer nodes 5ms p99 "We knew we were not going to use MongoDB sharding because it is complicated to use and not known for stability" 2015 – Million 2016 – Billion 2019 – Trillion Stanislav Vishnevskiy How Discord Stores Billions and Trillions of Messages

Unleashing the Power of Data with Scylla "We were looking for something really specific. A highly scalable, and high performance NoSQL database. The answer was simple, ScyllaDB is a better fit for our use case." João Pedro Voltani

MongoDB Versus ScyllaDB, in Production at Numberly " We are happy users of both, because we select the use cases based on their strengths " We use ScyllaDB for real-time and latency sensitive pipelines, mixed batch and real-time workloads, and web backends in GraphQL ( imposing a schema ). We see a trend into ScyllaDB today. Alexys Jacob Numberly adoption MongoDB OSS (AGPL) MongoDB 1.4 stable Cassandra OSS (Apache) MongoDB is very strong in web backends and real time queries over unpredictable behavioral data. We need to query and operate on top of this data. This is where MongoDB flexibility is good for us. Cassandra 1.0 Cassandra 2.0 Cassandra 3.0 ScyllaDB OSS (AGPL) ScyllaDB 1.0 ScyllaDB 2.0 ScyllaDB 3.0 2016 2017 2018 2019 2015 2013 2011 2010 2008 2009 MongoDB 2.0 MongoDB 3.0 WiredTiger (3.2) MongoDB 4.0 (SSPL) MongoDB 4.2 Numberly adoption MongoDB 3.6

Takeaways ‹#› ScyllaDB is the database for Performance ScyllaDB and MongoDB are different databases Knowing the differences is key for a successful transition Identify the optimal data model Follow a query-driven & performance-oriented design approach Both have a large driver and connectors ecosystem The path for migration is straightforward, and we're here to help ! Customers met their Performance challenges with ScyllaDB

Stay in Touch Felipe Cardeneti Mendes [email protected] @cardeneti82118 @fee-mendes Find me on LinkedIn! Pete Aven [email protected] @peteaven @wpaven Find me on LinkedIn!

ScyllaDB Summit 2024 Styles 2024 Summit color palette #1B58EF # 05CEE8 # 00EFB6 # F244CD # 8158FF #EEEEEE # FFA522 #4D4D4D The default body font is Roboto Condensed. You can adjust the size as needed. You can also use Roboto (the uncondensed version). For code you should use Roboto Mono and you can set it on this dark background
Tags