Lightweight Transactions in Scylla versus Apache Cassandra
1,682 views
42 slides
Jan 24, 2020
Slide 1 of 42
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
About This Presentation
Lightweight transactions (LWT) has been a long anticipated feature for Scylla. Join Scylla VP of Product Tzach Livyatan and Software Team Lead Konstantin Osipov for a webinar introducing the Scylla implementation of LWT, a feature that brings strong consistency to our NoSQL database.
In this webina...
Lightweight transactions (LWT) has been a long anticipated feature for Scylla. Join Scylla VP of Product Tzach Livyatan and Software Team Lead Konstantin Osipov for a webinar introducing the Scylla implementation of LWT, a feature that brings strong consistency to our NoSQL database.
In this webinar we will cover the tradeoffs typically made between database consistency, availability and latency; how to use lightweight transactions in Scylla; the similarities and differences between Scylla’s Paxos implementation and Cassandra’s, and what it all means to users.
From attending this live webinar you’ll learn…
The advantages and disadvantages of various consistency options
Scylla lightweight transactions: syntax and semantics
A design and implementation overview, changes in Paxos
Performance comparisons with Apache Cassandra
Scylla’s future roadmap for LWT beyond Paxos
Size: 3.94 MB
Language: en
Added: Jan 24, 2020
Slides: 42 pages
Slide Content
Tzach Livyatan, ScyllaDB VP of Product Konstantin Osipov, Software Team Lead Lightweight Transactions in Scylla vs Apache Cassandra
‹#› The Real-Time Big Data Database Drop-in replacement for Apache Cassandra and Amazon DynamoDB 10X the performance & low tail latency Open S ource , Enterprise and Cloud options Founded by the creators of KVM hypervisor HQs : Palo Alto, CA, USA; Herzelia, Israel; Warsaw, Poland About ScyllaDB
Presenters Konstantin Osipov , Software Team Lead Kostja is a well-known expert in the DBMS world, spending most of his career developing open-source DBMS including Tarantool and MySQL. At ScyllaDB his focus is transaction support and synchronous replication. Tzach Livyatan , VP of Product Tzach has a 15 year career in development, system engineering and product management. He has worked in the Telecom domain, focusing on carrier grade systems, signalling, policy and charging applications for Oracle and others. ‹#›
Agenda About ScyllaDB Eventual consistency in Scylla LWT at a glance LWT benchmarks Under the hood: Scylla optimizations Only 3 round trips in most cases LWT are always durable with commit log Reduced contention Roadmap QA ‹#›
Eventual Consistency in Scylla ‹#›
NoSQL - By Availability vs Consistency ‹#› Pick Two Availability Partition Tolerance Consistency
Data Replication Replication Factor(RF): Number of nodes where data is replicated Done automatically Per Data Center (DC) Keyspace level setting ‹#›
Determines number of replica node responses required for a query to be deemed successful CL of 1: Wait for response from one replica node CL of ALL: Wait for response from all replica nodes CL QUORUM: Wait for floor((#replicas/2)+1) CL LOCAL_QUORUM: Wait for floor((#dc_replicas/2)+1) ‹#› Consistency Level (CL)
CL of ONE: W ait for at least one of the replicas Client R1 R2 R3 ‹#› Consistency Level - Write
CL of QUORUM: Wait for at least two of the replicas R1 R2 R3 ‹#› Consistency Level - Write Client
CL of QUORUM: Wait for at least two of the replicas R1 R2 R3 ‹#› Consistency Level - Read Client
‹#› Eventual Consistency - Anti Entropy Hinted Handoff Read Repair Repair ⇒ Node R evive ⇒ Inconsistent read ⇒ Recurrent operation (weekly) Why Repair? The resurrection problem
What about Conditional Updates?
LWT at a Glance
Try Scylla LWT Now ! Download Scylla 3.2, Launch EC2 AMI, or use Docker https://www.scylladb.com/download/open-source/ https://hub.docker.com/r/scylladb/scylla/ Use --experimental and follow the docs: https://docs.scylladb.com/operating-scylla/scylla-yaml/ CQL Reference https://docs.scylladb.com/getting-started/dml/#if-condition ‹#›
CQL Avoids Slow Reads > UPDATE employees SET join_date = '2018-05-19' WHERE firstname = 'John' AND lastname = 'Doe' ; > SELECT * FROM employees ...; firstname | lastname | join_date -----------+----------+------------ John | Doe | 2018-05-19 ‹#›
CQL Conditional Statement > UPDATE employees SET join_date = '2018-05-19' WHERE firstname = 'John' AND lastname = 'Doe' IF join_date != null ; [a pplied] ----------- False ‹#›
What Statements Can Be Conditional? Any INSERT , UPDATE or DELETE can have an IF clause: > UPDATE employees SET join_date = … IF EXISTS ; > INSERT INTO bookings (id, item, client, quantity) VALUES (…) IF NOT EXISTS ; > UPDATE inventory SET state = 'Used' WHERE itemid = ? IF state = 'Unused' AND check = 'Passed' ; > DELETE FROM tasks WHERE project_id = ? AND task_id = ? IF task[ 'state' ] IN ( 'Complete' , 'Abandoned ' ); ‹#›
Conditional Batches > BEGIN BATCH > UPDATE tasks SET n_abandoned = 0 WHERE project_id = 1 > IF n_abandoned > > DELETE FROM tasks WHERE project_id = 1 > AND state = 'Abandoned' > APPLY BATCH ; [applied] | project_id | state | task_id | n_abandoned ----------+------------+-----------+---------+------------- True | 1 | Abandoned | 693 | 2 ‹#›
Scylla result metadata vs Cassandra Scylla: > INSERT INTO lwt (k,v,x) VALUES ( 1 , 2 , 3 ) IF NOT EXISTS ; [applied] | k | x | v -----------+------+------+------ True | null | null | null Cassandra: > INSERT INTO lwt (k,v,x) VALUES ( 1 , 2 , 3 ) IF NOT EXISTS ; [ applied ] ----------- True ‹#›
Steps of conditional statement execution ‹#› Assume leadership for this key [Repair previous round if necessary] Search the old row and then check IF conditions Build mutation with max known timestamp and send it to peers Commit transaction on peers
Setup: Single Region Amazon EC2, availability zone eu-west-1 3 nodes i 3.8xlarge 32 vcores, 244GB RAM, 4 x 1.9 NVMe SSD Replication strategy: Simple Replication factor: 3 5k row size: blogposts use case cassandra-stress user profile=cqlstress-lwt-example.yaml, client t3.8xlarge 1-182 connections ‹#›
Uncontended Write - B andwidth ‹#› 
Uncontended Write - Latency ‹#› 
Under the Hood
Introducing Paxos R 1 Can I propose a value? R 2 R 3 Accept new value Learn decision Decision made ok! ok! ok! ack ok! ‹#›
Introducing Paxos R 1 Can I propose a value? Check condition R 2 R 3 Accept new value Learn decision Decision made ok! ok! row checksum ok! ack ok! ‹#›
Scylla O ptimizations 3 round trips in most cases LWT are always durable with commit log Reduced contention ‹#›
Improved Speeds: Fewer Paxos Rounds ‹#› R 1 R 2 R 3 Decision made O k, checksum O k, row ok! ack ok! Can I propose a value? Check condition Accept new value Learn decision
Improved Durability: Always Durable Existing commitlog_sync modes: batch or periodic LWT statements always sync the commit log, in any mode Always durable ‹#›
Reduced Contention ‹#› R2 R3 R1 Client
Future Work
Scylla RAFT N ew replication strategy Tablet partitioning scheme Requested explicitly in CREATE TABLE No client-side timestamps Provides isolation for ALL queries ‹#›
Today: Scylla 3.2 - Experimental LWT Q1 2020: S c ylla 4.0 - Production ready LWT Q2 2020: Scylla Enterprise 2020.1 - Production ready LWT Road Map to LWT
Try Scylla LWT Now ! Download Scylla 3.2, Launch EC2 AMI, or use Docker https://www.scylladb.com/download/open-source/ https://hub.docker.com/r/scylladb/scylla/ Use --experimental and follow the docs: https://docs.scylladb.com/operating-scylla/scylla-yaml/ CQL Reference https://docs.scylladb.com/getting-started/dml/#if-condition ‹#›