CQL: SQL In Cassandra

jericevans 15,257 views 37 slides Dec 06, 2011
Slide 1
Slide 1 of 37
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37

About This Presentation

CQL presentation from Cassandra NYC, December 6, 2011


Slide Content

CQL: SQL for Cassandra
Cassandra NYC
December 6, 2011
Eric Evans
[email protected]
@jericevans, @acunu

●Overview, history, motivation
●Performance characteristics
●Coming soon (?)
●Drivers status

What?
●Cassandra Query Language
●aka CQL
●aka /ˈsēkwəl/
●Exactly like SQL (except where it's not)
●Introduced in Cassandra 0.8.0
●Ready for production use

SQL? Almost.
–- Inserts or updates
INSERT INTO Standard1 (KEY, col0, col1)
VALUES (key, value0, value1)
vs.
–- Inserts or updates
UPDATE Standard1
SET col0=value0, col1=value1 WHERE KEY=key

SQL? Almost.
–- Get columns for a row
SELECT col0,col1 FROM Standard1 WHERE KEY=key
–- Range of columns for a row
SELECT col0..colN
FROM Standard1 WHERE KEY=key
–- First 10 results from a range of columns
SELECT FIRST 10 col0..colN
FROM Standard1 WHERE KEY=key
–- Invert the sorting of results
SELECT REVERSED col0..colN
FROM Standard1 WHERE KEY=key

Why?

Interface Instability

(Un)ease of use
Column col = new Column(ByteBuffer.wrap(“name”.getBytes()));
col.setValue(ByteBuffer.wrap(“value”.getBytes()));
col.setTimestamp(System.currentTimeMillis());
ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();
cosc.setColumn(col);
Mutation mutation = new Mutation();
Mutation.setColumnOrSuperColumn(cosc);
List mutations = new ArrayList<Mutation>();
mutations.add(mutation);
Map mutations_map = new HashMap<ByteBuffer, Map<String, List<Mutation>>>();
Map cf_map = new HashMap<String, List<Mutation>>();
cf_map.set(“Standard1”, mutations);
mutations.put(ByteBuffer.wrap(“key”.getBytes()), cf_map)

CQL
INSERT INTO Standard1 (KEY, col0)
VALUES (key, value0)

Why? How about...
●Better stability guarantees
●Easier to use (you already know it)
●Better code readability / maintainability

Why? How about...
●Better stability guarantees
●Easier to use (you already know it)
●Better code readability / maintainability
●Irritates the NoSQL purists

Why? How about...
●Better stability guarantees
●Easier to use (you already know it)
●Better code readability / maintainability
●Irritates the NoSQL purists
●(Still )irritates the SQL purists

Performance

Thrift RPC
Column col = new Column(ByteBuffer.wrap(“name”.getBytes()));
col.setValue(ByteBuffer.wrap(“value”.getBytes()));
col.setTimestamp(System.currentTimeMillis());
ColumnOrSuperColumn cosc = new ColumnOrSuperColumn();
cosc.setColumn(col);
Mutation mutation = new Mutation();
Mutation.setColumnOrSuperColumn(cosc);
List mutations = new ArrayList<Mutation>();
mutations.add(mutation);
Map mutations_map = new HashMap<ByteBuffer, Map<String, List<Mutation>>>();
Map cf_map = new HashMap<String, List<Mutation>>();
cf_map.set(“Standard1”, mutations);
mutations.put(ByteBuffer.wrap(“key”.getBytes()), cf_map)

Your query, it's a graph

CQL
INSERT INTO Standard1 (KEY, col0)
VALUES (key, value0)

Hotspot
Quoted string literals
UPDATE table SET 'name' = 'value'
WHERE KEY = 'somekey'

Hotspot
Quoted string literals
UPDATE table SET 'name' = 'value'
WHERE KEY = 'somekey'

Hotspot
Quoted string literals
UPDATE table SET 'name' = 'value'
WHERE KEY = 'somekey'
●Anything that appears between quotes
●Inlined Java constructs a StringBuilder to store
the contents (slow not fast)
●Incurred multiple times per statement

Hotspot
Marshalling
UPDATE table SET 'clear' = 'abffaadd10'
WHERE KEY = 'acfe12ff'

Hotspot
Marshalling
UPDATE table SET 'clear' = 'abffaadd10'
WHERE KEY = 'acfe12ff'
ascii blob

Hotspot
Marshalling
UPDATE table SET 'clear' = 'abffaadd10'
WHERE KEY = 'acfe12ff'
●Terms are marshalled to bytes by type
●String.getBytes is slow (AsciiType)
●Hex conversion is fast faster (BytesType)
●Incurred multiple times per statement
ascii blob

Hotspot
Copying / Conversion
execute_cql_query(
ByteBuffer query, enum compression)
●Query is binary to support compression (is it worth it?)
●And don't forget the String → ByteBuffer conversion on
the client-side
●Incurred only once per statement!

Achtung!
(These tests weren't perfect)
●Uneeded String → ByteBuffer → String
●No query compression implemented
●Co-located client and server

Insert 20M rows, 5 columns
Avg rate Avg latency
RPC20,953/s 1.6ms
CQL19,176/s (-8%)1.7ms (+9%)

Insert 10M rows, 5 cols (indexed)
Avg rate Avg latency
RPC9,850/s 5.3ms
CQL9,290/s (-6%)5.5ms (+4%)

Counts, 10M rows, 5 cols
Avg rate Avg latency
RPC18,052/s 1.7ms
CQL17,635/s (-2%)1.7ms

Reading 20M rows, 5 cols
Avg rate Avg latency
RPC22.726/s 2.0ms
CQL20,272/s (-11%)2.3ms (+10%)

In Summary
Don't step over dollars to pick up pennies!

Coming Soon(ish)

Roadmap
●Prepared statements (CASSANDRA-2475)
●Compound columns (CASSANDRA-2474)
●Custom transport / protocol (CASSANDRA-2478)
●Performance testing (CASSANDRA-2268)
●Schema introspection (CASSANDRA-2477)
●Multiget support (CASSANDRA-3069)

Drivers

Drivers
●Hosted on Apache Extras (Google Code)
●Tagged cassandra and cql
●Licensed using Apache License 2.0
●Conforming to a standard for database
connectivity (if applicable)
●Coming soon, automated testing and
acceptance criteria

Drivers
Driver Platform Status
cassandra-jdbc Java Good
cassandra-dbapi2 Python Good
cassandra-ruby Ruby New
cassandra-pdo PHP New
cassandra-node Node.js Good
http://code.google.com/a/apache-extras.org/hosting/search?q=label%3aCassandra

The End