Percona XtraDB Cluster

Grypyrg 2,839 views 71 slides May 16, 2014
Slide 1
Slide 1 of 71
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71

About This Presentation

Built-in MySQL Replication is known for its capability to enable to scale reads easily. However, there are some limitations and known issues with this solution because of the asynchronous nature of this replication. This talk will describe another way of doing MySQL replication, by using synchronous...


Slide Content

Synchronous*Replica1on*for*MySQL
Kenny%Gryp

<[email protected]>%
29%Oct%2013
1

Agenda
•Default%asynchronous%MySQL%replication%
•Percona%XtraDB%Cluster:%
•Introduction%/%Features%/%Load%Balancing%
•Use%Cases:%
•High%Availability%/%WAN%Replication%/%Read%Scaling%
•Limitations%
•Future
2

Percona
•Percona%is%the%oldest%and%largest%independent%MySQL%
Support,%Consulting,%Remote%DBA,%Training,%and%
Software%Development%company%with%a%global,%24x7%
staff%of%over%100%serving%more%than%2,000%customers%
in%50+%countries%since&2006%!%
•Our%contributions%to%the%MySQL%community%include:%
•Percona*Server,*Percona*XtraDB*Cluster*
•Percona*XtraBackup:*online*backup*
•Percona*Toolkit,*Percona*Playback…*
•books,%and%research%published%on%the%MySQL%Performance%Blog.
3

Agenda
•Default*asynchronous*MySQL*Replication*
•Percona%XtraDB%Cluster:%
•Introduction%/%Features%/%Load%Balancing%
•Use%Cases:%
•High%Availability%/%WAN%Replication%/%Read%Scaling%
•Limitations%
•Future
4

MySQL*Replication
If your HA is based on MySQL Replication -!
You may be playing a dangerous game !
5

Traditional*Replication*Approach
Server*1 Server*2
replication*stream
“master” “slave”
6

MySQL*Replication
7
1 2
3 4
5 6
•Common%Topologies:%
•MastercMaster%(Only%1%active%master)%
•1%or%more%layers%of%replication

•Slaves%can%be%used%for%reads:%
•asynchronous,%stale%data%is%the%rule%
•data%loss%possible%(*semicsync)
MySQL*Replication
8
1 2
3 4
5 6

•nonctrivial:%
•external%monitoring%
•scripts%for%failover%
•add%node%==%restore%backup%
•much%better%in%

MySQL%5.6:%GTIDs
MySQL*Replication
1 2
3 4
5 6
9

Agenda
•Default%asynchronous%MySQL%Replication%
•Percona*XtraDB*Cluster:*
•Introduction*/*Features*/*Load*Balancing*
•Use%Cases:%
•High%Availability%/%WAN%Replication%/%Read%Scaling%
•Limitations%
•Future
10

11Percona*XtraDB*Cluster

12Percona*XtraDB*Cluster
•All%nodes%have%a%full%copy%of%the%data%
•Every%node%is%equal%
•No%central%management,%no%SPOF

13CAP*Theorem*
•MySQL%(Asynchronous)%Replication:%
•Availability%
•Partition%Tolerance%
•Percona%XtraDB%Cluster%
•Consistency%
•Availability
Consistency
Availability
Partition%
Tolerance

What*is*Percona*XtraDB*Cluster*?
•Percona%Server%
•+%WSREP%patches%
•+%Galera%library%
•+%Utilities%(init,%SST%and%cluster%check%scripts)%
14

15
•This%is%a%free$open$source%solution,%Percona%Server%is%a%
MySQL&alternative%which%offers%breakthrough%
performance,%scalability,%features,%and%instrumentation.%
Selfctuning%algorithms%and%support%for%extremely%highc
performance%hardware%make%it%the%clear%choice%for%
organisations%that%demand%excellent%performance%and%
reliability%from%their%MySQL%database%server.
Percona&Server

WSREP&and&Galera
•WSREP&API%is%a%project%to%develop%generic%replication%
plugin%interface%for%databases%(WriteSet%Replication)%
•Galera%is%a%wsrep%provider%that%implements%multicmaster,%
synchronous%replication
16

What&is&Percona&XtraDB&Cluster&?
Full&
compatibility&&
with&existing&
systems
17

What&is&Percona&XtraDB&Cluster&?
Minimal&efforts&
to&migrate
18

What&is&Percona&XtraDB&Cluster&?
Minimal&efforts&
to&return&back&
to&MySQL
19

20Features
•Synchronous%Replication%
•Multi%Master%
•Parallel%Applying%
•Quorum%Based%
•Certification/Optimistic%Locking%
•Automatic%Node%Provisioning

21Features
•Synchronous&Replication&
•Multi%Master%
•Parallel%Applying%
•Quorum%Based%
•Certification/Optimistic%Locking%
•Automatic%Node%Provisioning

22(Virtual)&Synchronous&Replication
•Writesets%(transactions)%are%replicated%to%all%
available%nodes%on&commit%(and%queued%on%
each)%
•Writesets%are%individually&“certified”&on%every%
node,%deterministically.Either%it%is%committed%
on%all%nodes%or%no%node%at%all%(NO%2PC)%
•Queued%writesets%are%applied%on%those%nodes%
independently%and%asynchronously%
•Flow&Control&avoids%too%much%‘lag’

23(Virtual)&Synchronous&Replication

24(Virtual)&Synchronous&Replication
•Reads%can%read%old%data%
•Flow%Control%(by%default%16%trx)%avoids%lag%
•wsrep_causal_reads-can%be%enabled%to%
ensure%full%synchronous%reads%
•Latency:%writes%are%fast,%only%at%COMMIT,%
communication%with%other%nodes%happen

25Stale&Reads

26Latency

27Features
•Synchronous%Replication%
•Multi&Master&
•Parallel%Applying%
•Quorum%Based%
•Certification/Optimistic%Locking%
•Automatic%Node%Provisioning

MultiPMaster&Replication
•You%can%write%to%any%node%in%your%cluster*%
•Writes%are%ordered%inside%the%cluster%
writes
writes
writes
28

29Features
•Synchronous%Replication%
•Multi%Master%
•Parallel&Applying&
•Quorum%Based%
•Certification/Optimistic%Locking%
•Automatic%Node%Provisioning

Parallel&Replication
•Standard%MySQL
Application 

writes N threads
Apply&1&thread&
(MySQL&5.6:&
max&1&thread&&
per&schema)
30

Parallel&Replication
•PXC%/%Galera
Application 

writes N threads
Apply M threads 

(wsrep_slave_threads)
31

32Features
•Synchronous%Replication%
•Multi%Master%
•Parallel%Applying%
•Quorum&Based&
•Certification/Optimistic%Locking%
•Automatic%Node%Provisioning

Quorum&Based
•If%a%node%does%not%see%more%than%50%%of%the%total%
amount%of%nodes:%reads/writes%are%not%accepted.%
•Split%brain%is%prevented%
•This%requires%at%least%3%nodes%to%be%effective%
•a%node%can%be%an%arbitrator%(garbd),%joining%the%
communication,%but%not%having%any%MySQL%running%
•Can%be%disabled%(but%be%warned!)
33

•Loss%of%connectivity
Quorum&Based
34
Network%Problem
Does%not%accept%Reads%&%Writes

•4%Nodes
Quorum&Based
35

•Default%quorum%configuration:

4%Nodes,%0%Nodes%have%quorum
Quorum&Based
36
Network%Problem

37Features
•Synchronous%Replication%
•Multi%Master%
•Parallel%Applying%
•Quorum%Based%
•Certification/Optimistic&Locking&
•Automatic%Node%Provisioning

38Certification

39Optimistic&Locking
•Communication%to%the%other%nodes%of%the%
cluster%only%happens%during%COMMIT,%this%
affects%locking%behavior.%
•Optimistic%Locking%is%done:%
•InnoDB%Locking%happens%local%to%the%node%
•During%COMMIT/Certification,%the%other%
nodes%bring%deadlocks

40Optimistic&Locking
•Some%Characteristics:%
•also%COMMIT%and%SELECT’s%can%fail%on%
deadlock&
•Might%require%application%changes:

Not%all%applications%handle%this%properly

41Traditional&InnoDB&Locking
system 1
Transaction 1
Transaction 2
BEGIN
Transaction1
BEGIN
UPDATE t WHERE id=14
UPDATE t WHERE id=14
...
COMMIT
Waits on COMMIT in trx 1

42Traditional&InnoDB&Locking
system 1
Transaction 1
Transaction 2
BEGIN
Transaction1
BEGIN
UPDATE t WHERE id=14
UPDATE t WHERE id=14
...
COMMIT
DML,COMMIT or SELECT
ERROR due row conflict
system 2
...

43Traditional&InnoDB&Locking
system 1
Transaction 1
Transaction 2
BEGIN
Transaction1
BEGIN
UPDATE t WHERE id=14
UPDATE t WHERE id=14
...
COMMIT
COMMIT
ERROR due row conflict
system 2
...ERROR 1213 (40001): Deadlock found when trying
to get lock; try restarting transaction

44Optimistic&Locking

45Features
•Synchronous%Replication%
•Multi%Master%
•Parallel%Applying%
•Quorum%Based%
•Certification/Optimistic%Locking%
•Automatic&Node&Provisioning

Automatic&Node&Provisioning
•When%a%node%joins%the%cluster:%
•the%data%is%automatically%copied%%
•when%finished:%the%new%node%is%automatically%
ready%and%accepting%connections%
•2%different%types%of%joining:%
–SST%(state-snapshot-transfer):%full%copy%of%the%data%
–IST%(incremental-state-transfer):%send%only%the%
missing%writesets%(if-available)-
46

StateTransfer&Summary
Full data!
SST
Incremental!
IST
New node
Node long!
time!
disconnected
Node!
disconnected!
short time
47

Snapshot&State&Transfer
mysqldump
Small!
databases
rsync
Donor!
disconnected!
for copy time
Faster
XtraBackup
Donor!
available
Slower
48

Incremental&State&Transfer
Node was!
in the cluster
Disconnected!
for maintenance
Node!
crashed
49

Automatic&Node&Provisioning
writes
writes
writes
new node joining
data is copied via SST or IST
50

Automatic&Node&Provisioning
writes
writes
writes
new node joining
when ready
writes
51

PXC&with&a&Load&balancer
•PXC%is%often%integrated%with%a%load%balancer%
•service%can%be%checked%using%clustercheck%or%
pyclustercheck-
•The%load%balancer%can%
•be%a%dedicated%layer%
•integrated%at%application%layer%
•integrated%at%database%layer
52

Dedicated&shared&HAProxy&
application server 1 application server 2 application server 3
PXC node 1 PXC node 2 PXC node 3
HA PROXY
53

Dedicated&shared&HAProxy&
application server 1 application server 2 application server 3
PXC node 1 PXC node 2 PXC node 3
HA PROXY
54

Dedicated&shared&HAProxy&
application server 1 application server 2 application server 3
PXC node 1 PXC node 2 PXC node 3
HA PROXY
SST
available_when_donor=0
5

HAProxy&on&application&side
56

Agenda
•Default%asynchronous%MySQL%Replication%
•Percona%XtraDB%Cluster:%
•Introduction%/%Features%/%Load%Balancing%
•Use&Cases:&
•High&Availability&/&WAN&Replication&/&Read&Scaling&
•Limitations%
•Future
57

58Use&Cases
•High%Availability%
•WAN%Replication%
•Read%Scaling

High&Availability&
59
•Each%node%is%the%same%(no%mastercslave)%
•Consistency%ensured,%no%data%loss%
•Quorum%avoids%splitcbrain%
•Cluster%issues%are%immediately%handled%on%
•no%‘failover’%necessary%
•no%external%scripts,%no%SPOF

60WAN&replication
MySQL
MySQL
MySQL
•No%impact%on%reads%
•No%impact%within%a%trx%
•Communication%only%happens%during

COMMIT%(or%if%autocommit=1)%
•Use%higher%timeouts%and

send%windows

•Beware%of%increased%latency%
•Within%EUROPE%EC2%
•COMMIT:%0.005100%sec%
•EUROPE%<c>%JAPAN%EC2%
•COMMIT:%0.275642%sec
61WAN&replication&P&latency
MySQL
MySQL
MySQL

62
WAN&replication&with&MySQL&asynchronous&
replication
MySQL
MySQL
MySQL
•You%can%mix%both%types%of%replication%
•Good%option%on%slow%WAN%link%
•Requires%more%nodes%
•If%binlog%position%is%lost,%

full%cluster%must%be%

reprovisioned%(*)
MySQL
MySQL
MySQL
MySQL
MySQLMySQL

Agenda
•Default%asynchronous%MySQL%Replication%
•Percona%XtraDB%Cluster:%
•Introduction%/%Features%/%Load%Balancing%
•Use%Cases:%
•High%Availability%/%WAN%Replication%/%Read%Scaling%
•Limitations&
•Future
63

64Limitations
•Supports%only%InnoDB%tables%
•MyISAM%support%will%most%likely%stay%in%alpha.%
•The%weakest%node%limits&write&performance&
•All%tables%must%have%a%Primary&Key!

65Limitations
•Large%Transactions%are%not%recommended%if%
you%write%on%all%nodes%simultaneously%
•Long&Running&Transactions&
•If%the%workload%has%a%hotspot%then%
(frequently%writing%to%the%same%rows%across%
multiple%nodes)%
•Solution:%Write%to%only%1%node

66Limitations
•WAN%Replication:%All%nodes%connect%to%all%
nodes,%causing%some%network%overhead%
•Mixing%Galera%with%asynchronous%replication%
is%hard%to%manage%(no%GTID%support)

Agenda
•Default%asynchronous%MySQL%Replication%
•Percona%XtraDB%Cluster:%
•Introduction%/%Features%/%Load%Balancing%
•Use%Cases:%
•High%Availability%/%WAN%Replication%/%Read%Scaling%
•Limitations%
•Future
67

68Galera&3.0&P&Currently&BETA
•MySQL%5.6%Support%
•GTID:%solves%many%issues%with%mixing%
asynchronous%replication.%
•Improved%WAN%support%(cluster%segmentation)%
•Performance%improvements%
•Better%large%TRX%handling

69Credits
•WSREP%patches%and%Galera%library%is%
developed%by%Codership%Oy

http://www.codership.com%
•Percona%&%Codership%will%present%on%Percona%
Live%UK%2013,%Nov%11c12%

http://www.percona.com/live/londonc2013/

Summary
•Default%asynchronous%MySQL%Replication%
•Percona%XtraDB%Cluster:%
•Introduction%/%Features%/%Load%Balancing%
•Use%Cases:%
•High%Availability%/%WAN%Replication%/%Read%Scaling%
•Limitations%
•Future
70

71Resources
•Percona%XtraDB%Cluster%website:%

http://www.percona.com/software/perconacxtradbccluster/%
•Codership%website:%

http://www.codership.com/wiki/doku.php%
•PXC%articles%on%mysqlperformanceblog:%

http://www.mysqlperformanceblog.com/category/
perconacxtradbccluster/%
•Test%it%now%using%Vagrant%!%

https://github.com/grypyrg/vagrantcperconacplayground

https://github.com/lefred/perconaccluster

https://github.com/percona/xtradbcclusterctutorial/tree/v2
Questions?
Kenny%Gryp

<[email protected]>