Scale-out-RabbitMQ-cluster-can-improve-performance-while-keeping-high-availability..pdf

bu3ny 12 views 38 slides Aug 31, 2024
Slide 1
Slide 1 of 38
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38

About This Presentation

Scale-out-RabbitMQ-cluster-can-improve-performance-while-keeping-high-availability..pdf


Slide Content

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Scale-out RabbitMQ
cluster can improve
performance while
keeping high availability.

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Agenda
▸Background
▸Goal / Actions
▸Ops Side
▹RabbitMQ verification
▹OpenStack messaging inspection
▹Instance increasement load testing
▸Dev Side
▹Nova’s messaging problem and solutions against it
▹Implementations
▸Conclusion
2

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Masahito Muroi <[email protected]>
NTT
Sofware Innovation Center
▸Cloud Architecte
▸Blazar PTL
▸Congress core reviewer
Mahito Ogura <[email protected]>
NTT Communications
Technology Development
▸DevOps Engineer
▸Distributed System Architecte
▸OpenStack Contributor
About us
3
Ops Side Dev Side

Copyright © NTT & NTT Communications Corporation. All rights reserved.
▸About 3 years ago, We got a report that “Message Queue (MQ) became
a bottleneck at 10K VMs and it is difficult to create VMs after that”
▹This issue becames topics at past OpenStack Summit & Ops Meetup
▹Solution are known, such as using Nova Cell, tuning MQs and dividing some
components into different MQs
▹However, It is difficult for operators to run RabbitMQ clusters.
▸NTT Communications is planning to expand its cloud environment
▹1000+ nodes (as 10K+ VMs) per region.
▹We need to improve performance and operation on MQ
Background
4

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Goal:
▸To run more than 1000+ nodes with a single RabbitMQ Cluster
Actions
▸Research & Verify RabbitMQ’s performance / function
▸Analyze messaging inside OpenStack at high load
▸Verify RabbitMQ improvement method on OpenStack
Goal / Actions
5

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Agenda
▸Background
▸Goal / Actions
▸Ops Side
▹RabbitMQ verification
▹OpenStack messaging inspection
▹Instance increasement load testing
▸Dev Side
▹Nova’s messaging problem and solutions against it
▹Implementations
▸Conclusion
6

Copyright © NTT & NTT Communications Corporation. All rights reserved.
RabbitMQ is
▸Widely deployed OSS message broker
▸Using Advanced Message Queuing Protocol (a.k.a AMQP)
▸Able to build a cluster for load balancing
▸Able to use High Availability(HA) mode by mirroring queues [1]
What is RabbitMQ ?
[1]:https://www.rabbitmq.com/ha.html
7

Copyright © NTT & NTT Communications Corporation. All rights reserved.
RabbitMQ cluster distributes queues for the Read/Write
load distribution.
clustering
RabbitMQ
Cluster
RabbitMQ cluster can distribute queues
client
2
4
6
client
1
3
4
5 6
2
3
5
1
Single
RabbitMQ node
8
Queues
Queues Queues
Queues

Copyright © NTT & NTT Communications Corporation. All rights reserved.
RabbitMQ cluster can be set to HA mode
▸all: Queue is mirrored across all nodes in the cluster.
▸exactly: Queue is mirrored to count nodes in the cluster.
▸node: Queue is mirrored to the nodes listed in node names.
At this time, we set “HA = exactly” (mirror = 2) and verified whether it is
possible to scale out while keeping HA.
RabbitMQ cluster can make mirrored queues.
9
RabbitMQ
Cluster
set_policy
ha-mode: exactly
ha-params: 2
1
3
4
5 6
2
RabbitMQ
Cluster
1
3
4
5 6
2 1
4
6
2
5
3

Copyright © NTT & NTT Communications Corporation. All rights reserved.
“HA = exactly” can improve performance by scale out [1]

“HA = exactly” can scale out
[1]:We used HA-Proxy to avoid unbalanced access on specific node
10
Fig. 1: Node=3 Fig. 2: Node=4 Fig.3: Node=5
rate [msg/s] rate [msg/s] rate [msg/s]
time (s)time (s)time (s)
latency (μs)

Copyright © NTT & NTT Communications Corporation. All rights reserved.
“HA = all” can scale out but less than “HA = exactly”[1]
“HA = all” can scale out too but ...
Fig. 3: Node=5Fig. 1: Node=3
11
Fig. 2: Node=4
[1]:We used HA-Proxy to avoid unbalanced access on specific node
rate [msg/s] rate [msg/s] rate [msg/s]
time (s)time (s)time (s)
latency (μs)

Copyright © NTT & NTT Communications Corporation. All rights reserved.
▸RabbitMQ cluster can Scale-out
▹ “HA = exactly” scales performance better than “HA = all”
▸RabbitMQ behavior when adding node to cluster
▹A new queue may be placed in a new node
▹Existing queues don’t rebalance and aren’t mirrored in the new node
▹Rebalance requires manual operation or cluster rebuild
▹Rebalancing large capacity queues may cause high load
▹RabbitMQ clients need config update & restart
▹when using Load Balancer, you have to update config to access
the new node
Results of RabbitMQ verification 1/2
12

Copyright © NTT & NTT Communications Corporation. All rights reserved.
▸Behavior when removing node from cluster
▹Existing queues are mirrored properly and no queue is lost
▹Error increases for a while on the client side
▹Clients should consider error handling
▸Behavior at failure
▹Similar to “removing node from cluster”
▹Existing queues are rebalanced because of node decrease
▹Rebalance didn't occur even after node are added for recovery
▹Rebalance requires manual rebalance or cluster rebuild
Results of RabbitMQ verification 2/2
13

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Agenda
▸Background
▸Goal / Actions
▸Ops Side
▹RabbitMQ verification
▹OpenStack messaging inspection
▹Instance increasement load testing
▸Dev Side
▹Nova’s messaging problem and solutions against it
▹Implementation
▸Conclusion
14

Copyright © NTT & NTT Communications Corporation. All rights reserved.
OpenStack uses messaging via MQ
Messaging patterns
▸API call
▹User Request
▹Call from Periodic task
▹etc…
▸Periodic task
▹Status check/update
▹Heal instance info cache
▹etc...
OpenStack messaging inspection
Messaging between Nova and Cinder
15

Copyright © NTT & NTT Communications Corporation. All rights reserved.
3 Messaging Kinds used in OpenStack




Kinds of “queue name” that OpenStack creates
▸“ServiceName”
▸“ServiceName_fanout_<<uuid>>” for fanout
▸“ServiceName.HostName”
▸”reply_<<uuid>>” for call reply
Messaging Kinds in OpenStack
Messaging Kinds Messaging TargetsReply
cast 1 : 1 No
call 1 : 1 Yes
fanout 1 : n No
16

Copyright © NTT & NTT Communications Corporation. All rights reserved.
User Request or call from periodic task using API
Pattern 1: API call messaging
nova-api
Queue
<conductor>
RabbitMQ
nova-conductor
Queue
<scheduler>
nova-scheduler
Queue
<reply_xxx>
Queue
<compute.”host”>nova-compute





① cast to ”conductor”
② call to ”scheduler”
③ reply to ②
④ call to a compute node
⑤ reply to ④
Fig. Flow of instance creating
17

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Pattern 2: Periodic task messaging
There are tasks that are executed periodically in
OpenStack service. Some of them send messages to
other service by RPC (Remote Procedure Call) or API

The number of periodic tasks in main components
▸Nova: 24(Messaging 23, non-Messaging 1)
▸Neutron: 2(Messaging 2)
▸Cinder: 2(Messaging 1, non-Messaging 1)
▸Glance: None
▸Keystone: None
18

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Peiodic task regulary executes API calls and / or
messaging to other service.
Fig. Flow of instance information synchronization to nova scheduler.
Pattern 2: Periodic task messaging
19
nova-compute
Queue
<conductor>
RabbitMQ
nova-conductor
Queue
<reply_xxx>
nova-scheduler
Queue
<scheduler>



① call to ”conductor” for DB access
② reply to ①
③ cast to ”scheduler”
Database
19

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Results of OpenStack messaging inspection
20
▸OpenStack has 2 messaging patterns, "API call" and "Periodic
Task". The two uses a common set of queues.
▸The number of queues does not increase, unless nodes or
services are added.
▸The amount of messages depends on the number of API calls
from users, the number of resources such as instances, block
devices and virtual routers, and the number of service nodes.
▸The message size depends on the number of resources such as
instances, block devices and virtual routers.

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Agenda
21
▸Background
▸Goal / Actions
▸Ops Side
▹RabbitMQ verification
▹OpenStack messaging inspection
▹Instance increasement load testing
▸Dev Side
▹Nova’s messaging problem and solutions against it
▹Implementation
▸Conclusion

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Environment
We created 1 to 1,000 instances to measure the
messaging count and message size of each queue in test
environment.

▸nova-compute:13 node
▸RabbitMQ v3.2.4:3 node (HA:ALL)
22

Copyright © NTT & NTT Communications Corporation. All rights reserved.
The number of messages & message size (in one hour)
increase when number of instances increases
Number of Messages & Message size
message size (byte)
messages (msg)
InscancesInscances
Increase in the number of unique queue messages Increase in the unique queue messages size
23
conductor

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Agenda
24
▸Background
▸Goal / Actions
▸Ops Side
▹RabbitMQ verification
▹OpenStack messaging inspection
▹Instance increasement load testing
▸Dev Side
▹Nova’s messaging problem and solutions against it
▹Implementation
▸Conclusion

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Problems in Nova’s messaging
▸heavily concentrated in ‘conductor’ queue
▹Becaues of scale-out reason of nova-conductor and
nova-compute
25
nova-compute
Queue
<conductor>
RabbitMQ nova-conductor
nova-compute
Database
nova-compute
nova-conductor
nova-compute
nova-conductor

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Basic ideas against the problem
▸Dividing the messaging workloads into multi queues
▹connected between nova-conductors and nova-computes with
mesh architecture
nova-compute
Queue
<conductor>
RabbitMQ
nova-conductor
nova-compute
Database
nova-compute
nova-conductor
nova-compute
nova-conductor
Queue
<conductor>
Queue
<conductor>

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Agenda
27
▸Background
▸Goal / Actions
▸Ops Side
▹RabbitMQ verification
▹OpenStack messaging inspection
▹Instance increasement load testing
▸Dev Side
▹Nova’s messaging problem and solutions against it
▹Implementation
▸Conclusion

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Implementation
▸Each nova-compute locally selects a leader
nova-conductor
▹Basing on a receiver of a previous RPC
▹calling all RPC to the leader nova-conductor directly
nova-compute
Queue
<conductor>
RabbitMQ
nova-conductor Database
nova-conductor
nova-conductor
Queue
<reply_xxx>
Queue
<conductor.host1>

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Implementation
nova-compute
Queue
<conductor>
RabbitMQ
nova-conductor
Database nova-conductor
nova-conductor
Queue
<reply_xxx>
Queue
<conductor.host1>
nova-compute
nova-compute
nova-compute
Queue
<conductor.host2>
Queue
<conductor.host3>

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Advantages of the locally selected leader RPC

▸Non central manager

▸Automatic Load Balancing among nova-conductors

▸Automatic rebalancing in case of down in nova-conductor

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Agenda
31
▸Background
▸Goal / Actions
▸Ops Side
▹RabbitMQ verification
▹OpenStack messaging inspection
▹Instance increasement load testing
▸Dev Side
▹Nova’s messaging problem and solutions against it
▹Implementation
▸Conclusion

Copyright © NTT & NTT Communications Corporation. All rights reserved.
▸Scale-out RabbitMQ cluster can improve performance
while keeping high availability
▹Load balancing to distribute Read / Write to queues within a cluster
▹When the performance limit for a specific queue is reached, it is
necessary to respond with scale-up
▸The number of messages & message size increase
when instances & services increase.
▹Especially the messages which nova-compute throw to
nova-conductor due to DB access, will increase fast.
Conclusion 1/2
32

Copyright © NTT & NTT Communications Corporation. All rights reserved.
▸The stuck of messages in ‘conductor’ queue can be
resolved by changing messaging style.
▹sample codes:
https://github.com/muroi/nova/commits/conductor-message-bus
▹Not yet proposed to Nova community
▸There is not much information on “HA mode exactly”
▹It is necessary to confirm that there are no problems due to
large-scale and long-term stabilization tests in the future
Conclusion 2/2
33

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Thank you for your attention.
This presantation is provided by

Copyright © NTT & NTT Communications Corporation. All rights reserved.
Agenda
35
▸Background
▸Goal / Actions
▸Ops Side
▸Dev Side
▸Conclusion
▸Reference

Copyright © NTT & NTT Communications Corporation. All rights reserved.
▸RabbitMQ at Scale, Lessons Learned
▹Cisco reported that they have run 800+ compute nodes by using
single RabbitMQ cluster
▹3 RabbitMQ nodes and HA-mode is ALL
▹It is possible to run 800+ compute nodes by tuning Linux Kernel,
Erlang VM and RabbitMQ
Reference : OpenStack Summit Barcelona 1/2
36

Copyright © NTT & NTT Communications Corporation. All rights reserved.
▸Chasing 1000 Nodes Scale
▹Default number of API/RPC workers in OpenStack services wouldn't
work for us if it tightened up to number of cores
▹When CPU and RAM are set enough in OpenStack services
▹MySQL and RAbbitMQ aren't bottoleneck at all.
▹Scheduler performance/scalability issues
Reference : OpenStack Summit Barcelona 2/2
37

Copyright © NTT & NTT Communications Corporation. All rights reserved.


Thank you!!
Tags