Mule Runtime: Performance Tuning

mulesoft 4,887 views 27 slides Apr 21, 2017
Slide 1
Slide 1 of 27
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27

About This Presentation

Explore case studies from our most demanding deployments and provide a best practice approach to designing and tuning applications for optimal performance.


Slide Content

Rupesh Ramachandran Roy Prins Performance Tuning

What we’ll cover today… What does Mule runtime performance look like Why does it perform so well How do you tune for high performance Testing Process Quick Wins Fine Tuning

Mule runtime performance Real-world customer examples Financial institution ~100 million transactions per day On-premise Elastic scale Large fast food chain ~40 million transactions per day CloudHub

Stability under Concurrency very important for API’s Performance characteristics 4

Batch jobs and bulk load still relevant Performance characteristics 5

Scalability 6

7

Why does it perform this well (technology) 8 Non-blocking architecture Plus many others Streaming Log4j2 Optimized xpath , jms , jetty kryo serialize etc

Why does it perform this well (people, process) World class performance engineering team Regression tests for every release Realistic customer projects and production traffic Profile and optimize new capabilities Pass on lessons learnt Performance Tuning Guide Optimized defaults for tunables

Tuning Mule Runtime

Define Performance Goals Memory Footprint Large payload steady state Less important on 64-bit Prevent swapping Startup Time Somewhat important (e.g. production ) Microservices , containerization requiring elastic scale up/down Throughput Transactions processed over time (e.g. TPS) Almost always Important Responsiveness Latency, round trip time, user wait time, etc Very Important for API’s Common mistake: using single message latency as indication of machine performance, does not account for concurrency overheads You can only pick two!! Minimize Memory Footprint Minimize Latency Minimize Overhead

Performance Tuning Process Repeatability - use a Controlled Environment High performance test driver External dependencies Mock if susceptible to outside constraints Keep CPU utilization under 75% Find the inflection point Tune one parameter at a time

Performance Tunables

Mule Performance Tunables - Overview Mule Design time tuning Mule flows Mule configuration settings Mule Runtime tuning JVM GC Infrastructure tuning OS Network

Longevity Tests Scalability Tests Tune test infra Tune backend Mule flow/config JVM/GC Iterative Test, Monitor, Profile Final Report Performance Test Lifecycle Take baseline Tune test infra and tools Functional Tests Design Mule Flows

Quick Wins

80/20 Rule Runtime tuning JVM Xmx , Xms GC CMS for API’s If large machines (cores/RAM), consider more than 1 Mule per machine Infrastructure tuning ulimit –n File descriptors for high socket connections wrapper.java.additional.16 =- XX:NewSize =1536m wrapper.java.additional.17=- XX:MaxNewSize = 1536m wrapper.java.additional.20=- Xms =2048m wrapper.java.additional.21=- Xmx =2048m # GC Tuning wrapper.java.additional .22= -XX:+ UseConcMarkSweepGC wrapper.java.additional .23= - XX:CMSInitiatingOccupancyFraction =65 wrapper.java.additional.xz =- XX:UseCMSInitiatingOccupancyOnly

Fine tuning Mule Runtime

Design time tuning – Mule Flows Session Variables Fewer Smaller Payload Formats Java objects fastest Data Extraction MEL preferred to scripting languages Message Transformers DataWeave preferred option XSLT Java

Design time tuning – Mule Flows Integration Patterns and performance implications Scatter-Gather Two or more independent data operations with a single source Results of the operations are combined Async scope Cache scope Transactional scope (cost) Stateful components ( resequencer , idempotent, etc ) Batch Module for bulk loads

Design time tuning – Mule configurations HTTP Connections Don’t use Jetty connector Use latest runtime for best NIO performance NIO acceptor threads handoff to worker threads Acceptor threads default is #cores. Works well For worker threads tuning, adjust MaxThreadsActive on HTTP listener (default 128) HTTP Keep-alive For HTTP outbound, on by default Test driver or HTTP clients should use the same HTTPS Use latest TLS 1.1

Design time tuning – Mule configurations Messaging Producer Mule flows JMS Connection Pooling ( sessionCacheSize param ) Consumer Mule flows JMS – ‘ numberOfConsumers ’ AMQP – ‘ numberOfChannels ’ JMS Server Disable message persistence if not needed Durable subscriber (cost) Message filtering (cost) VM Endpoints Flow references instead of VM endpoints within same app Unless HA save point required functionally

Design time tuning – Mule configurations Threading Threading Profiles Synchronous vs Asynchronuos processing Thread Pools Logging Async logger vs sync

Runtime tuning – Mind-map of 80/20 rule for GC tuning 24 Oracle Hotspot (Deprecated in JDK 8. Use MetaSpaceSize instead to bound native space use)

25

Thank you!