P99 Publish Performance in a Multi-Cloud NATS.io System

ScyllaDB 184 views 37 slides Jun 25, 2024

Slide 1 of 37

About This Presentation

This talk will walk through the strategies and improvements made to the NATS server to accomplish P99 goals for persistent publishing to NATS JetStream that was replicated across all three major cloud providers over private networks.

Size: 3.93 MB

Language: en

Added: Jun 25, 2024

Slides: 37 pages

Slide Content

P99 Publish Performance in Multi-Cloud NATS.io Derek Collison Founder & CEO Synadia

Derek Collison Founder & CEO Synadia S omething cool I’ve done: NATS.io, CloudFoundry, Google AJAX APIs, TIBCO Rendezvous and EMS My perspective on P99s: Systems L evel Problem Another thing about me: Conducted f ederal trials for first intra-arterial blood gas device What I do away from work: Travel, Boating

GOAL

<20ms P99 Streaming Publish across Multi-Cloud Use NATS / JetStream to publish messages which will be stored in multiple cloud providers at the same time through consensus Requirement of using all three major cloud providers as a single system P99 under load of <20ms 1gbit private network

What is NATS.io

NATS.io and JetStream

NATS.io and JetStream NATS is a Modern Connective Technology for Adaptive Edge & Distributed Systems Can do MicroServices, Streaming, Key-Value and Object Stores Based on a core pub/sub engine - location independent addressing At-Most-Once, At-Least-Once, and Exactly-Once* Static binary servers can link to form any topology and are multi-operator Secure multi-tenancy JetStream is our persistence and clustering / consensus technology More info: nats.io / youtube.com/@SynadiaCommunications

NATS Topology

NATS Topology This is a single NATS system spanning 9 servers with 3 servers per cloud provider Cloud Providers are connected via private 1Gbit links, 3-5ms RTT Topology is actually a full mesh one hop NATS cluster All have persistence through JetStream All have tags that allow placement of JetStream Streams and Consumers Anti-affinity tags allow R3 streams across all 3 providers and different AZs DNS for system and each provider

Creating a Stream (Anti-affinity tags)

Creating a Stream Stream creation uses anti-affinity to place R3 stream on each instance of CP Applications can connect to any NATS server JetStream runs a meta-layer for assignments - All servers have copy, e.g. dna HA / R3 assets run a consensus algorithm built upon NRGs NRGs are NATS implementation of RAFT Fast cooperative transfer and storage fault detection Leader is selected and they listen / subscribe for subjects assigned to the stream

Publishing to a Stream Clients can connect to any server They publish / send a message to a subject that the stream is listening on If the message has a reply subject, we respond with an ACK ACK is only sent after consensus has been reached

GOAL <20ms P99 Streaming Publish

How did we do?

First Results System started not heavily loaded P50 - P75 were decent Load was then increased (more streams, more publishers, more consumers) P50 still ok, P75 started to drift P99 was > 5s in some instances - NOT GOOD

Digging In..

Triage NATS servers <= 2.9.x have one connection between them All traffic from multiple accounts is mux’d across that connection A NATS server running JetStream could have many internal subscriptions Many of these would run inline, and hence be susceptible to delays Disk IO was a major culprit Network already offload due to NATS server architecture

Removing IO from hot path Work was done to remove any internal callbacks to JetStream from routes Storing a message was already removed but the NRG traffic was still inline Moved to a lock and a copy inline with all route processing Queued up internally for another execution context to process

Second Results We saw improvements P50 - P90 were looking decent Under load we were still getting not great results for P95 and P99

Digging Back In..

Second Triage Determined that some coarse grained locks were experiencing contention Interest based retention streams causing issues with NRG snapshots This due to a bad initial design for storing state on interior deletes

Removing coarse grained locks Work was done to add additional fine grained locks in the hot path In some instances we used atomics (NATS is written in Go) Only lock / atomic required was now for the internal queues Each function that needed execution context had its own dedicated IPQ Go channels are too slow for this section of the server

Switch to Limit Retention Interest based retention can cause interior deletes Massive interior deletes caused issues with NRG snapshots for <=2.9 Original design did not properly account for these Large Key/Value stores also had issues Redesigned in 2.10, with 1600x in time, and 50,000x in space Switched to Limits based for now

Third Results We saw more improvements 🙂 P50 - P95 were looking decent Under load we were still getting so so results for P99 and above Determined that the memory system was now our bottleneck 🤨 Never seen this before (sans maybe the very early days of Go, 0.52 days)

Third Triage Determined that the memory system was now our bottleneck 🤨 Never seen this before (sans maybe the very early days of Go, 0.52 days)

Removing memory contention Work was done use sync.Pools for objects we needed to post on IPQs These are thread friendly in terms of contention Functions now have their own IPQ with dedicate lock or atomic and sync.Pool

Final Results We saw more improvements 🙂 P50 - P99 were looking decent and reached our goal Under load we could still get good results However, not all was perfect 🤨 System reacted poorly on certain setups Clumping of client connections Clumping of JetStream asset leaders, streams or consumers Large interior deletes on NRG snapshots

Next Steps JetStream peer selection can be modified already and we try to balance Peer selection and placement can be changed during runtime as well Balancing client connections is possible but manual, will automate Each JetStream asset leader can be instructed to step-down and select a new leader This process also needs to be automated NATS v2.10 has multi-path routing, with mux’d and pinned routes NATS v2.10 has a redesigned NRG snapshot logic that is >1000x better in latency and size

NATS v2.10 is released!

Derek Collison [email protected] @derekcollison nats.io / synadia.com Thank you! Let’s connect.

P99 Publish Performance in a Multi-Cloud NATS.io System

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

P99 Publish Performance in a Multi-Cloud NATS.io System

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......