Evolving Atlassian Confluence Cloud for Scale, Reliability, and Performance by Bhakti Mehta
ScyllaDB
72 views
25 slides
Mar 05, 2025
Slide 1 of 25
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
About This Presentation
Explore how Confluence Cloud scaled to handle billions of requests while ensuring performance & reliability. Learn about microservice sharding, dependency scaling, failure isolation, and optimizing metrics for real customer impact.
Size: 1.58 MB
Language: en
Added: Mar 05, 2025
Slides: 25 pages
Slide Content
A ScyllaDB Community
Evolving Confluence Cloud for
Scale, Reliability and
Performance
Bhakti Mehta
Senior Chief Architect
Confluence Cloud
Bhakti Mehta
■Senior Chief Architect Confluence Cloud.
■Working at Atlassian for past 8.5 years
■Author of two books and regular conference speaker
■Past experiences:
■Platform lead at BlueJeans Network
■Sun Microsystems/Oracle for 13 years
Agenda
Confluence Cloud architecture
Architecting for Scale
Reliability at Scale
Performance
Balancing Cost with Reliability and
Scale
Confluence Cloud at 30000 ft
Forked codebase
from on-prem
offering
Multi-tenant
environment
13 region
deployments
Billions of requests
per day with low
latency
requirements
Initial high level
architecture
Architecting for Scale
Architecting for Scale
Challenges we faced
■Noisy neighbor issues
■Resource starvation, high CPU utilization on Databases
■Only vertically scalable solution initially
■Dependencies bringing down our services
■Scaling challenges during peak traffic
Where are we now?
Support for 150K users on a single
tenant
Architecting for Scale
Made the following changes
■Transitioned from Amazon RDS for Postgres to Amazon Aurora and
leverage read replicas
■Decomposed a few services to Dynamo DB for low latency
microservices
■Revisited scaling policies to handle peak traffic
Made the following changes
■Feature gates for all code changes
■Resiliency for failures from dependencies via timeouts and circuit
breakers
Architecting for Scale
Best practices for
Scale
Architecture reviews for newer features and identify bottlenecks
earlier
Constant monitoring of critical metrics such as tenant placement,
CPU utilization, DB connections
Load testing for features to address bottlenecks before
release to production
Reliability at Scale
Reliability at Scale
How we address?
■Public SLA for core experiences with 99.95 % reliability
■Budget is 21 min (downtime + experience failures per tenant)
■Focussing on end to end customer experience and tracking failures
Best practices for
Resiliency
Focus on early detection and
monitoring
Clear dashboards to
identify impact
Renderinging metrics to be
insightful of customer pain
Focus on daily errors
proactively
Performance
Performance
How do we architect for performance?
■Focus on Time to Visually Complete (TTVC metric)
- metric for measuring the performance of web page loads, page
transitions and interactions.
■How long it takes for everything to on page to fully render
without visually changing?
■We have detailed dashboards to measure and quantify impact
Performance
How do we architect for performance?
■Cumulative Layout Shift (CLS) a metric used to measure the
visual stability of a webpage.
■It quantifies the largest burst of unexpected layout shifts that
occur during the entire lifecycle of a page.
■These shifts can negatively impact user experience by causing
elements to move unexpectedly, leading to a jarring browsing
experience.
Performance
How do we architect for performance?
■Reducing JS bundle size to reduce time to download (network),
evaluate (CPU) and memory consumption
■React 18 and SSR streaming to push rendering start time close
to request start time.
■Parallel fetching of macros along with Content
■Prioritization score for which items to pick based on Traffic
percentage, view port size and TTVC
■Fixing CLS for some elements which helped in TTVC
Balancing Cost with Reliability and Scale
Special projects aimed
at cost reduction
Sharing successes across teams to
adopt similar practices
Metrics & Measurement
With attribution to the right
microservices
Stay in Touch
Your Bhakti Mehta [email protected]
bhakti_mehta
www.linkedin.com/in/bhaktihmehta