Alluxio + S3 A Tiered Architecture for Latency-Critical, Semantically-Rich Workloads

Alluxio 0 views 20 slides Oct 02, 2025

Slide 1 of 20

About This Presentation

AI/ML Infra Meetup
Sep. 30, 2025
Organized by Alluxio

For more Alluxio Events: https://www.alluxio.io/events/

Speaker:
- Bin Fan (VP of Technology @ Alluxio)

In this talk, Bin Fan, VP of Technology at Alluxio, presents on building tiered architectures that bring sub-millisecond latency to S3-...

Size: 3.43 MB

Language: en

Added: Oct 02, 2025

Slides: 20 pages

Slide Content

Alluxio + S3: A Tiered Architecture
for Latency-Critical,
Semantically-Rich Workloads
Bin Fan,
VP of Technology @Alluxio
[email protected]
Sept 30th, 2025

Alluxio Confidential
Amazon S3 Becomes the Hard Disk for Cloud

400+ trillion objects stored
150+ million requests per second
11 nines durability (99.999999999%)
~$23/TB/month cost-effective pricing
But Modern Workloads Demand More...
As AI training, inference, and real-time analytics evolve, S3 or Object Storage in general shows strain with latency-critical and semantically
rich operations.

Alluxio Confidential
Modern AI & Realtime Workload Requirements

What an Architect is asking for today:
●Sub-millisecond SLAs for online queries, feature stores, agentic memory, etc
●Efficient write-ahead logs and checkpointing for large objects
●High-performance metadata operations across millions of objects
●All while maintaining S3's pricing, scalability, and operational simplicity.

Alluxio Confidential
Where S3 Shows Strain

Latency: GetObject TTFB typically 30-200ms — acceptable for batch, painful
for inference and real-time access
Limited Semantics: Rename operations require copy + delete; append writes not
supported in Standard S3
Metadata Operations: Standard S3 directories are prefixes, making large-scale
listing expensive and slow
The Reality: S3 excels as a capacity store, but real-time and latency-critical
workloads requiring sub-millisecond response times.

Alluxio Confidential
Our Solution: Augment, Don't Replace

Add a transparent, distributed caching and augmentation layer on top of S3,
combining the best of multiple worlds:
-Mountable experience of AWS FSx for Lustre
-Ultra-low latency of AWS S3 Express One Zone
-Cost efficiency of standard S3 buckets
-No data migration required

What is Alluxio,
indeed?
A shim layer on S3 (or other cloud
storage) to provide sub-ms read
latency, single-digit ms write
latency, and enhanced semantics,
driven by morden data-intensive
workloads

Journey of Alluxio Since Inception

Alluxio open source
project founded
UC Berkeley AMPLab
2019 2023
Baidu deploys
1000+ node cluster
2014
Alluxio scales
to 1 billion files
7/10 top internet brands
accelerated by Alluxio
AliPay accelerates
model training
BIG DATA ANALYTICS CLOUD ADOPTION

GENERATIVE AI
1000+ OSS
Contributors
Meta accelerates
Presto workloads
9/10 top internet brands
accelerated by Alluxio
2024
Alluxio scales to
10+ billion files
Leading ecommerce brand
accelerates model training
Fortune 5 brand
accelerates model training
Zhihu accelerates
LLM model training

8
Alluxio in AI & Analytics Ecosystem
Various API Support
(S3, HDFS, POSIX etc)

Mount a S3 bucket
Mounted s3://bucketA/data to alluxio Path alluxio:///s3:

$ bin/alluxio mount add \
--path /s3/ \
--ufs-uri s3://bucketA/data/

Alluxio Confidential
Alluxio is the industry-leading
sub-ms time to ﬁrst byte (TTFB) solution on S3-class storage
How much better is Alluxio? (Details next slide)
➔45x Lower Latency than S3 Standard
➔5x Lower Latency than S3 Express One Zone
➔Unlimited, linear scalability
Alluxio for Low Latency Caching

Alluxio Confidential
Test environment references

Alluxio EE
● Version/Spec: Alluxio Enterprise AI 3.6 (50TB
cache)
● Test env: 1 FUSE (C5n.metal, 100Gbps
network) and 1 Worker (i3en.metal)
AWS S3
● Version/Spec: AWS S3 bucket (Standard Class)
● Test env: 1 FUSE (C5n.metal, 100Gbps
network)
AWS S3 Express One Zone
● Version/Spec: AWS bucket (S3 Express One
Zone Class)
● Test env: 1 FUSE (C5n.metal, 100Gbps
network)
Alluxio for Low Latency Caching

➔ 45x Lower Latency than S3 Standard
➔ 5x Lower Latency than S3 Express One Zone

Alluxio
Worker n
Alluxio
Worker 2
Big Data Query Big Data ETL Model Training
Core Feature 1: Distributed Caching
Alluxio
Worker 1
A
B
s3:/bucket/file1
s3://bucket/file2C
A C B
Select worker based on
consistent hashing

Core Feature 2: Filesystem Namespace Virtualization
●Alluxio can be viewed as a logical file system
○Multiple different storage service can be mounted into same logical Alluxio namespace
●An Alluxio path is backed by an persistent storage address
○alluxio://ip:port/Data/Sales <-> hdfs://service/salesdata/Sales

Alluxio Namespace
AWS us-east-1

/
Data Users
Alice Bob
s3://bucket/Users
Alice Bob
On-prem data warehouse

hdfs://service/salesdata
Reports Sales
Reports Sales

Common theme:
●Use Apache Parquet format for fast
point-query lookup into structured data
○Industry standard today for data lake
●Store Parquet files of PB level on S3
●Read directly from S3 is bad in tail
latency
Low Latency Read Accelerator on S3
AWS
SQL/Pandas/Polars
Data Lake
Distributed Cache
~1 ms
30 ms -
200 ms
See Talk from Greg Lindstrom @ Blackout Power Trading

Common theme:
●Can be overwrite or append
●Either keeps replication in Alluxio space
or async upload to S3

Low-latency & “Reliable” Write Buffer on S3
AWS
Rocksdb/S3 Client
Data Lake
Distributed Cache
~5 msappend
Upload in
background

Alluxio: Bringing Performance and Semantics to S3
a software layer that transparently sits between applications and S3 (or any object
store). It offers both POSIX and S3-compatible
Benefits (on top of S3) Capability
Zero-migration Mount existing S3 buckets as-is; no data move required
Low-latency accelerator Achieves sub-ms latency for S3 objects
Semantic bridge Enable append, async writes, and cache-only updates
Minimal-hardware
requirement
Pool local SSDs/NVMEs for intelligent, cost-efficient caching
Kubernetes-native Deploy via Operator; integrated metrics, tracing, and observability

Alternatives on AWS: Side-by-Side Comparison
Feature S3 Standard S3 Express One
Zone
FSx Lustre +
S3
Alluxio + S3
Latency (TTFB)100+ ms 1–10 ms 1 ms 1 ms
Multi-cloud ❌ ❌ ❌ ✅
POSIX API ❌ ❌ ✅ ✅
S3 API ✅ ✅ ❌ ✅
Support WALs
(Append)
❌ ✅ ✅ ✅ (via POSIX)
Data Migration
Required
No High (Creation
time choice)
No No
Cost ($/TB/mo)

~$23
1
~$110
2
~$143
3
~$23
4
to ~$41
5

1
Assumes S3 standard is the source of
truth, hoping full data
2
Assumes S3 Express One Zone
holding full data, as it needs to be
decided at bucket creation time
3
Assumes for 1,000 MB/s/TiB class,
FSx Lustre holding 20% hot data, while
S3 keeps full data
4
Assumes Alluxio deployed on GPU
spare disks holding 20% hot data, no
additional hardware cost, while S3
keeps full data
5
Assumes separate Alluxio cluster
holding 20% hot data using i3en.6xlarge
instances (1 yr reserved), while S3
keeps full data

A joint engineering collaboration between Alluxio and
Salesforce. For More details:
https://www.alluxio.io/whitepaper/meet-in-the-middle-fo
r-a-1-000x-performance-boost-querying-parquet-files-o
n-petabyte-scale-data-lakes

Takeaway
●S3 remains essential but wasn't designed for latency-critical, semantically
rich workloads
●Don't need to migrate away from S3 — augment it with intelligent caching
and performance layers
●Alluxio bridges the gap between S3's cost-effectiveness and the
performance demands of modern AI/ML workloads
●Proven results: Sub-millisecond latency, and real-world production
deployments
●Maximize ROI: Get premium performance without premium costs or
operational complexity

Alluxio + S3 A Tiered Architecture for Latency-Critical, Semantically-Rich Workloads

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Alluxio + S3 A Tiered Architecture for Latency-Critical, Semantically-Rich Workloads

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx