Alluxio + S3 A Tiered Architecture for Latency-Critical, Semantically-Rich Workloads

Alluxio 0 views 20 slides Oct 02, 2025
Slide 1
Slide 1 of 20
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20

About This Presentation

AI/ML Infra Meetup
Sep. 30, 2025
Organized by Alluxio

For more Alluxio Events: https://www.alluxio.io/events/

Speaker:
- ​Bin Fan (VP of Technology @ Alluxio)

In this talk, Bin Fan, VP of Technology at Alluxio, presents on building tiered architectures that bring sub-millisecond latency to S3-...


Slide Content

Alluxio + S3: A Tiered Architecture
for Latency-Critical,
Semantically-Rich Workloads
Bin Fan,
VP of Technology @Alluxio
[email protected]
Sept 30th, 2025

Alluxio Confidential
Amazon S3 Becomes the Hard Disk for Cloud

400+ trillion objects stored
150+ million requests per second
11 nines durability (99.999999999%)
~$23/TB/month cost-effective pricing
But Modern Workloads Demand More...
As AI training, inference, and real-time analytics evolve, S3 or Object Storage in general shows strain with latency-critical and semantically
rich operations.

Alluxio Confidential
Modern AI & Realtime Workload Requirements

What an Architect is asking for today:
●Sub-millisecond SLAs for online queries, feature stores, agentic memory, etc
●Efficient write-ahead logs and checkpointing for large objects
●High-performance metadata operations across millions of objects
●All while maintaining S3's pricing, scalability, and operational simplicity.

Alluxio Confidential
Where S3 Shows Strain

Latency: GetObject TTFB typically 30-200ms — acceptable for batch, painful
for inference and real-time access
Limited Semantics: Rename operations require copy + delete; append writes not
supported in Standard S3
Metadata Operations: Standard S3 directories are prefixes, making large-scale
listing expensive and slow
The Reality: S3 excels as a capacity store, but real-time and latency-critical
workloads requiring sub-millisecond response times.

Alluxio Confidential
Our Solution: Augment, Don't Replace

Add a transparent, distributed caching and augmentation layer on top of S3,
combining the best of multiple worlds:
-Mountable experience of AWS FSx for Lustre
-Ultra-low latency of AWS S3 Express One Zone
-Cost efficiency of standard S3 buckets
-No data migration required

What is Alluxio,
indeed?
A shim layer on S3 (or other cloud
storage) to provide sub-ms read
latency, single-digit ms write
latency, and enhanced semantics,
driven by morden data-intensive
workloads

Journey of Alluxio Since Inception

Alluxio open source
project founded
UC Berkeley AMPLab
2019 2023
Baidu deploys
1000+ node cluster
2014
Alluxio scales
to 1 billion files
7/10 top internet brands
accelerated by Alluxio
AliPay accelerates
model training
BIG DATA ANALYTICS CLOUD ADOPTION

GENERATIVE AI
1000+ OSS
Contributors
Meta accelerates
Presto workloads
9/10 top internet brands
accelerated by Alluxio
2024
Alluxio scales to
10+ billion files
Leading ecommerce brand
accelerates model training
Fortune 5 brand
accelerates model training
Zhihu accelerates
LLM model training

8
Alluxio in AI & Analytics Ecosystem
Various API Support
(S3, HDFS, POSIX etc)

Mount a S3 bucket
Mounted s3://bucketA/data to alluxio Path alluxio:///s3:


$ bin/alluxio mount add \
--path /s3/ \
--ufs-uri s3://bucketA/data/

Alluxio Confidential
Alluxio is the industry-leading
sub-ms time to first byte (TTFB) solution on S3-class storage
How much better is Alluxio? (Details next slide)
➔45x Lower Latency than S3 Standard
➔5x Lower Latency than S3 Express One Zone
➔Unlimited, linear scalability
Alluxio for Low Latency Caching

Alluxio Confidential
Test environment references

Alluxio EE
● Version/Spec: Alluxio Enterprise AI 3.6 (50TB
cache)
● Test env: 1 FUSE (C5n.metal, 100Gbps
network) and 1 Worker (i3en.metal)
AWS S3
● Version/Spec: AWS S3 bucket (Standard Class)
● Test env: 1 FUSE (C5n.metal, 100Gbps
network)
AWS S3 Express One Zone
● Version/Spec: AWS bucket (S3 Express One
Zone Class)
● Test env: 1 FUSE (C5n.metal, 100Gbps
network)
Alluxio for Low Latency Caching



➔ 45x Lower Latency than S3 Standard
➔ 5x Lower Latency than S3 Express One Zone

Alluxio
Worker n
Alluxio
Worker 2
Big Data Query Big Data ETL Model Training
Core Feature 1: Distributed Caching
Alluxio
Worker 1
A
B
s3:/bucket/file1
s3://bucket/file2C
A C B
Select worker based on
consistent hashing

Core Feature 2: Filesystem Namespace Virtualization
●Alluxio can be viewed as a logical file system
○Multiple different storage service can be mounted into same logical Alluxio namespace
●An Alluxio path is backed by an persistent storage address
○alluxio://ip:port/Data/Sales <-> hdfs://service/salesdata/Sales

Alluxio Namespace
AWS us-east-1

/
Data Users
Alice Bob
s3://bucket/Users
Alice Bob
On-prem data warehouse

hdfs://service/salesdata
Reports Sales
Reports Sales

Common theme:
●Use Apache Parquet format for fast
point-query lookup into structured data
○Industry standard today for data lake
●Store Parquet files of PB level on S3
●Read directly from S3 is bad in tail
latency
Low Latency Read Accelerator on S3
AWS
SQL/Pandas/Polars
Data Lake
Distributed Cache
~1 ms
30 ms -
200 ms
See Talk from Greg Lindstrom @ Blackout Power Trading

Common theme:
●Can be overwrite or append
●Either keeps replication in Alluxio space
or async upload to S3

Low-latency & “Reliable” Write Buffer on S3
AWS
Rocksdb/S3 Client
Data Lake
Distributed Cache
~5 msappend
Upload in
background

Alluxio: Bringing Performance and Semantics to S3
a software layer that transparently sits between applications and S3 (or any object
store). It offers both POSIX and S3-compatible
Benefits (on top of S3) Capability
Zero-migration Mount existing S3 buckets as-is; no data move required
Low-latency accelerator Achieves sub-ms latency for S3 objects
Semantic bridge Enable append, async writes, and cache-only updates
Minimal-hardware
requirement
Pool local SSDs/NVMEs for intelligent, cost-efficient caching
Kubernetes-native Deploy via Operator; integrated metrics, tracing, and observability

Alternatives on AWS: Side-by-Side Comparison
Feature S3 Standard S3 Express One
Zone
FSx Lustre +
S3
Alluxio + S3
Latency (TTFB)100+ ms 1–10 ms 1 ms 1 ms
Multi-cloud ❌ ❌ ❌ ✅
POSIX API ❌ ❌ ✅ ✅
S3 API ✅ ✅ ❌ ✅
Support WALs
(Append)
❌ ✅ ✅ ✅ (via POSIX)
Data Migration
Required
No High (Creation
time choice)
No No
Cost ($/TB/mo)

~$23
1
~$110
2
~$143
3
~$23
4
to ~$41
5

1
Assumes S3 standard is the source of
truth, hoping full data
2
Assumes S3 Express One Zone
holding full data, as it needs to be
decided at bucket creation time
3
Assumes for 1,000 MB/s/TiB class,
FSx Lustre holding 20% hot data, while
S3 keeps full data
4
Assumes Alluxio deployed on GPU
spare disks holding 20% hot data, no
additional hardware cost, while S3
keeps full data
5
Assumes separate Alluxio cluster
holding 20% hot data using i3en.6xlarge
instances (1 yr reserved), while S3
keeps full data

A joint engineering collaboration between Alluxio and
Salesforce. For More details:
https://www.alluxio.io/whitepaper/meet-in-the-middle-fo
r-a-1-000x-performance-boost-querying-parquet-files-o
n-petabyte-scale-data-lakes

Takeaway
●S3 remains essential but wasn't designed for latency-critical, semantically
rich workloads
●Don't need to migrate away from S3 — augment it with intelligent caching
and performance layers
●Alluxio bridges the gap between S3's cost-effectiveness and the
performance demands of modern AI/ML workloads
●Proven results: Sub-millisecond latency, and real-world production
deployments
●Maximize ROI: Get premium performance without premium costs or
operational complexity

BOOK A DEMO WITH US