Achieving Double-Digit Millisecond Offline Feature Stores with Alluxio

Alluxio 0 views 25 slides Oct 02, 2025

Slide 1 of 25

About This Presentation

AI/ML Infra Meetup
Sep. 30, 2025
Organized by Alluxio

For more Alluxio Events: https://www.alluxio.io/events/

Speaker:
- Greg Lindstrom, (VP ML Trading @ Blackout Power Trading)

Greg Lindstrom shared how they achieved double-digit millisecond offline feature store performance using Alluxio, a...

Size: 753.1 KB

Language: en

Added: Oct 02, 2025

Slides: 25 pages

Slide Content

Greg Lindstrom, VP ML Trading
Alluxio Meetup | September 30, 2025
Achieving Double-Digit Millisecond
Offline Feature Stores with Alluxio

What if…
●What if S3 was 10x faster? 30x faster? (bandwidth and latency)

●What if S3 data was served at in-memory latency?

●What complexity could be reduced, how much money and time
could be saved?

But First… Power Trading 101

10k+ Tradeable Locations
Source: Yes Energy

Locational Marginal Price (LMP)
●Prices are the result of a huge math model solving optimal
dispatch (minimize cost)

●Marginal generation unit sets price

Day-Ahead Versus Real Time Power
●Day-ahead (DA) power scheduled one day in advance

●Real-time (RT) power is scheduled every 5 minutes

●Trading the spread between DA and RT power
○Why are these markets different?

●Think air traffic controller scheduling planes one day in
advance

Day-Ahead Power Trading Primer
●Daily blind auction

●Who participates:
○Utilities
○Asset owners
○Speculators (optimizers)

●Speculators physically change generator dispatch

Competitive Market
●Using the latest data
○Weather forecasts
○Renewables forecasts
○Outages
○Pricing
○Etc

●Latest renewables forecast comes out 30 minutes before
market close

Market Window Pressure Cooker
●30 minutes between last forecast and market close
=
●15 minutes to run inference thousands of small ML models
+
●15 minutes to review, manually adjust for risk proﬁles and
human insights, and submit

The Feature Join Problem
Benchmark Join:
●20 feature tables, 4 columns from each table, 1 primary key
resulting in 81 columns
●24 rows for inference and 70k rows for training

Join Example in SQL
SELECT
t1.col1, t1.col2, t1.col3, t1.col4,
t2.col1, t2.col2, t2.col3, t2.col4,
...
t20.col1, t20.col2, t20.col3, t20.col4
FROM table1 t1
JOIN table2 t2 ON t1.id = t2.t1_id
JOIN table3 t3 ON t2.id = t3.t2_id
JOIN table4 t4 ON t3.id = t4.t3_id
...
JOIN table20 t20 ON t19.id = t20.t19_id;

S3 Becoming Inefficient
●Currently 5k models (~6/sec) -> growing to 100k (~110/sec)

●Query averaged 3.8 seconds for inference

●Reading models artifact and saving results 3+ seconds

●~80% of time waiting on IO

Why Have a Feature Store?
●Simplicity
○Ingest
○Query

●Metadata

●Feature Views

●Versioning

Standard Feature Store Pipeline
Source: https://www.tecton.ai/blog/what-is-a-feature-store/

Online Feature Stores
Pros
●Very high performance
inference
●Ingesting streaming data
is simple
Cons
●Limited Data
●Complexity
○Data lifecycle
○Split sources of truth
●Expensive
●Training data serving is
still very slow
●Volatile

Offline Feature Stores
Pros
●Relatively cheap
●Low complexity
●Durable
●All data
Cons
●Slow
●Ingesting streaming data
is more complex
What if it wasn’t slow?

What Makes Offline Slow?
1.Storage latency / bandwidth

2.Storage format

3.Feature joining

4.Post-join data transfer

What if we optimized for speed?

1.Solving Storage Latency / Bandwidth
[Alluxio enters the chat]
●Cache ﬁles on NVME drives for low latency

●Bandwidth now constrained by EC2 instance types (how does
10GB/s sound?)

●High Availability + Maintains S3 durability

●Scales linearly

●Lots of tuning options depending on workload

2. Solving Storage Format
●Contenders: Parquet, Delta Lake, Iceberg, Hudi, Avro

●Parquet is easily the winner for speed however:
○Can't handle concurrent writes
○Queries may see partial results (partitioned tables)
○No version history

Can we still make parquet work?

3. Solving Multi-Join Performance
●Contenders: Spark, Trino, Flink, Dask, Duckdb, Pandas, Polars

●Polars and Duckdb are the fastest with Polars being the clear
winner

●Ideally zero-copy for post-join data transfer

4. Solving post-join data transfer
●Format Contenders: Json, Parquet, Arrow, others

●Arrow streaming over gRPC (HTTP2) outperforms all others
which is a huge time savings due to…

●Arrow ﬂight server

Putting It Together
●Kubernetes
●High Availability
●Offline Low Latency
●Scalable
●Durable

Performance
Query Type*
Without
Alluxio
With Alluxio
Query
(ms)
Cold Query
(ms)
Hot Query
(ms)
Inference (24 rows) 3,727 99 45
Training (70k rows) 3,841 171 104
* 20 Table Join, 4 Columns per Table, 1 Primary key, 81 Column Result

Operational Payoff
●~60x reduction in latency for inference

●~30× reduction in latency for training

●Scaling from 5,000 to 100,000+ models in the same 15-minute
window

●No online feature store necessary

●Low latency training data

Q&A
●Discussion

●Blog Post:
https://www.alluxio.io/blog/blackout-power-trading-achieved-low-l
atency-offline-feature-store-performance-with-alluxio-caching

●Get in touch: linkedin.com/in/greg-lindstrom

Achieving Double-Digit Millisecond Offline Feature Stores with Alluxio

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Achieving Double-Digit Millisecond Offline Feature Stores with Alluxio

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx