Bridging epoll and io_uring in Async Rust by Tzu Gwo

ScyllaDB 5 views 23 slides Oct 17, 2025
Slide 1
Slide 1 of 23
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23

About This Presentation

Tokio dominates async Rust, but its epoll-based model makes it hard to adopt io_uring. This talk explains why async Rust’s design creates that friction and introduces an approach to support both I/O models: switching runtimes at compile time. With this method, I/O middleware can work with epoll an...


Slide Content

A ScyllaDB Community
Bridging epoll and io_uring
in Async Rust
Tzu Gwo
Co-founder, CEO

Tzu Gwo (he/him/his)

Co-founder, CEO
■Worked in infra team of ByteDance, delivering
10+PB per day time-series data processing
■Love Sci-Fi, big fan of Peter Watts
■Find me Twi: @ yochowgwo
■Founded tonbo.io in 2024

Background & Challenge

io_uring: unified disk and net I/O in async way
■One async interface for both sockets and files
●network and storage I/O use the same completion queue.
■No thread-pool detour for files
●disk I/O can be issued asynchronously just like sockets.
■Other benefits
●Lower syscall and context-switch overhead
●Lower tail latency
●…

Async I/O in Rust Today
Unified Traits Hidden Limitations

io_uring supports both disk/network
async I/O, but not in the same way as poll-based API
compio
monoio
tokio_uring

Why io_uring Doesn’t Fit Today’s Async Rust
■epoll: “Reading into the buffer still happens synchronously when you poll.”
●The kernel does not proactively fill the buffer.
●After user code is woken, it still has to call read(), and at that moment the data is
synchronously copied into the user-provided buffer

■io_uring: “The kernel may fill the buffer asynchronously at any time, and you
only get a completion event once it’s done.”
●When submitting an SQE, the user already hands the buffer address to the kernel.
●The kernel or DMA can fill this buffer at any time after submission.
●Once complete, the kernel signals via a CQE and wakes the user-space Future.

Existing Approaches

Tokio-uring’s approach
■A standalone runtime crate by Tokio team.
■Pros:
●Familiar to Tokio users, official experiment.
■Cons:
●Different API set (not drop-in).
●Low maintenance in recent years.

Monoio’s approach
■Built around io_uring from day one.
■Pros:
●High performance, no blocking detour.
●Modern async design.
■Cons:
●Incompatible with Tokio ecosystem (most crates expect tokio::io).
●Hard to reuse existing async Rust libraries.

Neon Hybrid
■Mechanism: io_uring → eventfd → epoll → Tokio AsyncFd → Future。
■Pros:
●Keeps Tokio as the executor.
●Eliminates spawn_blocking overhead for file I/O.
■Cons:
●Still needs buffer copies.
●Doesn’t expose registered buffer / batch features.
■Diagram: show SQE -> uring -> eventfd -> epoll -> Tokio reactor -> Future.

Tokio Official Integration
■Ongoing effort: integrate io_uring support into tokio::fs.
■Current status:
●Basic file ops (open/read/write) under development.
●Advanced features (registered buffers, batch submit, sharding) are future work.
■Implication:
●Goal is transparent replacement, but currently limited.

Our Approach: fusio

Fusio — Compile-time Switch
■Core idea:
●Define I/O traits (e.g. ReadAt, WriteAll) that abstract over the backend.
●Provide multiple backend implementations: Tokio (epoll), Tokio-uring, Monoio, etc.
●Use Cargo features and type aliases to select the backend at compile time.
■Takeaway: One API, multiple I/O engines. No app code changes.

Fusio — Compile-time Switch
■With --features=tokio → runs on
epoll + spawn_blocking.
■With --features=tokio-uring →
runs on io_uring completion.
■App code unchanged.

Pros & Cons
■Pros
●Clean abstraction, no runtime hacks.
●Middleware/app code unchanged.
●Cross-platform: use epoll where io_uring unavailable.
■Cons
●Backend chosen at compile time (not runtime).
●Advanced io_uring features (registered buffers, O_DIRECT, batching) still require new APIs.

Value for Databases
■Why it matters:
●Storage engines and DB middleware can target Fusio traits, not Tokio directly.
●Get io_uring benefits on Linux 5.10+ without rewriting code.
●Still compatible with existing Tokio ecosystem (when built with epoll backend).
■Takeaway:
●“Fusio makes async Rust storage code I/O-agnostic at compile time.”

Beyond fusio

The Buffer Problem
■Current async traits
●AsyncRead/Write expect caller-provided &mut [u8].
●Works for epoll (readiness: copy happens synchronously).
■Problem with io_uring
●Kernel may fill buffer at any time after submission.
●Runtime must guarantee buffer lifetime, alignment, pinning.
■Implication
●If we stick to current API, runtime has to copy from its own buffer → extra overhead.
■Takeaway: To fully unlock io_uring, we need new buffer semantics.

Runtime-owned, Refcounted Buffers
■Idea: Runtime manages a pool of pinned/aligned buffers.
■API sketch:



■Buf properties
●Safe (RAII, pinned, refcounted).
●Can be pre-registered with io_uring.
●Reusable, batch-friendly, O_DIRECT compatible.
■Pros:
●True zero-copy, exploit io_uring’s strengths (registered buffers, batching).
■Cons:
●Diverges from today’s AsyncRead/Write API.
●Requires ecosystem adoption.
■Diagram: Buf pool → submit SQE → kernel fills → CQE → return Buf.

Conclusion

Conclusion
■Async Rust today → Epoll-based, file I/O inconsistent.
■Existing approaches → Monoio (fast but isolated), Tokio-uring (separate
runtime), Neon hybrid (bridge in Tokio), Tokio official (in progress).
■Fusio → Clean compile-time abstraction, same API for epoll/io_uring,
middleware/app code unchanged.
■Beyond Fusio → To fully unlock io_uring: runtime-owned buffer APIs.

Thank you! Let’s connect.
Tzu Gwo
[email protected]
@ yochowgwo
https://tonbo.io/blogs
Tags