Bridging epoll and io_uring in Async Rust by Tzu Gwo
ScyllaDB
5 views
23 slides
Oct 17, 2025
Slide 1 of 23
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
About This Presentation
Tokio dominates async Rust, but its epoll-based model makes it hard to adopt io_uring. This talk explains why async Rust’s design creates that friction and introduces an approach to support both I/O models: switching runtimes at compile time. With this method, I/O middleware can work with epoll an...
Tokio dominates async Rust, but its epoll-based model makes it hard to adopt io_uring. This talk explains why async Rust’s design creates that friction and introduces an approach to support both I/O models: switching runtimes at compile time. With this method, I/O middleware can work with epoll and io_uring without changing its code.
Size: 1.97 MB
Language: en
Added: Oct 17, 2025
Slides: 23 pages
Slide Content
A ScyllaDB Community
Bridging epoll and io_uring
in Async Rust
Tzu Gwo
Co-founder, CEO
Tzu Gwo (he/him/his)
Co-founder, CEO
■Worked in infra team of ByteDance, delivering
10+PB per day time-series data processing
■Love Sci-Fi, big fan of Peter Watts
■Find me Twi: @ yochowgwo
■Founded tonbo.io in 2024
Background & Challenge
io_uring: unified disk and net I/O in async way
■One async interface for both sockets and files
●network and storage I/O use the same completion queue.
■No thread-pool detour for files
●disk I/O can be issued asynchronously just like sockets.
■Other benefits
●Lower syscall and context-switch overhead
●Lower tail latency
●…
Async I/O in Rust Today
Unified Traits Hidden Limitations
io_uring supports both disk/network
async I/O, but not in the same way as poll-based API
compio
monoio
tokio_uring
Why io_uring Doesn’t Fit Today’s Async Rust
■epoll: “Reading into the buffer still happens synchronously when you poll.”
●The kernel does not proactively fill the buffer.
●After user code is woken, it still has to call read(), and at that moment the data is
synchronously copied into the user-provided buffer
■io_uring: “The kernel may fill the buffer asynchronously at any time, and you
only get a completion event once it’s done.”
●When submitting an SQE, the user already hands the buffer address to the kernel.
●The kernel or DMA can fill this buffer at any time after submission.
●Once complete, the kernel signals via a CQE and wakes the user-space Future.
Existing Approaches
Tokio-uring’s approach
■A standalone runtime crate by Tokio team.
■Pros:
●Familiar to Tokio users, official experiment.
■Cons:
●Different API set (not drop-in).
●Low maintenance in recent years.
Monoio’s approach
■Built around io_uring from day one.
■Pros:
●High performance, no blocking detour.
●Modern async design.
■Cons:
●Incompatible with Tokio ecosystem (most crates expect tokio::io).
●Hard to reuse existing async Rust libraries.
Neon Hybrid
■Mechanism: io_uring → eventfd → epoll → Tokio AsyncFd → Future。
■Pros:
●Keeps Tokio as the executor.
●Eliminates spawn_blocking overhead for file I/O.
■Cons:
●Still needs buffer copies.
●Doesn’t expose registered buffer / batch features.
■Diagram: show SQE -> uring -> eventfd -> epoll -> Tokio reactor -> Future.
Tokio Official Integration
■Ongoing effort: integrate io_uring support into tokio::fs.
■Current status:
●Basic file ops (open/read/write) under development.
●Advanced features (registered buffers, batch submit, sharding) are future work.
■Implication:
●Goal is transparent replacement, but currently limited.
Our Approach: fusio
Fusio — Compile-time Switch
■Core idea:
●Define I/O traits (e.g. ReadAt, WriteAll) that abstract over the backend.
●Provide multiple backend implementations: Tokio (epoll), Tokio-uring, Monoio, etc.
●Use Cargo features and type aliases to select the backend at compile time.
■Takeaway: One API, multiple I/O engines. No app code changes.
Pros & Cons
■Pros
●Clean abstraction, no runtime hacks.
●Middleware/app code unchanged.
●Cross-platform: use epoll where io_uring unavailable.
■Cons
●Backend chosen at compile time (not runtime).
●Advanced io_uring features (registered buffers, O_DIRECT, batching) still require new APIs.
Value for Databases
■Why it matters:
●Storage engines and DB middleware can target Fusio traits, not Tokio directly.
●Get io_uring benefits on Linux 5.10+ without rewriting code.
●Still compatible with existing Tokio ecosystem (when built with epoll backend).
■Takeaway:
●“Fusio makes async Rust storage code I/O-agnostic at compile time.”
Beyond fusio
The Buffer Problem
■Current async traits
●AsyncRead/Write expect caller-provided &mut [u8].
●Works for epoll (readiness: copy happens synchronously).
■Problem with io_uring
●Kernel may fill buffer at any time after submission.
●Runtime must guarantee buffer lifetime, alignment, pinning.
■Implication
●If we stick to current API, runtime has to copy from its own buffer → extra overhead.
■Takeaway: To fully unlock io_uring, we need new buffer semantics.
Runtime-owned, Refcounted Buffers
■Idea: Runtime manages a pool of pinned/aligned buffers.
■API sketch: