Designing an Energy-efficient Architecture for Geo Databases by Yichen Wei

ScyllaDB 7 views 32 slides Oct 17, 2025
Slide 1
Slide 1 of 32
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32

About This Presentation

The cost and efficiency of IP geolocation data services depends on careful system design. This talk covers how storage architecture, data compression, network optimization, memory layout, and CPU-level tuning all play a role. We also show how techniques like NUMA awareness, SIMD, zero-copy, and zero...


Slide Content

A ScyllaDB Community
Designing an Energy-efficient
Architecture for Geo Databases
Yichen Wei
Engineer Manager

Yichen Wei
Engineer Manager
■HA Distributed Platform domains with
machine learning and GenAI
■Evangelist of HPC, HTAP, Rust and FP/CT
■WASM/WASI, FaaS

Geo service

Challenge

■Critical Service
●Low latency
●High availability
●Backward and forward compatibility
■Cost
●Data size
●Query performance

Geo service workflow

Why not existing local databases?
Why not using sqlite, duckdb or rocksdb?
■Different user cases
●Daily/Weekly full size dump
●No update operations
■Performance and cost concerns
●Read/Write amplification and memory footprint
●Long tail lookups

Solution

Geo service workflow

Optimization 1: Loading

Optimization 2: Parsing
■Encoding
●Dictionary encoding - Country, Region, City, Metro, Zip…
●Delta encoding - longitude and latitude

Optimization 2: Parsing
■Memory layout and alignment

Optimization 2: Parsing
■Memory layout and alignment

Optimization 3: Lookup

Optimization 3: Lookup
Chunk Size
■Graviton instance
●L1: 64K
●L2: 1M

Optimization 3: Lookup

Optimization 3: Lookup
SIMD

Optimization 3: Lookup
SIMD
■truth table
●16 bytes vector - 4 IPv4 value


IP1 IP2 IP3 IP4
T T T T
T T T F
T T F F
T F F F
F F F F

Optimization 3: Lookup
SIMD
■Does Golang have SIMD support?
●Yes and No

Optimization 3: Lookup
SIMD
■Does Golang have SIMD support?

Optimization 3: Lookup
SIMD
■Cgo is not Go!

Optimization 3: Lookup
SIMD
■https://github.com/alivanz/go-simd
■`linkname` directive to call C code, bypass CGO

Optimization 3: Lookup
SIMD

Optimization 3: Lookup
SIMD

Optimization 3: Lookup - One More Thing
premature optimization is the root of all evil

Optimization 3: Lookup - One More Thing

Optimization 3: Lookup - One More Thing

Optimization 3: Lookup - One More Thing

Optimization 3: Lookup - One More Thing

Optimization 3: Lookup - One More Thing
Understand compiler first before optimizations!

Looking Forward

■Compression & encoding
■Histogram Distribution
■Data prefetching
■Change Language

Takeaways

■No one knows your data better than you
■Don’t be afraid to build the infrastructure for your data
■Optimization is fun
●… and hard
■Golong doesn’t support SIMD well

Thank you! Let’s connect.
Yichen Wei
[email protected]
Tags