Riot Games Scalable Data Warehouse Lecture at UCSB / UCLA
sean_seannery
7,723 views
74 slides
Nov 18, 2015
Slide 1 of 74
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
About This Presentation
This is a talk that was given for the Scalable Internet Services Masters-level Computer Science class at UCLA and UCSB. It briefly discusses the server architecture for the game League of Legends before going into depth about how the data warehouse can hold petabytes of player data. Discussion abo...
This is a talk that was given for the Scalable Internet Services Masters-level Computer Science class at UCLA and UCSB. It briefly discusses the server architecture for the game League of Legends before going into depth about how the data warehouse can hold petabytes of player data. Discussion about message queue architecture and scalability occurs along the way
Size: 5.23 MB
Language: en
Added: Nov 18, 2015
Slides: 74 pages
Slide Content
MOVING
MOUNTAINS OF
PLAYER DATA
SEAN MALONEY
RIOT GAMES
@SEAN_SEANNERY
SCALABLE INTERNET SERVICES
UCLA/UCSB - NOV 2015
SEAN MALONEY
BIG DATA ENGINEER
WHO IS THIS GUY?
Lead developer on Riot’s ETL tools
FUN FACT:
Was a student in this class 4 years
ago
Intern at Appfolio
MOVING MOUNTAINS OF DATA
INTRODUCTION1.
THE GAME PLATFORM: OUR MAIN DATA SOURCE2.
HOW WE INGEST AND QUERY DATA3.
HOW WE SCALE IN AWS4.
CONCLUSION - SEAN’S PRO TIPS5.
INTRODUCTION
WHAT IS LEAGUE OF LEGENDS?
2009
LAUNCH
ONLINE
MULTIPLAYER
WINDOWS
/ OSX
40-50 MIN
GAMES
THE
TEAM
YOUR CHAMP
THE
BATTLE
GROUND
THE GAME PLATFORM
THE CLIENT.
CHAT
STOREAUDIT
Load Balancers and Firewalls
CHAT
ORACLE COHERENCE (IN MEMORY DB)
STORE AUDIT GAME ETC.
CHAT
CHAT
STORE AUDIT GAME ETC.
STORE AUDIT GAME ETC.
PRIMARY DB
HOT BACKUP DB
2nd BACKUP DB
/ ETL
OTHER DATA SOURCES
<REST>
DATA INGESTION
PUSH-BASED
PULL-BASED / ETL
BATCH QUERIES
INGESTION STORAGE QUERY / VIEWSVIZ. TOOLS
SINGLE-ROW QUERIES
AGGREGATE QUERIES
FuETL
- OLTP game data
- External Data Sources
MASTER WAREHOUSE
HONU
- Anything pushed to it
- Server logs
DATA AUDITING
PUSH-BASED
PULL-BASED / ETL
BATCH QUERIES
INGESTION STORAGE
QUERY /
VIEWS
VIZ. TOOLS
SINGLE-ROW QUERIES
AGGREGATE QUERIES
FuETL
- OLTP game data
- External Data Sources
MASTER WAREHOUSE
HONU
- Anything pushed to it
- Server logs
DATA AUDITING
Distributed ETL Software written in
Ruby.
Scales Horizontally
Same ETL applied to multiple regions
/ datacenters
Self-Service UI with SQL query
templating.
NA Korea Russia
Create an ETL
Create an ETL
Amazon S3
SQS
(S)FTP
Hive
Microsoft SQL Server
MySQL
DynamoDB
Vertica
Redshift
REST websites
FUETL
CAN
CONNECT
TO
Create an ETL
Webapp
Core Libraries
Task Service
Tasks
Helper Service
Helpers
Environment
Service
Scheduler Process Worker Process Task / Helper / ControllersCommand Line Tool
View
- backbone.js
- Bootstrap CSS
Task DAO Helper DAOEnvironment
DAO
Env. Task DAO Env. Helper DAO
Webapp
Core Libraries
Task Service
Tasks
Helper Service
Helpers
Environment
Service
Scheduler Process Worker Process Task / Helper / ControllersCommand Line Tool
View
- backbone.js
- Bootstrap CSS
Task DAO Helper DAOEnvironment
DAO
Env. Task DAO Env. Helper DAO
Webapp
Core Libraries
Task Service
Tasks
Task DAO
Helper Service
Helpers
Helper DAO
Environment
Service
Environment
DAO
Scheduler Process Worker Process Task / Helper / Controllers
Env. Task DAO Env. Helper DAO
Command Line Tool
View
- backbone.js
- Bootstrap CSS
Webapp
Core Libraries
Task Service
Tasks
Helper Service
Helpers
Environment
Service
Scheduler Process Worker Process Task / Helper / ControllersCommand Line Tool
View
- backbone.js
- Bootstrap CSS
Task DAO Helper DAOEnvironment
DAO
Env. Task DAO Env. Helper DAO
FuETL STATISTICS
14 TB
DATA MOVED DAILY
5213
ACTIVE REGIONAL
ETLS
23125
DAILY ETL RUNS
FuETL SCALING
FuETL SCALING
Idempotency
Idempotent - an operation that will produce the
same results if executed once or multiple times
EXAMPLE:
Non-Idempotent: - x = x * 5;
- Submitting a purchase
PUSH-BASED
PULL-BASED / ETL
BATCH QUERIES
INGESTION STORAGE
QUERY /
VIEWS
VIZ. TOOLS
SINGLE-ROW QUERIES
AGGREGATE QUERIES
FuETL
- OLTP game data
- External Data Sources
MASTER WAREHOUSE
HONU
- Anything pushed to it
- Server logs
DATA AUDITING
REST micro-service built with Java
and docker.
Reports and visualizations we can
use to find problems.
Source and target comparison.
Warehouse
Auditing
Service
Platform
HOW TO AUDIT
VISUALIZING
VISUALIZING
HOW TO AUDIT
PUSH-BASED
PULL-BASED / ETL
BATCH QUERIES
INGESTION STORAGE
QUERY /
VIEWS
VIZ. TOOLS
SINGLE-ROW QUERIES
AGGREGATE QUERIES
FuETL
- OLTP game data
- External Data Sources
MASTER WAREHOUSE
HONU
- Anything pushed to it
- Server logs
DATA AUDITING
BATCHOLAPPOINT
SCALING IN AWS
RESOURCE CONTENTION
SCALING
RDS
AWS Infrastructure Today
EMR EC2 Storage
Data
Science
Analytics /
Hue
ETL Telemetry
PlatforaDynamoDB
Loading
Auditing ETL
Telemetry
collectors
Data
dictionary
Rocana
(real time
dashboard)
Solr (real
time)
Point Data
Service
Metastore
Data
Science
Fraud
DYNAMODB
ETL App DB
Point Data
Store
S3
Source of “Truth”
Networking
VPC
AWS Direct
Connect
AWS Direct
Connect
AWS Direct
Connect
AWS Direct
Connect
CONCLUSION
DON’T
SEAN’S PRO TIPS OF THE DAY
DO
➔Don’t wait. Create S3
permissions and naming
standards early
➔Get an auditing solution
for DW accuracy
➔Allocate time for tuning
AWS infrastructure
➔Don’t forget to track cost.
AWS bills can surprise you
➔Don’t underestimate simple
problems in big data.
➔Prepare for multiple data
access patterns
➔Keep idempotency in mind
and use MQ architecture
➔Don’t stop. Believing
Custom rewards for mastering
different champions
Intensive query that spans every
game that every player has played
Improves player engagement
CHAMPION
MASTERY
Full copy of our data warehouse in
DynamoDB
Hive->DynamoDB Dynamic Partition
Support can answer questions faster
than ever.
PLAYER
SUPPORT
Data science team queries all chat
messages in game
Sentiment analysis and
classification
Identifies negative, offensive players
and mutes them automatically.
OFFENSIVE
CHAT
DETECTION
QUESTIONS?
SMALONEY
@RIOTGAMES.COM
@SEAN_SEANNERY engineering.riotgames.com
ENGINEERING
BLOG