Bringing Real-Time to the Enterprise with Hortonworks DataFlow

Hadoop_Summit 614 views 24 slides Jun 21, 2017
Slide 1
Slide 1 of 24
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24

About This Presentation

TELUS is Canada’s fastest-growing national telecommunications company, with $12.9B of annual revenue and 12.7M customer connections. TELUS provides TELUS TV to more than 1.1 million customers in western Canada, with the goal of maximizing clients’ service experience. Every television set-top box...


Slide Content

Bringing Real-Time to the
Enterprise with
Hortonworks DataFlow
Cavan Loughran –TELUS
Oliver Meyn–T4G
June 2017

2TELUSPublic
Introduction to TELUS and Optik TV
®
Starting state
Target state
Journey
Lessons learned
Agenda / Today’s Talk
1
2
3
4
5

TELUSPublic 3
TELUS is Canada’s fastest-growing national
telecommunications company, with $12.9B of annual
revenue and 12.7M customer connections.
TELUS provides Optik TV
®
to 1.1M customers
At TELUS, our goal is to delight our customers by
continuously improving the service/viewing experience.
TELUS and Optik TV
®

TELUSPublic 4
Watching TV @Home
Content
Provider
eg. NBC
Service
CreationService Delivery
Encoding &
DRM
IP/Internet/CDN
Network
Access
Technology
Eg.
Fibre/Copper
Your
Neighborhood Your Home Your TV
Home
Gateway
IPTV
STB
(Smart
Device)
Video flows as data over IP
Streaming video monitoring

TELUSPublic 5
IPTV Set Top Box (STB)

TELUSPublic 6
STB’s report diagnostic logs every 12 hours.
Daily batch load into Datalake using scp& pig.
Through daily analysis we create and action key
insights for the network and individual services.
Starting State

TELUSPublic 7
12 hour interval?
TV is a streaming real-time service, a
check point every 12 hours hides too many
things from analysis and timely action.

TELUSPublic 8
Identify and take action to correct a problem within minutes.
Predict when a device and/or service is in danger of
degrading or failing.
How to achieve Target State:
–Increase STB reporting frequency ~50X from twice a day to 96 per day.
–Streaming analytics (ML) to monitor each STB and take immediate
action.
Target State

TELUSPublic 9
Watching TV @Home
Content
Provider
eg. NBC
Service
CreationService Delivery
Encoding &
DRM
IP/Internet/CDN
Network
Access
Technology
Eg.
Fibre/Copper
Your
Neighborhood Your Home Your TV
Home
Gateway
IPTV
STB
(Smart
Device)
Video flows as data over IP
Streaming video monitoring
Target
state

TELUSPublic 10
Journey

TELUSPublic 11
TELUS supports many enterprise use cases with a
single “Datalake”, aka multi-tenancy.
HDP 2.4 (ships with Spark 1.6 and Kafka 0.9)
HDF 2.1 (always latest NiFi)
Encryption & Kerberos in and out of Datalake.
Not one tutorial on the web cares about security 
Constraints

TELUSPublic 12
®
First Attempt -Kafka
Set Top
Boxes
Actions /
Alerts
Insights /
Analytics
Hadoop network

TELUSPublic 13
Kafka 0.9 introduced SSL/TLS, but Spark 1.6 uses
the Kafka 0.8 client library, so can’t do SSL.
Solved by using Spark 2.x, but:
NiFi can only use one Kerberos principal
Kafka, thwarted!

TELUSPublic 14
Site to Site (S2S) provides two way SSL between
NiFi instances.
SparkStreamingcan stream using S2S protocol.
Careful… No easy way to limit NiFi resources by
user/group.
Multiple NiFi?

TELUSPublic 15
NiFi outside the Datalake does the heavy work to
process incoming STB logs.
NiFi within the Datalake does the lighter work:
–Store in HDFS
–Feeds Spark Streaming using S2S via 2-way SSL
–Lighter load improves multi-tenancy (scale) for NiFi
Second Attempt –NiFi Site to Site

TELUSPublic 16
Final Architecture –NiFi S2S
Set Top
Boxes
Actions &
Alerts
Insights &
Analytics
Hadoop network
ML
results
Store
®

TELUSPublic 17
No prior labels to demonstrate a failing STB
Unsupervised classifier as first pass
SparkMLusing KMeansto build model
Problem Detection

TELUSPublic 18
Model Building

TELUSPublic 19
Lessons Learned

TELUSPublic 20
Java 8 everywhere
–SSL ‘fun’ with J7 vs J8
–NiFi >= 1.0 libraries compiled with J8
At least Spark 2.0, ideally 2.1 or newer
–SSL/TLS with Kafka
–Speed
–Improved SparkML
–Structured Streaming
Lessons Learned

TELUSPublic 21
NiFi Kafka support one Kerberos principal
per NiFi server.
NiFi HDFS supports multiple Kerberos
principals, however, server process can read all
keytabs.
More Lessons Learned

TELUSPublic 22
NiFi site to site squashes identity to that of the
sending server.
Get good at SSL and how certificates work.
Highlysecured multi-tenancy is hard!
More Lessons Learned

TELUSPublic 23
Cavan Loughran, TELUS, www.telus.com
Oliver Meyn, T4G, www.t4g.com
Thank you!

TELUSPublic 24
the future is friendly.