Model driven telemetry

CiscoCanada 3,943 views 44 slides Nov 30, 2017
Slide 1
Slide 1 of 44
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44

About This Presentation

Model driven telemetry


Slide Content

© 2016 Cisco and/or its affiliates. All rights reserved. 1
Model-Driven Telemetry
#ConnectCA
Jimmy Fanizzi
Systems Engineer –Global Service Provider
November 28, 2017
Connect
Cisco

© 2016 Cisco and/or its affiliates. All rights reserved. 3
Look
Twice
Before
You
Leap.
--Charlotte Bronte
What Automation Without Visibility Looks Like

© 2016 Cisco and/or its affiliates. All rights reserved. 4
Source: Google @ Bay Area OpenDaylightMeetup 06/16

5© 2016 Cisco and/or its affiliates. All rights reserved.
Current
challenges

© 2016 Cisco and/or its affiliates. All rights reserved. 6
syslog
SNMP
CLI
Too Slow
Incomplete
Network-Specific
Hard to Operationalize
Why Network Visibility Is Hard

© 2016 Cisco and/or its affiliates. All rights reserved. 7
SNMP Polling Is Hard on Everybody
Request-ID 1:
Sent, No Response
Request-ID 2:
Sent, No Response
ManagersNetworkRouters

© 2016 Cisco and/or its affiliates. All rights reserved. 8
What Happens When You Push SNMP Too Hard
•10 second poll / push
•3 pollers/ telemetry receivers
•30 minute measurement intervals
•288 100Gig E Interfaces (Line Rate)
•SNMP: IF-MIB (query by row)

9© 2016 Cisco and/or its affiliates. All rights reserved.
Telemetry
fundamentals

© 2016 Cisco and/or its affiliates. All rights reserved. 10
Three Enablers for Telemetry
Push Not Pull
Analytics-Ready Data
Data-Model Driven

© 2016 Cisco and/or its affiliates. All rights reserved. 11
Push Means Don’t Wait To Be Asked
T1, interface stats
T2, interface stats
T1, interface stats
T2, interface stats
•Collect Once, Send Many
•Wait for a Period of Time
•Repeat

© 2016 Cisco and/or its affiliates. All rights reserved. 12
Push Beats SNMP Pull for Speed & Scale
•10 second poll / push
•3 pollers/ telemetry receivers
•30 minute measurement intervals
•288 100Gig E Interfaces (Line Rate)
•SNMP: IF-MIB (query by row)

© 2016 Cisco and/or its affiliates. All rights reserved. 13
“OSI model” of Telemetry
Data store
Data model
Producer
Exporter
Collector
Native (raw) data inside a router’s
database
Raw data mapped to a model
(YANG native, OpenConfig, etc)
Sending requested data in model format
to the “Exporter” at defined intervals
Encoding and delivering data to the
collector(s) destination(s)
Information collection for processing
(e.g., data monitoring, automation, analytics)
“Data”“Layer”
Telemetry end
-to
-end

© 2016 Cisco and/or its affiliates. All rights reserved. 14
“OSI model” of Telemetry
Data store
Data model
Producer
Exporter
Collector
Native (raw) data inside a router’s
database
Raw data mapped to a model
(YANG native, OpenConfig, etc)
Sending requested data in model format
to the “Exporter” at defined intervals
Encoding and delivering data to the
collector(s) destination(s)
Information collection for processing
(e.d., data monitoring, automation, analytics)
“Data”“Layer”
Telemetry end
-to
-end

© 2016 Cisco and/or its affiliates. All rights reserved. 15
“OSI model” of Telemetry
Data store
Data model
Producer
Exporter
Collector
Native (raw) data inside a router’s
database
Raw data mapped to a model
(YANG native, OpenConfig, etc)
Sending requested data in model format
to the “Exporter” at defined intervals
Encoding and delivering data to the
collector(s) destination(s)
Information collection for processing
(e.d., data monitoring, automation, analytics)
“Data”“Layer”
Telemetry end
-to
-end

© 2016 Cisco and/or its affiliates. All rights reserved. 16
YANG Is A Modeling Language
module ietf-interfaces {
import ietf-yang-types {
prefix yang;
}
container interfaces {
list interface {
key "name";
leaf name {
type string;
}
leaf enabled {
type boolean;
default "true";
}

Edited
for
Brevity
Self-contained top-level hierarchy of nodes
Import or define data types
Leaf nodes for simple data
Lists for sequence of entries
Containers group related nodes
Other YANG Features
•RO or RW
•Optional nodes
•Choice
•Augment
•When
•Arbitrary XML
•RPC
•etc

© 2016 Cisco and/or its affiliates. All rights reserved. 17
Models are Available in Github
You
Should
Do
This
•Telemetry only cares about operational (*-oper.yang)
models.
•143 oper YANG models published for XR 6.1.1
•151 oper YANG are for XR 6.1.2
•177 oper YANG are for XR 6.2.1
•180 oper YANG for XR 6.2.2
•198 oper YANG for XR 6.3.1https://github.com/YangModels/yang/tree/master/vendor/cisco/xr

© 2016 Cisco and/or its affiliates. All rights reserved. 18
Finding the Data You Want To Stream
$ pyang-f tree Cisco-IOS-XR-infra-statsd-oper.yang
--tree-path infra-statistics/interfaces/interface/latest/generic-counters
telemetry model-driven
sensor-group SGROUP1
sensor-path Cisco-IOS-XR-infra-statsd-oper:infra-
statistics/interfaces/interface/latest/generic-counters

© 2016 Cisco and/or its affiliates. All rights reserved. 19
What Will Be Pushed With That Config
{
"Timestamp": 1480547974706,
"Keys": {
"interface-name": "MgmtEth0/RP0/CPU0/0"
},
"Content": {
"applique": 0,
"availability-flag": 0,
"broadcast-packets-received": 25035,
"broadcast-packets-sent": 0,
"bytes-received": 165321050,
"bytes-sent": 233917498,
"carrier-transitions": 3,
"crc-errors": 0,
"framing-errors-received": 0,
"giant-packets-received": 0,
"input-aborts": 0,
"input-drops": 62,
"input-errors": 0,
"input-ignored-packets": 0,
"input-overruns": 0,
"input-queue-drops": 0,
"last-data-time": 1480547974,
"last-discontinuity-time": 1479244159,
"multicast-packets-received": 457,
"multicast-packets-sent": 0,
"output-buffer-failures": 0,
"output-buffers-swapped-out": 0,
"output-drops": 0,
"output-errors": 104,
"output-queue-drops": 0,
"output-underruns": 0,
"packets-received": 373156,
"packets-sent": 311583,
"parity-packets-received": 0,
"resets": 0,
"runt-packets-received": 0,
"seconds-since-last-clear-counters": 0,
"seconds-since-packet-received": 0,
"seconds-since-packet-sent": 0,
"throttled-packets-received": 0,
"unknown-protocol-packets-received": 0
}
Repeated for all interfaces

© 2016 Cisco and/or its affiliates. All rights reserved. 20
“OSI model” of Telemetry
Data store
Data model
Producer
Exporter
Collector
Native (raw) data inside a router’s
database
Raw data mapped to a model
(YANG native, OpenConfig, etc)
Sending requested data in model format
to the “Exporter” at defined intervals
Encoding and delivering data to the
collector(s) destination(s)
Information collection for processing
(e.d., data monitoring, automation, analytics)
“Data”“Layer”
Telemetry end
-to
-end

© 2016 Cisco and/or its affiliates. All rights reserved. 21
Configuring Destination
telemetry model-driven
destination-group DGROUP
address family ipv4 192.168.1.1 port 2104
----and/or ----
address family ipv6 2001:db8::1 port 2104
encoding self-describing-gpb
protocol tcp
Specify where you want to send your data
Specify how you want your data to look like
Specify how you want your data to be delivered

© 2016 Cisco and/or its affiliates. All rights reserved. 22
Basic Concept: Encoding
Encoding (or “serialization”) translates data (objects, state) into a format that
can be transmitted across the network. When the receiver decodes (“de-
serializes”) the data, it has an semantically identical copy of the original data.
DATA
DATA
“Decode”
“Encode”
IOS XR platforms
Encodings
•Compact GPB
•Key-Value GPB
•JSON (6.3.1)

© 2016 Cisco and/or its affiliates. All rights reserved. 23
GPB Encoding
Design Goals
•Simplicity
•Performance
•Forward/Backward
Compatibility
Non-Goals
•Human-Readable
•Self-Describing
•Text-based
Google Protocol Buffers (GPB)
Call them
“protobufs”
for short
“Protocol buffers are Google's language-neutral, platform-
neutral, extensible mechanism for serializing structured data
–think XML, but smaller, faster, and simpler.”

© 2016 Cisco and/or its affiliates. All rights reserved. 24
Telemetry Has Two GPB Encoding Options
data_gpb{
row {
timestamp: 1485794640469
keys: "\n\026GigabitEthernet0/0/0/0"
content:
"\220\003\010\230\003\001\240\003\002\250\003\000\260\003\
000\270\003\000\300\003\000\310\003\000\320\003\300\204=\3
30\003\000\340\003\000\350\003\000\360\003\377\001"
}
2X faster
Operationally more complex (but not
relative to SNMP!)
data_gpbkv{
timestamp: 1485793813389
fields {
name: "keys"
fields { name: "interface-name" string_value:
"GigabitEthernet0/0/0/0" }
}
fields {
name: "content"
fields { name: "input-data-rate" uint64_value: 8 }
fields { name: "input-packet-rate" uint64_value: 1 }
<<< 9 lines are skipped >>>
fields { name: "input-load" uint32_value: 0 }
fields { name: "reliability" uint32_value: 255 }
}
}
...
3X larger
Native models: still need heuristics for key
names
GPB –“compact”GPB –“self-describing”

© 2016 Cisco and/or its affiliates. All rights reserved. 25
Dial-Out
•TCP & gRPC(from 6.1.1)
•UDP (from 6.2.1)
Dial-In
•gRPConly (from 6.1.1)
Transport Options
Collector
Data
SYN
SYN-ACK
ACK Collector
Data
SYN
SYN-ACK
ACK

© 2016 Cisco and/or its affiliates. All rights reserved. 26
gRPC: Like REST But Different
Runs over HTTP/2
Optimize for page load time
Server push, header compression, multiplexing, TLS
RFC 7540 (May 2015)
Preserves most HTTP1.1 syntax
Defines Services (“RPCs”)
Encodes Using Google Protocol
Buffers (“GPB” or “protobufs”)
Services and Messages
Auto-generate code in many languageshttp://ww.grpc.io/docs/#hello-grpc

© 2016 Cisco and/or its affiliates. All rights reserved. 27
“OSI model” of Telemetry
Data store
Data model
Producer
Exporter
Collector
Native (raw) data inside a router’s
database
Raw data mapped to a model
(YANG native, OpenConfig, etc)
Sending requested data in model format
to the “Exporter” at defined intervals
Encoding and delivering data to the
collector(s) destination(s)
Information collection for processing
(e.d., data monitoring, automation, analytics)
“Data”“Layer”
Telemetry end
-to
-end

© 2016 Cisco and/or its affiliates. All rights reserved. 28
A Telemetry Subscription
telemetry model-driven
subscription Sub1
sensor-group-id SGROUP1 sample-interval 30000
destination-id DGROUP1
*Omit Destination Group For gRPCDial-In

29© 2016 Cisco and/or its affiliates. All rights reserved.
Telemetry on
Cisco products

© 2016 Cisco and/or its affiliates. All rights reserved. 30
Cisco XR Telemetryoverview
Classic XR ASR9kEvolved XR ASR9kNCS5500NCS6k
MDT support6.1.16.1.16.1.16.1.3
Data models
YANG
(native, OC, IETF)
Link formodels
YANG
(native, OC, IETF)
Link formodels
YANG
(native, OC, IETF)
Link formodels
YANG
(native, OC, IETF)
Link formodels
Transport
(Control
protocols)
TCP (dial-out),
UDP (dial-out)*
gRPC(dial-in, dial-out),
TCP (dial-out),
UDP (dial-out)*
gRPC(dial-in, dial-out),
TCP (dial-out), UDP
(dial-out)*
TCP (dial-out), UDP
(dial-out)*
EncodingGPB /
GPB-KV / JSON**GPB / GPB-KV / JSON**GPB /
GPB-KV / JSON**
GPB /
GPB-KV / JSON**
CollectorsPipeline***Pipeline***Pipeline***Pipeline***
* UDP support from 6.2.1
** JSON support from 6.3.1
*** Open-sourced and ready to use: https://github.com/cisco/bigmuddy-network-telemetry-pipeline

© 2016 Cisco and/or its affiliates. All rights reserved. 31
NX OS IOS-XE
MDT support7.0(3)I6(1)16.6.1*
Data modelsData Management Engine, NX-API, YANG
(native, OC,IETF)
YANG (native**,IETF)
Link for models
Transport
(Controlprotocols)
gRPC*(dial-out), UDP**(dial-out), HTTP***
(dial-out)
Netconf (for YANG), GNMI
(16.8.1), gRPC(16.9.1)
Formats GPB/JSONXML, GPB (16.9.1)
CollectorsPipeline TBD
Min sample interval5 sec 1 sec
Max # of dial-out destinations5 TBD
Cisco NXOS/XE Telemetryhigh-leveloverview
*gRPCsupports GPB only *supported on Catalyst 3650/3850/9300/9500, ASR1000, ISR4000
**UDP from 7.0(3)I7(1), supports both, GPB and JSON **Native models are different from YANG models in XR
***HTTP suppors JSON only
https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/7-x/programmability/guide/b_Cisco_Nexus_9000_Series_NX-
OS_Programmability_Guide_7x/b_Cisco_Nexus_9000_Series_NX-OS_Programmability_Guide_7x_chapter_011000.html
https://www.cisco.com/c/en/us/td/docs/ios-xml/ios/prog/configuration/166/b_166_programmability_cg/model_driven_telemetry.html

© 2016 Cisco and/or its affiliates. All rights reserved. 32
Some Commonly Used Models and Subtrees
32BRKSPG-2333

© 2016 Cisco and/or its affiliates. All rights reserved. 33
Your infrastructure should be ready
Bytes are per a single collection* Go with EDT starting from IOS XR 6.3.x
DataNCS5516
numbers
gRPC/KVGPB
BW(mbps)
gRPC/GPB
BW(mbps)
UDP/JSON
BW(mbps)
Interface Oper State*21600.68068760.1452774670.769932267
Interface Data Rate21600.63658960.1510120.802756
Interfaces Stats21601.71556320.30232962.0160128
Optics Ports Info5760.4773022220.0658393780.483407289
Uptime Info1 0.00028160.00022880.0005968
CPU State180.3359160.0530520.3043256
Memory Info180.0040360.00115840.0055872
Processes Memory5380.0615031110.0198177780.0832968
LLDP Info5740.1853120.07032880.2401824
IPv4 RIB Info* 650,02247.3739397810.9584742756.86328129
IPv6 RIB Info* 148763.64150440.8569333334.371237733
BGP IPv4 Routes Info650,0000.00047440.00024240.000632
BGP IPv6 Routes Info128000.0004720.000240.0006272
BGP ipv4 Neighbor Info2 0.03902720.00547280.0391232
MPLS-TE Tunnels
Summary Info10030.0046040.00056320.004692
RSVP Interface Info5 0.00153520.00042560.001972
NCS5500 NPU Stats961.597559760.2554402271.726576827
NCS5500 NPU Resources
Info 960.20520160.03788160.2130816
~ 57 mbps~ 13 mbps~ 68 mbps

34© 2016 Cisco and/or its affiliates. All rights reserved.
Collectors
and analytics
platforms

© 2016 Cisco and/or its affiliates. All rights reserved. 35
Kafka
Different Collection Models
Logstash
ElasticSearch
Kibana
Panda
BYO
CustomOpen Source, Customizable
Proprietary
or OS-
based
Commercial Stack
Prometheus /
InfluxDB
Grafana
Applications
Storage
Collection

© 2016 Cisco and/or its affiliates. All rights reserved. 36
Pipeline: An Open Source Collector
Pipeline
Kapacitor
Output to file, TSBD, Kafka…
Ingest, transform, filter
Self-monitoring, horizontally scalable

© 2016 Cisco and/or its affiliates. All rights reserved. 37
MDT in real time
Demo

© 2016 Cisco and/or its affiliates. All rights reserved. 38
Topology
tgn-01
PagentTraffic Generator
xrv-01
XRv9k
rcv-01
IOS
Gi0/0/0/0Gi0/0/0/1
Traffic
Pipeline
Telemetry
QoS DataPolicy-map output

© 2016 Cisco and/or its affiliates. All rights reserved. 39
How does the stack work?
Pipeline
Transforms telemetry data
for DB storage
InfluxDB
Stores telemetry data
Grafana
Data visualization and alerting
Python
Flask App to trigger
Ansibleplaybooks
Ansible
Automation Engine
Cisco Network Service
Orchestrator
IOS XR -core data source
NSO

© 2016 Cisco and/or its affiliates. All rights reserved. 40
The power of a modular tool chain
Data transformation
Data storage
Data visualization
and alerting toolWebhookend point
Automation engine
Service orchestrator
IOS XR -core data source
NSO
= New Service Offering

© 2016 Cisco and/or its affiliates. All rights reserved. 41
NSO
telemetry model-driven
destination-group dev
address family ipv4 10.85.204.18port 5432
encoding self-describing-gpb
protocol tcp
!
!
sensor-group QoS
sensor-path Cisco-IOS-XR-qos-ma-oper:qos/nodes/node/policy-
map/interface-table/interface/output
!
!
subscription Sub1
sensor-group-id QoSsample-interval 30000
destination-id dev
!
How does the stack work? –IOS XR
Pipeline
address
Sensors
Subscription

© 2016 Cisco and/or its affiliates. All rights reserved. 42
Key Takeaways
•Speed & Scale Require Visibility
•It’s Not Hard to Beat SNMP
•Data Models Are Your Friends
•A Big Data Platform Is In Your Future

© 2016 Cisco and/or its affiliates. All rights reserved. 43
Resources
Tutorials, Blogs, VoDs
https://xrdocs.github.io/telemetry/
http://blogs.cisco.com/sp/the-limits-of-snmp
http://blogs.cisco.com/sp/boring-is-the-new-awesome
http://blogs.cisco.com/sp/why-you-should-care-about-model-driven-telemetry
https://youtu.be/tIN8BjHwpNs(NANOG 67: 10 Lessons from Telemetry)
YANG
https://github.com/YangModels/yang/tree/master/vendor/cisco(Cisco YANG models)
http://blogs.cisco.com/getyourbuildon/yang-opensource-tools-for-data-modeling-driven-management
(YANG open source tools)
https://developer.cisco.com/site/ydk/(YDK intro)
Telemetry Tools :
https://github.com/cisco/bigmuddy-network-telemetry-pipeline
https://github.com/cisco/bigmuddy-network-telemetry-stacks
Demos and Lab
https://dcloud-cms.cisco.com/demo/mdt-ios-xr-611-v1(dCloud)
https://youtu.be/F_S9-ctNFe0(demo on NCS 5508)

Thank you.