M02 - Cisco HyperFlex Solution Deep Dive.pdf

tuancq77 7 views 37 slides Mar 11, 2025
Slide 1
Slide 1 of 37
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37

About This Presentation

Cisco HyperFlex Solution Deep Dive


Slide Content

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential1Template Version 2020.12.06-1
Cisco Hyperflex Deep Dive

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential3
Agenda
A day in the life…
Cisco HyperFlex Data Optimization

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential4
A day in the life…

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential5
All nodes have a dedicated SSD cache drive
About cache…
The cache drive has a Write Log
•8GB or 32GB in Hybrid Nodes
–remaining space is Read Cache
•=Disk capacity in All Flash Nodes
–No need for read cache
•Half the write log is the Active partition,
half Passive partition
–Writes are always made to the Active Partition
–Each partition has a Primary cache and one or
two mirror caches
The controller VM has a L1 RAM cache used
in Read IO
HYBRID NODEALL FLASH NODE
CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
SSD Cache
Write Log
ActivePassive
PrimaryPrimary
Mirror1Mirror1
Mirror2Mirror2
L2 Read Cache
SSD Cache
Write Log
ActivePassive
DRAM L1DRAM L1
PrimaryPrimary
Mirror1Mirror1
Mirror2Mirror2
CapacityCapacity

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential7
Input/Output Lifecycles: Writes
1.VM writes data to HX datastore, received by the IOvisor, hashed to determine the primary node for that write, sent via network to primary node’s controller VM.
UCS FABRIC
NODE 1
SSD Cache
Write Log
NODE 2NODE 3NODE 4
VMVM
CONTROLLERHYPERVISOR
VM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
IOVisor
Primary
Mirror 1
Mirror 2
CapacityCapacityCapacity Capacity

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential8
UCS FABRIC
NODE 1
SSD Cache
Write Log
NODE 2NODE 3NODE 4
VMVM
CONTROLLERHYPERVISOR
VM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
Primary
Mirror 1
Mirror 2
CapacityCapacityCapacity Capacity
Input/Output Lifecycles: Writes
2.Controller VM compresses the data and commits the write to the active local write log on the caching SSD and sends duplicate copies to the other nodes using RPCs*.
IOVisor
A2
A3
A1
Note: Metadata
is also written
to cache.
*RPC:
Remote
procedure call -
Wikipedia

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential9
UCS FABRIC
NODE 1
SSD Cache
Write Log
NODE 2NODE 3NODE 4
VMVM
CONTROLLERHYPERVISOR
VM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
Primary
Mirror 1
Mirror 2
CapacityCapacityCapacity Capacity
Input/Output Lifecycles: Writes
3.After all 3 copies are committed, the write is acknowledged back to the VM via the IOvisorand the datastore like a normal I/O.
IOVisor
ACKNote: Metadata
is also written
to cache.
A2
A3
A1

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential10
UCS FABRIC
NODE 1
SSD Cache
Write Log
NODE 2NODE 3NODE 4
VMVM
CONTROLLERHYPERVISOR
VM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
Primary
Mirror 1
Mirror 2
CapacityCapacityCapacity Capacity
Input/Output Lifecycles: Destage
4.Additional write I/O is handled the same way, until the active local write log segment of a node becomes full, then a destageoperation is started.
B3
B2
B1
C2
C3
C1D1
D2
D3E3
E2
E1F1
F3
F2
G3
G2
G1H1
H3
H2
I3
I2
I1
A2
A3
A1
Note: Metadata
is also written
to cache.

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential11
UCS FABRIC
NODE 1
SSD Cache
Write Log
NODE 2NODE 3NODE 4
VMVM
CONTROLLERHYPERVISOR
VM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
ActivePassive
PassiveActive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
PassiveActive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
PassiveActive
Primary
Mirror 1
Mirror 2
CapacityCapacityCapacity Capacity
Input/Output Lifecycles: Destage
5.The destageoperation is started with the full Active and empty Passive partitions being flipped allowing the Write Log to receive new incoming writes.
B3
B2
B1
C2
C3
C1D1
D2
D3E3
E2
E1F1
F3
F2
G3
G2
G1H1
H3
H2
I3
I2
I1
A2
A3
A1

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential12
UCS FABRIC
NODE 1
SSD Cache
Write Log
NODE 2NODE 3NODE 4
VMVM
CONTROLLERHYPERVISOR
VM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
ActivePassive
PassiveActive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
PassiveActive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
PassiveActive
Primary
Mirror 1
Mirror 2
CapacityCapacityCapacity Capacity
Input/Output Lifecycles: Destage
6.The Primary data is replicated, and new blocks are committed to the capacity layer disks in a distributed fashion. Duplicates generate only a metadata update.
G1I1A2A3A1A1
A2 A3A1
G2G3G1
G1G2 G3
I3I2I1
I3I2 I1
Did you notice
anything about
how I3was de-
staged?
This shows how
the destaging
process is also
distributed, and
independent of
the cache.
Strive for
balance
Theoretically
possible to have
storage nodes
without cache
B3
B2
B1
C3
C1D1
D2
E3
E1F1
F2
G3
G2
H1
H3
H2
I3
I2A2
A3
C2
D3
E2
F3

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential13
UCS FABRIC
NODE 1
SSD Cache
NODE 2NODE 3NODE 4
VMVM
CONTROLLERHYPERVISOR
VM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
Write Log
ActivePassive
PassiveActive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
ActivePassive
SSD Cache
Write Log
PassiveActive
ActivePassive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
PassiveActive
Primary
Mirror 1
Mirror 2
CapacityCapacityCapacity Capacity
Input/Output Lifecycles: Destage
7.Once all data has been destaged, the passive log segments are moved to the Read Cache (Hybrid only) or purged (All Flash) ready to flip once the active segments become full again.
A2 A3A1G1G2 G3I3I2 I1
L2 Read CacheL2 Read CacheL2 Read CacheL2 Read CacheD2G2I2 G1I1A1 G3I3A3
B3
B2
B1
C3
C1D1
E3
E1F1
F2
H1
H3
H2A2 C2
D3
E2
F3

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential14
All nodes have a dedicated SSD cache drive
Input/Output Lifecycles: Reads
•When doing a read, the IOVisor knows
which node has the primary copy of the
block.
–In all-flash nodes, the read always tries to get
the primary block.
–In hybrid nodes, it is possible to request the
primary block or any of the copies, depending
on which operation is faster*
–Reads are decompressed and cached in L1/L2
as appropriate and returned to the
requesting VM.
HYBRID NODEALL FLASH NODE
CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
SSD Cache
Write Log
ActivePassive
PrimaryPrimary
Mirror1Mirror1
Mirror2Mirror2
L2 Read Cache
SSD Cache
Write Log
ActivePassive
DRAM L1DRAM L1
PrimaryPrimary
Mirror1Mirror1
Mirror2Mirror2
CapacityCapacity
*Hyperconverged Infrastructure Data CentersDemystifying HCI Sam Halabi. p186

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential24
UCS FABRIC
NODE 1
SSD Cache
NODE 2NODE 3NODE 4
VMVM
CONTROLLERHYPERVISOR
VM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
PassiveActive
PassiveActive
ActivePassive
SSD Cache
ActivePassive
ActivePassive
PassiveActive
SSD Cache
PassiveActive
PassiveActive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
PassiveActive
CapacityCapacityCapacity Capacity
Input/Output Lifecycles: Read Request –Found in L2 Cache
VM's read request received by the IOvisor, the metadata index directs the request to primary node for the stripe unit for the data. Requests are serviced in the following order:
A2 A3A1G1G2 G3I3I2 I1
L2 Read CacheL2 Read CacheL2 Read CacheL2 Read CacheD2G2I2 G1I1 G3I3A3
B3
B2
B1
C3
C1D1
E3
E1
F1
F2
H1
H3
H2A2 C2
D3
E2
F3
IOVisor
A
J1
J2
J3
K1
K2
K3
A Please
1.Active write logs of primary node
2.Active write logs of other nodes
3.Passive write logs
4.L1 (DRAM) cache
5.L2 read cache (hybrid only)
6.When found
•decompress
•copy to L1 cache
•and return the read
X3X1X2B1F1
K1
B2F2 K2B3
F3
K3
CONTROLLERHYPERVISOR
VMVMVM
IOVisor
AA1
DRAM L1A
Primary
Mirror 1
Mirror 2

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential25
UCS FABRIC
NODE 1
SSD Cache
NODE 2NODE 3NODE 4
VMVM
CONTROLLERHYPERVISOR
VM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
PassiveActive
PassiveActive
ActivePassive
SSD Cache
ActivePassive
ActivePassive
PassiveActive
SSD Cache
PassiveActive
PassiveActive
ActivePassive
SSD Cache
Write Log
ActivePassive
ActivePassive
PassiveActive
Primary
Mirror 1
Mirror 2
CapacityCapacityCapacity Capacity
Input/Output Lifecycles: Read Request –Found in Capacity Storage
VM's read request received by the IOvisor, the metadata index directs the request to primary node for the stripe unit for the data. Requests are serviced in the following order:
A2 A3A1G1G2 G3I3I2 I1
L2 Read CacheL2 Read CacheL2 Read CacheL2 Read CacheD2G2I2 G1I1 G3I3A3
B3
B2
B1
C3
C1D1
E3
E1
F1
F2
H1
H3
H2A2 C2
D3
E2
F3
IOVisor
X
J1
J2
J3
K1
K2
K3
X Please
1.Active write logs of primary node
2.Active write logs of other nodes
3.Passive write logs
4.L1 (DRAM) cache
5.L2 read cache (hybrid only)
6.Capacity Storage
7.When found:
•leave a copy in L2 (hybrid)
•decompress
•copy to L1 cache
•and return the read
X3X1X2B1F1
K1
B2F2 K2B3
F3
K3
CONTROLLERHYPERVISOR
VMVMVM
IOVisor
DRAM L1
XA1
A
X1
X1

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential26
All nodes have a dedicated SSD cache drive
Input/Output Lifecycles: Reads
•Requests to read data are also received by the IOvisorand serviced in the following order:
1.Active write logs of primary node
2.Active write logs of other nodes
3.Passive write logs
4.L1 (DRAM) cache
5.L2 read cache (hybrid only)
6.Capacity SSDs/HDDs
•Reads are decompressed and cached in L1/L2 as appropriate and returned to the requesting VM.
HYBRID NODEALL FLASH NODE
CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
SSD Cache
Write Log
ActivePassive
PrimaryPrimary
Mirror1Mirror1
Mirror2Mirror2
L2 Read Cache
SSD Cache
Write Log
ActivePassive
DRAM L1DRAM L1
PrimaryPrimary
Mirror1Mirror1
Mirror2Mirror2
CapacityCapacity

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential28
A.The IOvisorsends the stripe unit to the Primary controller which compresses it and writes it to the active write log
B.The IOvisorcompresses the stripe unit then sends it to the three controllers, each of which write it to the active write log cache
C.When the active write log is filled, the stripe units are copied from the active write logand committed to capacity storage on each node. The active write log then becomes the passive write log
D.When the active write log is filled, the active write log then becomes the passive write logand the stripe units copied from the Primary passive write logand committed to capacity storage on each node.
The HXDP uses a two stage write process. Which of the following statements best describe
that process.Choose Two

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential29
Controller VM Architecture with Hyper-V
SMB Client SMB Client SMB Client
Controller VM
I/Ovisor
StorFS
running
SMB
server
SMB
proxy
Controller VM
I/Ovisor
StorFS
running
SMB
server
SMB
proxy
Controller VM
I/Ovisor
StorFS
running
SMB
server
SMB
proxy
HX Datastore
HyperFlex Data Fabric for
Microsoft Hyper-V
App VM
VHDXApp VM
VHDXApp VM
VHDX
App VM
VHDXApp VM
VHDXApp VM
VHDX
App VM
VHDXApp VM
VHDXApp VM
VHDX
SMB file share
Windows Server 2019Windows Server 2019Windows Server 2019

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential30
Cisco HyperFlex Data Optimization

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential31
VM1.
vmdk
UCS FABRIC
CONTROLLERHYPERVISOR
VM1VMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
SSD[•][•] SSD[•][•] SSD[•][•] SSD[•][•]
A3A2
VM1.vmdkVM1.vmdk
…breaks up files into "Stripe units" identified by a unique fingerprint (checksum)
Hyperflex…
•"Stripe units"
–are created and identified by a unique fingerprint which becomes the pointer to the
stored unit –are always written to a single disk
–written 2 or 3 times (on different nodes)
•Hyperflex DP uses a Pointer-Based Log Structured File System–Always writes to the end of the log–Changed "stipe units" are replaced by new ones
oand the stale ones marked for deletion
HX DATA PLATFORM
VM1.vmdk
D1E1A1B2 C1B1C2 C3 D2D3 E2E3 B3
DCBAE

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential32
UCS FABRIC
CONTROLLERHYPERVISOR
VMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
SSD[•][•] SSD[•][•] SSD[•][•] SSD[•][•]
VM1.vmdkE
A3A2
…migrate VMs without moving any data
By using "Stripe Units" Hyperflex Can…
HX DATA PLATFORM
D1E1A1B2 C1B1C2 C3 D2D3 E2E3 B3
DCB
VM1
Result:
•Enterprise grade
performance
–meaning no data needs to move
just because a VM moves

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential33
–…by distributing "strip units" evenly across
multiple disks across multiple nodes
VM1.vmdkE
…minimise NAND SSD wear and HDD seek times
By using "Stripe Units" Hyperflex Can…
DCB •Hyperflex DP uses a Pointer-
Based Log Structured File
System
–Always writes to the end of the
log
oSpeeds up HDD seek times
oCorrelates with NAND SSD
operation minimising wear
UCS FABRIC
A3A2
HX DATA PLATFORM
D1E1A1B2 C1B1C2 C3 D2D3 E2E3 B3
CONTROLLERHYPERVISOR
VM1VMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
SSD[•][•] SSD[•][•] SSD[•][•] SSD[•][•]

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential34
Hyperflex Data Platform (HXDP) minimises NAND SSD wear and
speeds up spinning disk performance because…
A.HXDP uses a log structured file system, and distributes "stripe
units" evenly across multiple disks across multiple nodes
B.HXDP uses a UCS fabric to ensure low latency between nodes
C.HXDP offloads IO to secondary controllers when more than one
VM writes
D.HXDP uses Erasure Coding

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential35
One more thing
Because Cisco Hyperflex identifies every "Stripe unit"
by a unique fingerprint (checksum)…
…means there is one more very important feature
inherently built into the Cisco Hyperflex pointer-based
log structured file system…
–It requires NO extra hardware or licencing…
This Photoby Unknown Author is licensed under CC BY-ND

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential36
UCS
FABRIC
CONTROLLERHYPERVISOR
VM1VMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
SSD[•][•] SSD[•][•] SSD[•][•] SSD[•][•]
VM1.vmdk
A3A2
Hyperflex Can…
…deduplicate on the fly –without extra hardware. Look at HX Snapshots…
•When the first snapshot is taken, a SENTINEL snapshot is created to prevent VMware ever creating a redo snapshot
•Looking at the second and later snapshots show…
Deduplication is inherent in the pointer-based file system
HX DATA PLATFORM
D1E1A1B2 C1B1C2 C3 D2D3 E2E3 B3
VM1.SENTINEL.vmdkDCBE
F1 F2 F3
VM1.vmdkDCBE F
VM1.snap2.vmdkFCBE
G
VM1.snap1.vmdkDCBE
VM1.snap3.vmdkGCBE
H
VM1.snap4.vmdkGHBE
G1 G2
G3H1 H2 H3
IJLKM
I1I2 I3J1J2J3K1K2 K3L1 L2L3 M1M2 M3
VM1.snap5.vmdkMLJI K

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential37
UCS
FABRIC
CONTROLLERHYPERVISOR
VM1VMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
SSD[•][•] SSD[•][•] SSD[•][•] SSD[•][•]
HX DATA PLATFORM
VM1.vmdk
A3A2
Hyperflex Can…
…consolidate VM snapshots in seconds
HX Native HX Snapshots are
ALWAYS consolidated!
D1E1A1B2 C1B1C2 C3 D2D3 E2E3 B3
VM1.SENTINEL.vmdkDCBE
F1 F2 F3
VM1.vmdkMLJI
VM1.snap2.vmdkFCBE
VM1.snap1.vmdkDCBE
VM1.snap3.vmdkGCBE
VM1.snap4.vmdkGHBE
G1 G2
G3H1 H2 H3
K
I1I2 I3J1J2J3K1K2 K3L1 L2L3 M1M2 M3
VM1.snap5.vmdkMLJI K

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential38
UCS
FABRIC
CONTROLLERHYPERVISOR
VM1VMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
SSD[•][•] SSD[•][•] SSD[•][•] SSD[•][•]
HX DATA PLATFORM
VM1.vmdk
Hyperflex Can…
…consolidate VM snapshots in seconds
•Garbage collection happens
every 24 hrs
–The cleaner schedule runs at 06:00
UTC by default
VM1.vmdkMLJI K
I1I2 I3J1J2J3K1K2 K3L1 L2L3 M1M2 M3

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential39
UCS
FABRIC
CONTROLLERHYPERVISOR
VM1VMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
SSD[•][•] SSD[•][•] SSD[•][•] SSD[•][•]
HX DATA PLATFORM
…deduplicate on the fly –without extra hardware.Look at HX Ready Clones…
Hyperflex Can…
Deduplication is inherent in the
pointer-based file system.
VM1.vmdkMLJI K
I1I2 I3J1J2J3K1K2 K3L1 L2L3 M1M2 M3
VM7.vmdkMLJI K
VM2.vmdkMLJI KVM8.vmdkMLJI K
VM3.vmdkMLJI KVM9.vmdkMLJI K
VM4.vmdkMLJI KVM10.vmdkMLJI K
VM5.vmdkMLJI K
VM6.vmdkMLJI K

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential40
UCS
FABRIC
CONTROLLERHYPERVISOR
VM1VMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor CONTROLLERHYPERVISOR
VMVMVM
IOVisor
SSD[•][•] SSD[•][•] SSD[•][•] SSD[•][•]
HX DATA PLATFORM
…facilitate VMs with a disk size greater than the total capacity of a single Node
Hyperflex Can…
•No need to worry about whether there is enough room on a single
node if the system has sufficient capacity
–Made possible by even data distribution
A3A2 D1E1A1B2 C1B1C2 C3 D2D3 E2E3 B3F1 F2 F3G1 G2
G3H1 H2 H3I1I2 I3J1J2J3K1K2 K3L1 L2L3 M1M2 M3
VM1.vmdkDCBA E IHGF JLKM
Node Capacity 8TB

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential41
Advantages of using the IOVisor
Read and write requests are distributedamongst the available CVMs.
•This eliminates hotspots
No Controller VM necessary
•Facilitates Diskless Workstations
•If a CVM fails, load is distributed amongst remaining CVMs providing
redundancy
IOVisor calculates the correct Primary CVM to send IO to based on
Fingerprints
•No need for large tables in RAM
The IOVisor makes the existence of the hyperconvergence layer
transparent to the hypervisor

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential42
How HX does faster Reads
•Traditional System Track iNodesand Offsets
•HX Software Track Fingerprintsand Content
Fingerprinting algorithmis a procedure that maps an arbitrarily
largedataitem (such as a computerfile) to a much shorterbitstring,
itsfingerprint, that uniquely identifies the original data for all practical
purposes,just as humanfingerprintsuniquely identify people for
practical purposes.
Source : https://en.wikipedia.org/wiki/Fingerprint_(computing)

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential43
FingerPrints: Advantages
In HX, entire File Structure Metadata is based on Fingerprints
Fingerprints are stored in DRAM and SSD and enable fast retrieval of
data
If two objects point to the same fingerprint then we will be able to
leverage the data fetched by some one else
Same FingerPrintsare very common in VDI environments (Boot
Storms, Login Storms)

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential44
Log Structured Filesystem
Local FS
Distributed File system
VendorLocal FileSystemCommentary
Cisco HX Data
Platform
StorFS
(Log structured)
•Compressionfriendly
•Snapshot friendly
•Better flash endurance
Unique
Architecture
Local FSLocal FSLocal FS
https://www.ingramflyhigher.com/campaigns/hyperflex-
2.1/downloads/Cisco_HyperFlex_HX_Data_Platform_Software_White_Paper.pdf

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential45
Log Structured Filesystem
The de-duplicated data is grouped and compressed into self addressable
Objects
Theses objects are written to disk in a log-structured Sequential manner
All incoming IO, including Random IO is written Sequentially to both
caching and persistent tier
The objects are distributed across all nodes uniformly
Sequential layout increases Flash Endurance

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential46
Log Structured Filesystem
•Data blocks are compressed into objects and sequentially laid out
in fixed segments
•Segments are laid out in in a sequential manner
•Each object is uniquely addressable through a KEY
•Each Key is fingerprinted and checksumedfor Data Integrity
Self –Describing Variably Sized Compressed Objects

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential47
In traditional write-in-place filesystems, there is a performance overhead
•Output of compression will be variable length and updating this in place creates inefficiencies on block boundaries. Fragmentation would result from freed space. Too large would require moving the blocks.
•When using large block sizes, you must do a read-decompress-modify-write
With LSFS, all modifications are written to a new location
•No worry if new data doesn’t compress as well as the old data
•Eliminate fragmentation as compressed blocks are packed tightly together
Log Structured File System
Traditional write-in-place filesystems
Compression

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential48
End

© 2020 Cisco and/or its affiliates. All rights reserved. Cisco Confidential49
Thank You
Tags