HEVC VIDEO CODEC By Vinayagam Mariappan

VinayagamMariappan1 3,374 views 60 slides Jun 07, 2016
Slide 1
Slide 1 of 60
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60

About This Presentation

HEVC VIDEO CODEC FOR NEXT GENERATION BROADCASTING TECHNOLOGY


Slide Content

eSILICON LABS, INDIA
HEVC VIDEO CODEC
Vinayagam M
Video Electronics

2

3
Agenda
Video Coding
HEVC

4
VIDEO CODING

5
VIDEO CODING
•Lossless Compression
•Lossy Compression
•Transform Coding
•Motion Coding

6
VIDEO CODER ARCHITECTURE
•Image / Video Coding Based on Block-Matching
–Assume frame f-1 has been encoded and reconstructed, and frame f is the
current frame to be encoded
•Exploiting the redundancies
–Temporal
MC-Prediction (P and B frames)
–Spatial
Block DCT
–Color
Color Space Conversion
•Scalar quantization of DCT coefficients
•Zigzag scanning, runlength and Huffman coding of the nonzero
quantized DCT coefficients

7
VIDEO CODER ARCHITECTURE…
•VideoEncoder
–Divideframefintoequal-sizeblocks
–Foreachsourceblock,
Find its motion vector using the block-matching algorithm based on the
reconstructed frame f-1
Compute the DFD of the block
–Transmitthemotionvectorofeachblocktodecoder
–CompressDFD’sofeachblock
–TransmittheencodedDFD’stodecoder

8
VIDEO CODER ARCHITECTURE…
•VideoEncoder

9
VIDEO CODER ARCHITECTURE…
•VideoDecoder
–Receivemotionvectorofeachblockfromencoder
–Basedonthemotionvector,findthebest-matchingblockfromthe
referenceframe
ie,, Find the predicted current frame from the reference frame
–ReceivetheencodedDFDofeachblockfromencoder
–DecodetheDFD.
–Eachreconstructedblockinthecurrentframe=ItsdecompressedDFD+
thebest-matchingblock

10
VIDEO CODER ARCHITECTURE…
•VideoDecoder

11
VIDEO CODEC STANDARDS

12
VIDEO CODEC STANDARDS…
•Based on the same fundamental building blocks
–Motion-compensated prediction (I, P, and B frames)
–2-D Discrete Cosine Transform (DCT)
–Color space conversion
–Scalar quantization, runlengths, Huffman coding
•Additional tools added for different applications:
–Progressive or interlaced video
–Improved compression, error resilience, scalability, etc.
•MPEG-1/2/4, H.261/3/4
–Frame-based coding
•MPEG-4
–Object-based coding and Synthetic video

13
VIDEO CODEC STANDARDS…

14
HEVC

15
HEVC
•Video Coding Standards Overview
Next Generation Broadcasting

16
HEVC…
•MPEG-H
–HighEfficiencyCodingandMediaDeliveryin
HeterogeneousEnvironmentsanewsuiteof
standardsprovidingtechnicalsolutionsfor
emergingchallengesinmultimediaindustries
–Part 1: System, MPEG Media Transport (MMT)
Integratedserviceswithmultiplecomponentsinahybrid
deliveryenvironment,providingsupportforseamlessand
efficientuseofheterogeneousnetworkenvironments,
includingbroadcast,multicast,storagemediaandmobile
networks
–Part 2: Video, High Efficiency Video Coding
(HEVC)
Highlyimmersivevisualexperiences,withultrahighdefinition
displaysthatgivenoperceptiblepixelstructureevenif
viewedfromsuchashortdistancethattheysubtendalarge
viewingangle(upto55degreeshorizontallyfor4Kx2K
resolutiondisplays,upto100degreesfor8Kx4K)
–Part 3: Audio, 3D-Audio
Highlyimmersiveaudioexperiencesinwhichthedecoding
devicerendersa3Daudioscene.Thismaybeusing10.2or
22.2channelconfigurationsormuchmorelimitedspeaker
configurationsorheadphones,suchasfoundinapersonal
tabletorsmartphone.

17
HEVC…
•Transport/System Layer Integration
–Ongoingdefinitions(MPEG,IETF,…,DVB):benefitfromH.264/AVC
–MPEGMediaTransport(MMT)?

18
HEVC…
•HEVC = High Efficiency Video Coding
•Joint project between ISO/IEC/MPEG and ITU-T/VCEG
–ISO/IEC: MPEG-H Part 2 (23008-2)
–ITU-T: H.265
•JCT-VC committee
–Joint Collaborative Team on Video Coding
–Co-chairs: Dr. Gary Sullivan (Microsoft, USA) and Dr. Jens-Reiner Ohm (RWTH
Aachen, Germany)
•Target
–Roughly half the bit-rate at the same subjective quality compared to H.264/AVC (50%
over H.264/AVC)
–x10 complexity max for encoder and x2/3 max for decoder
•Requirements
–Progressive required for all profiles and levels
Interlaced support using field SEI message
–Video resolution: sub QVGA to 8Kx4K, with more focus on higher resolution video
content (1080p and up)
–Color space and chroma sampling: YUV420, YUV422, YUV444, RGB444
–Bit-depth: 8-14 bits
–Parallel Processing Architecture

19
HEVC…
•H.264 Vs H.265

20
HEVC…
•Potential applications
–Existing applications and usage scenarios
IPTV over DSL : Large shift in IPTV eligibility
Facilitated deployment of OTT and multi-screen services
More customers on the same infrastructure: most IP traffic is video
More archiving facilities
–Existing applications and usage scenarios
1080p60/50 with bitrates comparable to 1080i
Immersive viewing experience: Ultra-HD (4K, 8K)
Premium services (sports, live music, live events,…): home theater, Bars
venue, mobile
HD 3DTV Full frame per view at today’s HD delivery rates
What becomes possible with 50% video rate reduction?

21
HEVC…
•Tentative Timeline

22
HEVC…
•History

23
HEVC…
•H.264 Vs H.265

24
HEVC…
•H.264 Vs H.265

25
HEVC…
•HEVC Encoder

26
HEVC…
•HEVC Decoder

27
HEVC…
•Video Coding Techniques : Block-based hybrid video coding
–Interpicture prediction
Temporal statistical dependences
–Intrapicture prediction
Spatial statistical dependences
–Transform coding
Spatial statistical dependences
•Uses YCbCrcolor space with 4:2:0 subsampling
–Y component
Luminance (luma)
Represents brightness (gray level)
–Cband Cr components
Chrominance (chroma).
Color difference from gray toward blue and red

28
HEVC…
•Video Coding Techniques : Block-based hybrid video coding
–Motion compensation
Quarter-sample precision is used for the MVs
7-tap or 8-tap filters are used for interpolation of fractional-sample
positions
–Intrapicture prediction
33 directional modes, planar (surface fitting), DC (flat)
Modes are encoded by deriving most probable modes (MPMs) based
on those of previously decoded neighboring PBs
–Quantization control
Uniform reconstruction quantization (URQ)
–Entropy coding
Context adaptive binary arithmetic coding (CABAC)
–In-Loop deblockingfiltering
Similar to the one in H.264 and More friendly to parallel processing
–Sample adaptive offset (SAO)
Nonlinear amplitude mapping
For better reconstruction of amplitude by histogram analysis

29
HEVC…
•Coding Tree Unit (CTU) -A picture is partitioned into CTUs
–TheCTUisthebasicprocessingunitinsteadofMacroBlocks(MB)
–Contains lumaCTBs and chromaCTBs
A lumaCTB covers L ×L samples
Two chromaCTBs cover each L/2 ×L/2 samples
–HEVC supports variable-size CTBs
The value of L may be equal to 16, 32, or 64.
Selected according to needs of encoders -In terms of memory and
computational requirements
Large CTB is beneficial when encoding high-resolution video content
–CTBs can be used as CBs or can be partitioned into multiple CBs using
quadtreestructures
–The quadtreesplitting process can be iterated until the size for a luma
CB reaches a minimum allowed lumaCB size (8 ×8 or larger).

30
HEVC…
•Block Structure
–Coding Tree Units (CTU)
Corresponds to macroblocksin earlier coding standards (H.264, MPEG2, etc)
Lumaand chromaCoding Tree Blocks (CTB)
Quadtreestructure to split into Coding Units (CUs)
16x16, 32x32, or 64x64, signaled in SPS

31
HEVC…
•Anewframeworkcomposedofthree
newconcepts
–CodingUnits(CU)
–PredictionUnits(PU)
–TransformUnits(TU)
•Thedecisionwhethertocodea
pictureareausinginterorintra
predictionismadeattheCUlevel
Goal:Tobeasflexibleaspossibleandtoadaptthe
compression-predictiontoimagepeculiarities

32
HEVC…
•Block Structure
–Coding Units (CU)
Lumaand chromaCoding Blocks (CB)
Rooted in CTU
Intra or inter coding mode
Split into Prediction Units (PUs) and Transform Units (TUs)

33
HEVC…
•Block Structure
–Prediction Units (PU)
Lumaand chromaPrediction Blocks (PB)
Rooted in CU
Partition and motion info

34
HEVC…
•Block Structure
–Transform Units (TU)
Rooted in CU
4x4, 8x8, 16x16, 32x32 DCT, and 4x4 DST

35
HEVC…
•Relationship of CU, PU and TU

36
HEVC…
•IntraPrediction
–35intramodes:33directionalmodes+
DC+planar
–Forchroma,5intramodes:DC,planar,
vertical,horizontal,andlumaderived
–Planarprediction(Intra_Planar)
Amplitudesurfacewithahorizontaland
verticalslopederivedfromboundaries
–DCprediction(Intra_DC)
Flatsurfacewithavaluematchingthe
meanvalueoftheboundarysamples
–Directionalprediction(Intra_Angular)
33differentdirectionalpredictionis
definedforsquareTBsizesfrom4×4up
to32×32

37
HEVC…
•Intra Prediction
–Adaptive reference sample filtering
3-tap filter: [1 2 1]/4
Not performed for 4x4 blocks
For larger than 4x4 blocks, adaptively performed for a subset of modes
Modes except vertical/near-vertical, horizontal/near-horizontal, and DC
–Mode dependent adaptive scanning
4x4 and 8x8 intra blocks only
All other blocks use only diagonal upright scan (left-most scan pattern)

38
HEVC…
•Intra Prediction
–Boundary smoothing
Applied to DC, vertical, and horizontal modes, lumaonly
Reduces boundary discontinuity
–For DC mode, 1st column and row of samples in predicted block are
filtered
–For Hor/Vermode, first column/row of pixels in predicted block are filtered

39
HEVC…
•Inter Prediction
–Fractional sample interpolation
¼ pixel precision for luma
–DCT based interpolation filters
8-/7-tap for luma
4-tap for chroma
Supports 16-bit implementation
with non-normative shift
–High precision interpolation and
biprediction
–DCT-IF design
Forward DCT, followed by
inverse DCT

40
HEVC…
•InterPrediction
–AsymmetricMotionPartition(AMP)forInterPU
–Merge
Derivemotion(MVandrefpic)fromspatialand
temporalneighbors
Whichspatial/temporalneighborisidentifiedby
merge_idx
Numberofmergecandidates(≤5)signaledinslice
header
Skipmode=mergemode+noresidual
–AdvancedMotionVectorPrediction(AMVP)
Usespatial/temporalPUstopredictcurrentMV

41
HEVC…
•Transforms
–Coretransforms:DCTbased
4x4,8x8,16x16,and32x32
Squaretransformsonly
Supportpartialfactorization
Near-orthogonal
Nestedtransforms
–Alternative4x4DST
4x4intrablocks,lumaonly
–Transformskippingmode
By-passthetransformstage
Mosteffectiveon“screencontent”
4x4TBsonly

42
HEVC…
•ScalingandQuantization
–HEVCusesauniformreconstructionquantization(URQ)
schemecontrolledbyaquantizationparameter(QP).
–TherangeoftheQPvaluesisdefinedfrom0to51

43
HEVC…
•EntropyCoding
–Oneentropycoder,CABAC
ReuseH.264CABACcorealgorithm
Morefriendlytosoftwareandhardware
implementations
Easiertoparallelize,reducedHWarea,increased
throughput
–Contextmodeling
Reduced#ofcontexts
Increaseduseofby-passbins
Reduceddatadependency
–Coefficientcoding
Adaptivecoefficientscanningforintra4x4and8x8
▫Diagonalupright,horizontal,vertical
Processedin4x4blocksforallTUsizes
Signdatahiding:
▫Signoffirstnon-zerocoefficientconditionallyhiddenin
theparityofthesumofthenon-zerocoefficient
magnitudes
▫Conditions:2ormorenon-zerocoefficients,and
“distance”betweenfirstandlastcoefficient>3

44
HEVC…
•EntropyCoding-CABAC
–Binarization:CABACusesBinaryArithmeticCodingwhichmeansthatonlybinarydecisions(1or
0)areencoded.Anon-binary-valuedsymbol(e.g.atransformcoefficientormotionvector)is
"binarized"orconvertedintoabinarycodepriortoarithmeticcoding.Thisprocessissimilartothe
processofconvertingadatasymbolintoavariablelengthcodebutthebinarycodeisfurther
encoded(bythearithmeticcoder)priortotransmission.
–Stagesarerepeatedforeachbit(or"bin")ofthebinarizedsymbol.
–Contextmodelselection:A"contextmodel"isaprobabilitymodelforoneormorebinsofthe
binarizedsymbol.Thismodelmaybechosenfromaselectionofavailablemodelsdependingon
thestatisticsofrecentlycodeddatasymbols.Thecontextmodelstorestheprobabilityofeachbin
being"1"or"0".
–Arithmeticencoding:Anarithmeticcoderencodeseachbinaccordingtotheselectedprobability
model.Notethattherearejusttwosub-rangesforeachbin(correspondingto"0"and"1").
–Probabilityupdate:Theselectedcontextmodelisupdatedbasedontheactualcodedvalue(e.g.if
thebinvaluewas"1",thefrequencycountof"1"sisincreased)

45
HEVC…
•ParallelProcessingTools
–Slices
–Tiles
–Wavefrontparallelprocessing(WPP)
–DependentSlices
•Slices
–SlicesareasequenceofCTUsthatareprocessedintheorder
ofarasterscan.Slicesareself-containedandindependent
–Eachsliceisencapsulatedinaseparatepacket

46
HEVC…
•Tile
–Self-containedandindependentlydecodablerectangularregions
–Tilesprovideparallelismatacoarselevelofgranularity
Tiles more than the cores Not efficient Breaks dependencies

47
HEVC…
•WPP
–AsliceisdividedintorowsofCTUs.Parallelprocessingofrows
–Thedecodingofeachrowcanbebegunassoonafewdecisionshave
beenmadeintheprecedingrowfortheadaptationoftheentropycoder.
–Bettercompressionthantiles.Parallelprocessingatafinelevelof
granularity.
No WPP with tiles !!

48
HEVC…
•DependentSlices
–SeparateNALunitsbutdependent(Canonlybedecodedafterpartof
thepreviousslice)
–Dependentslicesaremainlyusefulforultralowdelayapplications
RemoteSurgery
–Errorresiliencygetsworst
–Lowdelay
–GoodEfficiency
Goes well with WPP

49
HEVC…
•SliceVsTile
–Tilesarekindofzerooverheadslices
Sliceheaderissentateveryslicebuttileinformationonceforasequence
Sliceshavepacketheaderstoo
Eachtilecancontainanumberofslicesandviceversa
–Slicesarefor:
Controllingpacketsizes
Errorresiliency
–Tilesarefor:
Controllingparallelism(multiplecorearchitecture)
DefiningROIregions

50
HEVC…
•TileVsWPP
–WPP
Bettercompressionthantiles
Parallelprocessingatafinelevelofgranularity
But…
Needsfrequentcommunicationbetweenprocessingunits
IfhighnumberofcoresCan’tgetfullutilization
–Goodforwhen
Relativelysmallnumberofnodes
Goodintercorecommunication
NoneedtomatchtoMTUsize
Bigenoughsharedcache

51
HEVC…
•In-LoopFilters
–Twoprocessingsteps,adeblockingfilter(DBF)followedbyan
sampleadaptiveoffset(SAO)filter,areappliedtothe
reconstructedsamples
TheDBFisintendedtoreducetheblockingartifactsduetoblock-
basedcoding
TheDBFisonlyappliedtothesampleslocatedatblock
boundaries
TheSAOfilterisappliedadaptivelytoallsamplessatisfying
certainconditions. e.g.basedongradient.

52
HEVC…
•LoopFilters:Deblocking
–AppliedtoallsamplesadjacenttoaPUorTUboundary
Exceptthecasewhentheboundaryisalsoapictureboundary,or
whendeblockingisdisabledacrosssliceortileboundaries
–HEVConlyappliesthedeblockingfiltertotheedgethatare
alignedonan8×8samplegrid
Thisrestrictionreducestheworst-casecomputationalcomplexity
withoutnoticeabledegradationofthevisualquality
Italsoimprovesparallel-processingoperation
–Theprocessingorderofthedeblockingfilterisdefinedas
horizontalfilteringforverticaledgesfortheentirepicturefirst,
followedbyverticalfilteringforhorizontaledges.

53
HEVC…
•LoopFilters:Deblocking
–SimplerdeblockingfilterinHEVC(vsH.264)
–Deblockingfilterboundarystrengthissetaccordingto
Blockcodingmode
Existenceofnonzerocoefficients
Motionvectordifference
Referencepicturedifference

54
HEVC…
•LoopFilters:SAO
–Aprocessthatmodifiesthedecoded
samplesbyconditionallyaddingan
offsetvaluetoeachsampleafterthe
applicationofthedeblockingfilter,
basedonvaluesinlook-uptables
transmittedbytheencoder.
–SAO:SampleAdaptiveOffsets
NewloopfilterinHEVC
Non-linearfilter
–ForeachCTB,signalSAOtypeand
parameters
–EncoderdecidesSAOtypeand
estimatesSAOparameters(rate-
distortionopt.)

55
HEVC…
•SpecialCoding
–I_PCMmode
Theprediction,transform,quantizationandentropycodingarebypassed
Thesamplesaredirectlyrepresentedbyapre-definednumberofbits
Mainpurposeistoavoidexcessiveconsumptionofbitswhenthesignal
characteristicsareextremelyunusualandcannotbeproperlyhandledbyhybrid
coding
–Losslessmode
Thetransform,quantization,andotherprocessingthataffectsthedecodedpicture
arebypassed
Theresidualsignalfrominter-orintrapicturepredictionisdirectlyfedintothe
entropycoder
Itallowsmathematicallylosslessreconstruction
SAOanddeblockingfilteringarenotappliedtothisregions
–Transformskippingmode
Onlythetransformisbypassed
Improvescompressionforcertaintypesofvideocontentsuchascomputer-
generatedimagesorgraphicsmixedwithcamera-viewcontent
CanbeappliedtoTBsof4×4sizeonly

56
HEVC…
•HighLevelParallelism
–Independentlydecodablepackets
–SequenceofCTUsinrasterscan
–Errorresilience
–Parallelization
–Independentlydecodable(re-entry)
–RectangularregionofCTUs
–Parallelization(esp.encoder)
–1slice=moretiles,or1tile=moreslices
–RowsofCTUs
–Decodingofeachrowcanbeparallelized
–ShadedCTUcanstartwhengrayCTUsin
rowabovearefinished
–Mainprofiledoesnotallowtiles+WPP
combination

57
HEVC…
•Profiles,LevelsandTiers
–Historically,profiledefinescollectionofcoding
tools,whereasLevelconstrainsdecoder
processingloadandmemoryrequirements
–ThefirstversionofHEVCdefined3profiles
MainProfile:8-bitvideoinYUV4:2:0format
Main10Profile:sameasMain,upto10-bit
MainStillPictureProfile:sameasMain,one
pictureonly
–LevelsandTiers
Levels:maxsamplerate,maxpicturesize,
maxbitrate,DPBandCPBsize,etc
Tiers:“maintier”and“hightier”withinone
level

58
HEVC…
•Complexity Analysis
–Software-based HEVC decoder capabilities
(published by NTT Docomo)
Single-threaded: 1080p@30 on ARMv7
(1.3GHz),1080p@60 decoding on i5
(2.53GHz)
Multi-threaded: 4Kx2K@60 on i7 (2.7GHz),
12Mbps, decoding speed up to 100fps
–Other independent software-based HEVC
real-time decoder implementations published
by Samsung and Qualcomm during HEVC
development
–Decoder complexity not substantially higher
More complex modules: MC, Transform, Intra
Pred, SAO
Simpler modules: CABAC and deblocking

59
HEVC…
•Quality Performance

60
THANK YOU