Digital Communication: Information Theory

12,859 views 55 slides Oct 11, 2019
Slide 1
Slide 1 of 55
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55

About This Presentation

PPT on Information Theory focused on entropy, joint entropy, information rate, source encoder, channel capacity, shanon's channel capacity theorem


Slide Content

Digital Communication
Dr. S. M. Gulhane
Professor & Head, Dept. of Electronics & Telecommunication Engineering,
Jawaharlal DardaInstitute of Engineering & Technology, Yavatmal
SantGadgeBaba Amravati University, Maharashtra, India
Unit-2 : Information Theory
1
1
0
0
1
1
0
1
1
0
0
0
1
Information Theory

Information Theory
•The discrete information source consist of a discrete set
of letters or symbol.
•a message emitted by a source consist of sequence of
symbols
•every message coming out of the source contains some
information.
•However the information contains by the message
varies from one another. Some message convey more
information than others.
•In order to quantify the information contain in the
message, it is necessary to define a measure of
information.
2

Measure of information
•The event that occurs with low probability carries more
information than the event with larger probability.
•More the uncertainty of the event is, more is the information
content in the occurrence of the event
•Thus amount of information received is inversely proportional
to the probably of its occurrence.
•If p is the probability of message and I is the information
gain from message then
as p 1 ,I 0,and when p 0 and I ∞
•Which can be expressed as
3
I = log 1/p or I
k=-log p
k

hartleys
k
p
nats
k
p
e

,k-,........., for k bits
k
p -
k
p

k
pk
I
10log -
log -
110
2
log
1
2
log
1
log
























n informatio nocertain, absoultely 0 1for 
k
I
k
p

0 00for 
k
I
k
p
i
I
k
I
i
p
k
p  for max
for I
k
I
i
p
k
p 
Measure of information
4
1

Entropy
•Let us assume an information source which emit Kdistinct
symbols m
1
,m
2
,.., m
k
,.., m
K
with probabilityp
1
,p
2
,..,p
k
,..,p
K
respectively
•Assume source have delivered a statistically independent
sequence of N symbols (N∞) then symbol m
k
occurs Np
k
times and occurrence of each m
k
conveys information
I
k=-log p
k
•Therefore the total information due to m
ksymbol in a
sequence =-Np
k log p
k
•The total information due to all symbols in a sequence
I
total=-N∑ p
klog p
k
•Entropy H of the source is define as the average information
per symbol
K
H=I
total/N=-∑ p
klog p
k
k=1
5

6
Entropy
•Simple interpretation of the source entropy is on an average
we can accept to get H bits of information from an information
source even though we can not say in advance what symbol
sequence will occurs in this message.
•The above equation of entropy is applicable to source that
emits symbol in statistically independent sequences i.e. each
symbol emitted is independent of the previous symbol.
•The entropy is maximum when all the symbols are equi-
probable i.e. when the symbol probabilities are equal
i.e. p
1
=p
2
...=p
K
=1/K
•Then
H
max=-∑ 1/K log 1/K = -K(1/K log 1/K)= log
2K

Entropy
7
consider the situation where you have just two symbols with
probabilities “p‟ and “(1-p)‟.
Average information per symbol is
At p =0, H =0 and at p = 1, H = 0 again,
The maximum value of H can be obtained as,
H
max= ½ log
22 + ½ log 2 = log
22 = 1
H
max= 1 bit / message
Plot of H can be shown as
The above observation can be generalized
for a source with an alphabet of M symbols.

Information Rate
•Average rate at which information is transferred is
called information rate.
r –Symbol rate
H -Entropy
R=r Hbits/sec
8
A source puts out one of five possible messages during each
message interval. The probs. of these messages are p1 = 1/2 ;
p2 = 1/4 ; p3 = 1/8 : p4 = 1/16 , p5= 1/16
What is the information content of these messages?

Problems on entropy and rate of information
•P1: A discrete source emits one of five symbols once every
milliseconds. The symbols probabilities are 1/2, 1/2, 1/4, 1/8, 1/16.
find the source entropy and information rate.
•P3: An event has 6 possible outcomes with probabilities p1=1/2,
p2=1/4, p3=1/8, p4=1/16, p5=1/32, p6=1/32. Find Entropy and
rate of information if there are 16 outcomes/sec.
•P3: In a binary source, 0s occur three times as often as 1s. Find the
information contained in 0s and 1s. Express the information in bits,
nats and hartleys. If the output data rate is 16,000 symbols/sec, find
the rate of information (Po=P1=3/6=1/2)
•P4: The output of an information consists of 128 symbols. 16 of
which occur with probability of 1/32 and the remaining 112 occur
with a probability of 1/224. The source emits 1000 symbols/sec.
Assume that the symbols are chosen independently. Find a) Entropy
b) rate of information. (H=16*(1/32)*log32+112*(1/224)*log224)
9

Continued..
•P5:Consider a telegraph source having two symbols dot(.) and
dash(-). The dot duration is 0.2 sec and dash duration is 3 times
the dot duration. The probability of dots occurring is twice that of
the dash. Time between the symbols is 0.2 sec. Calculate the
information rate of telegraph source.
(P(.)=2/3,P(-)=1/3, Consider 100 symbols are transmitted that
will contain 66.6 dots and 33.3 dash and 50 spaces of 0.2
duration. So 100 symbols will be emitted during 13.3+20+10
sec. R=100/43.3.
•P7: A high resolution black and white TV consists of 2x10
6
picture elements and 16 different brightness levels. Pictures are
repeated at a rate of 32 per sec. All picture elements are
assumed to be independent and have equally likelihood of
occurrence. Calculate average rate of information conveyed by
the TV picture source.
(Rate of occurrence of picture element is 2x10
6
x32
elements/sec. H=logK=log16)
10

Joint entropy: Entropy of sequences generated from several sources
•To study the behavior of a communication system, we must
simultaneously study the behavior of transmitter and receiver.
This use the concept of two-dimensional probability scheme.
•finding the entropy of the two-dimensional probability scheme
is called as joint entropy.
•Consider two sources S1 and S2 delivering symbols X=(x
1,
x
2,…., x
m) and Y= (y
1, y
2,….., y
n)
•Let P(x
i) and P(y
j) denote the probabilities of X=x
iand Y=y
j,
where x
iand y
jare symbols generated from two sources.
•Then the joint probability of X=x
iand Y=y
jis P(x
i,y
j)
•The joint entropy of X and Y is defined as
mn
H(X,Y) = ∑ ∑ P(x
i, y
j)log(1/P(x
i, y
j))
i=1 j=1
11

Three probabilities and their entropies
m
•H(x) = -∑ P(x
i) log P(x
i)
i=1
n
•H(y) = -∑ P(y
j) log P(y
j)
i=1
m n
H(X,Y) = -∑ ∑ P(x
i, y
j) log P(x
i, y
j)
i=1 j=1
•The joint probability in terms of conditional probability
is
P(xi,yj)= P(xi|yj) P(yj)
12

Conditional entropy
•The conditional probability is given by
P(xi|yj) = P(xi,yj)/ P(yj)
Hence, the conditional entropy of X given Y is
m n
H(X|Y) = ∑ ∑P(xi,yj)log(1/ P(xi|yj))
i=1 j=1
m n
H(X|Y) = -∑ ∑P(xi,yj)log (P(xi|yj)P(yj))
i=1 j=1
m n
H(X,Y) = ∑ ∑ P(x
i, y
j)log(1/P(x
i, y
j))
i=1 j=1
13

Relationship between joint probability and conditional probability
•If we add all the joint probabilities for fixed yj,
then we get
m
∑P(xi,yj) = P(yj)
i=1
similarly,
n
∑P(xi,yj) = P(xi)
j=1
Also m
∑P(xi) = 1
i=1
and n
∑P(yj) = 1
j=1
14

Relationship between joint probability and conditional probability
m n
H(X,Y) = -∑ ∑P(xi,yj) log (P(xi|yj)P(yj))
i=1 j=1
m n
= -∑ ∑P(xi,yj)(log(P(xi|yj))+ log (P(yj))
i=1 j=1
m n m n
= -∑ ∑ P(xi,yj)(log(P(xi|yj))+ ∑ ∑ P(xi,yj)( log (P(yj))
i=1 j=1 i=1 j=1
15
H(X,Y) = H(Y) + H(X|Y) = H(X) + H(Y|X)

Continued..
•If xi and yj are independent events, we know
that P(xi,yj) = P(xi)P(yj) then,
If xi and yj are dependent events, then
H(X|Y) = H(X), H(Y|X) = H(Y)
&
H(X,Y) = H(X) + H(Y)
H(X,Y) = H(Y) + H(X|Y)
H(X,Y) = H(X) + H(Y|X)
16

Problems on conditional and joint entropies
P7: The joint probability function of two random variables X and Y is tabulated
below:
0.2 0.12 0.05
0.08 0.15 0.4
Find a) conditional probabilities P(X/Y), P(Y/X) b) conditional entropies
P8: Two discrete random variables X and Y have binary outcomes defined as
x1 = 0, x2 = 1, y1= 0, y2= 1.
The joint probabilities of X and Y are given as
P(x1,y1) = ¼ , P(x1,y2) = ¼ , P(x2,y1) = ½ , P(x2,y2) = 0
Find the self entropies H(X) and H(Y), the joint entropy H(X,Y), and the
conditional entropy H(X|Y).
P10: Consider a channel with noise characteristics
0.6 0.2 0.2
P(X|Y) = 0.2 0.6 0.2
0.2 0.2 0.6
Find (i) entropy of the source H(X), (ii) entropy of the receiver H(Y)
(iii) Joint entropy, (iv) Conditional entropy
Given, P(x1) = 1/8, P(x2) = 1/8, P(x3) = 6/8
P(X,Y) =
17

Five entropies
•H(X): Average information per character at the transmitter, or
entropy of the transmitter.
•H(Y): Average information per character at the receiver, or
entropy of the receiver.
•H(X,Y):Average information per pair of the transmitted and
received characters, or average uncertainty of the
communication system as a whole.
•H(X|Y): A transmitted character x
jmay be the result of the
reception of y
kwith a given probability, or it is a measure of
information about the transmitter, when it is known that Y is
received.
•H(Y|X): A received character y
k may be the result of the
transmission of x
jwith a given probability, or it is a measure
of information about the receiver, where it is known that X is
transmitted.
18

Analysis of entropies
•H(X) and H(Y)give indications of the probabilistic nature of
the transmitter and receiver respectively.
•H(X|Y)indicates how well one can recover the transmitted
symbols, from the received symbols; i.e, it gives the measure
of equivocation.
•H(Y|X)indicates how well one can recover the received
symbols from the transmitted symbols; i.e, it gives the
measure of error, or noise.
19

Mark-off source: Entropy of source emitting dependent symbol sequences
•In such sequences, occurrence of any symbols depends on the
occurrence of previous symbols in a sequence.
•All practical sources emit sequences of symbols that are statistically
dependent. Such sources is said to be having a memory.
•This statistical dependency reduce the amount of information
coming out of such a source as compare to the amount of
information coming from the source emitting the same set of
symbols in dependent sequences.
•The entropy and information for such discrete sources that emits
dependent sequences of symbols can be define and determine by
developing a statistical model called a mark-off statistical model
and such a source is called as mark-off source.
20

Mark-off source
•The mark off sources are generally specified by the set of
conditional probabilities i.e. p(xi/ s1,s2----sq-1)
•p(xi/s1,s2-----sq-1) indicates the probability of occurrence of
symbols xi in the s
q
th
position given that (s1,s2 -----sq-1)
has occurred. Thus the p(xi) depends on earlier q-1 symbols.
•The conditional probabilities of the mark off source are
termed as transitional probabilities.
21

Mark-off source
•The interval between the occurrence of two successive symbols is the
symbol interval.
•In mark off source, it is assumed that the source changes state
during each symbol intervals.
•The possibility of transition from one state to another state only
depends on the state during the current symbol intervals and does
not depends on the state during any of the preceding symbol
interval.
•When a state changes from ito j it emits symbol and the particular
symbol emitted depends on the initial state iand state transition
from ito j .
•The behavior of the mark off source can be predicted from the state
diagram.
22

Mark-off source
State diagram :
•A state diagram is a graph where the
states are represented by nodes and
transition between states is represented by a directed line from initial to
final state.
The transition probabilities and the symbols emitted corresponding to
various transitions are usually marked along with lines of the graphs.
•For example a state diagram of the source emitting three symbols A,
B and C is as shown in figure.
23
C ¼
B ¼
C ½
A
C ¼
A ¼
A
¼
B
¼
1 B ½
¼
2
3
P
1(1) = 1/3
P
2(1) = 1/3
P
3(1) = 1/3

Mark-off source
Tree diagram :
•Tree diagram is a planar graph where the nodes correspond to states
and branches correspond to transitions. Transitions between states
occur once every Ts seconds.
•Along the branches of the tree, the transition probabilities and
symbols emitted are indicated.
•Tree diagram can be used to obtain the probabilities of generating
various symbol sequences.
•In general the probability of the source emitting a particular symbol
sequence can be computed by summing the product of probabilities
in the tree diagram along all the paths that yield the particular
sequence of interest.
24

25

26
P(AB)=P(S1=1, S2=1, S3=3)
Or
P(S1=2, S2=1, S3=3 )
Or
P(S1=3, S2=1, S3=3 )
P (AB)=P(S1=1, S2=1, S3=3 )
+ P(S1=2, S2=1, S3=3)
+ P(S1=2, S2=1, S3=3)
P(S1=2, S2=1, S3=3)
=P(S1=1)P(S2=1/S1=1)P(S3=3/S1=1,
S2=1)
=P(S1=1)P(S2=1/S1=1)P(S3=3/S2=1)
P(S1=1,S2=1,S3=3 ) = 1/3x1/2x1/4
= 1/24
And
P(AB) = 1/12
Similarly the probs of occurrence of
other symbol sequences can be
computed.

Entropy of Markoff Source
•Entropy of mark off source is the average of entropy of
each state and
n
H = ∑PiHi bits/symbol
i=1
where P
iis the probability that the source is in the state I and n is
the number of states
•the entropy of each state is defined as the average
information contents of the symbol emitted from the i
th
state.
n
Hi = ∑P
ijlog
2(1/P
ij) bits/symbol
j=1
Where P
ijis the probability of going from i to j state
27

Source encoding
•The process of efficient representation of data
generated by a discrete source is known as source
encoding.
•The source encoder convert the symbol sequence into
the binary sequence by assigning codeword to the
symbol in the input sequence,
•the codeword assigned may either be fixed length or
variable length .
•in a fixed length codeword, fixed length binary
codeword is assigned to each symbol whereas, in a
variable length codeword more bits are assigned to
rarely occurring symbols and less bits are assigned to
frequently occurring symbols.
28

Source encoding
•The objective in designing a source encoder is to find
out unique binary code word C
iof the length n
ibits for
the message m
ifor i=1,2,3,……….,q such that the
average number of bits per symbol used in the coding
scheme is as close to the average information per
symbol i.e. G
N
•The performance of the encoder is usually measured in
terms of the coding efficiencywhich is given by
η= G
N/L
where L represents the average number of bits per
source symbol used in the source encoding process.
q
L = ∑ p
i
n
i
i=1
•The source encoder is said to be efficient when η
approaches unity.
29

Source encoding
•The objective in designing a source encoder is to find
out unique binary code word C
iof the length n
ibits for
the message m
isuch that the average number of bits
per symbol used in the coding scheme is as close to the
average information per symbol i.e. G
N
•At the same the code should be uniquely decipherable
to ensure that no source information is lost.
•A necessary and sufficient condition for a binary code to
be uniquely decipherable is given by Kraft inequality
which state that the codeword length should satisfy
m
K = ∑2
Ni
≤ 1
i=1
where N
iis the length of the codeword assigned to message I
and m is the number of distinct messages
30

Problems
P1: Consider a Source emitting messages A,B,C,D with prob.
1/2,1/4,1/8, 1/8 resp. The output is encoded using four
different codes as shown in table. Determine which code
provide the efficient and uniquely decipherable coding?
31
mi pi Code ICode IICode IIICode IV
A 1/2 0 0 0 00
B 1/4 1 01 10 01
C 1/8 10 011 110 10
D 1/8 11 0111 111 11

Shannon-Fano encoding
•Shannon-Fanoencodingis the first established and widely
used encoding method. This method and the corresponding
code were invented simultaneously and independently of each
other by C. Shannon and R. Fanoin 1948
•It provides a procedure for efficiently assigning variable length
binary codes to DMS symbols, such that the resulting 1’s and
0’s appear independently and with almost equal probability.
•Shannon-Fanoencoding constructs reasonably efficient
separable binary codes for sources without memory.
Separable codes are those codes which are uniquely
decipherable
•Shannon’s –fanoalgorithm Provides the powerful & efficient
yet simple algorithm for coding
32

Shanon-Fano encoding
The steps in the Shannon-Fano’sencoding algorithm are
1.Arrange the message m
i in order of decreasing probability Pi
2.Find out the no. of bits n
iin the codeword Cito be assigned
to the message m
iby the bounding equation
3. Log
2(1/Pi)≤ n
i< Log
2(1/Pi)+1
where n
i is an integer which is bounded by the above equation .
i
4.Find out F
i=∑ P
kwith F
1= 0where k=1,..,I
k=1
5.for 1
st
message m1, assign the codeword
(F1)
ni= 000……(n
1) times
6.find out n2, f2= ∑ Pk. Find out the binary equivalent of F2.
Truncate the binary expansion to n2 bits and obtain the
codeword for m2
7.Repeat the steps to obtain Cifor all mi
33

Shanon-Fano encoding
•In general to obtain the codeword Ci for the i
th
message mi,
Find out ni
Find Fi= ∑ Pk
Find out the binary expansion of Fi
Trunket the binary expansion to ni bits to obtain the
codeword Ci for message mi
•Design a source encoder for the information source emitting
source alphabets A, B ,C …….., H with probabilities P(xi) as
1/2, 1/4 ,1/16, 1/16, 1/32, 1/32, 1/32 , 1/32.
34

Shanon-Fano encoding
•Design a source encoder for the information source generating
sequence as shown in the table compare the avg. o/p bit rate and
efficiency of the coder for N= 1 & 2
•N=1,
•N=2
mi Pi ni Ci
A 3/8 2 00
B 3/8 2 01
C 1/4 2 11
mi Pi ni Ci
AA 9/64
BB 9/64
AB 3/32
AC 3/32
BA 3/32
BC 3/32
CA 3/32
CB 3/32
CC 1/16
35

Huffman Encoding
•This encoding algorithm has been proposed by David A.
Huffman in 1952, and it is still the main loss-less
compression basic encoding algorithm.
•The Huffman encoding ensuresconstructing separable
codes(the unique decipherability property holds) with
minimum redundancy for a set of discrete messages
(letters).
•It provides a method for constructing codes with the
shortest average code length.
•But more difficult to implement
36

Huffman Encoding
1.Arrange the message in the order of descending
probability
2.Combine the last two messageinto one message with
prob. equal to addition of two probability.
3.rearrange the message in the order of descending
probability in the second column and repeat step 2.
4.Repeat this procedureuntil the no. of message is reduced
to two.
5.Assign 0 &1to this two message in the last column as
their 1
st
digit in the codeword.
6.Go to back column and assign 0 & 1to the 2
nd
digit for the
two message that were combined in the previous steps .
7.Keep progressing this way until the 1
st
column is reached
8.The code so obtained in the 1
st
column is the final code.
37

Huffman Coding
•Construct the Huffman code for the following set of
messages: x1, x2, x3, x4, x5 with the probabilities
0.4,0.2,0.2,0.1,0.1 resp.
miPiS1S2S3
x10.40.40.40.6
x20.20.20.40.4
x30.20.20.2
x40.10.2
x50.1
mi Pi S1 S2 S3
x1 0.4 0.4 0.4 0.6
x2 0.2 0.2 0.4 0.4
x3 0.2 0.2 0.2
x4 0.1 0.2
x5 0.1
0
1
00
01
1
11
10
00
00
01
10
11
010
011
38

Lossy Source Coding
•Huffman coding and Shannon’s Fano coding techniques
are lossless source codings.
•Because the source information is recovered exactly
without any loss.
•In lossy source coding, some of the information is lost
during coding and that is never recovered.
•Eg: MPEGand JPEG techniques used for video and
images.
39

Communication Channels
•Depending on the terminal point and functionality,
communication channel can be characterized as
Analog Channel
Discrete Channel
•Analog channel
between point d –d’
provide the electrical connection between transmitter and
the receiver.
the input to and the output from this channel are analog
electrical waveform.
it is often called as analog channel, also referred as
modulation channel
40

Discrete Communication Channel
41
d’d
c c’

Analog Channel
•Analog channel is subjected to several varieties of
degradation due to
Amplitude and frequency response variations of the
channel.
Variation of the channel characteristics w.r.t. time.
Nonlinearities in the channel.
These causes the degradation of the signal.
Also it corrupt the signal in a random manner due to
various types of additive, multiplicative noise & fades.
•All of these factors introduces errors in the data
transmissionand limit the maximum rate at which data
can be transmitted over the channel.
42

Discrete Channel
•Discrete channel
Communication channel between point c and c’ is discrete
in nature and
called as discrete communication channel since the input
to and the output from this channel is the discrete signal.
In general the input and output from such channel are the
symbols belonging to set of M symbols.
•Due to errorin the channel, the output symbol may be
different from the input symbol during some symbol
intervals.
•Errors are mainly due to the signal distortion in the
analog portion of the communication channel.
•One of the most important parameter of the discrete
communication channel is the channel capacity which
represent the maximum rate at which data can be
transferred between two points with a small probability
of error.
43

Statistical Model for Discrete Communication Channel
•Channels designed to transmit and receive one of M possible
symbols are called as M-arychannels.
•If M=2then the channel is called as binary channel.
•the discrete channel is characterized by a set of probabilities
P
i
t
and Pij
where P
i
t
is the probability that the i
th
symbol is transmitted and
Pijis the probability that the j
th
symbol is received when j
th
symbol is transmitted over the channel.
•The binary discrete communicationchannelcan be modeled
as
Transmitted Received
symbols X symbolsY
X=0
X=1
Y=1
Y=0
P
00
P
11
P
01 P
10
44

Binary Discrete Channel
•Consider a memory less channel
i.e. the occurrence of an error during any bit interval does not
affect the channel behavior during other bit intervals.
•X & Y -are binary valued discrete random variables as
input and output resp.
•Let P
0
t
–Probability that 0 is transmitted = P(X=0)
•P
1
t
–Probability that 1 is transmitted = P(X=1)
•P
0
r
–Probability that 0 is received = P(Y=0)
•P
1
r
–Probability that 1 is received = P(Y=1)
•Then probability of error is given by
P
e= P(X≠Y)= P(X=0,Y=1)+P(X=1,Y=0)
=P(Y=1/X=0)P(X=0)+ P(Y=0/X=1)P(X=1)
P
e=P
01P
0
t
+ P
10P
1
t
45

Binary Symmetric Channel (BSC)
•The probability of P
0
r
& P
1
r
can be expressed as
P
0
r
=P
00P
0
t
+ P
10P
1
t
P
1
r
= P
11P
1
t
+ P
01P
0
t
Thus the discrete channel is characterized by a set of probabilities.
•In a binary channel when P
00 = P
11=p then the channel is
called as binary symmetric channel (BSC).
where p is the probability of correct transmission
•A BSCchannel can be modeled as
X=0
X=1
Y=1
Y=0
p
p
1-p 1-p
46

Rate of information transmission over a discrete channel
•Let us consider the discrete memory less channel.
•let a source emit symbols X={x1,x2, …….Xm} and the
receiver receive symbols Y={y1,y2,……ym}
•then the average amount of information per symbol
applied to the channel is given by the entropy
M
H(X) = -∑P(x
i)log P(x
i) bits/symbol
i=1
•Similarly,
the entropy of the output Y i.e. average information per
symbol received at the o/p of the channel is given by
M
H(Y) = -∑P(y
i)log P(y
i) bits/symbol
i=1
47

Rate of information transmission over a discrete channel
•If the channel is noiselessthen the reception of some symbol yj
uniquely determines the message transmitted.
•Because of noise, however there is a certain amount of uncertainty
regarding the transmitted symbol when yjis received.
•if P(xi/yj) represent the conditional probability that xiwas
transmitted when yjis received then the conditional entropy H(X/Y)
represent the average uncertainty about the transmitted symbol xi
when symbol yjis received.
•Or H(X/Y) represent how uncertain we are of X, on an average when
we know Y.
H(X/Y) =-∑
i∑
jP(xi,yi)logP(xi/yj)
-there is no uncertainty about input X when output Y is known
Y = X -means uncertainty is zero i.e. H(x/y) = 0
-no information is lost in the channel since the output is
uniquely related to the input.
•thus H(x/y)represent the measure of amount of information
lostin the channel
48

Rate of information transmission over a discrete channel
•Accordingly, we can define the amount of information transmitted
over a channelby subtracting the information lost from the amount
of information supplied to the channel.
•i.e. amount of information transmitted over a channel (or received
by the receiver) is given by
I(X;Y)=H(X)-H(X/Y)
I (x:y) is called as mutual information of x & y .
•if r
sis the symbol rate , then the avg. rate at which information
applied to the channel is given by
R
in= H(X) r
sbits/sec.
•and the rate of information transmission over a channel is given by
R
t= I (X:Y) r
s
= [H(X)-H(X/Y)] rsbits/sec.
49

Channel matrix
•A channel is completely specified by the
complete set of transition probabilities given
by [P(Y|X)]
y
1y
2… y
n
x
1
P(y
1/x
1) P(y
2/x
1) … P(y
n/x
1)
[P(Y|X)] = x
2
P(y
1/x
2) P(y
2/x
2) … P(y
n/x
2)
: : : :
x
m
P(y
1/x
m) P(y
2/x
m) … P(y
n/x
m)
50

Example
1. If the source emits one symbol/message and it provides input to the
channel given below find the maximum rate of information
transmission over the channel. Given r
s=1000 symbols/sec
Channel matrix P(Y|X)]=
2. For the channel matrix given in above problem, determine the rate
of Information when symbol probability p(x1)=1/2, p(x2)=1/4,
p(x3)=1/4. Also determine the maximum possible rate of
information transmission.
0.8
B
A
C
A
C
B
0.9
0.2
0.3
0.7
0.1
0.8
0.3
0.1
00.2
0.70
00.9
x1
y1y2y3
x2
x3
x1
x2
x3
y1
y2
y3
51

Channel Capacity
•The channel capacity C of the channel is defined as the
maximum possible rate of information that can be transmitted
through the channel.
•we have seen that the information transmitted is a function of
probabilities of transmission of the input symbols and channel
matrix
•for a given channel, I (X,Y) will be maximum for some of
probabilities P(xi). This maximum value is the channel capacity.
C = max {I(X,Y)} bits/ symbols
P(Xi)
Thus C represent the maximum information that can be transmitted per
symbol over the channel.
•In BSC, the maximum rate of transmission occurs when
P(0)=P(1)=p=0.5.
•Inordertomaximizetherateofinformationtransferitis
requiredtooptimizethesourcestatisticsthroughsourcecoding
52

Shanon’s Channel Capacity Theorem
•Let Cbe the capacity of a discrete memory less channel and
H be the entropy of discrete information source emitting r
s
symbols/ sec, then the shannon’scapacity theorem states
that if r
sH≤ C then there exist a coding scheme such that the
output of the source can be transmitted over the channel with
the arbitrarily small probability of errors.
•A converse of these theorem states that it is not possible to
transmit the message without error if r
sH> C
•Thus the capacity theorem predicts essentially error free
transmission over a noisy channel,
•Unfortunately, the theorem tells us only about the existence
of the codes and says nothing about how to construct them.
53

Shanon’s Hartley Channel Capacity Theorem
•The capacity of a channel in AWGN in terms of bandwidth B
and SNR is given by
C = B log
2(1+S/N) bits/sec
where S-Signal power
N-Noise power
•Theorem provide two important information
Gives the relation between bandwidth of the channel and SNR to achieve
the desired channel capacity.
Provide upper limit that can be reached for reliable transmission rate
over AWGN channel
Thus the system designer can try to optimize the system to have a data
rate as close to channel capacity C with an acceptable error rate
54

Thank You…..