BSM with Sofware package for Social Sciences

profgnagarajan 53 views 39 slides Sep 21, 2024
Slide 1
Slide 1 of 39
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39

About This Presentation

BSM with Sofware package for Social Sciences


Slide Content

Unit-2 PROBABILITY AND RANDOM VARIABLES & PROBABILITY DISTRIBUTIONS

Basic Concepts of Probability

Probability Experiments Probability experiment: An action, or trial, through which specific results (counts, measurements, or responses) are obtained. Outcome: The result of a single trial in a probability experiment. Sample Space: The set of all possible outcomes of a probability experiment. Event: Consists of one or more outcomes and is a subset of the sample space. Probability experiment: Roll a die Outcome : {3} Sample space: {1, 2, 3, 4, 5, 6} Event : {Die is even}={2, 4, 6}

Example: Identifying the Sample Space A probability experiment consists of tossing a coin and then rolling a six-sided die. Describe the sample space. Solution : There are two possible outcomes when tossing a coin: a head (H) or a tail (T). For each of these, there are six possible outcomes when rolling a die: 1, 2, 3, 4, 5, or 6. One way to list outcomes for actions occurring in a sequence is to use a tree diagram. Tree diagram: H1 H2 H3 H4 H5 H6 T1 T2 T3 T4 T5 T6 The sample space has 12 outcomes: {H1, H2, H3, H4, H5, H6, T1, T2, T3, T4, T5, T6}

Simple Events An event that consists of a single outcome. e.g . “Tossing heads and rolling a 3” {H3} An event that consists of more than one outcome is not a simple event.e.g . “Tossing heads and rolling an even number” {H2, H4, H6 } Example: Identifying Simple Events Determine whether the event is simple or not. You roll a six-sided die. Event B is rolling at least a 4. Solution: Not simple (event B has three outcomes: rolling a 4, a 5, or a 6)

Some Basic Relationships of Probability There are some basic probability relationships that can be used to compute the probability of an event without knowledge of all the sample point probabilities . Complement of an Event The complement of event A is defined to be the event consisting of all sample points that are not in A The complement of A is denoted by A c . Complement of an Event Union of Two Events Intersection of Two Events Mutually Exclusive Events Event A A c Sample Space S Venn Diagram

Intersection of Two Events The intersection of events A and B is the set of all sample points that are in both A and B . The intersection of events A and B is denoted by A  B Event A Event B Intersection of A and B Sample Space S Mutually Exclusive Events If events A and B are mutually exclusive, P ( A  B ) = 0. The addition law for mutually exclusive events is: P ( A  B ) = P ( A ) + P ( B ) There is no need to include “ - P ( A  B  ”

Formal Probability 1.Two requirements for a probability: A probability is a number between 0 and 1. For any event A , 0 ≤ P ( A ) ≤ 1 . 2.Probability Assignment Rule: The probability of the set of all possible outcomes of a trial must be 1. P ( S ) = 1 ( S represents the set of all possible outcomes.) 3. Complement Rule: The set of outcomes that are not in the event A is called the complement of A , denoted A C . The probability of an event occurring is 1 minus the probability that it doesn’t occur: P ( A ) = 1 – P( A C ) 4. Addition Rule (cont.): For two disjoint events A and B , the probability that one or the other occurs is the sum of the probabilities of the two events. P ( A or B ) = P ( A ) + P ( B ) , provided that A and B are disjoint. 5.Multiplication Rule: For two independent events A and B , the probability that both A and B occur is the product of the probabilities of the two events. P ( A and B ) = P ( A ) x P ( B ) , provided that A and B are independent.

Addition Law The addition law provides a way to compute the probability of event A, or B, or both A and occurring The law is written as: Example : A deck of 52 cards is shuffled. What is the probability of drawing : A heart or a diamond ? A face card (King, Queen, Jack) or an Ace ? Solution: P(Heart ) = 13/52 = 1/4 P(Diamond ) = 13/52 = 1/4 P(Heart ∪ Diamond) = P(Heart) + P(Diamond) = 1/4 + 1/4 = 1/2 P(Face card) = 12/52 = 3/13 P(Ace ) = 4/52 = 1/13P(Face card ∪ Ace) = P(Face card) + P(Ace) = 3/13 + 1/13 = 4/13 P ( A  B ) = P ( A ) + P ( B ) - P ( A  B 

Conditional Probabilities To find the probability of the event B given the event A , we restrict our attention to the outcomes in A . We then find in what fraction of those outcomes B also occurred. ( On the Formula Sheet) Note: P ( A ) cannot equal 0, since we know that A has occurred Example: A box contains:20 red balls 30 blue balls 10 green balls If a ball is drawn and it's not blue, what is the probability that it's red ? Solution : P(Red ) = 20/60 P(Not Blue) = 30/60 P( Red|Not Blue) = P(Red) / P(Not Blue)= (20/60) / (30/60)= 2/3

We encountered the general multiplication rule in the form of conditional probability. Rearranging the equation in the definition for conditional probability, we get the General Multiplication Rule : For any two events A and B , P ( A and B ) = P ( A ) x P ( B | A ) Or P ( A and B ) = P ( B ) x P ( A | B ) Example :A survey found that : 60 % of males smoke 40 % of females smoke Males constitute 55 % of the population Females constitute 45% of the population What is the probability that : a . A randomly selected person is a male smoker ? b . A randomly selected smoker is female ? Solution: P(Male ∩ Smoker) = P(Male) × P( Smoker|Male ) = 0.55 × 0.6 = 0.33 P(Female ∩ Smoker) = P(Female) × P( Smoker|Female ) = 0.45 × 0.4 = 0.18 The General Multiplication Rule(Count)

Independent and Dependent events Independent Events : Events A and B are independent if the occurrence or non-occurrence of one does not affect the probability of the other . Characteristics: 1 . P(A ∩ B) = P(A) × P(B ) 2 . P(A|B) = P(A ) 3 . P(B|A) = P(B ) Dependent Events : Events A and B are dependent if the occurrence or non-occurrence of one affects the probability of the other . Characteristics: 1 . P(A ∩ B) ≠ P(A) × P(B ) 2 . P(A|B) ≠ P(A ) 3 . P(B|A) ≠ P(B)

Bayes theorem Introduction: Bayes ' theorem is also known as Bayes' rule, Bayes' law, or Bayesian reasoning, which determines the probability of an event with uncertain knowledge. In probability theory, it relates the conditional probability and marginal probabilities of two random events . Bayes' theorem was named after the British mathematician Thomas Bayes. The Bayesian inference is an application of Bayes' theorem, which is fundamental to Bayesian statistics. What is Bayes Theorem? Bayes Theorem is a method of calculating conditional probability. • The traditional method of calculating conditional probability (the probability that one event occurs given the occurrence of a different event) is to use the conditional probability formula, calculating the joint probability of event one and event two occurring at the same time, and then dividing it by the probability of event two occurring. • However, conditional probability can also be calculated in a slightly different fashion by using Bayes Theorem.

When calculating conditional probability with Bayes theorem, you use the following steps: Determine the probability of condition B being true, assuming that condition A is true. Determine the probability of event A being true. Multiply the two probabilities together. Divide by the probability of event B occurring. This means that the formula for Bayes Theorem could be expressed like this: P(A|B) = P(BA)*P(A) / P(B) Calculating the conditional probability like this is especially useful when the reverse conditional probability can be easily calculated, or when calculating Sthe joint probability would be too challenging . How ? It is a way to calculate the value of P(BIA) with the knowledge of P(A|B) Bayes' theorem allows updating the probability prediction of an event by observing new information of the real world. Example: If cancer corresponds to one's age then by using Bayes' theorem, we can determine the probability of cancer more accurately with the help of age.

Bayes' theorem Bayes' theorem can be derived using product rule and conditional probability of event A with known event B: As from product rule we can write: P(A/B)= P(A|B) P(B) or Similarly , the probability of event B with known event A: • P(A/B)= P(BIA) P(A) Equating right hand side of both the equations, we will get : The above equation (a) is called as Bayes' rule or Bayes' theorem. This equation is basic of most modern Al systems for probabilistic inference.

• It shows the simple relationship between joint and conditional probabilities. Here, • P(A|B) is known as posterior , which we need to calculate, and it will be read as Probability of hypothesis A when we have occurred an evidence B. • P(BIA) is called the likelihood, in which we consider that hypothesis is true, then we calculate the probability of evidence. • P(A) is called the prior probability , probability of hypothesis before considering the evidence • P(B) is called marginal probability , pure probability of an evidence. • In the equation (a), in general, we can write P (B) = P(A)*P(B | Ai), hence the Bayes' rule can be written as: • Where A1, A2, A3,........, A, is a set of mutually exclusive and exhaustive events.

Bayes Theorem Statement Let E 1 , E 2 ,…, E n  be a set of events associated with a sample space S, where all the events E 1 , E 2 ,…, E n  have nonzero probability of occurrence and they form a partition of S. Let A be any event associated with S, then according to Bayes theorem, for any k = 1, 2, 3, …., n Bayes Theorem Proof According to the conditional probability formula, Using the multiplication rule of probability, Using total probability theorem,

Putting the values from equations (2) and (3) in equation 1, we get Example : Scenario : You receive an email, and you want to determine the probability that it is spam based on the presence of the word "discount" in the email . Given Data : Prior probability of spam:𝑃(Spam)=0.3P(Spam)=0.3 (30% of emails are spam ). Prior probability of not spam:𝑃(Not Spam)=0.7P(Not Spam)=0.7 (70% of emails are not spam ). Probability of "discount" given spam:𝑃(" discount"∣Spam )=0.8P(" discount"∣Spam )=0.8 (80% of spam emails contain the word "discount "). Probability of "discount" given not spam:𝑃(" discount"∣Not  Spam)=0.1P(" discount"∣Not  Spam)=0.1 (10% of non-spam emails contain the word "discount").

Objective: We want to find P( Spam∣"discount ") the probability that an email is spam given that it contains the word "discount". Step 1: Calculate P("discount") Using the law of total probability: P("discount")=P(" discount"∣Spam )⋅P(Spam)+P(" discount"∣Not  Spam)⋅P(Not Spam) Calculating: P("discount")=(0.8⋅0.3)+(0.1⋅0.7) P("discount")=0.24+0.07=0.31 Step 2: Apply Bayes' Theorem Now we can use Bayes' Theorem: P( Spam∣"discount ")=P(" discount"∣Spam )⋅P(Spam)/P("discount") ​Substituting in the values we have: P( Spam∣"discount ")=0.8*0.3/0.31 =0.24/0.31≈0.7742

Random Variables We can associate each single outcome of an experiment with a real number: We refer to the outcomes of such experiments as a “ random variable ” . Why is it called a “ random variable ” ? Definition For a given sample space S of some experiment, a random variable ( r.v .) is a rule that associates a number with each outcome in the sample space S . In mathematical language, a random variable is a “function” whose domain is the sample space and whose range is the set of real numbers: X : S ! R So , for any event s , we have X(s)=x is a real number

Types of Random Variables Discrete Random Variables : Take on a countable number of values. Example: The number of heads when flipping three coins. Possible values are 0, 1, 2, or 3. Continuous Random Variables : Can take on any value within a given range. Example: The height of students in a classroom. Possible values can be any real number within a certain range (e.g., 150 cm to 200 cm). Probability Distributions for Discrete Random Variables Probabilities assigned to various outcomes in the sample space S , in turn, determine probabilities associated with the values of any particular random variable defined on S . The probability mass function ( pmf ) of X , p(X) describes how the total probability is distributed among all the possible range values of the r.v . X : p(X=x), for each value x in the range of X Often , p(X=x) is simply written as p(x) and by definition p ( X = x ) = P ( {s 2 S|X ( s ) = x} ) = P ( X — 1 ( x )) Note that the domain and range of p(x) are real numbers .

Example A lab has 6 computers. Let X denote the number of these computers that are in use during lunch hour - - {0, 1, 2… 6 }. Suppose that the probability distribution of X is as given in the following table: From here, we can find many things: Probability that at most 2 computers are in use Probability that at least half of the computers are in use Probability that there are 3 or 4 computers free

Expectation of a random variable- Mean, Variance and Standard Deviations. The expectation, mean, variance, and standard deviation are fundamental concepts in probability and statistics related to random variables. Expectation (Mean): The expectation, or mean, of a random variable provides a measure of the central tendency. It indicates where the values of the random variable are concentrated. Discrete Random Variable: For a discrete random variable 𝑋, the expected value 𝐸(𝑋) is calculated as: where 𝑃(𝑋=𝑥)P is the probability of X taking the value 𝑥. Continuous Random Variable For a continuous random variable X, the expected value is calculated as: where 𝑓(𝑥 ) is the probability density function (PDF) of 𝑋.

Variance: Variance measures the dispersion or spread of the random variable's values around the mean. A higher variance indicates that the values are more spread out. Discrete Random Variable: For a discrete random variable X, the variance Var (X) is calculated as: Alternatively, it can also be expressed as: Continuous Random Variable For a continuous random variable 𝑋X, the variance is given by: Standard Deviation The standard deviation is the square root of the variance. It provides a measure of the average distance of the random variable's values from the mean, making it easier to interpret than variance. Formula SummaryMean (Expectation) 𝐸(𝑋 ) : The average value of a random variable.Variance 𝑉𝑎𝑟(𝑋 ) : The measure of how much the values vary around the mean.Standard Deviation 𝑆𝐷( 𝑋 ) The average distance of values from the mean, providing context for the variance.

The Binomial Probability Distribution Binomial distribution was given by Swiss mathematician James Bernouli (1654-1705) in 1700 and it was first published in 1713. It is also known as ' Bernouli Distribution '. Definition The binomial random variable X associated with a binomial experiment consisting of n trials is defined as X = the number of S’s among the n trials This is an identical definition as X = sum of n independent and identically distributed Bernoulli random variables, where S is coded as 1, and F as 0. Imagine a simple trial with only two possible outcomes Success ( S ) Failure ( F ) Examples Toss of a coin (heads or tails) Sex of a newborn (male or female) Survival of an organism in a region (live or die) Suppose that the probability of success is p What is the probability of failure? q = 1 – p

Examples Toss of a coin ( S = head): p = 0.5  q = 0.5 Roll of a die ( S = 1): p = 0.1667  q = 0.8333 Fertility of a chicken egg ( S = fertile): p = 0.8  q = 0.2 Imagine that a trial is repeated n times Examples A coin is tossed 5 times A die is rolled 25 times 50 chicken eggs are examined Assume p remains constant from trial to trial and that the trials are statistically independent of each other W hat is the probability of obtaining x successes in n trials? Example What is the probability of obtaining 2 heads from a coin that was tossed 5 times? P ( HHTTT ) = (1/2) 5 = 1/32

But there are more possibilities: HHTTT HTHTT HTTHT HTTTH THHTT THTHT THTTH TTHHT TTHTH TTTHH P (2 heads) = 10 × 1/32 = 10/32 However, if order is not important, then where is the number of ways to obtain x successes in n trials, and i ! = i  ( i – 1)  ( i – 2)  …  2  1 n ! x !( n – x )! P ( x ) = n ! x !( n – x )! p x  q n – x

CHARACTERISTICS OF BINOMIAL DISTRIBUTION It is a discrete distribution which gives the theoretical probabilities. It depends on the parameter p or q, the probability of success or failure and n(i.e. The number of trials). The parameter n is always a positive integer. The distribution will be symmetrical if p-q. It is skew symmetric or asymmetric if p is not equal to q . The statistics of the binomial distribution are, mean=np variance= npq , and standard deviation = The Mode of the binomial distribution is equals to that value of X which has longer frequency  

What is Passion Distribution? Poisson distribution is a limiting form of the binomial distribution in which n, the number of trials, becomes very large & p, the probability of success of the event is very very small . When there is a large number of trials, but a small probability of success, binomial calculation becomes impractical Example: Number of deaths from horse kicks in the Army in different years The mean number of successes from n trials is µ = np Example: 64 deaths in 20 years from thousands of soldiers If we substitute µ / n for p , and let n tend to infinity, the binomial distribution becomes the Poisson distribution: P ( x ) = e -µ µ x x !

Poisson distribution is applied where random events in space or time are expected to occur Deviation from Poisson distribution may indicate some degree of non-randomness in the events under study Investigation of cause may be of interest Emission of  - particles: Rutherford, Geiger, and Bateman (1910) counted the number of  -particles emitted by a film of polonium in 2608 successive intervals of one-eighth of a minute What is n ? What is p ? o their data follow a Poisson distribution ? Calculation of µ : µ = No. of particles per interval = 10097/2608 = 3.87 Expected values: 2680  P ( x ) = No.  -particles Observed 57 1 203 2 383 3 525 4 532 5 408 6 273 7 139 8 45 9 27 10 10 11 4 12 13 1 14 1 Over 14 Total 2608 2608  e - 3.87 (3.87) x x !

Random events Regular events Clumped events

The Expected Value of a Discrete Random Variable The Variance of a Discrete Random Variable

Eggs are packed into boxes of 500. On average 0.7% of the eggs are found to be broken when the eggs are unpacked. Find the probability that in a box of 500 eggs Exactly three are broken. At least two are broken Mean = 500 x 0.007 = 3.5 At least two are broken.

The Normal Distribution Discovered in 1733 by de Moivre as an approximation to the binomial distribution when the number of trails is large Derived in 1809 by Gauss Importance lies in the Central Limit Theorem, which states that the sum of a large number of independent random variables (binomial, Poisson, etc.) will approximate a normal distribution Example : Human height is determined by a large number of factors, both genetic and environmental, which are additive in their effects. Thus, it follows a normal distribution . A continuous random variable is said to be normally distributed with mean  and variance  2 if its probability density function is f ( x ) = 1 2  ( x   ) 2 /2  2 e f ( x ) = 1 2  ( x   ) 2 /2  2 e

The shape and position of the normal distribution curve depend on two parameters, the mean and the standard deviation . Each normally distributed variable has its own normal distribution curve, which depends on the values of the variable’s mean and standard deviation.

Normal Distribution Properties The normal distribution curve is bell-shaped. The mean, median, and mode are equal and located at the center of the distribution. The normal distribution curve is unimodal (i.e., it has only one mode). The curve is symmetrical about the mean, which is equivalent to saying that its shape is the same on both sides of a vertical line passing through the center. The curve is continuous—i.e., there are no gaps or holes. For each value of X , there is a corresponding value of Y . The curve never touches the x- axis. Theoretically, no matter how far in either direction the curve extends, it never meets the x- axis—but it gets increasingly close. The total area under the normal distribution curve is equal to 1.00 or 100%. The area under the normal curve that lies within one standard deviation of the mean is approximately 0.68 (68%). two standard deviations of the mean is approximately 0.95 (95%). three standard deviations of the mean is approximately 0.997 ( 99.7%).

Standard Normal Distribution Since each normally distributed variable has its own mean and standard deviation, the shape and location of these curves will vary. In practical applications, one would have to have a table of areas under the curve for each variable. To simplify this, statisticians use the standard normal distribution . The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1 . The z value is the number of standard deviations that a particular X value is away from the mean. The formula for finding the z value is:

Area under the Standard Normal Distribution Curve To the left of any z value: Look up the z value in the table and use the area given . To the right of any z value: Look up the z value and subtract the area from 1. Between two z values: Look up both z values and subtract the corresponding areas