PROBABILITY AND PROBABILITY DISTRIBUTIONS Dambi Dollo University 4/10/2024 1
Learning Objectives 4/10/2024 2 At the end of this session, the student will be able to: Understand the concepts and characteristics of probabilities and probability distributions Compute probabilities of events Differentiate between the binomial and normal distributions Understand the concepts and uses of the standard normal distribution
Brain storming 4/10/2024 3 What is probability?????? Why Probability in Statistics? When we talk about probability??? What is the importance of probability in medicine? Try ????
Probability 4/10/2024 4 Probability is the chance of an outcome of an experiment. It is the measure of how likely an outcome is to occur. The probability that something occurs is the proportion of times it occurs when the same experiment is repeated a very large (preferably infinite) number of times (Empirical or relative frequentist definition)
Probability … 4/10/2024 5 If there are n equally likely possibilities, of which one must occur and m are regarded as favorable, or as a “success,” then the probability of a “success” is m/n (classical prob.concept) Example: What is the probability of rolling a 6 with a well-balanced die? Possible outcomes are (1, 2,3,4,5,6) In this case, m=1 and n=6, so that the probability is 1/6 = 0.167
Probability …. 4/10/2024 6 Measure of the strength of belief in the occurrence of an uncertain event It is a mathematical construction that determines the likelihood of occurrence of events that are subject to chance An event is subject to chance, means the outcome is in doubt and there are at least two possible outcomes
Why Probability in Statistics? 4/10/2024 7 Results are not certain, uncertainty is high. To evaluate how accurate our results are: given how our data were collected, are our results accurate? Given the level of accuracy needed, how many observations need to be collected ? The sample size issue? Used to understand about sampling and sampling distributions, estimation, hypothesis testing and advanced statistical analysis
When can we talk about probability ? 4/10/2024 8 When dealing with a process that has an uncertain outcome Birth of male or female child? A patient taking certain drug(cure/no)?
Why Probability in Medicine? 4/10/2024 9 Because medicine is a science of probability and uncertainty, Physicians and Public Health seldom predict an outcome with absolute certainty E.g., to formulate a diagnosis, a physician must rely on available diagnostic information about a patient History and physical examination Laboratory investigation, X-ray findings, ECG, etc
Why Probability…. 4/10/2024 10 Although no test result is absolutely accurate, it does affect the probability of the presence (or absence) of a disease. - Sensitivity and specificity An understanding of probability is fundamental for quantifying the uncertainty that is inherent in the decision-making process Probability theory is a foundation for statistical inference, & Allows us to draw conclusions about a population based on information obtained from a sample drawn from that population.
Definition and Basic concepts 4/10/2024 11 Experiment : a process of obtaining outcomes for uncertain events Example:- Tossing a coin and observing the face showing up The outcome of the sex of a newborn from a mother in delivery room is either Male or female measurements of social awareness among mentally disturbed children or measurements of blood pressure among a group of students.
Definition … 4/10/2024 12 Probability Experiment: It is an experiment that can be repeated any number of times under similar conditions and it is possible to enumerate the total number of outcomes with out predicting an individual outcome. Trial: Performing a random experiment is called a trial. Outcomes: A result of an experiment/trial. When two coins are tossed the possible outcomes are HH, HT,TH, TT.
Definition … 4/10/2024 13 Event: is an outcome or a combination of outcomes of a random experiment. E.g., if the experiment is to flip one fair coin, event A might be getting at most one head. The probability of an event A is written P(A). An event either occurs or it does not occur Sample space: is the set of all possible outcomes of a random experiment. It is a collection of unique, non-overlapping possible outcomes of a random circumstance.
Examples 4/10/2024 14
Definition … 4/10/2024 15 Equally likely events are events that have the same probability of occurring. Mutually exclusive events : when the occurrence of any one event excludes the occurrence of the other event . Mutually exclusive events cannot occur simultaneously. Independent Events: Two events are independent if the occurrence of one does not affect the probability of the other occurrence
P (A) + P (A C ) = 1 Definition … 4/10/2024 16 Complement Sometimes, we want to know the probability that an event will not happen; an event opposite to the event of interest is called a complementary event. If A is an event, its complement is AC or A Example: The complement of the male event is the female
Properties of Probability 4/10/2024 17 The numerical value of a probability always lies between 0 and 1, inclusive. P(E) 1 A value 0 means the event can not occur A value 1 means the event definitely will occur A value of 0.5 means that the probability that the event will occur is the same as the probability that it will not occur.
Properties…. 4/10/2024 18 The sum of the probabilities of all mutually exclusive outcomes is equal to 1. P(E 1 ) + P(E 2 ) + .... + P(E n ) = 1. For two mutually exclusive events A and B, P(A or B ) = P(A U B)= P(A) + P(B); P (A ∩ B) = 0 The complement of an event A, denoted by Ā or Ac, is the event that A does not occur Consists of all the outcomes in which event A does NOT occur P(Ā) = P(not A) = 1 – P(A) » Ā occurs only when A does not occur. These are complementary events.
Example 4/10/2024 19 A= New born is LBW so the complement of A is the event that a newborn is not LBW In other words, A is the event that the child weighs less than 2500 grams at birth, let P(A)=0.076 , so what is P(Ac ) ? P( A c ) = 1 − P( A ) P(not low bwt ) = 1 − P(low bwt ) = 1− 0.076 = 0 . 924
Basic Probability Rules 4/10/2024 20 1. Addition rule If events A and B are mutually exclusive: P(A or B) = P(A) + P(B), P(A and B) = 0 More generally: If A and B are any events, then P(A or B) = P(A) + P(B) - P(A and B) – P(event A or event B occurs or they both occur)
Example: The probabilities below represent years of schooling completed by mothers of newborn infants 4/10/2024 21
Example…. 4/10/2024 22 What is the probability that a mother has completed < 12 years of schooling? What is the probability that a mother has completed 12 or more years of schooling? Solutions a. What is the probability that a mother has completed < 12 years of schooling? P(≤ 8 years) = 0.056 and P(9-11 years) = 0.159 Since the these two events are mutually exclusive, P(≤ 8 years or 9-11) = P(≤ 8 years U 9-11) = P(≤ 8) + P(9-11) = 0.056+0.159 = 0.215
Example…. 4/10/2024 23 b. What is the probability that a mother has completed 12 or more years of schooling? P(≥12) = P(12 or 13-15 or ≥16) = P(12 U 13-15 U ≥ 16) = P(12)+P(13-15)+P(≥ 16) = 0.321+0.218+0.230 P(≥12)= 0.769
2. Multiplication rule 4/10/2024 24 If A and B are independent events, then P(A ∩ B) = P(A) × P(B) Two events A and B are called dependent events if P(A ∩ B) ≠ P(A) × P(B) More generally: If A and B are any events, then P(A ∩ B) = P(A) P(B|A) = P(B) P(A|B) P(A and B) denotes the probability that A and B both occur at the same time.
Example: 4/10/2024 25 Hypertension, Genetics: Consider all possible DBP measurements from a mother and her first born child. Let event A = {mother’s DBP ≥ 95}, B = {first-born child’s DBP ≥ 80}. Suppose Pr(A ∩ B) = 0.05, Pr(A) = 0.1, Pr(B) = 0.2 Then, are A and B independent events?
Probability Distribution 3/14/2022 26 Probability Distribution is a device (table, graph, mathematical formula) used to describe the distribution that a random variable may have. It summarizes the relationship between the values of a random variable and the probabilities of their occurrence It is a way to shape the sample data to make predictions and draw conclusions about an entire population. Once we find the appropriate distribution, we can use it to make inferences and predictions. Random variable is a numerical quantity that takes on different values depending on chance
Probability Distribution… 3/14/2022 27
A. Discrete Probability distribution 3/14/2022 28 Discrete random variables have a finite set of possible outcomes or countable number of values. Us ed to model uncertain events of variable with countable possible outcome . E.g . Number of children per family, Episodes of diarrhea among U5yrs With categorical variables, we obtain the frequency distribution of each variable With numeric variables, the aim is to determine whether or not normality may be assumed
Discrete Probability distribution … 4/10/2024 29 We represent a potential outcome of the random variable X by x Let P(X = x) denote the probability of the random variable X equals x, then 0 ≤ P(X = x) ≤ 1 and Σ P(X = x) = 1
Example: 4/10/2024 30 a. What is the probability that a patient receives exactly 3 diagnostic services? b. What is the probability that a patient receives at most one diagnostic service? c. What is the probability that a patient receives at least four diagnostic services?
Example … 4/10/2024 31 What is the probability that a patient receives exactly 3 diagnostic services? P(X=3) = 0.031 What is the probability that a patient receives at most one diagnostic service? P (X≤1) = P(X = 0) + P(X = 1) = 0.671 + 0.229 = 0.900 What is the probability that a patient receives at least four diagnostic services? P(X≥4)=P(X=4)+P(X=5)=0.010+0.006= 0.016
Discrete Probability… 4/10/2024 32 Examples of discrete probability distributions are : Binomial Distribution and Poisson Distribution
The Binomial distribution 4/10/2024 33 It is one of the most widely encountered discrete probability distributions. Bernoulli trial random event characterized by two mutually exclusive outcomes (success or failure; dead or alive; sick or well, male or female) E.g. Survival (yes or no) Suppose an event can have only binary outcomes A and B. Let the probability of A is p and that of B is 1 - p . The probability p stays the same each time the event occurs. Success could mean anything you want to consider as a positive or negative outcome.
Binomial Distribution 34 A family of distributions identified by two parameters n the number of trials p the probability of success for each trial Notation: X ~b(n,p) X random variable ~ “distributed as” b( n, p) binomial RV with parameters n and p If we treat 4 patients with a treatment of 75% success. X random number of successes, which varies 0, 1, 2, 3, or 4 depending on binomial distribution X ~b(4, 0.75)
Assumption 35 There are only two possible outcomes to each trial (success and failure). The outcomes of the n trials are independent( the probability of the second trial is not affected by the first trial.) The probability of a success ( p ) , remains constant for each trial . The probability of a failure , 1-p , = q . There are a fixed number of trials n , each of which results in one of two mutually exclusive outcomes
The Binomial Formula 3/14/2022 36 Pr( X = x )= n C x p x q n – x Where nCx = the binomial coefficient (next slide) p = probability of success for each trial q = probability of failure = 1 – p x= 0, 1 …n The mean : μ = np=E(x) The variance : σ 2 = Var(X) = npq
Binomial Coefficient (“Choose Function”) 37 n C x = where ! the factorial function: Example:4! = 4 × 3 × 2 × 1 = 24 By definition 1! = 1 and 0! = 1 n C x the number of ways to choose x items out of n Example: “4 choose 2”: 4 C 2 = = = = 6
Cont … 4/10/2024 38
4/10/2024 39
Example: 4/10/2024 40 Suppose that in a certain population, 52% of all recorded births are males. If we select randomly 10 birth records, what is the probability that exactly 5 will be males? Given: n=10, x =5 and, p = 0.52 P(X=x) = Therefore, Pr (X=5) = 3 or more will be females? Pr (X≥3) = 1- Pr (X<3) = 1-[ Pr (X=0)+ Pr (X=1)+ Pr (X=2)] =1-[0.001+0.013+0.055]= 1-0.069=0.931 E(x)=Mean= n.p =10*0.52=5.2
Exercise 41 Suppose that in a certain malarias area past experience indicates that the probability of a person with a high fever will be positive for malaria is 0.7. Consider 3 randomly selected patients (with high fever) in that same area. What is the probability that no patient will be positive for malaria? What is the probability that exactly one patient will be positive for malaria? What is the probability that exactly two of the patients will be positive for malaria? What is the probability that all patients will be positive for malaria?
B. Continuous Probability Distribution 42 Deals with continuous variables. A continuous random variable X can take on any value within a specified interval or range. Instead of assigning probabilities to specific outcomes of the random variable X , probabilities are assigned to ranges of values Thus, the probability of a continuous random variable assume values between a and b is denoted by P(a<X<b)
Continuous Probability Distributions … 4/10/2024 43 figure shows graph of a continuous distribution showing area between a and b. The area under the smooth curve is equal to 1. The area under the curve between any two points a and b is the probability that x takes the value between a and b.
Continuous Probability distr … 44 The probability associated with any one particular value is equal to i.e. P(X=2)=0 Also, P(X ≥ x) = P(X > x) This is the Normal probability density function (pdf) The commonest and the most useful continuous distribution is the normal distribution .
The Normal distribution 4/10/2024 45 The Normal Distribution is the most important probability distribution in statistics. Frequently called the “Gaussian distribution” or bell-shaped curve. The distributions of many medical measurements in populations follow a normal distribution (e.g. Serum uric acid levels, cholesterol levels, blood pressure, height, and weight). The real importance of the ND will be seen in the areas of estimation and hypothesis testing
Normal distrib … 4/10/2024 46 The normal Distribution is a family of Bell-shaped and symmetric distributions as the allocation is symmetric: one-half (.50 or 50%) lies on either side of the mean. The concept of “probability of X=x” in the discrete probability distribution is replaced by the “probability density function f(x)
Normal distrib … 4/10/2024 47 A random variable X is said to follow normal distribution, if and only if, its probability density function is: ,for - < x < + π ( pi) = 3.14159 e = 2.71828, x = Value of X Range of possible values of X: -∞ to +∞ μ = Expected value of X (“the long-run average”) σ2 = Variance of X. μ and σ are the parameters of the normal distribution — they completely define its shape
Characteristics of the Normal Distribution 4/10/2024 48 It is a probability distribution of a continuous variable . It extends from minus infinity( - ) to plus infinity (+ ). It is unimodal, bell-shaped, and symmetrical about x = . The mean, the median, and the mode are all equal The total area under the curve and above the x-axis is one square unit.
Normal distrib … 4/10/2024 49 The curve never touches the x-axis . It is determined by two quantities: its mean ( ) and SD ( ) An observation from a normal distribution can be related to a standard normal distribution ( SND) which has a published table. Graph of a normal distribution
68-95-99.7 rule for Normal Random Variables 50 68% of it falls within 1 standard deviation of the mean ( µ ± 1 σ ) 95% fall within 2 σ ( µ ± 2 σ ) 99.7% fall within 3 σ ( µ ± 3 σ )
Standard normal distribution 4/10/2024 51 ♣ Since the values of and will depend on the particular problem in hand and tables of the normal distribution cannot be published for all values of and , calculations are made by referring to the standard normal distribution which has = 0 and = 1. ♣ Thus an observation x from a normal distribution with mean and standard deviation can be related to a Standard normal distribution by calculating : SND = Z = (x- )/
SND … 4/10/2024 52 Properties of the Standard Normal Distribution: Same as a normal distribution, but also... Mean is zero Variance is one Standard Deviation is one Areas under the standard normal distribution curve have been tabulated in various ways. The most common ones are the areas between Z= 0 and a positive value of Z
5 4 3 2 1 - 1 - 2 - 3 - 4 - 5 . 4 . 3 . 2 . 1 . Z f ( z ) Standard Normal Distribution = 0 =1 { The SND… 4/10/2024 53 The standard normal random variable , Z, is the normal random variable with mean = 0 and standard deviation = 1 : Z~N(0,1) .
The SND… 4/10/2024 54 Given a normally distributed random variable X with Mean μ and standard deviation σ
Finding normal curve areas 4/10/2024 55 The table gives areas between - ∞ and the value of Z. Find the z value in tenths in the column at left margin and locate its row. Find the hundredths place in the appropriate Row. Read the value of the area (P) from the body of the table where the row and column intersect. Values of P are in the form of a decimal point and four places.
Eg : Find P(Z<1.96) 4/10/2024 56
Examples 4/10/2024 57 Find the area under the standard normal distribution which lies : Between Z=0 and Z=0.96 Between Z=-1.45 and Z=0 To the right of Z=-0.35 Between Z=-0.67 and Z=0.75
Solutions 4/10/2024 58 a. b. c. d.
Applications of the Normal Distribution 4/10/2024 59 The ND is used as a model to study many different variables. The ND can be used to answer probability questions about continuous random variables. Following the model of the ND, a given value of x must be converted to a z score before it can be looked up in the z table.
Exercise: 4/10/2024 60 A random variable X has a normal distribution with mean 80 and standard deviation 4.8. What is the probability that it will take a value Less than 87.2 Greater than 76.4 Between 81.2 and 86.0
Solution: 4/10/2024 61 Given: mean= =80, SD= =4.8 a. X=87.2 SND = Z = (x- )/ = (87.2-80)/4.8=1.5 then find the area to the left of 1.5 from table which is: 0.9332 Do for b and c similarly
Example1: 4/10/2024 62 Suppose a borderline hypertensive is defined as a person whose DBP is between 90 and 95 mm Hg inclusive, and the subjects are 35-44-year-old males whose BP is normally distributed with a mean of 80 and variance of 144. What is the probability that a randomly selected person from this population will be borderline hypertensive?
Example … 63 Solution: Given Mean=80, variance=144 Let X be DBP, X ~ N(80, 144) P (90 < X < 95) = P[(90-80)/12]<X< P[(95-80)/12] = P(0.83 < z < 1.25) = P (Z<1.25)− P(Z<0.83)=0.8944−0.7967=0.098 Thus, approximately 9.8% of this population will be borderline hypertensive.