Probability Probability is used to quantify the relative possibility (likelihood, or chance) about the occurrence of an event. A probability is a numerical value ranging from 0 to 1. A high probability indicates that the outcome is more likely to occur. A zero value (0) indicates that the event will not occur, an impossible event . A probability of 1 indicates an event will occur with certainty i.e., certain event . I f we assign an event the items produced in a production process is defective. Then is interpreted as “if we inspect a large number of items that have been produced by the production process, approximately 10% of them will be defective”.
Terminologies in Probability Experiment: An experiment is any process that produces an observation or outcome and that can be repeated under given conditions. Usually, the exact result of the experiment cannot be predicted with certainly. Experiment can be of two types – Deterministic experiment Random experiment Random Experiment: The process that can results in different outcomes, even though it is repeated in the same manner every time. The following are some examples of random experiment: flipping a coin rolling a die counting the number of defective items in the lot containing 50 items counting the number of accidents per day in a highway measuring the waiting time of customers wait in a queue of ATM booth
Random Variables and Its Probability Distributions Random Variable (Definition): A quantitative variable whose values depend on the chance factor or the outcomes of a random experiment is called a random variable. Example: Consider an experiment where we toss a fair coin twice. The sample space consists of four possible outcomes: . Let be the number of Heads. This is a random variable with possible values 0, 1, and 2. Viewed as a function, assigns the value 2 to the outcome , 1 to the outcomes and , and 0 to the outcome . That is,
Probability Distributions Probability distribution : Most commonly in applications, the support of a discrete r.v . is a set of integers. In contrast, a continuous r.v . can take on any real value in an interval (possibly even the entire real line). Given a random variable, we would like to be able to describe its behavior using the language of probability by means of a probability distribution . For a discrete r.v ., the most natural way to describe the behavior of that r.v . is a function, called the probability mass function , which we now define. Probability mass function: The probability mass function ( pmf ) of a discrete r.v . is the function given by . Note that this is positive if is in the support of , and 0 otherwise.
Probability Distributions ( cont …) Example : In the example of tossing a fair coin twice, we can find the pmf of the random variable along with its pmf : , the number of Heads. Since equals if occurs, if or occurs, and 2 if occurs, the pmf of is the function given by for all other values of .
Probability Distributions ( cont …) Example : A small e-commerce company records the number of product returns by individual customers ( ) in a given quarter. The estimated distribution is as follows Estimate average number of returns per customer. Find the probability that a customer returns at least one product.
Probability Distributions ( cont …) Continuous Probability Distribution: A continuous probability distribution shows the possible values of a continuous random variable and the densities of each of these values. Probability density function: The function, that gives the densities is called probability density function (pdf). Using these densities we can calculate the probability (proportion) of values of that variable within a specific range. Probability from pdf: The probability that X is between the points a and b i.e., can be determined from the subarea under the curve bounded by the curve, the x -axis, and perpendiculars erected at any two points a and b.
Probability Distributions ( cont …) Let us consider the probability density function (pdf), of is as follows . The probability of is between 0 and 0.25 is shown in the following figure Therefore, the probability is nothing but the area of that region and we can write Note: The above distribution is known as continuous uniform distribution. A random variable is said to have a continuous uniform distribution if .
Probability Distributions ( cont …) Let us consider another pdf of as follows The probability of is between 0 and 0.25 is shown in the following figure Therefore, the probability is nothing but the area of that region
Cumulative distribution functions Another function that describes the distribution of an r.v . is the cumulative distribution function ( cdf ). Unlike the pmf , which only defined for discrete r.v.s possess, the cdf is defined for all r.v.s i.e., for discrete and for continuous r.v . Definition : The cumulative distribution function ( cdf ) of an r.v . is the function given by . i.e , area under the curve at or below . Probability from cdf : By definition of cdf and the fundamental theorem of calculus,
PDF ( cont …) Example : Let be the length (in minutes) of a customer service call. Assume the call length follows: Find probability that call is longer than 6 minutes. Estimate average call handling time. Inform resource planning for call centers. [(1) Agent Scheduling : On average, each call takes 6 minutes → one agent can handle ~10 calls/hour. Helps estimate how many agents are needed for forecasted call volume; (2) Peak Load Readiness : About 22% of calls are longer than 6 minutes. Helps plan buffer time for agents to avoid backlog during busy hours.]
PDF ( cont …) Example: Let the random variable represent the time (in hours) a visitor takes to make a purchase after visiting a website. The given probability density function (PDF) is: Estimate when purchases peak. Determine the probability a customer buys within 30 minutes. Calculate expected waiting time to purchase.
The Binomial Distribution The binomial distribution is one of the most widely used probability distributions in applied statistics. The distribution is derived from a process known as a Bernoulli trial , When a random process or experiment, called a trial, can result in only one of two mutually exclusive outcomes: success or failure (dead or alive, sick or well, full-term or premature, heads or tails, defective item or good item, or many other possible pairs), the trial is called a Bernoulli trial.
The Binomial Distribution ( Cont …) Binomial Random Variable and Its pmf : If a Bernoulli trial can result in a success with probability , then the probability distribution of the random variable , the number of successes in independent trials, is called binomial distribution with probability function , . The requirements for using the binomial distribution are as follows: There are only two possible outcomes: success or failure in a single trial. The number of trials, , must be fixed, regardless of the outcome of each trial. The trials are independent. The probability of success, , in each trial is constant. Suppose that follows binomial distribution with number of trials and success probability Notationally it is represented by Bin .
The Binomial Distribution ( Cont …) If Bin , it can be shown that The mean and variance of binomial random variable are Mean: Note: If Bin then Note: If we put , the resulting distribution is Bernoulli distribution . The mean and variance of Bernoulli random variable are Mean:
The Binomial Distribution ( Cont …) Problem : Suppose that 10 percent of a certain population is color blind. If a random sample of 10 people is drawn from this population, find the probability that (a) Two or fewer will be color blind. (b) Two or more will be color blind. (c) Between three and five inclusive will be color blind. Solution : ( i ) Total number of trials: , and probability of success . Let be the number of color blind population in the sample, then have binomial distribution with pmf , . (a) We have to find We have, (from binomial table) Similarly and Therefore, (b) We have to find (c) We have to find
Binomial Distribution ( cont …) Problem : Batches that consist of 20 manufacturing items from a production process are checked for conformance to customer requirements. The probability of getting a nonconforming item in that process is 0.05. What is the probability of number of nonconforming items in a batch is at least 3? Solution : ( i ) Total number of trials: Probability of success, We have to find We have, (from binomial table) Similarly and Therefore,
Binomial Distribution ( cont …) Problem : Batches that consist of 24 manufacturing items from a production process are checked for conformance to customer requirements. The mean number of nonconforming items in a batch is 3. Assume that the number of nonconforming items in a batch, denoted as , is a binomial random variable. ( i ) What are and ? (ii) What is ? Solution : ( i ) Total number of trials: So, probability of success, We have to find We have, Similarly and Therefore,
Poisson Distribution Poisson Distribution: Situations often arise where the variable of interest is the number of occurrences of a particular event in a given interval of space or time. The events are often defects, accidents or unusual natural happenings, such as earthquakes, where in theory there is no upper limit on the number of events. Some examples are the number of cars passing a point on a road in a time interval of 1 minute the number of telephone calls received at a switchboard per minute the number of misprints on each page of a book, the number of radioactive particles emitted by a radioactive source in a time interval of 1 second. The number of plants infected with a particular disease in a plot of field.
Poisson Distribution The Poisson distribution is often known as the distribution of rare events. Given an interval of real numbers, assume counts occur at random throughout the interval. If the interval can be partitioned into subintervals of small enough length such that For a small interval the probability of the event occurring is proportional to the size of the interval. The probability of more than one occurrence in the small interval is negligible (i.e. they are rare events). Events must not occur simultaneously Each occurrence must be independent of others and must be at random. The random experiment is called a Poisson process.
The Poisson Distribution ( cont …) Poisson distribution (definition): A random variable is said to have a Poisson distribution with parameter if its pmf is given by , The Poisson distribution was first derived in 1837 by the French mathematician and physicist Siméon -Denis Poisson (1781–1840). Suppose that follows Poisson distribution with average number of occurrence (mean) Notationally it is represented by Poi .
The Poisson Distribution ( Cont …) If Poi , it can be shown that The mean and variance of Poisson random variable are Mean: Poisson approximation to Binomial distribution: If Bin , and such that (a constant) then it can be shown that i.e., Poi
Poisson Distribution ( cont …) Problem : The average number of spots in 1 square meter of metal sheet is 3.4. What is the probability that there will be (a) 2 spots in the next square meter? (b) at most 2 spots in the next square meter? (c) at least 3 spots in the next square meter? Solution : Given that , we have , (a) (b) (c)
Poisson Distribution ( cont …) Problem : The average number of meteors found by a radar system in any one-minute interval under specified conditions is 3.62. Assume the meteors appear randomly and independently. (a) What is the probability that no meteors are found in a 30-second interval? (b) What is the probability of observing at least four but not more than six meteors in two minutes of observation? (c) Find the mean and standard deviation of meteors found in three minutes interval. Solution : Given that (a) we have and (b) Here (c) Here ,
Poisson Distribution ( cont …) Problem : Automobile battery of a particular brand malfunctions with probability 0.001. Use two different distributions to calculate the probability that at least one out of 1000 batteries will malfunction. Binomial distribution : Given that success probability , , let number of batteries out of 1000 that will malfunction. Then Bin . Now we have Poisson distribution : As such that , then Bin can be approximated by Poi Now we have
Continuous Uniform Distribution A continuous random variable is said to have a continuous uniform distribution if . Suppose that follows continuous uniform distribution in the interval abd , it is represented by . The cdf of is If U , it can be shown that the mean and variance are Mean:
Continuous Uniform Distribution ( cont …) Problem : According to the Insurance Institute of America, a family of four spends between $400 and $3,800 per year on all types of insurance. Suppose the money spent is uniformly distributed between these amounts. (a) What is the mean amount spent on insurance? (b) What is the standard deviation of the amount spent? (c) If we select a family at random, what is the probability they spend less than $2,000 per year on insurance per year? (d) What is the probability a family spends more than $3,000 per year? Solution : Given that where and (a) we have (b) (c) We have, , so (d)
Exponential Distribution A continuous random variable is said to have a continuous uniform distribution if . Suppose that follows exponential distribution, it is represented by . The cdf of is If , it can be shown that the mean and variance are Mean:
Continuous Uniform Distribution ( cont …) Problem : Waiting times to receive food after placing an order at a local sandwich shop follow an exponential distribution with a mean of 60 seconds. Calculate the probability a customer waits: a. Less than 30 seconds. b. More than 120 seconds. c. Between 45 and 75 seconds. d. Fifty percent of the patrons wait less than how many seconds? What is the median?
Normal Distribution A continuous random variable is said to have a normal distribution if its pdf is where is the mean of the distribution and is the variance of . Suppose that follows normal distribution with mean and variance , it is represented by . Normal distribution is a symmetrical distribution (symmetrical about the point ). The normal distribution with mean 100 and variance 4 is shown in the following figure
Normal Distribution ( cont …) Difficulties in calculating probabilities: The probability from normal distribution requires numerical methods. A table for calculating probabilities can be constructed. The number of normal distributions is unlimited in a normal family, each having a different mean , variance , or both. Thus providing tables for calculating probabilities for these infinite numbers of normal distribution is impossible. Fortunately, one member of the normal family can be used to determine the probabilities for all normal distributions. It is called the standard normal distribution .
Standard Normal Distribution If is a normal random variable with mean and standard deviation , then the random variable is called the standard normal variable and its probability distribution is called the standard normal distribution. The mean and variance of is 0 and 1 respectively. The probability density function, , for all values of is computed as
Computing Probability from Normal Distribution Step-1: Express as Step-2: These probabilities for different intervals of a standard random variable (Z) can be calculated by using a standard normal table.
Problem on Normal Distribution Example : Suppose a random variable have a normal distribution with mean 283 and standard deviation 1.65. Then is equivalent to , and is equivalent to as shown below
Example ( Cont …) Similarly
Example ( Cont …) Similarly
Problem on Normal Distribution Problem : The weight of a sophisticated running shoe is normally distributed with a mean of 12 ounces and a standard deviation of 0.45 ounce. (a) What is the probability that a shoe weighs at most 12.2 ounces (b) What is the probability that a shoe weighs more than 13.2 ounces? (ii) What is the probability that a shoe weighs is between 11.4 ounces and 12.9 ounces? Solution : Given that and (a) (b) (c)