2 Population Unknown: we would like to make inferences (statement) about Take a Sample Use the sample to say something about the population
3 Sampling: A Pictorial View Sample Target Population Sampled (Study) Population Sample Target Population Sampled(STUDY) Population
Sampling variation We need to distinguish between a population and a sample of a population Data from a sample: The mean of a variable ( ) is considered an estimate of the true population mean ( µ ) The standard deviation of the variable (s) estimates the population standard deviation ( s ) However, if another sample is drawn: The second sample mean will differ from the first sample mean This is called sampling variation
µ = population mean 1 , 2 are sample means 1 2 µ Frequency Population Samples
How variable are the sample means The Standard Deviation (sd): The sample standard deviation (s) estimates the variability of the individual data in the population ( s ) The Standard Error (SE) Represents the variability of the sample means Can be estimated from the standard deviation as s/√n Standard Error decreases as the sample size increases standard deviation represents the variability in the individual data standard error represents the variability in the sample means
Normal Distribution
Definitions Continuous data are data such as age, weight, height, haemoglobin. In descriptive stats we use the mean and standard deviation to describe these data Assumption the data are approximately Normal If not we could use median and IQR to describe the data
WHY THE NORMAL DISTRIBUTION IS IMPORTANT A good empirical description of the distribution of many variables the sampling distribution of a mean is normal, even when the individual observations are not normally distributed, provided that the sample size is not too small it occupies a central role in statistical analysis. CI’s, P-values, proportions and rates
What is the Distribution? Gives us a picture of the variability and central tendency. Can also show the amount of skewness.
Normal Distributions (aka Bell-Shaped Curves, Gaussian Distributions)
Examples of Normal Distributions
Characteristics of Normal Distributions Mean = Median = Mode Bilaterally symmetrical Tails never touch x-axis Total area under curve = 1
Notation A random sample of size n is taken from the population of interest. The mean and standard deviation of the quantitative variable x in the population and in the sample are given by:
In a normal distribution: ~68% of observations lie between –1 and 1 standard deviations from the mean ~95% of observations lie between –2 and 2 standard deviations from the mean (actually -1.96 to +1.96) ~99% of observations lie between –3 and 3 standard deviations from the mean
Example of normal distribution
Relationship between the mean, standard deviation and the normal distribution For symmetric distributions, approximately, 95% of all observations lie in the interval 2s The limits: - 2s and + 2s are referred to as the 95% tolerance limits (or 95% spread limits) Values contained in the interval are commonly termed as “THE NORMAL VALUES”
In Summary For most continuous data, the data points are Normally distributed, around the Mean, with a spread which is called the Standard deviations From any sample of continuous data, the mean is calculated, and will vary from sample to sample . The sample mean will have a distribution which is centered on the Population mean , and will have a spread which is called the Standard Error. The standard error of the sample mean is related to the standard deviation and the sample size.