Presentation 1 data types and distributions1.pptx

tawandachivese3 21 views 42 slides Sep 09, 2024
Slide 1
Slide 1 of 42
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42

About This Presentation

Data types


Slide Content

Data Types and Distributions Tonya Esterhuizen Biostatistics Unit, Centre for Evidence Based Health Care

Introduction to Statistics “There are three kinds of lies: lies, damn lies and statistics” Benjamin Disraeli “Statistics are like bikinis. What they reveal is suggestive, but what they conceal is vital.” Aaron Levenstein Quiz Click the Quiz button to edit this quiz

Statistics - What Does It Mean ? Data Numerical observations Quantitative information

Statistics - What Does It Mean ? Number of doctors in the different provinces in South Africa Birth weight of babies born at a given hospital in a given year

The number of diabetics in Cape Town who have had an amputation The prevalence rate of HIV per 1000 population in Western Cape The creatine concentration in mg per litre in a 24 h urine sample Statistics - What Does It Mean ?

Statistics - What Does It Mean ? Discipline or science of managing uncertainties in decision processes

Statistics - What Does It Mean ? using the scientific methods of collecting processing reducing presenting analysing interpreting data

Statistics - What Does It Mean ? and of making inferences and drawing conclusions from numerical data

Main Uses of Statistical Methods Collection of data in the best possible way Description of the characteristics of a group or situation Analysing data and drawing conclusions

Collection of data in the best way Using a suitable and appropriate method for selecting subjects for a study, to minimise the role of uncertainty Designing valid data collection instruments such as questionnaires and schedules

Collection of data in the best way Organising data collection procedures for clinical and laboratory research and epidemiological research to minimise the chances of errors eg standardise definitions and equipment and train data gatherers

Description of the characteristics of a group or situation Data presentation in tables or graphs Calculating summary measures such as averages, which can adequately represent the structure of the data set

Analysing data and drawing conclusions This involves analytical techniques and the use of probability concepts in drawing conclusions

Uses of statistical concepts and methods in health science Handling of variation Diagnosis of patients ailments and communities’ health problems Prediction of likely outcomes of an intervention programme in a community or of treatment of individual patients

Uses of statistical concepts and methods in health science Selection of appropriate intervention for a patient or a community Public health, health administration and planning Planning, conducting, analysing, interpreting and reporting of medical research

Handling of variation Variation in a characteristic occurs when its value changes from subject to subject Or from time to time from instrument to instrument within the same subject or from observer to observer

Handling of variation Requires appropriate methods to summarise a characteristic for a group of patients or for a community decide on the normal or average value of a characteristic compare two groups of patients with respect to a particular characteristic

Diagnosis of patients ailments Explicit statistical methods are available for ordering disease categories according to their probabilities of being the correct diagnosis Changing medicine from an art to a science ?

Prediction of likely outcome of treatment Prognosis - An outcome is predicted when the chances of its occurrence are high and the associated uncertainty is low Achieved by keeping records of the characteristics prior to treatment, the treatment and its outcome and analysing them

Selection of appropriate intervention This is based on experience gained with similar patients who received the intervention reports of clinical trials or experiments of the efficacy of different drugs or Rx objective assessment of previous experience

Public Health, health planning Use of data relating to the health and illness in the population to make a community diagnosis Requires knowledge of:

Public Health, health planning Requires knowledge of: population characteristics - age, sex health profile of the population in terms of disease risk factors factors affecting population dynamics data on births, deaths, migration

Get a feel for the data Assess the quality of the data Types of variables Summary statistics Distribution Graphical representation Descriptive Statistics

Types of Variables Quantitative Continuous : temp, height, weight Discrete : number of headaches/week Categorical Ordinal : severity of pain Nominal : sex, blood group Binomial : no or yes

Types of Variables Influences the type of analysis that is possible with that data Therefore its important to be able to define your variable types so that the most appropriate statistical tests are chosen.

Types of Variables

Types of Data Distributions Two of the most common in medical statistics: Normal (Z) distribution (continuous data) Binomial distribution (binary categorical data)

Birth weight (Kg) No of births 1.76 - 2.0 4 2.01 - 2.25 3 2.26 - 2.5 12 2.51 - 2.75 34 2.76 - 3.0 115 3.01 - 3.25 175 3.26 - 3.5 281 3.51 - 3.75 261 3.76 - 4.0 212 4.01 - 4.25 94 4.26 - 4.5 47 4.51 - 4.75 14 4.76 - 5.0 6 5.01 - 5.25 2 Total Births 1260 Normal Distribution: Frequency Distribution Tabulated Limits

Histogram

Histogram and Frequency Polygon

Frequency Polygon

Frequency Polygon

Normal and Skewed Distributions Symmetrical mean, mode, median unimodal

Normal and Skewed Distributions Positively Skewed Negatively skewed tail tail

Normal and Skewed Distributions Bimodal

Normal and Skewed Distributions Mode Median Mean More on these summary measures in next session

Binomial Distribution: Percentages and proportions In a survey of attitudes to statistics 6 out of 100 people say they enjoy the subject The percentage enjoying statistics is (6/100) x 100 = 6% The proportion enjoying statistics is 6/100 = 0.06 In this session we will sometimes use percentages and sometimes proportions

Binomial distribution Sample numbers with a given characteristic follow the binomial distribution The shape of this distribution varies with the population proportion p, and the size of the sample With small samples, the distribution is symmetrical only if p is 0.5

p=0.9, N=5 p=0.5, N=5 p=0.9, N=15 p=0.5, N=15 p=0.2, N=15 p=0.2, N=5

Binomial approximation to Normal As the sample size (n) becomes larger the shape of the distribution becomes roughly Normal, whatever the value of p A rule of thumb is that you can use the Normal approximation if both np and n(1-p) are greater than 5 e.g. If n=20 and p = 0.3, np=6 and n(1-p) =14. Since both exceed 5 we can use the Normal approximation

Binomial approximation to Normal The Normal approximation can be used for both confidence intervals and for hypothesis tests (covered in session 3)

Thank you!
Tags