Types of data and graphical representation

ReenaTitoria 14,386 views 55 slides May 09, 2018
Slide 1
Slide 1 of 55
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55

About This Presentation

Types of Data and it's representation- Graphs and table


Slide Content

TYPES OF DATA AND GRAPHICAL / TABULAR REPRESENTATION DR. REENA TITORIA

INTRODUCTION Statistics may be defined as the science, which deals with collection, presentation, analysis and interpretation of numerical data DESCRIPTIVE STATISTICS INFERENTIAL STATISTICS

Collected data should be • Accurate (i.e. Measures true value of what is under study) • Valid( i.e. Measures only what is supposed to measure) • Precise(i.e. Gives adequate details of the measurement) • Reliable(i.e. Should be repeatable)

Types of DATA Qualitative/ Quantitative Discrete/ Continuous/ Interval/ Ratio Primary/ Secondary Nominal/ Ordinal

Quantitative data: Qualitative data: Also called as measurement data Can be expressed as number with or without unit of measurement Eg : Height in cm, Hb in gm%, BP in mm of Hg, Weight in kg Represents a particular quality or attribute Expressed as numbers without unit of measurements Eg : religion, Sex, Blood group etc

Discrete data: Here we always get a whole number. Ex: Number of beds in hospital Malaria cases Continuous data : I t can take any value possible to measure or possibility of getting fractions Ex: Hb level, Ht, Wt. WHAT IS IMPORTANT???

Interval: Has values of equal intervals that mean something. For example, a thermometer might have intervals of ten degrees Ex: Celsius Temperature, IQ (intelligence scale) Ratio: Exactly the same as the interval scale except that the zero on the scale means: does not exist Ex: Age, Weight, Height

Primary data : Data collected by the investigator himself/ herself for a specific purpose Ex : Data collected by a student for his/her thesis or research project Advantages : The investigator collects data specific to the problem under study. There is no doubt about the quality of the data collected (for the investigator). If required, it may be possible to obtain additional data during the study period .

Secondary data : Data collected by someone else for some other purpose (but being utilized by the investigator for another purpose) Ex : Census data being used to analyze the impact of education on career choice and earning  Advantages of using Secondary data: The data’s already there- no hassles of data collection It is less expensive The investigator is not personally responsible for the quality of data (“I didn’t do it”)

Nominal data: The information or data fits into one of the categories, but the categories cannot be ordered Categories without order Ex: Colour of eyes, Race, Gender Ordinal data: A rank or order Here the categories can be ordered, but the space or class interval between two categories may not be the same Ex: Ranking in the class or exam, SES

QUESTION A person's highest educational level is which type of variable? Continuous Discrete Ordinal Nominal The number of motor-vehicle accidents on a particular stretch of the national highway in a week is which type of variable? Continuous Discrete Nominal Ordinal

REPRESENTATION OF DATA Tabular Graphic Numeric

When to use Tables When you wish to show how a single category of information varies when measured at different points When the dataset contains relatively few numbers When the precise value is crucial to your argument and a graph would not convey the same level of precision For example: when it is important that the reader knows that the result was 2.48 and not 2.45 When you don’t wish the presence of one or two very high or low numbers to detract from the message contained in the rest of the dataset

Tabular Presentation 1. Table must be numbered 2. Brief and self explanatory title must be given to each table 3. The heading of columns and rows must be clear, sufficient, concise and fully defined 4. The data must be presented according to size of importance, chronologically, alphabetically or geographically 5. Table should not be too large 6. The classes should be fully defined, should not lead to any ambiguity 7. The classes should be exhaustive i.e. should include all the given values 8. The classes should be mutually exclusive and non overlapping. 9. The classes should be of equal width or class interval should be same 10. The number of classes should be neither too large nor too small

Normal Range  18.5 ≤ x < 25

Frequency distribution table with quantitative data: Table 1: Fasting blood glucose level in diabetics at the time of diagnosis (n=78) Fasting Glucose n 120-129 12 130-139 8 140-149 10 150-159 10 160-169 15 170-179 18 180-189 5

Cross- Tabulation Table 2: Fasting blood glucose level in diabetics at the time of diagnosis (n=78)

Frequency distribution table with qualitative data: Table 1: Cases of malaria in adults and children in the months of June and July 2010 in Nair Hospital (n=389)

EXAMPLE This is a poor example because: • The table lacks a title • The source of the information is not provided • Row titles overlap two lines • The alphabetical listing of regions results in a non-numerical ordering of data down the columns

EXAMPLE This is a better example because: • The table has title • The source of the information is provided • Row titles not in two lines • The alphabetical listing of regions results in a numerical ordering of data down the columns • Numbers are aligned

Graphical P resentation A Graphical representation is a visual display of data and statistical results. It is more often and effective than presenting data in tabular form Graphical representation helps to quantify, sort and present data in a method that is understandable to a large variety of audience Graphs also enable us in studying both time series and frequency distribution as they give clear account and precise picture of problem Graphs are also easy to understand and eye catching

General Principles of Graphic Presentation In a graph there are two lines called coordinate axes One is vertical known as Y axis and the other is horizontal called X axis These two lines are perpendicular to each other. Where these two lines intersect each other is called ‘0’ or the Origin On the X axis the distances right to the origin have positive value and distances left to the origin have negative value On the Y axis distances above the origin have a positive value and below the origin have a negative value It should have a title, legend and labelling

VARIOUS CHARTS AND DIAGRAMS Bar Diagram Histogram Frequency polygon Cumulative frequency curve/ Ogive Scatter diagram Line diagram Pie diagram Pictogram Stem and Leaf Plot

BAR DIAGRAM Bar charts are used for qualitative type of variable in which the variable studied is plotted in the form of bar along the X-axis (horizontal) and the height of the bar is equal to the percentage or frequencies which are plotted along the Y-axis (vertical). The width of the bars is kept constant for all the categories The space between the bars also remains constant throughout. The number of subjects along with percentages in bracket written on the top of each bar Types: Simple Compound Component

SIMPLE BAR CHART When we draw bar charts with only one variable or a single group it is called as simple bar chart

COMPOUND BAR CHART W hen two variables or two groups are considered it is called as multiple/ compound bar chart In multiple bar chart the two bars representing two variables are drawn adjacent to each other and equal width of the bars is maintained

COMPONENT BAR CHART Bar chart wherein we have two qualitative variables which are further segregated into different categories or components is called component bar chart In this the total height of the bar corresponding to one variable is further sub-divided into different components or categories of the other variable

HISTOGRAM A histogram is used for quantitative continuous type of data where, on the X-axis, class intervals and on the Y-axis we plot the frequencies It is very similar to the bar chart with the difference that the rectangles or bars are adherent (without gaps) It is used for presenting class frequency table (continuous data) Diagram consisting of rectangles whose area is proportional to the frequency of a variable and whose width is equal to the class interval

Distribution of the subjects by Cholesterol level Serum Cholesterol (mg/dl) No. of Subjects Percentage (%) 175-200 3 30 200-225 3 30 225-250 2 20 250-275 1 10 275-300 1 10 Total 10 100 EXERCISE

EXERCISE

FREQUENCY POLYGON AND CURVE •Plot the variable along the X-axis and the frequencies along the Y-axis •Derived from a histogram by connecting the mid points of the tops of the rectangles in the histogram •The line connecting the centres of histogram rectangles is called frequency polygon •If we construct a smooth freehand curve passing through these points. Such a curve is known as frequency curve

(n=37)

CUMULATIVE FREQUENCY DIAGRAM

One can tell the number of patients that lie above or below a certain level

Exercise