Stastistics in Physical Education - SMK.pptx

819 views 85 slides Sep 04, 2023
Slide 1
Slide 1 of 85
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65
Slide 66
66
Slide 67
67
Slide 68
68
Slide 69
69
Slide 70
70
Slide 71
71
Slide 72
72
Slide 73
73
Slide 74
74
Slide 75
75
Slide 76
76
Slide 77
77
Slide 78
78
Slide 79
79
Slide 80
80
Slide 81
81
Slide 82
82
Slide 83
83
Slide 84
84
Slide 85
85

About This Presentation

• It is a specific branch of mathematics that deals with analysis of data collected on various population groups
• Statistics involves mathematical abilities more than addition, subtraction, division and multiplication which are repeated many times in a logical fashion.
• for fuller details ...


Slide Content

Statistics in Physical Education “Every Moment is a Golden One for him who has the Vision to Recognize it as such!” Prof . Shatrunjay Mrityunjay Kote, Ph. D. Assistant Professor, M. S. M’s. College of Physical Education, Khadkeshwar, Aurangabad [email protected]

Introduction to statistical tests Although calculations may be easily performed by suing computers but right interpretation of calculated results need clarity of statistical concepts

Face of Statistics It is a specific branch of mathematics that deals with analysis of data collected on various population groups Statistics involves mathematical abilities more than addition, subtraction, division and multiplication which are repeated many times in a logical fashion. for fuller details of statistical tests may refer to Chandha (1992); Vincent (1995); Hopkin et al. (1996); Sincrich et al. (2002); Triola (2002)

Statistical concepts Understanding of basic statistics is indispensable for dealing with the process of evaluation of test and measurement. The statistical concepts facilities proper and effective interpretation of test scores or measurements taken by the coach or a physical educator While a computer assists the teacher or the coach in saving the huge time needed for enormous calculations, but the meaning of results is made clear only through the understanding of relevant statistical test concepts. Tests act as seed to measurements, the statistical tests act as seed to the construction of all other types of tests and are also essential for the testing of validity, reliability and objectivity of all tests.

Functions of Statistical Tests The information which we can deduce from test and measurement is based on our statistical ability. It is the statistical tools which enable us to do the following important functions: Organize and tabulate date (presentation of facts in a definite form) Analysis data Synthesize data (classification / combination of facts) Compare groups of data Simplification of unwieldy and complex data Proper interpretation of a data testing of hypotheses understand the relationship and association between different parameters, make predications and take decisions. Construction of physical, psychomotor and written tests Evaluation of individual measurements selection of sportsperson Monitoring of training and teaching effects and testing the need for individualization of training and teaching.

Statistics: Meaning Meaning: The word “statistics” is a plural form of ‘statistic’. The term statistic is uncommon to that an extent that many of the students of statistics may be unaware of its singular form. The word statistics has been taken from German word ‘ statistik ’ meaning a political state. Since, facts and figures were required in olden days mainly by kings for their administration. Therefore, in the beginning. It was also known as the ‘Science of Kings’ ( Chadha , 1992). Subsequently, its scope has greatly widened and statistics now refers to a huge body of methods, symbols and formulae dealing with phenomena that can be described numerically providing quantitative arrays of information Statistic is numerical value which characterizes a group of scores. For example the average height characterizes the entire sample whose all subjects’ heights have been measured to calculate the average height. A number of such characterizing values refer to the plural form of above mentioned statistic and thus, give rise to the more commonly used term that is statistics. According to Webster’s dictionary, statistics means “The classified facts representing the condition of people in a state, especially those facts which can be stated in numbers or in tables or in any diagrammatic or classified arrangements.

Statistics: Definition Definition: Statistics may be defined as “summarized figures of numerical facts such as percentages, averages (means, medians, modes), standard deviations etc. and methods of dealing with numerical facts” Tate (1955) described statistics by following interesting sentence. You compute statistics from statistics by statistical methods. Mangal (1987) defines statistics as “A subject or branch of knowledge that helps us in the scientific collection, presentation, analysis and interpretation of numerical facts” Hastad and Lacy (1994) define statistics as “the science of collecting, classifying, presenting and interpreting numerical data” In general, we can define statistics as “the study of collection, organization, summarizing, analysis and interpretation of numerical data”

Meaning of Important Statistical Terms Data: Any information collected about facts is called data. The word data is the plural form of datum (fact). Population: A defined group of all subjects or all objects with at least one common characteristic, is known as a population or universe. Sample: A subgroup drawn for the study of selected characteristics of the population is known as a sample. Hence a sample may be defined as a much smaller part of population selected for representing the whole population. Sampling : it may be defined as the scientific procedure of selecting a sample i.e., only a few items / subjects from the entire population of items or subjects

Meaning of Important Statistical Terms Types of sampling : Simple Random Sampling : in simple random sampling each member of the specified population has an equal chance of being included in the sample. All the subjects are assigned numbers and then the required number of subjects is selected with the help of a random number table given in standard statistics books and on internet ( www.graphpad.com ) Systematic Sampling : After listing all the subjects, sampling is done at such systematic intervals that will provide the required number of subjects. Stratified sampling : A random sample containing equal representation of each strata (sub-group) of the population. In this type of sampling first the population is divided into homogeneous groups and then the procedure of systematic sampling is applied on each group. Cluster Sampling : A random sample collected from randomly selected clusters. Clusters differ from sub-groups because the subjects of a cluster are widely dissimilar while subjects of a sub-group are quite similar. Proportional Sampling: It is used in all previous methods mentioned above. It involves determining the percentage representation of each category of a population.

Meaning of Important Statistical Terms 5. Statistics : Any characteristics of population studied from a sample is know as a statistic or a variable whereas a number of characteristics of varied population groups are know as statistics. 6. Parameter : The investigator is interested to study various characteristics of a population. Each characteristic when studied on the entire population is known as a particular parameter. In other words a measurable characteristic of the entire defined population. 7. Hypothesis : supposition of a answer to the problem at hand made by the investigator before conducting the research, in a testable form, which is to be later tested by scientific research Features of a Hypothesis : (a) the hypothesis provides the basis for the planning; (b) It is a supposition about some phenomenon; (c) A hypothesis is to be tested by gathering actual facts about the phenomenon.

Meaning of Important Statistical Terms 8. Norms : norms are naturally occurring normal values and the normal ranges cannot be modified by experts. Statistical tools which evaluate the status of an individual with respect to recently collected data on a large sample of a population group of the individual being examined. 9. Standards : standards are the investigator set limits of physical, physiological or psychomotor levels for scoring or selecting individuals for a a particular competitions, degree / prize awarding or items selection etc. 10. Degree of freedom : The degree of freedom may be defined as the number in a distribution which (individual values) is independent of each other and cannot be deducted from each other. Generally the degree of freedom is given by the formula: Df = Ns- Np ; Where df = degree of freedom; Ns = Number of scores (observation); Np = number of parameters being estimated.

Classification of Statistics Descriptive Statistics: the branch of statistics which describes the statistical constants of the data such as average, measures of variability, standard scores etc. is known as descriptive statistics. For instance, the frequency distribution mean, median, mode, standard deviation, coefficient of variation, standard ‘Z’ score, ‘T’ score, sigma score, hull score, percentiles, normal curve etc. come under descriptive statistics, in other words, descriptive statistics summarize the achievements, structural and functional characteristics of champions and groups in numerical figures usually in tabular and diagrammatic forms. Inferential Statistics: the statistical tools used to make conclusions or inferences about the study of differences or relationships’ predictions are known as inferential statistics or statistical tests. Examples are ‘t’ test, ‘F’ test, chi-square test, regression coefficients etc. in other words, inferential statistics provide mathematical tools to compare groups and to examine the level of group differences, prediction of one variable from the other as well as to test various types of hypotheses.

Importance of Statistics Statistics that is statistical tests are an essential part of test construction for any type of physical performance evaluation These help to find the central tendency (mean value) which represents the average performance of the group (class or team etc.) These help to compare a group of sportsmen with another Statistics enables to classify the players in different grades / levels These help the physical education to validate his/her methods of teaching training and to study the effects of different teaching /training methods. Statistics are essential to understand research literature, conduct research and report the results of test and measurements No scientific evaluation of tests is possible without applying statistical tests With the help of statistics, a physical educationist is able to interpret the scores and grades awarded to the students in quantitative and qualitative data It is the statistics that helps the physical educationist to select sports talent at young age and to predict the performance potential of his / her trainees.

Data and its Distribution Meaning of Data The term ‘data’ is the plural form of datum, which means ‘fact’. Thus, data refers to a number of facts collected by the investigator. In other words, data refers to the raw scores especially the numerical records obtained in a statistical enquiry. In measurement and evaluation data is used for numerical scores about facts such as measured height, weight, performance in athletic tests, achievement scores for qualitative tests (intelligence, attitude, sportsmanship behaviour, interest etc.) Definition of Data: Any information collected systematically about facts or measures is called data

Kinds of Data Data may be divided into various kinds with respect to different bases of its classification. The bases of classification of data are given below; (a) Sources of data collection (b) Type of distribution of data (c) Organisation of data

(a) Sources of data collection Kinds of data based on sources of the collection: ( i ) Primary data: The data collected by actual measurement testing or observation with the help of scientific equipments or questionnaire, is known as primary data or the data collected from a primary source. For example, measuring height, weight, standing broad jump, intelligence quotient directly on a selected sample of subjects. (ii) Secondary Data: The utilization of data previously collected / recorded by any other investigator or agency used in a different manner by the present investigator, is known a secondary sources. For example, the height, age, weight etc. collected at the time of recruitment of soldiers or admission of children in schools or selection of athletes etc., used by the investigator for his / her research work, is the kind of secondary data.

(b) Kinds of Data Based on Types of its Distribution ( i ) Continuous data: data is known as continuous when it may be subdivided or classified in a continuous series without having any gap. When the values of raw scores difference from one another by indefinitely small changes, the data belongs to continuous type of data. Majority of the variables in which physical education teachers/coaches are interested, have the continuous (normal) distribution and thus, measurement of these variables gives rise to continuous or normally distributed data. Examples of this kind of data may be numerous such as data on height, weight, lengths, temperatures, intelligence levels, attitudes, performances etc. (ii) Discrete data: Data is known as discrete or discontinuous when it cannot be divided into continuous series and contrarily falls into discontinuous, that is unconnected discrete series. The variables which are discontinuous, with real gaps between one value and the next, the data collected on such variables gives rise to discrete data which can be expressed only in whole units (numbers). Examples are number of eggs laid by a hen, blood groups, ethnic groups (human races), sex etc.

(c) Kinds of data based on its organisation Statistical data may be divided into the following tow kinds: ( i ) Raw data (ungrouped data): The data when presented by the raq scores as they were recorded, without any attempt to arrange them into a more meaning ful or convenient form, is called ungrouped or raw data. (ii) Organized data (grouped data): the data when organized or arranged in some more meaningful manner such as from high to low value or grouped into categories or classes in order to facilitate further computation, is known as organized or grouped data. The grouped data is usually represented by the frequency distribution.

Frequency Distribution Definition: Frequency distribution may be defined as ‘method of grouping the raw scores of data’. It is usually represented in a tabular form enlisting the raw scores or intervals of scores and the frequencies with which the raw scores exist in each class interval. Such tables are known as frequency tables. Before the invention of calculator, the calculations based on grouped data were faster even though the procedures were quite indirect and complicated. However, with the recent wide spread availability of pocket and inexpensive calculators, the use of frequency distribution has greatly reduced. The computation of statistical constants based on ungrouped data, is known as computer method of calculation. For frequency distributions, the raw scores are arranged in ascending or descending series giving the rank or merit position of the individual’s score. While doing so, the frequency of individual’s scores also becomes apparent. The frequency of a score defines the number of times a score is repeated in a series of data. In physical education, the data being of continuous nature (normally distributed), contains a large number of rank orders with very low frequency of each score resulting into long series of data. In order to summarize such a data more adequately, the rank orders are divided into few classes which results in the grouping of data with increased frequency. This organization of data is practically known as the frequency distribution with grouped data. Such organization of data needs the construction of a frequency distribution table.

Construction of a frequency distribution table The frequency distribution table may be constructed for both discrete and continuous data and thus, is of two types: Discrete frequency distribution table: Since the discrete data is indivisible in fractional units and may have large gaps. Therefore, it has fewer number of categories with larger number of frequencies in each category. The discrete frequency distribution table construction is very simple and does not involve any number of steps rather it is a simple listing of tallies against each discrete category / class. An example of discrete frequency distribution table is given below with respect to the distribution of subjects belong to different blood groups Type of Blood Group Tallies Frequency A Iiiii ; iiiii ; iiiii ; iiiii ; iii 23 B Iiiii ; iiiii ; iiiii ; ii 17 O Iiiii ; iiiii ; iiiii ; iiiii ; iiiii ; iiiii ; iiiii ; iiiii ; iiiii ; iiiii ; iiiii ; iiiii 60 AB Iiiii ; iiiii 10

Construction of a frequency distribution table (b) Continuous frequency distribution table construction: as mentioned earlier, the continuous frequency distribution series are very lengthy and are usually grouped before the construction of a continuous frequency distribution table. The continuous frequency table consists of five columns instead of three columns of discrete frequency table. Two additional columns are of cumulative frequency and percentage cumulative frequency. Consequently a process of construction also becomes somewhat lengthy and therefore, it is divided into many steps given below: Step 1: Finding Range of Scores: in order to find the total size of the range of distribution, the lowest score is subtracted from the highest score plus one that is mathematically. Range = (Highest score + 1) – lowest score. For example, consider the distribution of height of 40 male students given in the following table: Step 2: Deciding the number and size of class intervals: the range is divided into a number of class intervals (groups) by deciding the size (sub-range) of each group. While deciding the number of groups, generally the size of the sample is taken into consideration. If the sample size is less than 50, the number of groups is usually kept less than 5 whereas in samples of 50 to 100, the number of groups is usually kept between 5 to 10. in sample containing more than 100 items, the number of groups is taken between 10 to 20 (Tate, 1955)

Construction of a frequency distribution table S. No. Height S. No. Height S. No. Height 1 165.3 16 166.5 31 170.5 2 172.5 17 164.7 32 171.1 3 168.2 18 169.5 33 170.7 4 171.9 19 171.0 34 168.8 5 174.5 20 171.1 35 174.1 6 176.0 21 170.3 36 167.3 7 166.0 22 167.5 37 165.9 8 167.3 23 168.8 38 170.4 9 169.5 24 173.5 39 166.9 10 175.0 25 174.4 40 172.2 11 177.5 26 170.2 12 173.5 27 169.7 13 164.3 28 168.9 14 170.2 29 167.6 15 168.5 30 170.6

Construction of a frequency distribution table Highest value = 177.5; Lowest value = 164.3; Range = (177.5+1.0) – 164.3 = 14.2 After deciding the number of the groups, the size of class intervals may be calculated by dividing the total range of the sample by the number of groups Desired Class interval = Range/No. of groups (Rule 1) The alternate rule for deciding the grouping of raw data into various class interval. When class interval is decided first instead of the number of the groups, the value of 2, 3, 4, 5 or 10 units is used as the class interval. In this case the number of groups is calculated by dividing the range by the class interval Or No of Groups = Range / class inter (rule 1) For example, if any investigator desires to have 3 cms as the class interval for the distribution of height of 40 male students given in above table then the number of groups may be calculated as below: No. of groups = 14.2 / 3 = 4.7 (rounded off to 5) Where f = frequency, cf = cumulative frequency and % cf = cumulative frequency in percentage Class Tallies f cf % cf

Construction of a frequency distribution table Step 3: Setting up the contents of frequency distribution table: ( i ) For setting up the construction frequency distribution table, firstly five columns are labeled as given in table in earlier slide. (ii) The second sub-step: in writing the contents is writing the classes of distribution that is filling up the first column of the frequency distribution table Step 4: Checking the frequency distribution: for checking the frequencies (column 3) are summed up. The sum should be equal to the total number of subjects measured. For instance, the total of column – 3 in above example is 40 which is equal to the total number of the subjects in the sample Step 5: finding midpoint of each class interval: the mid point of each class interval helps to discuss and explain the results by giving a single representative value to each class (group) instead of the total range of each class. For instance, the mid points of the classes (group) of the above frequency distribution represented in tables. For computing the midpoint, the smallest unit of accuracy is subtracted from the lowest limit of the first class (group) of the frequency distribution. Then the difference between the upper and the lower limit is divided by 2 and the result is added to the lower limit of the above mentioned subtracted lower limit.

Tests of Central Tendency Meaning and definition: invariably we come across average values in newspapers daily. These average values are most characteristic of a given group. In other words, we may say that a greater concentration of scores in any group is towards the middle value of scores. The average value or the value having greater concentration around it is statistically known as central tendency. The meaning and definition of central tendency. The meaning and definition of central tendency follows. When a group of students is assembled and arranged in files (rows) according to their height or according to their achievement score in a fitness test, the bigger rows will be towards the middle and these mid points of continuous variables are referred to as the central tendency. In other towrds , the central tendency represents the typical average score of a normal variable. According to Rothstein (1985), “ Central tendency represents a point in the set of distribution around which scores seem to center”. The three basic measures of the central tendency are mean, median, and mode which are by far the most widely used statistics, both for the researchers and for the general population. Daily we talk about average values. For example, average income, average rate of inflation, average size of family, average speed, average run rate etc. the measures of central tendency are used to condense information. Tate (1955) had defined “Central tendency is a sort of average or typical value of the items in the series and its function is to summarize the series in terms of this average value”.

Mean It may be defined as sum of all the individual scores or values of the items in a series divided by the number of items. The mean is usually designated by the symbol capital ‘M’ or X bar. It is the same as is the arithmetic mean. For teachers of physical education and sports trainers, ‘mean’ is the most frequently used and relevant measure of central tendency because almost all physical, physiological, psychomotor and achievement scores are normally distributed variables and mean is the best measure of central tendency for such variables. Calculation of Mean: Mean represents the general average which is most commonly talked about and is computed by simple arithmetic calculation by the following formulae: (a) For ungrouped data: M= ∑X / N Where m stands for mean, ∑X stands for sum of the individual values or scores of the items and N for the total number of items in a series of group. For example, if we want to compute the mean of an ungrouped data of height measured on 40 male students presented in the table. We will first add up all the 40 values of height measured and then divide the same by 40 to find the mean (average) height of the male students of that group. Mathematically, it may be written as M=X1+X2+X3+X4+…………..+X40 / 40

Mean (b) For grouped data: For a grouped data presented in the form of a frequency distribution table, the calculation of mean is done by the following formula number 1: M= ∑ fx / N Where ‘X’ represents the mid-point of the subgroup (or class), ‘f’ its respective frequency and ‘N’ the total number of subjects in all the sub-groups or total sample being studied. For example, if we want to calculate the mean height of the above mentioned 40 male students presented in the form of frequency distribution shown in table, the steps for computing mean are presented below in table: Class (height in cm) X=Mid point (cm) F= Frequency fX 164.1-166.0 165 05 825 166.1-168.0 167 06 1002 168.1-170.0 169 08 1352 170.1-172.0 171 11 1881 172.1-174.0 173 04 692 174.1-176.0 175 05 875 176.1-178.0 177 01 177 164.1-178.0 171 N=40 ∑ fX =6804 M= ∑ fx / N = 6804/40 = 170.1

Mean Still another method of computing the mean of grouped data is based on an assumed mean value with the help of the following formula M=AM+ ∑ fx / N x I Where AM = Assumed mean F = frequency of a class N = total number I = Class interval X’ = X-AM/ i X = Midpoint of the class ∑ = Sum of frequency For example, the above short cut method of computing mean value of the data used in table is illustrated in table. The assumed mean is usually the value of the mid point of the middle class of the distribution. Hence, let assumed mean (AM) = 171; i = 2 in other words, using following formula: M=AM + (∑ fx ’ / N x i ) => M = 171.0 + (-18/40 x 2) => M = 171 + (-36/40) or 171 – 0.9 = 170.1

Mean In other words, using following formula: M = AM + (∑ fx ’ / N x i ) => M = 171.0 + (-18/40 x 2) => M = 171 + (-36/40) or 171 – 0.9 = 170.1 Mean is the balance point of distribution and can be tested by finding the sum of difference on plus and minus side of the mean. The sum of differences on the plus side should equal the sum of the differences on the minus side. In other words, the sum of deviations from the mean (X – X bar) is always 0. It may be illustrated as given in Table 9.3 Class (Height in Cm) X Mid point of class (cm) f X’= X-AM/ i fX ’ 164.1-166.0 165 5 -3 -15 166.1-168.0 167 6 -2 -12 168.1-170.0 169 8 -1 -8 170.1-172.0 171 11 172.1-174.0 173 04 +1 +4 174.1-176.0 175 05 +2 +10 176.1-178.0 177 01 +3 +3 164.1-178.0 171 40 ∑ fx ’ = -18

Mean Sum of deviations from mean is always zero X X-X bar 4 -6 Minus Deviation from mean = 21 5 -5 6 -4 7 -3 8 -2 9 -1 10 X bar = 10 11 +1 Plus Deviation from mean = 21 (Sum of deviation = zero) 12 +2 13 +3 14 +4 15 +5 16 +6

Median The median is the middle value among all the scores. When the scores or values of available are arranged in ascending or descending order of magnitude, the score of value of the central item in this descending or ascending series is known as median. According to Mangal (1987). “the median of a distribution is the point on the score scale below which one half or 50% of the scores fall”. Median divides the ascending or descending series of items into two equal parts. It may be clearly understood or explained that it is not the number of central item which is the median, but it is the value or the measure of the central item which is known as median. In other words, the value of the 50 th percentile is known as the median value. However, median is obtained by arranging the scores in a sequence and finding the value of the middle score in this distribution. Calculation of Median: the procedure of calculating median for ungrouped and grouped details described below: (a) Ungrouped Data: Mathematically, when the number of items is odd and not even Median = N + 1 / 2th Value Where Median = Median N = total number of the series For example, if we have a series of seven values, the median will be (7+1)/2 = Fourth value in the series. For instance, if the body weight of seven boys is arranged in the ascending order, the value of the weight of the 4 th boy will be known as the median weight as illustrated below in Table

Median Median of the sample having odd number of subjects: Median for even values: if the number in the sample is even (say 8 boys then median = 4.5 th value which is non-existent) then the average value of the meddle two scores is known as the Median. In the above case, the average weight of 4 th and 5 th boys in a sense of eight boys will be taken as Median as illustrated below in Table. S. No. (after ascending order) Body weight (kg) 1 30 2 32 3 35 4 38 5 40 6 45 7 47

Median Median of the sample having even number of subjects: Median = N + 1 / 2 => 8 + 1 / 2 4.5 th Value Median = 38 + 39 / 2 = 38.5 kg i.e., mathematically, median of even items may be calculated by the following formula: Sum of middle two scores / 2 (b) Grouped Data: in case, the data is on a larger sample and arranged in the form of frequency distribution , the calculation of median needs to locate the class (group) in which the middle value lies. If N = 50 i.e., even number formula, then the median will be given by the term determined by the following: S. No. (after ascending order) Body weight (kg) 1 30 2 32 3 35 4 38 } Median = 38.5 39 5 6 40 7 45 8 47

Median Median = (N/2) th item + (N/2 + 1) / 2th item In case N = 50 and its frequency distribution is as per table then Median = (N/2) th item + (N/2 + 1) / 2th item Therefore, Median = 50/2 + 50/2+1 / 2 = 25 + 26 / 2 = 25.5 th value Median of large sample organized in the form of a frequency table The cumulative frequency column indicates that, the 25.5 th value is in class interval 168.6 – 171.5 Since: Median = L + N/2 – F/ f x I When N = Odd number Median = L + N+`/2 – F / f x I When N = Even number F = Cumulative frequency below median class f = frequency of class having median i = Class interval N = total number (sum of all frequencies) S. No. Body Height Classes f Cf 1 162.6-165.5 4 4 2 165.5-168.5 11 15 3 168.6-171.5 21 36 4 171.6-174.5 10 46 5 174.5-177.5 4 50 162.6-177.5 N = 50

Median For example, applying the above formula to the data given in table, we get L = 168.6; N = 50; F = 15; f = 21 Since, Median = L + (N+1)/2 – F/ f x i 168.6 + (25.5-15)/ 21 or 168.5 + 10.5/21 x3 =>168.6 + 1.5 =>Median = 170.1 cm The median may also be calculated by the formula of calculating percentiles from grouped data because 50 th percentile is the median. The formula of calculating 50 th percentile or median is given below: Median = I r I + 0.5 (N) - ∑ fb / fw x ( i ) Where I r I = lower real value of the interval containing middle value of a distribution. 0.5 = fractional value out of one or 50/100 = 0.5 N = Number of subjects ∑ fb = cumulative frequency below the interval containing median value fw = frequency with in the median interval i = size of class interval.

Mode “the most frequently occurring value in a distribution is known as mode”. In other words, it is the most common value of a distribution and is usually found to lie towards the center of the distribution and is known as a measure of central tendency. Sometimes, it is symbolized as Mo, however mostly it is not symbolized and is used as the complete word mode. It indicates the centre of concentration of frequency in and around a given value. Mode is most important in making choices of sizes say, for footwear, ready-made garments where the manufacturer is interested to produce number of items according to the frequency of sizes / choices of customers. Calculation of Mode: The procedure to compute mode is described below for both ungrouped data: (a) For ungrouped data: in case of ungrouped data, the value of that datum which occurs most frequently gives the mode value. Hence, mode of ungrouped data is calculated by the construction of frequency table. The value which has the highest frequency is the Mode. (b) In Grouped Data: in case of frequency distribution table data, the mode may be defined as the midpoint of the class interval containing the largest number of items i.e., maximum frequency Or Mode = L + fm – f (m-1)/ (fm – f (m-1) + (f(m) – f (m+1) x i Where: L = Lower limit of model class fm = frequency of model class f(m-1) = frequency of class preceding model class f(m+1) = frequency of class succeeding model class For example, for the data of table Mode = 168.6 + (21-11) / (21-11) + (21-10) x 3 168.6 + 10/21 x 3 168.6 + 10/7 168.6 + 1.43 Mode = 170.03 cm Another formula for computing Mode is, Mode = 3 median – 2 where median is median and M is a mean of the given data. For example, considering data of table As median = 170 cm And mean = 169.88 = 3 x 170 – 2 x 169.88 => 510 – 339.76 => 170.24 cms .

Properties (A) Properties of Mean: (1) It is the balance point of the distribution where the negative deviations equals that of positive deviation. (2) It is quite responsive to each score and is more sensitive to extreme scores than are the median and mode. (3) It is a measure which best reflects the total of all the scores. (4) It is widely used, implicitly or explicitly, in advanced statistical tests. (5) It is least affected by the sampling fluctuations. The mean fluctuates least when based on different samples drawn randomly from a population. (B) Properties of Median: (1) It is not affected by the extreme scores. (2) It is responsive only to the number of scores above or below its value. (3) It is subject to greater sampling fluctuations. The value of median fluctuates considerably when based on different sample of the same population. (4) it is the only suitable and stable measure of central tendency for open-ended distributions. Where there are no specific values in the beginning and / or at the end of the distribution, unlike mean, median can be easily computed. (C) Properties of Mode: (1) Determined by the most frequently occurring value, the mode is affected most by the sampling fluctuations. (2) It is greatly affected by the choice of class interval than mean and median. (3) It is most suited measure to discrete (categorical) variables.

Merits and Demerits of Mean Central Tendency Merits Demerits Mean a) It is common average easily understood by all. b) It gives weight to each and every value in proportion to its quantity. c) It is accurately defined d) It can be calculated without knowing the details of the data e) It is not affected by the position in the distribution f) Many further statistical constants may be obtained from mean value. a) It is greatly affected by extreme values. b) It cannot be located with the help of a diagrammatic presentation of the data. c) It cannot be used in the case of qualitative data.

Merits and Demerits of Median Central Tendency Merits Demerits Median a) It is not affected by the size of extreme values. b) It is an important measure of central tendency of qualitative data c) It can be located on graphical presentation of Data a) It is affected by sampling fluctuations b) It is necessary to arrange the data in ascending or descending order to compute the median value c) It does not take into consideration every value of the group

Merits and Demerits of Mode Central Tendency Merits Demerits Mode a) It is most easily calculated b) It gives the best representative value of the sample c) It can be determined graphically d) It gives the actual value of an important item of the series e) It is not affected by extreme values. a) Choice of class intervals has a great influence on the value of mode b) It does not consider all the values c) No further statistical manipulation can be obtained d) It is greatly affected by change in ‘the’ sample

Comparative Roles of Mean, Median and Mode The above comparative picture of merits and demerits of mean, median and mode and their properties give an idea of the comparative roles of each. Calculation of any one of the three provides a measure of central tendency. The comparative roles of Mean, Median and Mode in different situations are described below in brief one by one: 1. Relative role and use of Mean: ( i ) It is the most accurate measure of central tendency. Its role is most significant among the three forms of central tendency because it is the mo0st stable form of average value which has the least fluctuations in its value when based on different samples drawn from the same population. (ii) Mean is best suited to elaborate statistical treatment for the computation of other statistical constants like standard deviation and coefficients of correlation. 2. Relative role and use of Median: ( i ) When exact mid point is needed, it is the median which is helpful and is used to divide the group into two equal parts. (ii) if a distribution has extreme scores with a great variability around the central tendency, the role of median is most important because it is not affected by the extreme values. For example, if a physical education teacher is interested to find the central tendency of hammer throw in a group of 20 students and he finds that two or three students throw the hammer to very small distance due to their inexperience and non-cooperation. The teacher will get the same average typical value as obtained when these two – three students also learn and co-operate in improving their performance. (iii) The role of median is also important when the data has a incompletely defined grouping having players from greatly different grades of a high school. It is impossible to calculate mean in such a data, however, there is neither any difficulty nor any deterioration in the reliability of central tendency if computed from the median.

Comparative Roles of Mean, Median and Mode 3. Relative Role of Mode: ( i ) One of the significant role of mode is to provide a quick measure of central tendency. Just a cursory look over the data enables to find the most repeatedly occurring value in a group of sample values which represents the mode. The value which is repeated maximum number of times in a non-grouped data and the highest frequency of the grouped data is the value of mode and is quite evidently the quickest way of finding a measure of central tendency. (ii) the role of mode is most significant in the large scale manufacture of consumer goods where the most frequency occurring size (mode) is to be manufactured in the maximum quantity. Obviously median and mean have only limited values in finding the size of shoes and ready-made garments which will be most in demand for their fitting to most men and women of the region, so the manufacturer has to depend upon the average or central tendency computed in the form of mode. Summarily, the comparative role of three measures of central tendency and the preference of using these measures is given in table in next slide.

Preferential use of mean, median and mode in different situations Mean Median Mode Mean is computed Median is computed Mode is computed ( i ) To get an accurate average value ( i ) To know exact middle value ( i ) To know most frequently occurring value quickly (ii) When other statistical constants like standard deviation, coefficient of variation etc. are needed (ii) When the frequency distribution is open ended in the beginning or /and at the end (ii) When the investigator is interested in most frequently occurring value especially for manufacturing consumer good etc. (iii) When the distribution has no extreme values (iii) When the data has extreme values with a wide range which will greatly affect the mean value (iii) When the graphical representation of data is available (iv) When the data is normally distributed and has more than typical value interrupted discretely (e.g. bimodal distribution) (iv) When graphical representation on data is available (iv) When the data is not normally distributed.

Relationship among Mean, Median and Mode The relationship between mean, median and mode depends on the type of distribution of scores of achievements or values of the measurements of the group of individuals. When the data of these values is normally distributed or is symmetrical, the values of mean, median and mode are exactly equal. In case, the data is asymmetrical or lkewed , the values of mean, median and mode are different and their relative positions are indicated in figure. It is the shape of the frequency distribution which enables the investigator to decide about the appropriateness of computing mean, median, and mode for a particular data. If the investigator does not need any further details of data like standard deviation etc., the calculation of mode and median is more easier. When the distribution is skewed, the calculation of median is most suited and when investigator is working for consumer products manufacture etc., mode is most suitable. For teachers of physical education and sports sciences, the calculation of mean is most frequently needed while median is used to eliminate extreme scores and mode is seldom needed. However, the students of physical education and sports must be clear in their minds regarding the relationship between the three measures of central tendency. In slightly skewed distribution, usually the difference between mean and median is one third of the difference between mean and mode as represented by the following mathematical formulae: Mean = Median ± 1/3 (mode-mean) Or Mode = 3 Median – 2 Mean Or Mode = Median ± 2/3 (Median – Mean) Or Median = Mean ± 1/3 (Mode – Mean)

Perfect normal (unskewed distribution) Perfect normal (unskewed distribution)

Positively Skewed Distribution

Negatively Skewed Distribution Negatively Skewed Distribution

Test of Variability Meaning and Definition of Variability: A cursory look at the graphical representation of three figures of normal curve enable us to know that the two random samples may have a common mean but still may differ in the scatter of their individual values. The two groups may have widely different mean value but may have a comparable spread of variability of its individual scores and most frequently, the two groups have their individual mean values and scatters. Two groups with different variability but equal mean values

Meaning and Definition of Variability The spread of a group’s individual scores from lowest value to highest value or dispersions from the mean value is known as the variability of the group under study. As ‘mean’ provides a representative single value which describes the level of performance of the entire group, similarly the measure of variability is supposed to describe summarily the spread or dispersion of performance of individuals of the group. Different mean value but similar variability

Meaning and Definition of Variability Different mean value and different variability

For example, let us consider the following values of body height of 10 athletes and 10 basketball players, to understand more clearly the meaning and importance of variability. Table below recorded data on height along with mean, range and standard deviation of 10 male athletes and 10 male basketball players: Category Individual height measured (cm) Mean (cm) Range (cm) S. D. +/- (cm) Athletes N=10 155, 159, 160, 163, 170, 174, 176, 178, 180, and 185 170 31 10.2 Basketball 166, 167, 168, 169, 170, 170, 71, 172, 173 and 174 170 09 2.6

Variance It is also a reliable and useful measure of variability and may be defined as “the mean of the squares of the deviations of scores from their mean” In other words, it is equal to the square of standard deviation. Rather, it is the last but one step in the calculation of standard deviation and is used in inferential statistics. Mathematically, Variance = ∑(X-X bar)2 / N or N-1 or ∑(X- μ )2 / N or Variance = ∑X square – (∑X)2/N divided by N (for N>30) or N-1 Where X = individual Score; X bar = Sample Mean; μ = Population Mean Coefficient of Variation : This measure of variability is used to compare the variation of two or more variables having different units of measurement because it eliminates the units by converting standard deviation into a coefficient. It may be defined as “the standard deviation in percentage of the mean value” It is computed by the following equation Coefficient of Variation(CV) = S. D./ Mean x 100 For example, observe the computation of CV for the assumed data given in Table given in next slide.

Coefficient of Variation Now observe the comparative values of coefficient of variation in case of 100M, SBJ and Triceps skinfold scores presented in the histogram given in next graph . The values of CV indicate clearly that triceps skinfold thickness is many fold more variable than SBJ or 100M performance and that SBJ has been observed as more variable to 100M performance. Such a clear picture about the relative variability is not visible if we examine the values of SD of performances of 100M running. Standing broad jump and triceps skinfold thickness measurements (i.e., SD = 0.8 sec.; SD = 20 cm; and SD = 3.2mm) Histogram of relative variability from the values of CV Statistics 100 M (sec) SBJ (cm) Triceps skinfold (mm) Mean S. D. C. V. 14.0 sec ± 0.8 sec S. D./mean x 100 0.8x100/14.0 = 40/7 = 5.71 210 cm ± 20 cm 20x100/210 = 200/21 = 9.52 8.4 mm ± 3.2mm 3.2x100/8.4 = 800/21 = 38.09

Coefficient of Variation The fundamental property of coefficient of variation: It is a measure of relative variation. In other words, it enables to compare the variability in different variables having widely differing mean values (or mean values expressed in different units of measurement). Thus, CV enables to compare the variability of different measurements of a group on the one hand and variability of different groups with respect to a variable on the other hand. Absolute and Relative Variability: After standing various measures of validity the reader should be now ready to understand the difference between absolute and relative variability. The meaning and use of both absolute and relative variability have their own separate utility. (a) Absolute Variability: The range, A.D., Q. D., S. D., and variance represent and absolute variability of a parameter. In other words, absolute variability is the quantity of variation directly in the units of measurements. (b) Relative Variability: The CV represents the relative variability of different variales irrespective of their units of measurements.

Relationship between average deviation, quartile deviation and standard deviation There is a definite relationship between above mentioned three tests of variability of normally distributed variables the Quartile Deviation is smallest, the value of Average Deviation comes next whereas the numerical value of Standard Deviation is the largest. Representing the variability of normal distribution symbolically QD < AD < SD Further QD = 2 x SD/3 or AD = 5AD/6 AD = 4 x SD/5 or AD = 6QD/5 SD = 5 AD/ 4 or 3 QD/2 The proportion of percentage of individual scores covered within one unit of QD/AD/SD is given below: 1. X bar ± 1.0 QD = 50% of the total number of score 2. X bar ± 1.0 AD = 57.51% of the total number of score 3. X bar ± 1.0 SD = 68.27% that is about 2/3 of the total number of values or scores of a normal distribution

Normal Probability Curve The word normal is commonly used in everyday life to describe the average value for a variable in a desirable range. In other words, a person having height, weight, intelligence or motor ability close to the average value is known as normal. The individuals having lower values than the average value are called below average and having higher value than the average value are called above average. Since most of the biological, psychological and motor performance variables are distributed in such a way that majority of the individuals possess average or nearly average value and only a few individuals possess values which are widely away from the average value. In other words, the frequency on either side of the average values goes on decreasing as the difference between the average and the individual values goes on increasingly such a distribution when plotted by taking frequency on the vertical axis (i.e., Y axis) and size of variable on the horizontal axis (or X axis) gives rise to a bell shaped curve. A perfectly bell shaped curve is known as normal probability curve or simply ‘normal curve’ and the distribution of such scores or values giving rise to a normal curve is known as normal probability distribution or simply ‘normal distribution’.

Normal Probability Curve A perfect normal curve However, statistically, the areas of frequencies of a perfect normal curve are constant and perfectly defined by statistical constants in percentage of values of a distribution or in proportion to standard deviations. But, the actual distribution based on samples is seldom perfect and just approaches towards perfection and here the word probability is used. Probability values are illustrated in exact distribution of a perfect normal curve shown in the above figure. The accuracy of sample distribution is found by comparing the probable values with the exact experimental observations.

Meaning and Definition of Probability In order to understand properly the implication of probable values, it is important to explain the meaning and definition of probability. Probability is a quite common term used in everyday life. Majority of the statistical inferences and conclusions are based on the comparison of experimentally observed values with the statistical probabilities. Routinely we use the term probability accompanied with percentage value as indicated in the following sentences: While tossing a coin there is 50% probability that it will come up heads and 50% probability that it will come up tails. There is 80% probability that this goal keeper will block all shots on his goal. From a deck of playing cards there is 50% probability of getting the first card of red colour. There is 25% probability of getting it from diamonds. Thus, from the observation of above description of events, it may be concluded that probability may be defined as “the natural chance of occurrence of an event in a single trial expressed in terms of its percentage value with respect to total possible outcomes”. Minium (1978) defines probability in the following alternative ways: “Given a population of possible outcomes, each of which is equally likely to occur, the probability of occurrence on a single trial of an outcome characterized by A is equal to the number of outcomes yielding A, divided by the total number of possible outcomes”

Meaning and Definition of Probability “The probability of the occurrence of any event or any situation(say A) on a single trial is the proportion of trials characterized by A in an infinite series of trials, when each trial is conducted in a like manner” “The probability of occurrence of any one of several particular outcomes is the sum of their individual probabilities, provided that they are mutually exclusive” “The probability of several particular outcomes occurring jointly is the product of their separate probability provided that the events that generate these outcomes are independent’ For example in a deck of 52 cards there are 4 Aces, on drawing a card at random there are only four chances out of 52 that the drawn card will be an Ace. Hence, the probability of drawing an Ace from a pack of 52 cards is 4/52 x 100 = 7.6923% Similarly there are 13 cards of clubs, diamonds, spades and hearts each. Hence the probability of drawing a card of diamonds from a pack of cards is and so on 13/52x100=25%, with a clear concept of probability as explained above, it s quite pertinent to describe the difference between above mentioned theoretical probability and empirical (practical i.e., observed or experimental) probability.

Meaning and Definition of Probability The Theoretical probability is calculated based upon the logical ground as explained while dealing with the definition of probability. The empirical probability is not calculated but is based on the actual experience of getting the observation while experimenting. The inferences in statistics are mainly based upon the comparison of empirical probability with the theoretical probability. It may be mentioned here that due to the normal distribution of common human attributes like beauty, wealth, health, height, weight, intelligence etc., the probability is that the majority among human beings possess average beauty, health, wealth, height, weight, intelligence etc. According to the pattern of normal distribution, only a very few persons have a probability of possessing extreme (marked) deviations from average values of these attributes.

Normal Probability Curve As mentioned above, majority of the biological, psycho-motor and physiological variables are distributed in a fashion characterized by the normal probability curve (more commonly known as normal curve). It becomes very important to describe, illustrate and explain briefly the normal probability curve along with its principles, application etc. The details of variability along with the meaning and definition of standard deviation have already been dealt. It may be sufficient to mentioned here that standard deviation is a measure of variability and is used to describe the theoretical probability of the normal distribution curve. It may be noted that many of the procedures in inferential statistics are based on the assumption of normal distribution of the concerned variables. Hence, it is very relevant to discuss here what is a normal curve? What are its principles? What are its uses especially in answering statistical questions?

Normal Probability Curve Meaning of the Normal Curve: The normal curve is a graphical representation of the normal distribution while the normal distribution relates to a set of scores which when represented graphically provide the pictorial normal curve. This, the normal distribution and normal curve are one and the same. The normal curve is also a mathematical abstraction defined by the following equations (Minium 1978): Y= N/ σ√ 2 π x e to the power –x Square / 2 σ Square Where Y= frequency of a given x N = Total number of cases χ = X-X bar X = raw scores (individual values) X bar = mean of raw scores σ = Standard deviation of the distribution of raw scores π = 3.1416 (the ratio of the circumference of a circle to its diameter) e = 2.7183 (base of the Napierian system of logarithms) Thus, π and e are mathematical constants.

Normal Probability Curve The variables which are responsible for determining the value of y and x, x, SD ( σ ) and N. The above equation gives an ideal theoretical normal curve which is completely bell shaped as against the empirical normal curve obtained by plotting the actual observed data. Since the data of a coin tossing or dice throwing involving chance or probability values, when plotted on a graph paper, give rise to a bell shaped curve with all the properties of a normal curve, this normal curve is known s Normal Probability Curve. In early nineteenth century, Gauss derived the normal curve while working on experimental error in his astronomy experiments. The normal probability curve is also popularly known as Gaussian curve and the normal distribution as Gaussian distribution after Gauss.