KheannJanePasamonte
73 views
34 slides
May 01, 2024
Slide 1 of 34
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
About This Presentation
Research
Size: 2 MB
Language: en
Added: May 01, 2024
Slides: 34 pages
Slide Content
BASIC STATISTICS ON RESEARCH DESIGN GRACE P. PRINCIPE
TYPES OF DATA IN STATISTICS Continuous Data Discrete Data 3. Nominal Data 4. Interval Data 5. Categorical Data
Continuous Data are data which come from an interval of possible outcomes. Examples of continuous data include: the amount of rain, in inches, that falls in a randomly selected storm the weight, in pounds, of a randomly selected student the square footage of a randomly selected three-bedroom house
Discrete Data data with a finite or countably infinite number of possible outcomes. Examples of discrete data include the number of siblings a randomly selected person has the total on the faces of a pair of six-sided dice the number of students you need to ask before you find one who loves
Nominal Data values are grouped into categories that have no meaningful order. For example, gender and political affiliation are nominal level variables. Members in the group are assigned a label in that group and there is no hierarchy. Typical descriptive statistics associated with nominal data are frequencies and percentages.
Interval Data is a type of data which is measured along a scale, in which each point is placed at an equal distance (interval) from one another. Interval data is one of the two types of discrete data. An example of interval data is the data collected on a thermometer—its gradation or markings are equidistant.
Categorical Data Categorical variables represent types of data which may be divided into groups. Examples of categorical variables are race, sex, age group, and educational level. While the latter two variables may also be considered in a numerical manner by using exact values for age and highest grade completed, it is often more informative to categorize such variables into a relatively small number of groups.
STATISTICS the science concerned with developing and studying methods for collecting, analyzing, interpreting and presenting empirical data. Statistics is a highly interdisciplinary field; research in statistics finds applicability in virtually all scientific fields and research questions in the various scientific fields motivate the development of new statistical methods and theory.
Statistics and Its Types Statistics is a collection of planning experiments methods, obtaining data, analyzing, interpreting, and drawing conclusions based on the data ( Alferes & Duro 2010). It is divided into two main areas: Descriptive and Inferential.
Descriptive Statistics summarizes or describes the essential characteristics of a known set of data. are brief descriptive coefficients that summarize a given data set, which can be either a representation of the entire or a sample of a population. For example, the Department of Health conducts a tally to determine the number of CoViD-19 cases per day in the Philippines.
Inferential Statistics uses sample data to make inferences about a population. It consists of generalizing from samples to populations, performing hypothesis testing, determining relationships among variables, and making predictions. For example, assuming you want to find out if the Filipinos want to take a shot on the CoViD-19 vaccine. In such a case, a smaller sample of the population is considered. The results are drawn, and the analysis is extended to the larger data set.
Tools in Descriptive Statistics Frequency Distribution is a collection of observations produced by sorting them into classes and showing their frequency or numbers of occurrences in each class. For example, twenty-five students were given a blood test to determine their blood types.
From the given data, here is how to organize them using frequency distribution. Data sets of Blood types of Twenty-five students.
Measures of Central Tendency or Position or Average When scores and other measures have been tabulated into a frequency distribution, the next task is to calculate a measure of central tendency or central position. This measure of central tendency is synonymous with the word “average”. An average is a typical value that tends to describe the set of data.
mean Mean, or simply the average is the most frequently used and can be described as the arithmetic average of all scores or groups of scores in a distribution. The process can be done by adding all the scores or data then divided by the total number of cases.
median Median, or the middle-most value in a list of items arranged in increasing or decreasing order. If the case is in an odd number or items, there will be exactly one item in the middle. In case the number or items is an even number, the midpoint will be determined by getting the average of the two-middle item.
mode mode is the score or group of scores that occur most frequently. Some distributions don’t have mode at all. Others may have more than one mode. In cases that the distribution has two modes, the term used is bimodal.
Laboratory tests reveal the incubation period (measures in days) of virus among the 30 infected residents of Brgy . Malinis In dealing with this, arrange the given data from highest to lowest or vice versa
Measures of Variation/ Dispersion The previous section focused on average or measures of central tendency. The averages are supposed to be the central scores of a given set of data, However, not all features of a given data set may be reflected by the averages. Suppose, two different groups of 5 Students are given 20-item identical quizzes in Science. The following data below were the results.
Measures of Variation/ Dispersion The average of each group are as follows. As shown in the second table, the two sets of averages have no difference. But both groups show an obvious difference. Group 2 has more widely scattered data compared to Group 1. This characteristic called variability or dispersion is not reflected by averages. The three basic measures of dispersion are range, variance, and standard deviation.
Range is the simplest measure of dispersion to calculate. It is done by getting the difference between the highest/largest value and lowest/smallest value in each set of data. A larger range suggests greater variations or dispersion. On the other hand, a smaller range suggests lesser variations or dispersion
Variance measures how far a data set is spread out. It is mathematically defined as the average of the squared differences from the mean.
Standard Deviation is the most commonly used measure of dispersion. It indicates how closely the values of the given data set are clustered around the mean. It is computed by getting the positive square root of variance. The lower value of standard deviation means that the values of the given set of data are spread over a smaller range around the mean. On the other hand, greater value means that the values of the given set of data are spread over a larger range around the mean.
used in hypothesis testing To determine whether a predictor variable has a statistically significant relationship with an outcome variable and estimate the difference between two or more groups. To determine what type of statistical tool is appropriate. To choose the test that fits the types of predictor or independent variables and outcome/dependent variables you have collected.
Tools in Inferential Statistics Statistical tests are used to derive a generalization about the population from the sample. A statistical test is a formal technique that relies on the probability distribution for concluding the reasonableness of the hypothesis. These hypothetical testing related to differences are classified as parametric and non-parametric tests. The parametric test is one that has information about the population parameter. On the other hand, the non-parametric test is where the researcher has no idea regarding the population parameter.
Parametric Tests usually have stricter requirements than non-parametric tests and can make more robust inferences from the data. They can only be conducted with data that adheres to the standard assumptions of statistical tests. The most common types of the parametric test include regression tests, comparison tests, and correlation tests.
Parametric Tests Flowchart that will help us determine the appropriate statistical tool for parametric tests
example The Effect of the Amount of Chlorine in the Color of Algae. Identify first your independent and dependent variables, how many are they, and their type, whether qualitative/ categorical or quantitative/numeric. After identifying such, look at the diagram above to know the parametric test's right statistical tool. In the given problem, the amount of chlorine is the independent variable, it’s numeric or qualitative, and 2 or more amounts of chlorine may be used in the experiment. The dependent variable is the color of algae; its categorical and color may vary. So, looking at the above diagram, logistic regression is the appropriate tool.
Non-Parametric Test They don’t make as many assumptions about the data and are useful when one or more common statistical assumptions are violated. However, the inferences they make aren’t as strong as with parametric tests.
Non-Parametric Test The table shows how to determine the appropriate non-parametric tool to be used.
Statistical tools are complex, especially among beginners. However, according to Grobman , 2017, the most commonly used in science investigatory projects are chi-square, t-tests, and correlations. In determining whether there is no statistically significant relationship between the independent and dependent variables, we always consider the standard rule of thumb. If the p-value is lower than 0.05, we reject the null hypothesis and accept the alternative hypothesis.
Licensed Statisticians play a vital role in computing and interpreting the results of the data gathered. In any investigation, it is important to consult them to ensure that your results are statistically correct. SPSS and Strata are some of the most common software they are using.