1. What is Data Analysis?1. What is Data Analysis?
2. Statistics2. Statistics
3. Types of Analysis3. Types of Analysis
4. Levels of Measurement4. Levels of Measurement
5. Measures of Central Tendencies5. Measures of Central Tendencies
6. 6. Major Areas of Statistics: Descriptive and
Inferential Statistics
7. Population and Sample
8. Methods of Data Collection 8. Methods of Data Collection
9. Probability Sampling Collection9. Probability Sampling Collection
10. Nonprobability Sampling Collection10. Nonprobability Sampling Collection
11. Hypothesis11. Hypothesis
12. Level of Significance12. Level of Significance
13. Errors in Hypothesis Testing13. Errors in Hypothesis Testing
14. Reliability and Validity14. Reliability and Validity
Data analysis is the process of
systematically applying
statistical and/or logical
techniques to describe and
illustrate, condense and recap,
and evaluate data.
Dr. D. Ibanez
any recorded information derived
from counts, measurements,
observations, interviews,
experiments and other
techniques. The data originally
measured are referred to as raw
data.
Dr. D. Ibanez
a branch of mathematics dealing with the
collection, analysis, interpretation and
presentation of masses of numerical data
Dr. D. Ibanez
A. As a body of knowledge or scienceA. As a body of knowledge or science
The study of data
The study of populations
The study of variation
The study of distributions
B. As a mass of dataB. As a mass of data
Dr. D. Ibanez
STATISTICSSTATISTICS
(s.) a branch of knowledge(s.) a branch of knowledge
(a science)(a science)
(pl.) (pl.) data
nominalnominal
ordinalordinal
intervalinterval
ratioratio
collectioncollection
interviewinterview
questionnairequestionnaire
observationobservation
recordsrecords
presentationpresentation
textualtextual
tabulartabular
graphicalgraphical
analysis
univariateunivariate
bivariatebivariate
multivariatemultivariate
Interpretation
of data
narrow
broad
Dr. D. Ibanez
Univariate analysis- technique referring to the
analysis of single variable distributions. Example:
measures of central location like mean, mode and
median; frequency distribution, graphs, tables, etc.
Bivariate analysis- technique referring to the
analysis of two variables. Example: t-test or test of
difference, relationship, etc.
Multivariate analysis- technique referring to the
analysis of more than two variables. Example: 3-
Way ANOVA, MANCOVA,
Multiple Regression Analysis
Dr. D. Ibanez
Levels of Measurement refer to the
amount of information implied by the
numbers that represent the categories of a
variable. There are four levels of
measurement, namely:
1.Nominal
2. Ordinal
3. Interval - Scale
4. Ratio - Scale
Dr. D. Ibanez
Basic level of measurement
Also known as categorical or qualitative
There is no sense of order
Can be given a code but it doesn’t imply
order but just mere description/label
For example: age, sex, color, preferred type of
chocolate, blood type, race, eye color
To summarize nominal data, we use
Frequency and percentage
We cannot calculate mean or average in this
data
Dr. D. Ibanez
The data has meaning but intervals within
the data may not be equal
For example: rank, socio-economic status,
educational level, satisfaction rating, income
level
To summarize ordinal, we use frequency,
percentage, and sometimes mean
Dr. D. Ibanez
Also known as SCALE
The most precise level of measurement
It can be measured rather than
classified/ordered
For example: number of customers, weight,
age, size, length, temperature, grades
Can be discrete (whole numbers, i.e. 5
customers, 5 points) or continuous (example
4.2 miles, 32 degrees, 2.5 minutes)
Dr. D. Ibanez
You can “scale up” but
can’t “scale down”
-Interval/ratio to
nominal
-Interval/ratio to ordinal
-Ordinal to nominal
Dr. D. Ibanez
STATISTICSSTATISTICS
(s.) a branch of knowledge(s.) a branch of knowledge
(a science)(a science)
(pl.) data(pl.) data
nominalnominal
ordinalordinal
intervalinterval
ratioratio
collectioncollection
interviewinterview
questionnairequestionnaire
observationobservation
recordsrecords
presentationpresentation
textualtextual
tabulartabular
graphicalgraphical
analysisanalysis
univariateunivariate
bivariatebivariate
multivariatemultivariate
Interpretation
of data
narrow
broad
Dr. D. Ibanez
Mean
Median
Mode
Dr. D. Ibanez
Mean is the sum of the values, divided by the
number of values.
Example: War On Drugs
The number of illegal suspects killed that the
Philippine National Police (PNP) responded to
for a sample of 17 weeks is shown. Find the
mean
Dr. D. Ibanez
Solution :
X=
=
2+6+16+20+19+61+32+90+120+139+136+157+159+129+119+102+58
17
X 80.36
Hence, the mean number of ID suspects killed per week to which the police
responded is 80.36.
Standard Deviation is a statistic that
measures the dispersion of a dataset relative
to its mean and is calculated as the square
root of the variance.
If the data points are further from the mean,
there is a higher deviation within the dataset,
thus the more spread out the data, therefore
the higher the deviation.
Dr. D. Ibanez
Median is the midpoint of data array.
Example: Police Officers Killed
The number of police officers killed in the line
of duty over the last 11 years is shown. Find
the median.
Dr. D. Ibanez
177 153 122 141 189 155 162 165 149 157 240
Mode is the value that occurs most in the
data set.
Example: Find the mode of the signing
bonuses of 8 PBA players for a specific year.
The bonuses in millions of pesos are
Dr. D. Ibanez
2.8, 2.0, 3.4, 4.0, 5.3, 4.0, 4.5, 4.0
Descriptive StatisticsDescriptive Statistics
concerned with the methods for
collecting, organizing and describing
a set of data so as to yield meaningful
information.
For example: frequency distribution,
measures of central tendency,
measures of variation, normality test,
identification of outliers
Dr. D. Ibanez
Inferential StatisticsInferential Statistics
deals with the analysis and
interpretation of data
For example: test of difference, test
of relationship and test of
association
Dr. D. Ibanez
Parametric tests – are tests that require
normal distribution, the levels of
measurement of which are expressed in an
interval or ratio data. These are tests that
used parameters.
Nonparametric tests – are tests that do not
require a normal distribution, and they
utilize both nominal and ordinal data. No
need to use parameters for these tests.
Dr. D. Ibanez
Number of
Groups/Variables
Parametric tests Non-parametric tests
2 independent groupst-test for
independent samples
Mann-Whitney U
Wilcoxon rank-sum
test
Correlated sample/ one-
sample group
Paired t-test Wilcoxon Signed
Rank Test
Fisher sign test
McNemar’s test for
correlated
proportions
3 or more independent
groups
ANOVA (F-test) Kruskal-Wallis test
Friedman test
Dr. D. Ibanez
Number of
Groups/Variables
Parametric tests Non-parametric tests
Relationship: one
dependent and one
independent variable
Pearson Product
Moment Coefficient of
Correlation
Chi-square test of
independence
Chi-square test of
homogeneity
Spearman Rank-Order
Coefficient of
Correlation
Association: one dependent
and one independent
variable
Simple linear
regression
Kendall’s Coefficient of
Concordance W
Association: One
dependent and 2 or more
independent variables
Multiple linear
regression
Kendall’s Coefficient of
Concordance W
Dr. D. Ibanez
1. What is the demographic profile in terms of: DESCRIPTIVE
1.1 Age
1.2 Sex
1.3 Monthly Family Income
2. What is the level of burnout in terms of the following: DESCRIPTIVE
2.1. personal burnout;
2.2. work-related burnout;
3. What is the level of safety outcome measures in terms of event reporting?
DESCRIPTIVE
4. Is there a significant relationship between the level of burnout and level of
safety outcome measures? INFERENTIAL
5. Does burnout significantly predict the safety outcome measures?
INFERENTIAL
Dr. D. Ibanez
A population is the entire group that you
want to draw conclusions about.
A sample is the specific group that you will
collect data from.
The size of the sample is always less than the
total size of the population.
In research, a population doesn't always refer
to people. It can mean a group containing
elements of anything you want to study, such
as objects, events, organizations, countries,
species, organisms, etc.
Dr. D. Ibanez
VariableVariable - Any trait or attribute that vary
from person to person or case to case.
MeasurementMeasurement – the assignment of numbers
to attributes of persons or objects based on
an assigned rule
Dr. D. Ibanez
Administration of Tests, Scales, and
Questionnaires
Interviews
Focus Group Discussions
Observations
Records
Dr. D. Ibanez
The process of selecting the sample or the
study units from a previously defined
population.
Dr. D. Ibanez
The ways of selecting a part of the population
to enable researchers to make reliable
inferences about the nature of the
population.
The list of units from which we draw the
sample in any sampling procedure is called
the sampling framesampling frame.
Dr. D. Ibanez
Simple random sampling
Systematic random Sampling
Stratified random sampling
Cluster random Sampling
Dr. D. Ibanez
Simple random sampling is a subset of
statistical population in which each member
has an equal probability of being chosen. An
example of a simple random sampling would be
the names of 25 employees being chosen out of
a hat from a company of 250 employees.
Systematic random sampling is a probability
sampling method where researchers select
members of the population at a regular
interval – for example, by selecting every 15th
person on a list of the population.
Dr. D. Ibanez
Stratified random sampling is a type of sampling
method in which the total population is divided into
smaller groups or strata to complete the sampling
process. The strata is formed based on some
common characteristics in the population data.
Cluster random sampling is when the researcher
divides the population into separate group called
clusters. Then a simple random sample of clusters
is selected from the population. The researcher
conducts his analysis from the sampled clusters.
Dr. D. Ibanez
Convenience Sampling
Voluntary Response Sampling
Quota sampling
Purposive or Judgmental Sampling
Snowball Sampling
Dr. D. Ibanez
Convenience Sampling – or accidental
sampling, simply includes the individuals who
happen to be most accessible to the
researcher.
This is an easy and inexpensive way to gather
initial data, but there is no way to tell if the
sample is representative of the population, so
it can’t produce generalizable results.
Dr. D. Ibanez
Voluntary Response Sampling - similar to a
convenience sample, a voluntary response sample is
mainly based on ease of access.
Instead of the researcher choosing participants and
directly contacting them, people volunteer themselves
(e.g. by responding to a public online survey).
Voluntary response samples are always at least
somewhat biased, as some people will inherently be
more likely to volunteer than others.
Dr. D. Ibanez
Quota Sampling - is a type of non-probability
sampling where researchers will form a sample of
individuals who are representative of a larger population.
Researchers will assign quotas to a group of people in
order to create subgroups of individuals that represent
characteristics of the target population as a whole.
Some examples are these characteristics are gender, age,
sex, residency, education level, or income. Once the
subgroups are formed, the researchers will use their own
judgment to select the subjects from each segment to
produce the final sample.
Dr. D. Ibanez
Purposive or Judgmental Sampling - is a form of
non-probability sampling in which the researcher
uses his own judgment about which respondents to
choose, and picks those who best meets the
purposes of the study.
It is often used in qualitative research, where the
researcher wants to gain detailed knowledge about
a specific phenomenon rather than make statistical
inferences, or where the population is very small
and specific. An effective purposive sample must
have clear criteria and rationale for inclusion.
Dr. D. Ibanez
Snowball Sampling - also called chain referral
and referential sampling.
If the population is hard to access, snowball
sampling can be used to recruit participants
via other participants. The number of people
you have access to “snowballs” as you get in
contact with more people.
Dr. D. Ibanez
a method used in inferential statistics to
arrive conclusions about a certain population
under study through the use of sample and
parameters.
Dr. D. Ibanez
Null hypothesis – serves to deny what is
explicitly indicated in a given research
hypothesis
Research hypothesis (alternative
hypothesis) – is the hypothesis derived from
the researcher’s theory about some social
phenomenon
Dr. D. Ibanez
is a decision rule used to support or refute
the hypothesis and ensures objectivity into
interpretations of observations (p-value)
Dr. D. Ibanez
means the probability of outcome occurring
by chance is less than 5 percent. Something
else other than chance has affected the
outcome.
The value α = .05 is the significance level, the
maximum level of risk that we are willing to
accept in making inference about a
population based on the generated sample.
Dr. D. Ibanez
Generate the p-value
of a certain test
Compare the p-value
to the level of
significance (usually
0.05)
Dr. D. Ibanez
To say that a result is
statistically significant at the
alpha level just means
that the p-value is less than
alpha. For instance, for a
value of alpha = 0.05, if the p-
value is greater than 0.05,
then we fail to reject the null
hypothesis.
Dr. D. Ibanez
Decide over the H0
◦If p-value is to
0.05; the result of
the test is significant;
reject H0
◦Do not reject H0, if
otherwise
Dr. D. Ibanez
we reject the null hypothesis when it is true
and should not be rejected
The lower we set the level of significance, the
lower the likelihood of Type I error, and the
higher the likelihood of Type II error.
Dr. D. Ibanez
we fail to reject the null hypothesis when it is
actually false
The higher we set the level of significance,
the higher the likelihood of Type I error, and
the lower the likelihood of Type II error.
Dr. D. Ibanez
Dr. D. Ibanez
QUANTITATIVE QUALITATIVE MIXED METHODS
•Experimental designs
•Non-experimental
designs such as surveys
•Narrative research
•Phenomenology
•Grounded theory
•Ethnographies
•Case study
•Convergent
•Explanatory sequential
•Exploratory sequential
•Transformative,
embedded, or
multiphase
Dr. D. Ibanez
Diagnose the problem
Specify statistical test
Retrieve analysis results
Interpret analysis results
Dr. D. Ibanez
What are the questions to be
answered?
What’s the analysis required by
the question?
What is the nature of the data?
For comparative analysis, how
many means are to be
compared?
Dr. D. Ibanez