this slide will help to perform various tests in spss targeting univariate and bivariate analysis along with the way of entering and analyzing multiple responses.
Size: 5.26 MB
Language: en
Added: May 19, 2020
Slides: 91 pages
Slide Content
Univariate and bivariate analysis in SPSS Subodh Khanal Asst. professor Pakihawa Campus Institute of Agriculture and Animal Science Email: [email protected]
Type of analysis we will be doing Univariate analysis: only one variable is taken for analysis Bivariate analysis: when two variables are used Multivariate analysis: when more than 2 variables are used.
What is descriptive statistics? Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample and the measures . Together with simple graphics analysis, they form the basis of virtually every quantitative analysis of data . Descriptive Statistics are used to present quantitative descriptions in a manageable form.
Univariate Analysis/Descriptive Statistics Descriptive Statistics The Range Min/Max Average Median Mode Variance Standard Deviation Histograms and Normal Distributions Frequencies/percent
When to use Descriptives ? : open ended continuous variables (interval ,ratio) Percent/frequencies ( categorical variables) Go to analyse >>>>>go to descriptive analysis>>>>> choose either descriptive or percent
Univariate analysis Information Analysis Nominal Frequency, Percent Ordinal Frequency, Percent Interval Mean, Mode, Median, Range, Standard deviation Ratio Mean, Mode, Median, Range, Standard deviation For closed ended: use frequency, percent
Uni-variate analysis Go to analyze> descriptive statistics 1. descriptive( mean, std deviation, range) 2. frequency
Frequencies Click frequencies
Select the categorical variable and click this arrow If you want to see charts click charts and select suitable charts Click continue and click OK
Now see output, see only valid % because frequency and percent may have counted missing values also. Also very useful in data discrepancy check ( eg . eligibility, wild codes ) and missing system
Presentation I recommend you to use excel for making graphs. Label table above and figure below.
Descriptive (for numerical value) Click descriptive Select variables from LHS to RHS and click options Select options> click continue >click ok
See Z value of skeweness and Kurtosis (statistic/S.E.)
What is inferential statistics? Technique used to draw conclusion about a population by testing the data taken from the sample of a population. Includes testing hypothesis and deriving estimates. It focuses on making statements about population.
What is hypothesis testing? Hypothesis=assumption about population parameter (say population mean) In hypothesis testing, the 1 st step is to state the assumed or hypothesized value of population parameter. The assumption we want to test is null hypothesis (H0) also called as hypothesis of null difference.
The null hypothesis State the assumption to be tested. e.g. the average weight of vet intern students is 58 kg. Begin with assumption that the null hypothesis is true.
The alternate hypothesis (H1) Is opposite of null hypothesis. The average weight of students is not equal to 58 kg.
Procedure of hypothesis testing Step 1: Set up a hypothesis. Step 2: Set up a suitable significance level The confidence with which the researcher accepts or rejects null hypothesis. Denoted by alpha. In practice we use 5%, 1% and 0.1 level of significance.
e.g. when we take level of significance of 5% (alpha=0.05) There are 5 chances out of 100 that we would reject null hypothesis. In other words, out of 100, 95% chances are there that the null hypothesis will be accepted. We are 95% confident that we have make a right decision.
Step 3: determine suitable test statistic (t test, f test, chi square) Step 4: check critical value and check sample result. Step 5; make decision.
Type 1 and type 2 error : there are four possible results The hypothesis is true and our test accepts it. The hypothesis is false and our test rejects it. The hypothesis is true but our test rejects it. The hypothesis is false but our test accepts it. The last 2 possibilities are errors. The 3 rd is: type 1 error: rejecting true null hypothesis. The 4 th is : type 2 error: accepting false null hypothesis
One tailed and two tailed test
Bivariate analysis
When to use chi square test
You will be presented with the following Crosstabs dialogue box:
Kappa =degree of agreement Ranges -1 to 1 ≤ 0 as indicating no agreement and 0.01–0.20 as none to slight, 0.21–0.40 as fair, 0.41– 0.60 as moderate, 0.61–0.80 as substantial, and 0.81–1.00 as almost perfect agreement.
e.g. for using KAPPA Two policemen judging normal or suspicious behavior in CCtv footage Many situations in the healthcare industry rely on multiple people to collect research or clinical laboratory data. The question of consistency, or agreement among the individuals collecting data immediately arises due to the variability among human observers. Well-designed research studies must therefore include procedures that measure agreement among the various data collectors.
Click column
The output will be seen as This table allows us to understand that both males and females prefer to learn using online materials versus books
When reading this table we are interested in the results of the " Pearson Chi-Square " row. We can see here that χ(1) = 0.487, p = .485. This tells us that there is no statistically significant association between Gender and Preferred Learning Medium; that is, both Males and Females equally prefer online learning versus books . HOW??????
Phi and Cramer's V are both tests of the strength of association. We can see that the strength of association between the variables is very weak . >0.25: very strong >0.15: strong >0.1: moderate >0.05: weak >0: no or very weak
HOW TO PRESENT
T-test The t-test assesses whether the means of two groups are statistically different from each other. This analysis is appropriate whenever you want to compare the means of two groups
Statistical Analysis of the t-test The formula for the t-test is a ratio. The top part of the ratio is just the difference between the two means or averages. The bottom part is a measure of the variability or dispersion of the scores.
When both variable is continuous independent=gold standard value
Click define groups Put the designated code numbers. Click continue and click OK
Categorical variable has more than 2 options.
Click post hoc
Generally LSD is selected for less than 5 treatments and Duncans for more than 5 Click continue Click options Click descriptive Click continue and click OK
There is no significant difference
Significant difference was noted on weekly data of feed intake. (F=245.74, P<0.001).
Calculate lsd value See t distribution critical value table For f1 DF is 4 (see ANOVA within groups) The critical value for α =0.05 at df 4 is 2.776 Take square root of mse i.e of 5.6 i.e. s=2.366 Number of observation=number of replicates i.e =2 So LSD= txs ( √ 2)/ √n =2.776x2.366x1.41/1.41=6.556
Statistical difference between 1,3 and 1,4. Feed intake in treatment 1 is significantly higher than 3 and 4 but at par with 2.
Treatment Weekly feed intake 1 mean±se a 2 mean±se a 3 mean±se b 4 mean±se c Grand mean mean±se CV Calculate manually Lsd 245.47*** Note: ***=p<0.001, different alphabets represents that the values are significantly different.
Previous was of LSD. This is of DUNCAN Least feed intake was seen in t3 (significantly lower than other treatments). As 1 and 2 are in same subset they are not significantly different 3=a, 4=b, 1 and 2=c
How to report?
How to make scatter diagram in excel Choose your two variables Select them Click insert In charts select scatter diagram You will see blue dots Right click on it Click add trend line Click display r square and equation on charts
F1 and w1 are selected Click insert In charts select scatter diagram and click the first one
You will obtain such graph. Right click on it Click add trendline
Click linear, display equation on chart, display R2 Each unit increase in X increases Y by 0.868 units , 91.5% variation in Y is explained by X.
How to enter multiple response? First of all for all responses, make dichotomous variables; if ticked (yes=1) and crossed ( no=0)
What are the potential 5 NUS you are growing? Make each response as dichotomies…. Treat each response as a variable i.e. Grow niguro Yes (1) no (0) Grow sisno Yes (1) no (0) Grow Allo Yes (1) no (0) Grow bamboo Yes (1) no (0) Grow koiralo Yes (1) no (0) Enter the values as shown in next slide
Import the data in SPSS or simply you can change it in SPSS also First run frequencies of those newly created variables
In the Set Definition list, select each variable you want to include in your new multiple dataset, and then click the arrow to move the selections to the Variables in Set list. Select dichotomies Put the numeric value that you have used for yes e.g. 1 (in counted value) Name the variable and write its label Click add button Click close
Choose Analyze→Multiple Response→Frequencies . The new special variable should appear. Now click frequencies
Select $NUS (the new variable) from multiple response sets to table (s) for , click OK
See % of response and use for your future use
This may look confusing at first, but it’s really pretty easy. Ten people bought 24 pieces of fruit. Nine pieces of fruit were apples — 37.5% of the fruit. Nine out of ten people bought apples — 90% of the people. So , the difference is the denominator. What makes this table special is that what you usually care about is the people with multiple responses.