SPSS in Research Dr. Mohd . Mamur Ali Assistant Professor Dept of TT&NFE (IASE) Jamia Millia Islamia
Purpose of Applying Statistics… To describe data (Descriptive Statistics) To draw better inferences ( Inferential Statistics)
Descriptive Statistics Statistical Procedures used in describing the properties of samples or of population where complete population data are available Example – Measuring the IQ of the complete population of students in a university by computing mean IQ. It describes a characteristic of the complete population
Descriptive Statistics How to explain a class performance on an exam? Explain individuals score Represent data graphically (bar chart, frequency polygon etc) Central tendency of scores (Mean, Median Mode) Variance of scores from centre (Range, Standard Deviation)
Inferential Statistics In Descriptive statistics, we talk about the data we have In Inferential statistics, we talk about the data we do not have
Inferential Statistics Statistical Procedures used in the drawing of inferences about the properties of population from sample data Example – Measuring the IQ of the complete population of students in a university by computing IQ of a sample and then place inference to the whole population.
Educational Variables… Quantitative Variables An attribute in which individuals differ in amount degree or frequency Score on an exam, no. of students, frequency of absence from classroom Qualitative Variables An attribute in which individuals differ in kind, or sort and these variables do not possess inherent ordering system. Nationality, Religion, types of books read
Variable… Nominal variables (also called categorical variables)- Simply classify persons or objects into two or more categories where members of a category have atleast one common characteristics. Example: gender( male, female), employment status( part-time, full time, unemployed) Ordinal variables- they rank persons or objects in terms of the degree of a characteristics they possessed. Example: people are ranked from 1 to 20 on their height
Variable… Interval variable- have all the properties of nominal and ordinal variables but also have equal intervals. An interval variable does not have a “true” zero point ( zero score on achievement test means ???) Example: Achievement, aptitude, motivation and attitude tests are treated as interval variables
Variable… Ratio variable- is a property defined by an operation which permits the making of statement of equality of ratios. This means that one variable value may be spoken of as double or triple another and so on. Example- Length, Weight and numerosity of aggregates Ramesh is tall and ahmad is short (nominal scale) Ramesh is taller than ahmad (Ordinal scale) Ramesh is 6 feet tall and ahmad is 5 feet tall (interval scale) Ramesh is six-fifths as tall as ahmad ( ratio scale)
Steps in using inferential statistics… 1. select the test of significance 2. determine whether significance test will be two-tailed or one tailed 3. select α ( alpha ), the probability level 4. compute the test of significance 5. consult table to determine the significance of the results
Tests of significance... statistical formulas that enable the researcher to determine if there was a real difference between the sample means
… different tests of significance account for different factors including: the scale of measurement represented by the data; method of participant selection, number of groups being compared, and, the number of independent variables
The most common tests of significance… t-test ANOVA Chi Square
t-test... … used to determine whether two means are significantly different at a selected probability level … adjusts for the fact that the distribution of scores for small samples becomes increasingly different from the normal distribution as sample sizes become increasingly smaller
… there are two types of t -tests: the t -test for independent samples (randomly formed) and the t -test for nonindependent samples (nonrandomly formed, e.g., matching, performance on a pre-/post- test, different treatments)
ANOVA... … used to determine whether two or more means are significantly different at a selected probability level … avoids the need to compute duplicate t -tests to compare groups
… the strategy of ANOVA is that total variation, or variance, can be divided into two sources: a) treatment variance (“ between groups ,” variance caused by the treatment groups) and error variance (“ within groups ” variance)
Correlation Measures the relative strength of the linear relationship between two variables Unit-less Ranges between –1 and 1 The closer to –1, the stronger the negative linear relationship The closer to 1, the stronger the positive linear relationship The closer to 0, the weaker any positive linear relationship
REGRESSION It refers to the problem of predicting one variable from the knowledge of another or possibly several other variables. The linear regression of Y on X Y’ = byx X + ayx The linear regression of X on Y X’ = bxy Y + axy ‘b’ is slope of the regression line ‘a’ is the point where line intercepts the X/Y axis
Linear regression In correlation, the two variables are treated as equals. In regression, one variable is considered independent (=predictor) variable ( X ) and the other the dependent (=outcome) variable Y .
Regression Analysis Linear Regression Multiple Regression (Using more than one predictor variable )
Nonparametric Tests Chi-Square Sign Test (Median Test ) for Two sample case Sign Test (Median Test ) for Two Correlated samples Rank Test for Two Independent samples( Wilcoxon Rank Sum test, Mann-Whitney U test) Rank Test for Two Correlated samples
Chi Square ( Χ 2 ) ... …a nonparametric test of significance appropriate for nominal or ordinal data that can be converted to frequencies …compares the proportions actually observed (O) to the proportions expected (E) to see if they are significantly different
The Chi Square Test The test we use to measure the differences between what is observed and what is expected according to an assumed hypothesis is called the chi-square test .
Formula χ 2 = ∑ (O – E) 2 E χ 2 = The value of chi square O = The observed value E = The expected value ∑ (O – E) 2 = all the values of (O – E) squared then added together
ANCOVA ANCOVA is used to remove the effects of extraneous variables from the dependent variable. It may be used to adjust the initial difference among the groups on certain known variables. It combines both regression and ANOVA. Regression is used to remove the influence of covariate (uncontrolled variables, denoted by Y) ANOVA then applied to residuals i.e. to the portions of Y left unaccounted for by Y)
SPSS Statistical Package for Social Science
HOW TO START SPSS ?
The steps for starting the program Click the Start button, point to All Programs , point to SPSS Inc , point to Statistics 19 , and select Statistics 19 . Click the Cancel button to create a new data file.
Opening a Data File To open a data file: From the menus choose: File > Open > Data...
Data Entry into SPSS
Data Entry into SPSS There are 2 ways to enter data into SPSS Directly enter in to SPSS by typing in Data View Enter into other database software such as Excel then import into SPSS 36 :
Entering String Data
Reading Data from Spreadsheets From the menus choose: File > Open > Data... Select Excel (*.xls) as the file type you want to view. Open demo.xls .
Reading Data from a Database From the menus choose: File > Open Database > New Query...
Click Browse Open demo.mdb . Click OK in the login dialog box.
Reading Data from a Text File From the menus choose: File > Read Text Data... Select Text (*.txt) as the file type you want to view. Open demo.txt .
In Step 1, you can choose a predefined format or create a new format in the wizard. Select No to indicate that a new format should be created. Click Next to continue.
To change a data type: Under Data preview, select the variable you want to change, which is Income in this case.
Using the Data Editor Entering Numeric Data Data can be entered into the Data Editor, which may be useful for small data files or for making minor edits to larger data files. E Click the Variable View tab at the bottom of the Data Editor window. You need to define the variables that will be used. In this case, only three variables are needed: age , marital status , and income .
Entering String Data
DATA ANALYSIS in SPSS
DESCRIPTIVE STATISTICS
Case summaries Analyze / Report / Case summaries Select target variable(s) Select grouping variable(s) Include additional statistics
Do not limit/display cases Click here to choose the statistics you need Variable(s) you are interested in Grouping variables
Running an Analysis From the menus choose: Analyze > Descriptive Statistics > Frequencies...
GRAPHICAL PRESENTATION
Creating Charts From the menus choose: Graphs > Chart Builder... Click the Gallery tab (if it is not selected). Click Bar (if it is not selected). Drag the Clustered Bar icon onto the canvas, which is the large area above the Gallery.
Crosstabs
Crosstabs
INFERENTIAL STATISTICS PARAMETRIC TESTS
Scale – Difference- between group(s)
Scale – Correlation
Ordinal – Difference- between group(s)
Ordinal – Correlation
Nominal– Difference- between group(s)
Nominal – Correlation
Regression
The slope and the y-intercept as seen in Figure 8 should be substituted in the following linear equation to predict this year’s sales: Y = aX + b . In this case, the values of a , b , x , and y will be as follows: a = 1954.658 b = 440.987 X = Years of experience (values of independent variable) Y = Last year sales (values of dependent variable) The slope and the y-intercept as seen in Figure 8 should be substituted in the following linear equation to predict this year’s sales: Y = aX + b. In this case, the values of a, b, x, and y will be as follows: a = 1954.658 b = 440.987 X = Years of experience (values of independent variable) Y = Last year sales (values of dependent variable)