Quantitative Data Analysis with SPSS : Quantitative Data Analysis with SPSS 1. First Steps with SPSS
2. Further Steps with SPSS
3. Summarizing Data: Descriptive Statistics
4. Exploring Differences between Variables
5. Exploring Relationships between Variables
6. Aggregating Variables: Factor Analysis firstname.lastname@example.org 1. First Steps with SPSS : 1. First Steps with SPSS 1.1 Introduction to SPSS
1.2 Data Analysis and the Research Process
1.3 Starting SPSS
1.4 Data View and Variable View
1.5 Naming Variables in Data Editor
1.6 Saving Data
1.7 SPSS Viewer email@example.com 1.1 Introduction to SPSS : 1.1 Introduction to SPSS SPSS stands for Statistical Package for the Social Sciences
enables you to score and analyze quantative data very quickly and in many different ways
more complicated and more appropriate statistical techniques can be applied in a mouse click firstname.lastname@example.org 1.2 Data Analysis and the Research Process : 1.2 Data Analysis and the Research Process Theory
Operationalisation of concepts
Selection of respondents or participants
Survey/correlation design Experimental design
Conduct interviews or administer Create experimental or control
Carry out observations and/or administer tests
Findings email@example.com 1.3 Starting SPSS : 1.3 Starting SPSS Select Start to show menu.
Select Program for menu of programs.
Select SPSS for Windows which produces a dialog box.
Select a data file or close the dialog box with OK.
Now you can work with the Data Editor which is underneath the dialog box. firstname.lastname@example.org 1.4 Data View and Variable View : 1.4 Data View and Variable View The Data View window of the Data Editor consists of a matrix of columns and numbered rows, with columns representing the variables in the data file and the rows the cases (people).
The Variable View window of the Data Editor lists the variables as rows and 10 different aspects of the variables as columns. email@example.com 1.5 Naming Variables in Data Editor : 1.5 Naming Variables in Data Editor Select Variable View near the bottom of the window.
Select the appropriate row under the Name column and type in the name (Not more than 8 letters, no capitals, no blank spaces).
Under Type select Numeric.
Under Width you can leave 8, but under Decimals you normally choose 0.
Under Label you can write the question or item, e.g. “What is your level of education?” or “I had trouble staying asleep”.
The Value defines which answer of the respondent is translated into which number, e.g. “0” = “not at all”, “1” = “a little bit”, “2” = “moderately”, “3” = “quite a bit”, and “4” = “severely”.
Defining Missing Values: choose Discrete missing values and type in the appropriate value (e.g. 0 or 99) in the first box and then select OK.
Columns, Align, and Measure can be left as default. firstname.lastname@example.org 1.6 Saving Data : 1.6 Saving Data When we want to leave SPSS, we need to save our data as a file.
To be able to retrieve it later on, we need to give it a name; the default extension name for files in Data Editor is sav.
As in a Word file, we select File, then Save As, and Save, wherever we want to save it. email@example.com 1.7 SPSS Viewer : 1.7 SPSS Viewer After entering the data set in Data Editor, we are ready to analyze it. This can be done with a number of SPSS commands, e.g. calculating the mean age of a sample with Descriptive Statistics.
The output for any such procedure is displayed in the Viewer window.
Outputs can be named and saved as such, or copied and pasted into a Word file. However, diagrams can only be edited in the Viewer itself. firstname.lastname@example.org 2. Further Steps with SPSS : 2. Further Steps with SPSS 2.1 Selecting Cases
2.2 Recoding the Values of Variables
2.3 Computing a New Variable email@example.com 2.1 Selecting Cases : 2.1 Selecting Cases To select cases with certain characteristics, e.g. all men over 25 years, select Data and Select Cases… which opens the Select Cases dialog box.
Select If condition is satisfied and then If… which opens the Select Cases: If subdialog box.
In the empty box you enter the condition(s) cases must satisfy. In our example, select in sequence Gender = 1 & Age > 25. Then select Continue to close the Select Cases: If subdialog box and OK to close the Select Cases dialog box.
Now we can analyze the data of only the men over 25 years of age. firstname.lastname@example.org 2.2 Recoding the Values of Variables : 2.2 Recoding the Values of Variables Sometimes we need to recode the answers of negatively worded items.
Select Transform, then Recode and we can either use the same (into Same Variables) or different (into Different Variables) variable names, depending on whether we want to compare the original values with the recoded ones. email@example.com 2.2 Recoding the Values of Variables (continued) : 2.2 Recoding the Values of Variables (continued) When we open the Recode into Same Variables dialog box, we select the variables to be recoded to put them in the empty box (e.g. the variables SOCS1, SOCS2, SOCS3, SOCS7, and SOCS10 of the Sense of Coherence Scale).
Then we select Old and New Variables which opens the Recode into Same Variables: Old and New Variables subdialog box.
Type old value (e.g. 1) in the Value box under Old Value. Then type new value (e.g. 7) in the Value box under New Value. Then select Add to enter the change in the box under Old New. We do this consecutively for all the old values.
We can specify a range of old values by selecting the Range: option and typing the lowest value in the first box and the highest value in the second box. This is very useful for creating e.g. age groups. firstname.lastname@example.org 2.3 Computing a New Variable : 2.3 Computing a New Variable Sometimes we want to combine all the items of a scale into one index, so we can determine the sum score of a respondent on that particular scale.
Select Transform, then Compute Variable….
As a Target Variable, we can write Sum_IES.
Then we add all items of the Impact of Event Scale by putting them in the empty box under Numeric Expression and putting a + between them.
Select OK to close the dialog box. email@example.com 3. Summarizing Data: Descriptive Statistics : 3. Summarizing Data: Descriptive Statistics 3.1 Basic Concepts
3.3 Measures of Central Tendency
3.4 Measures of Dispersion firstname.lastname@example.org 3.1 Basic Concepts : 3.1 Basic Concepts Descriptive statistics are numerical and graphical methods used to summarize and present data in a meaningful way. email@example.com 3.2 Frequencies : 3.2 Frequencies For variables on the nominal and ordinal levels
Show the absolute frequency (number of cases in each category) and the relative frequency (percent).
When a variable is at the interval/ratio level, the data will have to be grouped (e.g. age groups).
Frequencies can also be presented as bar charts, pie charts, or histograms. firstname.lastname@example.org Slide 18: email@example.com Slide 19: firstname.lastname@example.org Slide 20: email@example.com Slide 21: firstname.lastname@example.org Using SPSS to Produce Frequency Tables and Histograms : Using SPSS to Produce Frequency Tables and Histograms Group the data with the Recode procedure.
Select Analyze, then Descriptive Statistics, and then Frequencies.
Select variables for which frequency tables are to be produced.
Select Charts to produce a bar chart, pie chart, or histogram. email@example.com 3.3 Measures of Central Tendency (for variables on the interval/ratio level) : 3.3 Measures of Central Tendency (for variables on the interval/ratio level) Arithmetic Mean or average consists of adding up all of the values and dividing by the number of values.
Median is the mid-point in a distribution of values. It splits a distribution of values in half.
Mode is the most frequently occurring value in a data set. firstname.lastname@example.org 3.4 Measures of Dispersion (for variables on the interval/ratio level) : 3.4 Measures of Dispersion (for variables on the interval/ratio level) Range is the difference between the largest and the smallest value.
Variance is the sum of the squared deviations of each value from the mean divided by the number of values.
Standard Deviation calculates the average amount of deviation from the mean. It is the positive square root of variance. email@example.com Slide 25: firstname.lastname@example.org More Graphs with SPSS : More Graphs with SPSS email@example.com Slide 27: firstname.lastname@example.org Slide 28: email@example.com 4. Exploring Differences between Variables : 4. Exploring Differences between Variables Chi-Square
Analysis of Variance firstname.lastname@example.org 4.1 Parametric vs. Non-Parametric Tests : 4.1 Parametric vs. Non-Parametric Tests Parametric Tests can be used when the data fulfill the following three conditions:
Variables are on the interval level
The distribution of the population scores is normal
The variances of both variables are equal or homogeneous email@example.com Parametric Non-Parametric Tests Tests : Parametric Non-Parametric Tests Tests T-test
One-way analysis of variance
Two-way analysis of variance Binomial
Chi-square for one variable
Chi-square for two or more variables
Kruskal-Wallis H firstname.lastname@example.org 4.2 Binomial Test for one Dichotomous Variable : 4.2 Binomial Test for one Dichotomous Variable Compares the frequency of cases actually found in the two categories of a dichotomous variable with those that are expected.
Select Analyze, then Nonparametric Tests, then Binomial (opens Binomial Test dialog box).
Select the variable (e.g. gender) and change the Test Proportion if necessary (e.g. from .50 to .75). email@example.com Slide 33: firstname.lastname@example.org 4.3 Chi-Square Test for one Variable : 4.3 Chi-Square Test for one Variable If we want to compare the observed frequencies with those expected in a nominal variable with more than two categories, then we use a χ² test.
Select Analyze, then Nonparametric Tests, then Chi-Square… (opens Chi-Square Test dialog box).
Choose test variable (e.g. ethnicg) and if necessary, specify lower (e.g.1) and upper (e.g. 4) value of groups to be compared.
If expected frequencies are not equal, then change accordingly by specifying what the expected values are for each group (e.g. 95+2+2+1). email@example.com : firstname.lastname@example.org 4.4 Chi-Square Test for two or more Variables : 4.4 Chi-Square Test for two or more Variables If we want to compare the observed frequencies with those expected in two variables with two or more categories, we would also use the χ² test (e.g. gender and ethnic group).
Select Analyze, then Descriptive Statistics, and then Crosstabs… (opens Crosstabs dialog box).
Select the row variable (e.g. ethnicg) and the column variable (e.g. gender).
Select Statistics to ask for Chi-square.
Select Cells… to ask for observed and expected counts and for unstandardized residuals. email@example.com : firstname.lastname@example.org 4.5 T-Test : 4.5 T-Test One-Sample T-Test
Determines if the mean of a sample is similar to that of the population.
Determines if the means of two unrelated samples differ.
Determines if the mean of the same sample differs in two different points in time. email@example.com Independent-Samples T-Test : Independent-Samples T-Test The means of two groups are compared.
There is one dependent variable (interval level) and one independent variable (dichotomous variable).
Select Analyze, then Compare Means, and then Independent-Samples T Test.
Select Test Variable (dependent variable) (e.g. satis) and Grouping Variable (e.g. gender).
Select Define Groups, then type in 1 and 2 respectively. firstname.lastname@example.org Slide 40: email@example.com 4.6 One-Way Analysis of Variance : 4.6 One-Way Analysis of Variance The means of more than two groups are compared.
There is one dependent variable (interval level) and one independent variable (nominal or ordinal level), which is also known as factor. firstname.lastname@example.org Producing One-Way Analysis of Variance with SPSS : Producing One-Way Analysis of Variance with SPSS Select Analyze, then Compare Means, and then One-Way ANOVA…
Then put e.g. satis under Dependent List:
Put ethnicg under Factor.
Select Post Hoc… test for multiple comparisions. email@example.com Slide 43: firstname.lastname@example.org Slide 44: email@example.com 4.7 Two-Way Analysis of Variance : 4.7 Two-Way Analysis of Variance The means of more than two groups are compared.
There is one dependent variable (interval level) and two independent variables (nominal or ordinal level), which are also known as factors.
The effect of two variables on a third is also known as factorial design. firstname.lastname@example.org Producing Two-Way Analysis of Variance with SPSS : Producing Two-Way Analysis of Variance with SPSS Select Analyze, then General Linear Model, and then Univariate…
Select Dependent Variable (e.g. satis), then the independent variables or factors as Fixed Factors (e.g. gender and ethnicg).
Under Options select Descriptive statistics, Homogenity tests, and Continue to revert to initial dialog box.
Under Plots select one factor to put in Horizontal Axis: box and select another factor to put in Separate Lines: box, and then select Add to put the 2 factors in the Plots: box. email@example.com 5. Exploring Relationships between Variables : 5. Exploring Relationships between Variables 5.1 Crosstabs
5.3 Regression firstname.lastname@example.org 5.1 Crosstabs : 5.1 Crosstabs There is a relationship between two variables when the distribution of values for one variable is associated with the distribution exhibited by the other variable (e.g. job satisfaction and absenteeism).
The crosstabulation of variables can be presented in a contingency table.
The figures to the right are the row marginals and the figures at the bottom of the table the column marginals. email@example.com Slide 49: Job satisfaction
Yes No Row
marginals Column 14 16
Marginals Yes No Absenteeism 15 15 firstname.lastname@example.org Crosstabs with SPSS : Crosstabs with SPSS For example, we want to explore the relationship between skill and gender.
The dependent variable, skill, should go down the table (rows) and the independent variable, gender, should go across (columns).
Select Analyze, then Descriptive Statistics, then Crosstabs…
Put skill in Rows and gender in Columns.
Open Statistics… box and choose Chi-square.
Open Cells… box and choose Observed beneath Counts and Row and Column beneath Percentages. email@example.com Slide 51: firstname.lastname@example.org Slide 52: email@example.com 5.2 Correlation : 5.2 Correlation Measures of correlation indicate both the strength and the direction of the relationship between a pair of variables.
There are 2 Types:
measures of linear correlation using interval variables (Pearson’s Product Moment Correlation Coefficient r)
measures of rank correlation using ordinal variables (Spearman’s rho and Kendall’s tau) firstname.lastname@example.org Slide 54: email@example.com Producing Correlation with SPSS : Producing Correlation with SPSS Select Analyze, then Correlate, then Bivariate… (opens Bivariate Correlations dialog box)
Select all the variables to be correlated with each other (e.g. autonom, routine, and satis) and place them in the Variables: box.
Select Pearson, if variables are interval.
OK firstname.lastname@example.org Slide 56: email@example.com Overview on Methods for Measuring Bivariate Relationships : Overview on Methods for Measuring Bivariate Relationships Nominal – nominal: crosstabs
Nominal – ordinal: crosstabs
Nominal – interval: crosstabs (collapse interval variable into groups) or compare means
Ordinal – ordinal: rank correlation
Interval – ordinal: rank correlation, crosstabs, or compare means
Interval – interval: linear correlation firstname.lastname@example.org Using Compare Means to Measure Relationships : Using Compare Means to Measure Relationships When the dependent variable is interval and the independent variable is either nominal, ordinal, or dichotomous, the SPSS Means… procedure is used.
Select Analyze, then Compare Means (opens Means dialog box).
Put e.g. satis in Dependent List: and skill in Independent List:
Under Options put Mean, Standard Deviation, and Number of Cases n the Cell Statistics: box, then select Anova table and eta
Select Continue and OK. email@example.com Slide 59: firstname.lastname@example.org 5.3 Regression : 5.3 Regression Enables us to make predictions of likely values of the dependent variable, for particular values of the independent variable (y=a+bx+e).
The relationship between two variables is summarized by producing a line which fits the data closely (line of best fit).
The r² value is used to indicate how well the predictions can be made, that is r² reflects the proportion of the variation of the dependent variable explained by the independent variable. email@example.com Slide 61: firstname.lastname@example.org Generating Regression Analysis with SPSS : Generating Regression Analysis with SPSS Select Analyze, then Regression, and then Linear… (opens Linear Regression dialog box)
Put e.g. satis in Dependent: box and routine in Independent(s): box.
OK email@example.com : R Square shows the amount of the variation in satis explained by routine. firstname.lastname@example.org : This table provides an F test for the equation and shows that the significance level associated with the F test is .000, which is a very high level. email@example.com : In the column under B, the top one is the value of the constant (intercept a) and the bottom one the value of the regression coefficient (b) in the regression equation for the line of best fit.
The last two columns are t tests with significance levels for the constant (a) and the regression coefficient (b) and in both cases are very statistically significant. firstname.lastname@example.org Producing Scatterplots with SPSS : Producing Scatterplots with SPSS Select Graphs, then Interactive, then Scatterplot…
Put the dependent variable (e.g. satis) on the y axis and the independent variable (e.g. autonomy) on the x axis.
Select Fit and beneath Method select Regression (this will draw the line of best fit; if you select None, then you will only get the correlation scatterplot).
OK email@example.com Slide 67: firstname.lastname@example.org Exercise 1 : Exercise 1 (a) Using SPSS, how would you create a contingency table for the relationship between gender and prody, with the former variable going across, along with column percentages (Job Survey data)?
(b) How would you asses the statistical significance of the relationship with SPSS?
(c) Is the relationship statistically significant?
(d) What is the percentage of women who are described as exhibiting ‘good’ productivity? email@example.com Exercise 2 : Exercise 2 (a) Using SPSS, how would you generate a matrix of Pearson’s r correlation coefficients for income, years, satis, and age (Job Survey data)?
(b) Which pair of variables exhibits the largest correlation?
(c) Taking this pair of variables, how much of the variance in one variable is explained by the other? firstname.lastname@example.org 6. Aggregating Variables: Factor Analysis : 6. Aggregating Variables: Factor Analysis 6.1 Uses of Factor Analysis
6.2 SPSS Factor Analysis Procedure email@example.com 6.1 Uses of Factor Analysis : 6.1 Uses of Factor Analysis Assessing the factorial validity of the questions which make up ours scales, to see whether they measure the same concept.
Reducing the number of variables to a smaller set.
Grouping characteristics of social behavior into a limited set of factors. firstname.lastname@example.org 6.2 SPSS Factor Analysis Procedure : 6.2 SPSS Factor Analysis Procedure Select Analyze, then Data Reduction, then Factor… (opens Factor Analysis dialog box)
Select Variables for analysis and put them in the Variables: box.
Under Descriptives, choose Initial Solution, Coefficients, and Significance levels.
Under Extraction, choose as Method: Principal axis factoring, as well as Correlations, Screeplot, and Eigenvalues over: 1.
Under Rotation, choose Varimax and Rotated solution.
Under Options, choose Exclude cases listwise and Suppress absolute values less than ,40.
OK email@example.com References : References Bryman, A. & Cramer, D. (2005). Quantative Data with SPSS 12 and 13. A Guide for Social Scientists. Routledge: London
Gaur, A.S. & Gaur, S.S. (2006). Statistical Methods for Practice and Research. A Guide to data analysis using SPSS. Response Books: New Delhi firstname.lastname@example.org