Developed by Naomi Schalken, Lion Behrens and Rens van de Schoot
This tutorial expects:
- Basic knowledge of frequentist and Bayesian correlation and regression
- An installed version of SPSS 25 or later on your electronic device
This tutorial provides the reader with a basic introduction to the Bayesian investigation of data relations using SPSS. Throughout this tutorial, the reader will be guided through importing datafiles, exploring summary statistics and relational analysis using correlation and regression analyses in the Bayesian framework. Here, we will exclusively focus on Bayesian statistics. To conduct frequentist analyses in SPSS, click here!
Throughout this tutorial we will use a dataset from Van de Schoot, van der Velden, Boom & Brugman (2010). Using multiple regression, we will predict adolescents’ socially desirable answering patterns (sd) from overt (overt) and covert (covert) antisocial behaviour. For more information on the sample, instruments, methodology and research context we refer the interested reader to the paper (see references). Here we will focus on data-analysis only. The data set and syntax file can be found in the subfolders tilted 'Assignment Files' and ‘Solutions’.
Note: In many other "How to get started" exercises you will be asked to compare the results from here with results you can obtain e.g. in R or lavaan. Make sure to save or write down the results you found in this exercise.
Preparation - Importing and Exploring Data
You can find the data in the file popular_regr_1.xlsx, which contains all variables that you need for this analysis. Although it is a .xlsx-file, you can directly load it into SPSS using the following settings.
Once you loaded in your data, it is advisable to check whether your data import worked well. Therefore, first have a look at the summary statistics of your data. You can da so by clicking
Analyze -> Descriptive Statistics -> Descriptives. Alternatively, to construct a reproducible analysis, you can open a new syntax file by clicking
File -> New -> Syntax and executing the following code:
DESCRIPTIVES VARIABLES=respnr Dutch gender sd covert overt
/STATISTICS=MEAN STDDEV MIN MAX.
Question: Have all your data been loaded in correctly? That is, do all data points substantively make sense? If you are unsure, go back to the .xlsx-file to inspect the raw data.
Exercise 1 - Correlation Analysis
In this exercise you will run a regression model with sd as outcome variable and overt and covert as predictors. But first, let's have a look at the bivariate (that is: pairwise) correlations between the variables of interest.
Exercise 1a. Analysis using default priors
From SPSS version 25, all Bayesian analysis commands are subsumed under the category "Bayesian Statistics" in the User Interface. Click on
Analyze -> Bayesian Statistics -> Pearson Correlation. At first, choose sd, overt and covert as your test variables. When investigating your correlations, you can do so via estimating the correlations' posterior distributions or testing a null correlation against its alternative hypothesis. Since we are going to look at both options, change the default setting to
Use both methods.
Question: Before inspecting correlations, click on the
Priorsbutton on the right. You can specify different prior distributions for your correlation ranging between -1 and 1. What is the default distribution and how does it look like? Which influence will it have on your results?
Now, inspect your Bayesian correlations using the default prior setting. You can do so by clicking
Okay or by pasting the following code into your syntax file:
/CRITERIA CILEVEL=95 SEED=RANDOM MCSAMPLES=1000000 TOL=0.0001 MAXITER=2000 POSTSAMPLES=1000000
/INFERENCE VARIABLES=sd covert overt ANALYSIS=BOTH MAXPLOTS=10 CVALUE=0
Simply copy-paste these four lines into the new syntax file, select all text
Ctrl+A and run the commands with
Question: Your output is split up in three parts: Bayes Factor Inference, Posterior Distribution Characterizations and Plots. Go through each part of the output. What does the Bayes factor test and what do you infer from it? Inspecting posteriors, in which interval will the correlations probably lie? Which point estimate would you choose? Have a look at the prior distributions that were specified by inspecting the plots.
Exercise 2 - Regression Analysis
Now, let's run a multiple regression model predicting socially desirable answering patterns (sd) from overt (overt) and covert (covert) antisocial behaviour. You can do so by clicking Analyze ->
Bayesian Statistics -> Linear Regression. Again, choose for reporting posterior distributions and Bayes factors. Under
Plots, choose to plot both covariates overt and covert. Alternatively, execute the following code:
BAYES REGRESSION sd WITH covert overt
/CRITERIA CILEVEL=95 TOL=0.000001 MAXITER=2000
/DESIGN COVARIATES=covert overt
/ESTBF COMPUTATION=JZS COMPARE=NULL
/PLOT COVARIATES=overt covert INTERCEPT=FALSE ERRORVAR=FALSE BAYESPRED=FALSE.
Question: At first, scroll down to the plots and inspect which prior distributions where used for each regression parameter. Do you agree with this default choice?
Question: Is the model relevant? Include the Bayes factor, the explained variance (R2) and the coefficients' posterior distributions in your answer. What do you subtantively conclude from the regression coefficients?
Question: Have a look at the Mean and Mode of the regression coefficients' posterior distributions. Are they different from the correlation results that you obtained in Exercise 1? If so, explain why!
Van de Schoot, R., van der Velden, F., Boom, J. & Brugman, D. (2010). Can at Risk Young Adolescents be Popular and Antisocial? Sociometric Status Groups, AntiSocial Behavior, Gender and Ethnic Background. Journal of Adolescence, 33, 583-592.
Other software exercises you might be interested in