Example Exercise: Regression (Bayes)

Developed by Naomi Schalken, Lion Behrens and Rens van de Schoot

 


This tutorial expects:

  • Basic knowledge of frequentist and Bayesian correlation and regression
  • An installed version of SPSS 25 or later on your electronic device

 

This tutorial provides the reader with a basic introduction to the Bayesian investigation of data relations using SPSS. Throughout this tutorial, the reader will be guided through importing datafiles, exploring summary statistics and relational analysis using correlation and regression analyses in the Bayesian framework. Here, we will exclusively focus on Bayesian statistics. To conduct frequentist analyses in SPSS, click here!

Throughout this tutorial we will use a dataset from Van de Schoot, van der Velden, Boom & Brugman (2010). Using multiple regression, we will predict adolescents’ socially desirable answering patterns (sd) from overt (overt) and covert (covert) antisocial behaviour. For more information on the sample, instruments, methodology and research context we refer the interested reader to the paper (see references). Here we will focus on data-analysis only. The data set and syntax file can be found in the subfolders tilted 'Assignment Files' and ‘Solutions’.

Note: In many other "How to get started" exercises you will be asked to compare the results from here with results you can obtain e.g. in R or lavaan. Make sure to save or write down the results you found in this exercise.

 

Preparation - Importing and Exploring Data

You can find the data in the file popular_regr_1.xlsx, which contains all variables that you need for this analysis. Although it is a .xlsx-file, you can directly load it into SPSS using the following settings.

 

 

Once you loaded in your data, it is advisable to check whether your data import worked well. Therefore, first have a look at the summary statistics of your data. You can da so by clicking Analyze -> Descriptive Statistics -> Descriptives. Alternatively, to construct a reproducible analysis, you can open a new syntax file by clicking File -> New -> Syntax and executing the following code:

 

DESCRIPTIVES VARIABLES=respnr Dutch gender sd covert overt
/STATISTICS=MEAN STDDEV MIN MAX.

 

Question: Have all your data been loaded in correctly? That is, do all data points substantively make sense? If you are unsure, go back to the .xlsx-file to inspect the raw data.

Exercise 1 - Correlation Analysis

In this exercise you will run a regression model with sd as outcome variable and overt and covert as predictors. But first, let's have a look at the bivariate (that is: pairwise) correlations between the variables of interest.

 

Exercise 1a. Analysis using default priors

From SPSS version 25, all Bayesian analysis commands are subsumed under the category "Bayesian Statistics" in the User Interface. Click on Analyze -> Bayesian Statistics -> Pearson Correlation. At first, choose sd, overt and covert as your test variables. When investigating your correlations, you can do so via estimating the correlations' posterior distributions or testing a null correlation against its alternative hypothesis. Since we are going to look at both options, change the default setting to Use both methods.

Question: Before inspecting correlations, click on the Priorsbutton on the right. You can specify different prior distributions for your correlation ranging between -1 and 1. What is the default distribution and how does it look like? Which influence will it have on your results?

 

Now, inspect your Bayesian correlations using the default prior setting. You can do so by clicking Okay or by pasting the following code into your syntax file:

 

BAYES CORRELATION
/MISSING SCOPE=PAIRWISE
/CRITERIA CILEVEL=95 SEED=RANDOM MCSAMPLES=1000000 TOL=0.0001 MAXITER=2000 POSTSAMPLES=1000000
/INFERENCE VARIABLES=sd covert overt ANALYSIS=BOTH MAXPLOTS=10 CVALUE=0
/ESTBF TYPE=JZS.

 

Simply copy-paste these four lines into the new syntax file, select all text Ctrl+A and run the commands with Ctrl+R.

 

Question: Your output is split up in three parts: Bayes Factor Inference, Posterior Distribution Characterizations  and Plots. Go through each part of the output. What does the Bayes factor test and what do you infer from it? Inspecting posteriors, in which interval will the correlations probably lie? Which point estimate would you choose? Have a look at the prior distributions that were specified by inspecting the plots.

 

 

Exercise 2 - Regression Analysis

Now, let's run a multiple regression model predicting socially desirable answering patterns (sd) from overt (overt) and covert (covert) antisocial behaviour. You can do so by clicking Analyze -> Bayesian Statistics -> Linear Regression. Again, choose for reporting posterior distributions and Bayes factors. Under Plots, choose to plot both covariates overt and covert.  Alternatively, execute the following code:

BAYES REGRESSION sd WITH covert overt
/CRITERIA CILEVEL=95 TOL=0.000001 MAXITER=2000
/DESIGN COVARIATES=covert overt
/INFERENCE ANALYSIS=BOTH
/PRIOR TYPE=REFERENCE
/ESTBF COMPUTATION=JZS COMPARE=NULL
/PLOT COVARIATES=overt covert INTERCEPT=FALSE ERRORVAR=FALSE BAYESPRED=FALSE.

 

Question: At first, scroll down to the plots and inspect which prior distributions where used for each regression parameter. Do you agree with this default choice?

Question: Is the model relevant? Include the Bayes factor, the explained variance (R2) and the coefficients' posterior distributions in your answer. What do you subtantively conclude from the regression coefficients?

Question: Have a look at the Mean and Mode of the regression coefficients' posterior distributions. Are they different from the correlation results that you obtained in Exercise 1? If so, explain why!

References

Van de Schoot, R., van der Velden, F., Boom, J. & Brugman, D. (2010). Can at Risk Young Adolescents be Popular and Antisocial? Sociometric Status Groups, AntiSocial Behavior, Gender and Ethnic Background. Journal of Adolescence, 33, 583-592.