Developed by Naomi Schalken, Lion Behrens and Rens van de Schoot
This tutorial expects:
- Basic knowledge of hypothesis testing
- Basic knowledge of Bayesian statistics
- An installed version of SPSS 25 or later on your electronic device
This tutorial provides the reader with a basic introduction to the Bayesian investigation of data relations using SPSS. Throughout this tutorial, the reader will be guided through importing datafiles, exploring summary statistics and comparing two group means using a T-test. Here, we will exclusively focus on Bayesian inference. To conduct classical frequentist analyses in SPSS, click here!
Throughout this tutorial we will use a dataset from Van de Schoot, van der Velden, Boom & Brugman (2010). We will compare Dutch and foreign adolescents’ in their socially desirable answering patterns (sd). For more information on the sample, instruments, methodology and research context we refer the interested reader to the paper (see references). Here we will focus on data-analysis only. The data set and syntax file can be found in the subfolders tilted 'Assignment Files' and ‘Solutions’.
Note: In many other "How to get started" exercises you will be asked to compare the results from here with results you can obtain e.g. in R or lavaan. Make sure to save or write down the results you found in this exercise.
Preparation - Importing and Exploring Data
You can find the data in the file popular_regr_1.xlsx, which contains all variables that you need for this analysis. Although it is a .xlsx-file, you can directly load it into SPSS using the following settings.
Once you loaded in your data, it is advisable to check whether your data import worked well. Therefore, first have a look at the summary statistics of your data. You can da so by clicking
Analyze -> Descriptive Statistics -> Descriptives. Alternatively, to construct a reproducible analysis, you can open a new syntax file by clicking
File -> New -> Syntax and executing the following code:
DESCRIPTIVES VARIABLES=respnr Dutch gender sd covert overt
/STATISTICS=MEAN STDDEV MIN MAX.
Question: Have all your data been loaded in correctly? That is, do all data points substantively make sense? If you are unsure, go back to the .xlsx-file to inspect the raw data.
Exercise 1 - T-Test
In this exercise you will compare Dutch and foreign adolescents (0=foreign, 1=Dutch) in their socially desirable answering patterns (sd), which serves as the outcome variable using an independent samples T-test.
Exercise 1a. Prior Knowledge
As you know, Bayesian inference consists of merging a prior distribution for your effect with the likelihood obtained from your data. Specifying your prior distribution is thus one of the crucial points in Bayesian inference and should be treated with your highest attention (for a quick refresher see e.g. Depaoli et al. 2017).
Question: Do you have any prior knowledge on how Dutch and foreign adolescents differ in socially desirable answering patterns? Think about a reasonable prior distribution. How would your prior distribution for the mean difference look like? Give reasons for your choice.
Exercise 1b. Conducting the T-Test
Now, let's conduct the test. From SPSS version 25, all Bayesian analysis commands are subsumed under the category "Bayesian Statistics" in the User Interface. Click on
Analyze -> Bayesian Statistics -> Independent Samples Normal if you are using the interface. When conducting a T-test, you have the option to estimate the mean difference's posterior distribution and to test the null hypothesis of zero difference versus its alternative using a Bayes factor. Since we are going to look at both options, change the default setting to
Use both methods. Alternatively, paste the following code in your syntax file:
/CRITERIA CILEVEL=95 TOL=0.000001 MAXITER=2000
/INFERENCE DISTRIBUTION=NORMAL VARIABLES=sd ANALYSIS=BOTH GROUP=Dutch SELECT=LEVEL(0 1)
/PRIOR EQUALDATAVAR=FALSE VARDIST=DIFFUSE
Question: Before interpreting you results, scroll down to the plots that are reported. How do the prior distributions that SPSS specified as a default look like? Do you agree with these distributions? Which influence will they have on the results
Question: Secondly, look at the Group Statistics table. Do both groups obtain variances that are likely to be equal in the underlying population? Can you assume equal variances? What did SPSS assume as a default?
Question: Next, focus on the Bayes factor that is reported. How do you interpret its value? Which hypothesis is more likely?
Question: Lastly, focus on the effect size (mean difference) itself, both by inspecting tables and plots. Which group scores higher, which scores lower? Where does this difference probably lie in the population? Is the difference substantial? The first Descriptives Statistics table that you obtained after importing the data might help you getting an impression about the scale of the outcome variable.
Van de Schoot, R., van der Velden, F., Boom, J. & Brugman, D. (2010). Can at Risk Young Adolescents be Popular and Antisocial? Sociometric Status Groups, AntiSocial Behavior, Gender and Ethnic Background. Journal of Adolescence, 33, 583-592.
Other software exercises you might be interested in