Developed by Naomi Schalken and Rens van de Schoot
Exercise 1 - Simple regression analysis
a. In the data file
Regression.txt there are three variables (y1, x1, x2). For this exercise you will analyze a simple regression model where Y1 is predicted by X1 and X2. Let’s say that Y1 measures depression and X1 measures age and X2 measures anxiety. The research question is whether depression can be predicted by age and anxiety level. First analyze this regression in SPSS/SAS/STATA/EXCEL. Note that you have to interpret the results using the correct interpretation according to the definition of Frequentist or Bayesian probability.
Question: What are the results in terms of statistical terms (regression coefficients, confidence intervals and significance levels)?
Question: What is the answer to the research question?
b. Analyze the same regression model using blavaan (see the R file
Exercisesbayes1.r). Follow the steps in the R file.
c. Analyze the same regression model again, but now with Bayesian regression, using the blavaan() function. Plots can be obtained using the plot() function with the specified arguments.
Question: What are the results for the Bayesian analysis?
Question: Are there any differences with the ML output? (You should answer ‘yes’ to this question, since the values, e.g. regression coefficients, confidence intervals, etc, have different interpretations). Please describe the differences.
Question: Which plots belong to which parameter estimates? Find out which plots belong to the intercept, the regression coefficients and the variance.
Exercise 2 - Sensitivity Analysis in blavaan
In the data file
data_IQ.txt there is information about the IQ scores of 20 children. Continue working with the R file (
ExercisesBayes1.r) and follow the steps. Start with step 1 & 2, and run the model using maximum likelihood estimation (ML) in step 3 and fill in the first row of this table you find in the document table.docx. You can directly download the document by clicking on the table!
Rerun the model following Step 4 in the R file with blavaan(). Fill in the second row of the table. Are there any differences compared to the ML estimates? Which prior was used for the mean score? Is this prior realistic for the mean IQ score?
Specify an alternative prior in Step 5. In blavaan they use the precision instead of the variance in the specification of priors. When you take the inverse of the variance by taking the quotient 1/variance you will obtain the precision. For example, if you want a prior variance of 100 in blavaan, you do 1/variance = 1/100 = 0.01. To specify an alternative prior in blavaan, an argument is added dp = dpriors(nu= “dnorm(100, 0.01))in the blavaan() function where you specify a mean of 100, and a precision of 0.01 (variance of 100). The variance is an indication of how certain you are about this mean. A variance of 100 is large, which means that you are not very certain about this mean. Run the model and fill in the table.
Run the model again in Step 6 with a very small variance, and in Step 7 with a mis-specified mean and a very small variance. Fill in the right values in the X’s below. Don’t forget to use the precision in the specification of the priors.
#STEP6: Specifying a new alternative prior for the mean IQ score.
fit.IQ4 <- blavaan(model.IQ1, data=data.IQ, dp = dpriors(nu= "dnorm(X, X)"))
#STEP7: Specifying a new alternative prior for the mean IQ score.
fit.IQ5 <- blavaan(model.IQ1, data=data.IQ, dp = dpriors(nu= "dnorm(X, X)"))
Then run two models (Step 8 & 9) where this unrealistic mean is combined with either a variance of 100 and of 1000. Fill in the right values for the X’s.
#STEP8: Specifying a new alternative prior for the mean IQ score.
fit.IQ6 <- blavaan(model.IQ1, data=data.IQ, dp = dpriors(nu= "dnorm(X, X)"))
#STEP9: Specifying a new alternative prior for the mean IQ score.
fit.IQ7 <- blavaan(model.IQ1, data=data.IQ, dp = dpriors(nu= "dnorm(X, X)"))
Question: Compare the results of the posterior means and C.I’s, what do you conclude?
Now compare the results of the posterior means for IQ score more formally, by computing the bias of the different informative priors with regard to the posterior mean with the default Bayes prior. You can use the following formula:
Task: Record your answers in the final column of the table above.
Question: What is your personal conclusion about the influence of priors on the outcomes of the model?
Exercise 3 - Violin plot
In the data file
studentpor.txt there is information about secondary school students in a Portuguese language course. The dataset consists of several social, gender and study specific variables that can be used as predictors of the final grade in Portuguese (G3). The variables in the model:
- G1 - First period grade (0-20)
- G2 - Second period grade (0-20)
- G3 - Final grade (0-20)
- Absences - Number of school absences (0-93)
- Health - Current health status (from 1-5)
- Walc - Weekend alcohol consumption (from 1-5)
- Dalc - Workday alcohol consumption (from 1-5)
- Goout - Going out with friends (from 1-5)
- Freetime - Free time after school (from 1-5)
- Sex - Student’s sex (0 = Male, 1 = Female)
- Studytime - Weekly study time (from 1-4)
Continue working with the R file (
ExercisesBayes1.r) and follow the steps.
Part 1 Import the dataset, recode the ‘sex’ variable, specify the regression model and run the model with the blavaan() function. Obtain the summary statistics. What can be concluded about the predictors? Give some conclusions about the model and a substantive interpretation of the significant predictors.
Part 2 Follow steps 1 through 6 and create the violin plot. How does it look like?
Other tutorials you might be interested in
First Bayesian Inference