Mplus: how to get started


Developed by Naomi Schalken and Rens van de Schoot


This tutorial expects: 

  • Basic knowledge of correlation and regression
  • An installed version of MPLUS on your electronic device (if no version is installed, look at Preparation - Installing MPLUS)
  • An installed version of SPSS on your electronic device


Preparation - Preparing Data for MPLUS

In order to use this dataset in Mplus, we need to make sure all missing values are recoded into one extreme value, for example -999, and we need to save the data in a different format, e.g. tab delimited. Download the file popular_regr_1.xlsx and open it in SPSS.

To recode all user and system missing values into -999, use Transform, Recode into same variables, select all variable and put these in the Variable box, Old and New values, select System or User missing and enter the value -999, click Add, Continue, and OK.


Or use the following syntax:

RECODE respnr Dutch gender sd covert overt (sysmis=-999) (else=copy).


All missing values should now be coded -999. You can verify this by inspecting the dataset. Now, we will save the data file in a different format. Again, you may use the menus or opt for the syntax method.

When using the menus use File -> Save as, give the file a title (e.g., popular_regr_1.dat), choose tab-delimited (.dat), use as file type, uncheck the option write variable names to spreadsheet, click Save.


When using syntax, copy-paste the following commands but make sure to change the directory to your preferred folder of choice:

SAVE TRANSLATE OUTFILE='<path directory to preferred folder>'
/textoptions decimal = dot


Always inspect the saved .dat file. Note that you can open the .dat file with software such as Notepad to check whether data-preparation succeeded. Upon opening the .dat file in Notepad, the datafile should look like this:


Make a habit out of scanning the .dat file for empty cells and make sure that decimal numbers are preceded by a dot, NOT a comma. We ensured this would be the case by using the /textoptions command in the SPSS syntax. In the menu system we cannot specify /textoptions, and you must use the Replace command in Notepad to change all decimal comma’s into dots. Here, we see that everything went well. A tab is used for separation of each value for every subject. Another option is to use comma separation, in which case the data would look like this, but we do not recommend to use this option.


Preparation - Installing MPLUS


A free demonstration version of Mplus may be obtained from

Make sure to follow the instructions that pertain to your operating system.

An icon for the Mplus demo version should now appear in your start menu (Windows) or launchpad (Mac OS X). Open this demo version and go to File -> New to open a brand new syntax file.


Exercise 1 - Multiple Regression in MPLUS

Exercise 1a. Let’s first take a look at the sample statistics to get familiar with the Mplus environment. You can copy-paste the following syntax into the new syntax file you just opened:

DATA: FILE IS popular_regr_1.dat;
NAMES ARE respnr Dutch gender sd covert overt;
USEVARIABLES ARE covert sd overt;
OUTPUT: sampstat;

Lets take a closer look at the syntax written above. In the first line, we use a DATA command and we tell Mplus what the datafile is called. In the next syntax line, we use a VARIABLE command that consists of three lines. First, we tell Mplus what the variable names are by using the NAMES ARE statement. Note that the order of the variable names has to mirror the actual order in the dataset. You can simple copy-paste the variable names from SPSS. Note that each line should end in a semicolon. You can use multiple lines in Mplus to structure your syntax. For example:



Here, Mplus will stop reading the syntax line after overt. You can use the exclamation mark to make comments. For example:

respnr !this part will not be read by Mplus


Second, we tell Mplus which variables we are actually going to use by using USEVARIABLES ARE. This way Mplus knows which columns in the .dat file to use. Finally, we tell Mplus that for all variables missing values are coded by the number -999 using MISSING ARE ALL. In the last command line we use an output command and ask for some descriptive statistics by requesting sampstat, which is short for sample statistics.


Before you can run this syntax, you need to save it as an .inp (input) file. Make sure to always save your input files in the same (sub)folder as your tab delimited dataset. Now you can run the syntax by pressing the blue ‘run’ button or pressing Alt+R. Mplus will generate an output file, called .out, and this output file will automatically be added and saved to the working folder where the input file was also saved.

In the output file, two warnings will appear:


WARNING in MODEL command
All variables are uncorrelated with all other variables in the model.Check that this is what is intended.
Data set contains cases with missing on all variables.
These cases were not included in the analysis.
Number of cases with missing on all variables: 145



The first warning occurs because we did not actually specify a model in this syntax and we can ignore this warning for now. The second warning refers to the fact that several participants had missing values on all variables specified in the USEVARIABLES command. These participants will not be used in determining the sample statistics.

Not all other output is relevant to us. For now, we only want to inspect the sample statistics. If you scroll down in the output file, you will find the Sample Statistics with the estimated means for the three variables we selected in the USEVARIABLES statement. We also see the covariance and correlation matrix of the three variables.

Question: Compare your results to the results you obtained in the tutorial SPSS: How to get started. Are the results similar? If not, can you explain differences between the Mplus output and the SPSS output?


Exercise 1b. In the previous exercise we only looked at sample statistics. Now, we are going to run the regression analysis in Mplus by adding a model statement to the syntax. Open a new syntax file in Mplus and enter the following syntax commands:

DATA: FILE IS popular_regr_1.dat;
NAMES ARE respnr Dutch gender sd covert overt;
USEVARIABLES ARE sd covert overt;
MODEL: sd ON covert overt;
OUTPUT: sampstat; stand;


We specified a MODEL where an outcome variable (sd) is being regressed ON two predictors (covert and overt), and we asked for standardized results by requesting stand in the output. Dependent or Y variables always appear on the left hand side of the ON statement and independent or X variables always appear on the right hand side of the ON statement.

Again, save the input file in the same folder as the .dat file and run the syntax. We can expect three warnings, all concerning the missing values.


Looking at the output, we can ignore the model fit information since this is a saturated model. The model results and standardized model results are most relevant to this exercise (see screenshot).

Question: How would you interpret these results? How do they compare to the results found in the tutorial SPSS: How to get started?





Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1-36.

Van de Schoot, R., van der Velden, F., Boom, J. & Brugman, D. (2010). Can at Risk Young Adolescents be Popular and Antisocial? Sociometric Status Groups, AntiSocial Behavior, Gender and Ethnic Background. Journal of Adolescence, 33, 583-592.


Other software exercises you might be interested in


First Bayesian Inference

How to avoid and when to worry about the misuse of Bayesian Statistics



How to get started



How to get started



How to get started

First Bayesian Inference

How to avoid and when to worry about the misuse of Bayesian Statistics



How to get started

First Bayesian Inference

How to avoid and when to worry about the misuse of Bayesian Statistics



How to get started