Load required libraries
library(lavaan)
library(semPlot)
The following explanation contains material that wasn’t covered at the time when the assignment was handed out, but you should pay attention to it since it provides a big picture for all the models that will be covered in the course.
Consider the structural model (without intercept) \[\begin{aligned} \boldsymbol \eta &=\mathbf B \boldsymbol \eta +\boldsymbol \Gamma \boldsymbol\xi +\boldsymbol \zeta\\ \end{aligned} \] and the measurement model \[\begin{aligned} \mathbf x &=\boldsymbol \Lambda_x \boldsymbol \eta +\boldsymbol \epsilon\\ \mathbf y &=\boldsymbol \Lambda_y \boldsymbol \xi +\boldsymbol \delta\\ \end{aligned} \] where \(\boldsymbol \eta\) is a vector of endogenous latent variables, \(\boldsymbol\xi\) a vector of exogenous latent variables, \(\boldsymbol \zeta\) is a vector error variables, \(\mathbf y\) a vector of endogenous observed variables, \(\mathbf x\) a vector of exogenous observed variables such that \[\begin{aligned} Cov(\boldsymbol \eta) &= \boldsymbol \Phi \\ Cov(\boldsymbol\xi) &= \boldsymbol \Psi \\ Cov(\boldsymbol \epsilon) &= \boldsymbol \Theta_x\\ Cov(\boldsymbol \delta) &= \boldsymbol \Theta_y \end{aligned} \] Now a path analysis model is a structural equation model where \(\mathbf x\) and \(\mathbf y\) are observed without error such that \[\begin{aligned} \boldsymbol \Lambda_x &=\mathbf I\\ \boldsymbol \Lambda_y &= \mathbf I \\ \boldsymbol \Theta_x &= \boldsymbol{0} \\ \boldsymbol \Theta_y &= \boldsymbol{0} \end{aligned} \]
We can use the lisrelModel function (one way to find out about it is through a Google search. Look here for examples) in the semPlot package to plot the path diagram for the model. We just need to feed the function the matrices with a 1 specifiying a free parameter and zero for fixed parameter. Here is the code:
endogen_N <- 3 # number of endogenous observed variables
exogen_N <- 1 # number of exogenous observed variables
# Beta 3x3 matrix:
Beta <- matrix(c(0, 0, 0,
1, 0, 0,
1, 1, 0),
nrow = endogen_N,
ncol = endogen_N,
byrow = TRUE)
# Gamma 3x1 matrix :
Gamma <- matrix(c(1, 1, 1),
nrow = endogen_N ,
ncol = exogen_N )
# Psi 3x3 diagonal matrix :
Psi <- diag(1,
nrow = endogen_N,
ncol = endogen_N)
# Phi 1x1 diagonal matrix :
Phi <- diag(1, 1, 1)
# Lambda identity matrices :
Lambda_x <- diag(1, exogen_N, exogen_N)
Lambda_y <- diag(1, endogen_N, endogen_N)
# Combine model:
mod <- lisrelModel(LY = Lambda_y,
PS = Psi,
BE = Beta,
manNamesEndo = c("IQ", "nAch", "GPA"),
LX = Lambda_x,
PH = Phi,
GA = Gamma,
manNamesExo = "SES")
We can print out the matrices first
Beta
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 1 0 0
[3,] 1 1 0
Gamma
[,1]
[1,] 1
[2,] 1
[3,] 1
Psi
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 0 1 0
[3,] 0 0 1
Phi
[,1]
[1,] 1
Lambda_x
[,1]
[1,] 1
Lambda_y
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 0 1 0
[3,] 0 0 1
Here is the plot. Do you notice something unusual about the graph?
# Plot path diagram:
semPaths(mod,
as.expression = c("nodes", "edges"),
sizeMan = 5, # adjust size of boxes
sizeInt = 1,
sizeLat = 4)
Alternatively, we can plot a simplified path diagram without Greek letters like this
# Specify model
SES_model <- '
# regressions
IQ ~ SES
nAch ~ SES + IQ
GPA ~ SES + IQ + nAch '
semPaths(SES_model ,
"eq",
layout = "tree")
First assign the summarized data to a character vector and convert it to a full symmetric covariance/correlation matrix with names
# SES, IQ and Need Achievement data
SES_lower <- '
1
.30 1
.41 .16 1
.33 .57 .50 1'
# Sample size
N3 <- 306
SES_cor <- getCov(SES_lower,
names = c("SES", "IQ", "nAch", "GPA"))
Fit the model
SES_model_fit <- sem(SES_model,
sample.cov = SES_cor,
sample.nobs = N3)
Found more than one class "Model" in cache; using the first, from namespace 'lavaan'
Print a summary of the model estimates. Don’t forget to add the rsquare option to print out the squared multiple correlations. We can also output the list of model matrices.
summary(SES_model_fit , rsquare = TRUE)
lavaan (0.5-21) converged normally after 15 iterations
Number of observations 306
Estimator ML
Minimum Function Test Statistic 0.000
Degrees of freedom 0
Parameter Estimates:
Information Expected
Standard Errors Standard
Regressions:
Estimate Std.Err z-value P(>|z|)
IQ ~
SES 0.300 0.055 5.501 0.000
nAch ~
SES 0.398 0.055 7.285 0.000
IQ 0.041 0.055 0.745 0.457
GPA ~
SES 0.009 0.046 0.199 0.842
IQ 0.501 0.043 11.763 0.000
nAch 0.416 0.045 9.348 0.000
Variances:
Estimate Std.Err z-value P(>|z|)
.IQ 0.907 0.073 12.369 0.000
.nAch 0.828 0.067 12.369 0.000
.GPA 0.502 0.041 12.369 0.000
R-Square:
Estimate
IQ 0.090
nAch 0.170
GPA 0.496
(modelEst1 <- lavInspect(SES_model_fit, "est"))
$lambda
IQ nAch GPA SES
IQ 1 0 0 0
nAch 0 1 0 0
GPA 0 0 1 0
SES 0 0 0 1
$theta
IQ nAch GPA SES
IQ 0
nAch 0 0
GPA 0 0 0
SES 0 0 0 0
$psi
IQ nAch GPA SES
IQ 0.907
nAch 0.000 0.828
GPA 0.000 0.000 0.502
SES 0.000 0.000 0.000 0.997
$beta
IQ nAch GPA SES
IQ 0.000 0.000 0 0.300
nAch 0.041 0.000 0 0.398
GPA 0.501 0.416 0 0.009
SES 0.000 0.000 0 0.000
Note that in lavaan the psi matrix corresponds to the following compound matrix in the notes
\[ \begin{bmatrix} \boldsymbol \Psi & \mathbf 0\\ \mathbf 0 & \boldsymbol \Phi \end{bmatrix} \]
and the beta matrix to
\[ \begin{bmatrix} \mathbf B & \boldsymbol \Gamma\\ \mathbf 0 & \mathbf 0 \end{bmatrix} \]
This is how to extract them
(Psi <- modelEst1$psi[1:3, 1:3])
IQ nAch GPA
IQ 0.907 0.000 0.000
nAch 0.000 0.828 0.000
GPA 0.000 0.000 0.502
(Phi <- modelEst1$psi[4, 4])
[1] 0.997
(Beta <- modelEst1$beta[1:3, 1:3])
IQ nAch GPA
IQ 0.0000 0.000 0
nAch 0.0407 0.000 0
GPA 0.5007 0.416 0
(Gamma <- modelEst1$beta[1:3, 4])
IQ nAch GPA
0.30000 0.39780 0.00919
Assign the summarized data to a character vector
# Social status, actual and perceived income data
income_lower <- '
1.0000
0.3670 1.0000
0.5103 0.4310 1.0000
0.3633 0.1840 0.2772 1.0000
0.1574 0.2819 0.2311 0.2921 1.0000'
# Sample size
N4 <- 432
Convert to a full symmetric covariance/correlation matrix with names
income_cor <- getCov(income_lower,
names = c("subj_income",
"subj_prestige",
"subj_status",
"actual_income",
"actual_prestige"))
Specify model
income_model <- '
# regressions
subj_income ~ actual_income + subj_prestige
subj_prestige ~ actual_prestige + subj_income
subj_status ~ subj_income + subj_prestige
# correlations
actual_income ~~ actual_prestige
subj_income ~~ subj_prestige
subj_status ~~ subj_prestige
subj_status ~~ subj_income'
Note that the “actual_income ~~ actual_prestige” is not really needed. Do you know why?(hint: think regression). Now, we can fit the model and print a summary of the model estimates.
income_model_fit <- sem(income_model,
sample.cov = income_cor,
sample.nobs = N4)
Warning in lavaan::lavaan(model = income_model, sample.cov = income_cor, :
lavaan WARNING: syntax contains parameters involving exogenous covariates;
switching to fixed.x = FALSE
summary(income_model_fit,
rsquare = TRUE)
lavaan (0.5-21) converged normally after 44 iterations
Number of observations 432
Estimator ML
Minimum Function Test Statistic 0.000
Degrees of freedom 0
Minimum Function Value 0.0000000000000
Parameter Estimates:
Information Expected
Standard Errors Standard
Regressions:
Estimate Std.Err z-value P(>|z|)
subj_income ~
actual_income 0.322 0.054 5.992 0.000
subj_prestige 0.225 0.178 1.260 0.208
subj_prestige ~
actual_prestig 0.231 0.048 4.818 0.000
subj_income 0.320 0.131 2.449 0.014
subj_status ~
subj_income 0.485 0.165 2.940 0.003
subj_prestige 0.549 0.201 2.731 0.006
Covariances:
Estimate Std.Err z-value P(>|z|)
actual_income ~~
actual_prestig 0.291 0.050 5.828 0.000
.subj_income ~~
.subj_prestige -0.173 0.188 -0.918 0.359
.subj_prestige ~~
.subj_status -0.239 0.166 -1.438 0.150
.subj_income ~~
.subj_status -0.109 0.133 -0.822 0.411
Variances:
Estimate Std.Err z-value P(>|z|)
.subj_income 0.780 0.061 12.843 0.000
.subj_prestige 0.812 0.055 14.680 0.000
.subj_status 0.762 0.101 7.531 0.000
actual_income 0.998 0.068 14.697 0.000
actual_prestig 0.998 0.068 14.697 0.000
R-Square:
Estimate
subj_income 0.218
subj_prestige 0.186
subj_status 0.236
We can also output a list of model matrices
(modelEst2 <- lavInspect(income_model_fit, "est"))
$lambda
sbj_nc sbj_pr sbj_st actl_n actl_p
subj_income 1 0 0 0 0
subj_prestige 0 1 0 0 0
subj_status 0 0 1 0 0
actual_income 0 0 0 1 0
actual_prestige 0 0 0 0 1
$theta
sbj_nc sbj_pr sbj_st actl_n actl_p
subj_income 0
subj_prestige 0 0
subj_status 0 0 0
actual_income 0 0 0 0
actual_prestige 0 0 0 0 0
$psi
sbj_nc sbj_pr sbj_st actl_n actl_p
subj_income 0.780
subj_prestige -0.173 0.812
subj_status -0.109 -0.239 0.762
actual_income 0.000 0.000 0.000 0.998
actual_prestige 0.000 0.000 0.000 0.291 0.998
$beta
sbj_nc sbj_pr sbj_st actl_n actl_p
subj_income 0.000 0.225 0 0.322 0.000
subj_prestige 0.320 0.000 0 0.000 0.231
subj_status 0.485 0.549 0 0.000 0.000
actual_income 0.000 0.000 0 0.000 0.000
actual_prestige 0.000 0.000 0 0.000 0.000
Note that in lavaan the psi matrix corresponds to the following compound matrix in the notes
\[ \begin{bmatrix} \boldsymbol \Psi & \mathbf 0\\ \mathbf 0 & \boldsymbol \Phi \end{bmatrix} \]
and the beta matrix to
\[ \begin{bmatrix} \mathbf B & \boldsymbol \Gamma\\ \mathbf 0 & \mathbf 0 \end{bmatrix} \]
This is how to extract them
(Psi <- modelEst2$psi[1:3, 1:3])
subj_income subj_prestige subj_status
subj_income 0.780 -0.173 -0.109
subj_prestige -0.173 0.812 -0.239
subj_status -0.109 -0.239 0.762
(Phi <- modelEst2$psi[4:5, 4:5])
actual_income actual_prestige
actual_income 0.998 0.291
actual_prestige 0.291 0.998
(Beta <- modelEst2$beta[1:3, 1:3])
subj_income subj_prestige subj_status
subj_income 0.000 0.225 0
subj_prestige 0.320 0.000 0
subj_status 0.485 0.549 0
(Gamma <- modelEst2$beta[1:3, 4:5])
actual_income actual_prestige
subj_income 0.322 0.000
subj_prestige 0.000 0.231
subj_status 0.000 0.000
Plot the path diagram
semPaths(income_model_fit,
"par",
layout = "circle")