HW3 Solution

Load required libraries

library(lavaan)
library(semPlot)

EX 1 (a + b)

The following explanation contains material that wasn’t covered at the time when the assignment was handed out, but you should pay attention to it since it provides a big picture for all the models that will be covered in the course.

Consider the structural model (without intercept) \[\begin{aligned} \boldsymbol \eta &=\mathbf B \boldsymbol \eta +\boldsymbol \Gamma \boldsymbol\xi +\boldsymbol \zeta\\ \end{aligned} \] and the measurement model \[\begin{aligned} \mathbf x &=\boldsymbol \Lambda_x \boldsymbol \eta +\boldsymbol \epsilon\\ \mathbf y &=\boldsymbol \Lambda_y \boldsymbol \xi +\boldsymbol \delta\\ \end{aligned} \] where \(\boldsymbol \eta\) is a vector of endogenous latent variables, \(\boldsymbol\xi\) a vector of exogenous latent variables, \(\boldsymbol \zeta\) is a vector error variables, \(\mathbf y\) a vector of endogenous observed variables, \(\mathbf x\) a vector of exogenous observed variables such that \[\begin{aligned} Cov(\boldsymbol \eta) &= \boldsymbol \Phi \\ Cov(\boldsymbol\xi) &= \boldsymbol \Psi \\ Cov(\boldsymbol \epsilon) &= \boldsymbol \Theta_x\\ Cov(\boldsymbol \delta) &= \boldsymbol \Theta_y \end{aligned} \] Now a path analysis model is a structural equation model where \(\mathbf x\) and \(\mathbf y\) are observed without error such that \[\begin{aligned} \boldsymbol \Lambda_x &=\mathbf I\\ \boldsymbol \Lambda_y &= \mathbf I \\ \boldsymbol \Theta_x &= \boldsymbol{0} \\ \boldsymbol \Theta_y &= \boldsymbol{0} \end{aligned} \]

We can use the lisrelModel function (one way to find out about it is through a Google search. Look here for examples) in the semPlot package to plot the path diagram for the model. We just need to feed the function the matrices with a 1 specifiying a free parameter and zero for fixed parameter. Here is the code:

endogen_N <- 3 # number of endogenous observed variables
exogen_N <- 1 # number of exogenous observed variables
# Beta 3x3 matrix:
Beta <- matrix(c(0, 0, 0,
                 1, 0, 0,
                 1, 1, 0), 
                nrow = endogen_N,
                ncol = endogen_N,
                byrow = TRUE)

# Gamma 3x1 matrix :
Gamma <- matrix(c(1, 1, 1), 
                nrow = endogen_N , 
                ncol = exogen_N )

# Psi 3x3 diagonal matrix :
Psi <- diag(1, 
            nrow = endogen_N, 
            ncol = endogen_N)

# Phi 1x1 diagonal matrix :
Phi <- diag(1, 1, 1)

# Lambda identity matrices :
Lambda_x <- diag(1, exogen_N, exogen_N)
Lambda_y <- diag(1, endogen_N, endogen_N)

# Combine model:
mod <- lisrelModel(LY = Lambda_y,
                   PS = Psi, 
                   BE = Beta, 
                   manNamesEndo = c("IQ", "nAch", "GPA"),
                   LX = Lambda_x,
                   PH = Phi, 
                   GA = Gamma,
                   manNamesExo = "SES")

We can print out the matrices first

Beta

     [,1] [,2] [,3]
[1,]    0    0    0
[2,]    1    0    0
[3,]    1    1    0

Gamma

     [,1]
[1,]    1
[2,]    1
[3,]    1

Psi

     [,1] [,2] [,3]
[1,]    1    0    0
[2,]    0    1    0
[3,]    0    0    1

Phi

     [,1]
[1,]    1

Lambda_x

     [,1]
[1,]    1

Lambda_y

     [,1] [,2] [,3]
[1,]    1    0    0
[2,]    0    1    0
[3,]    0    0    1

Here is the plot. Do you notice something unusual about the graph?

# Plot path diagram:
semPaths(mod, 
         as.expression = c("nodes", "edges"), 
         sizeMan = 5,  # adjust size of boxes
         sizeInt = 1, 
         sizeLat = 4)

Alternatively, we can plot a simplified path diagram without Greek letters like this

# Specify model
SES_model <- '
   # regressions
     IQ ~ SES
     nAch ~ SES + IQ
     GPA ~ SES + IQ + nAch '

semPaths(SES_model ,
         "eq",
         layout = "tree")

EX 1 (c)

First assign the summarized data to a character vector and convert it to a full symmetric covariance/correlation matrix with names

# SES, IQ and Need Achievement data

SES_lower <- '
 1
.30   1
.41 .16   1
.33 .57 .50   1'

# Sample size
N3 <- 306
SES_cor <- getCov(SES_lower,
                  names = c("SES", "IQ", "nAch", "GPA"))

Fit the model

SES_model_fit <- sem(SES_model,
           sample.cov = SES_cor,
           sample.nobs = N3)

Found more than one class "Model" in cache; using the first, from namespace 'lavaan'

Print a summary of the model estimates. Don’t forget to add the rsquare option to print out the squared multiple correlations. We can also output the list of model matrices.

summary(SES_model_fit , rsquare = TRUE)

lavaan (0.5-21) converged normally after  15 iterations

  Number of observations                           306

  Estimator                                         ML
  Minimum Function Test Statistic                0.000
  Degrees of freedom                                 0

Parameter Estimates:

  Information                                 Expected
  Standard Errors                             Standard

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)
  IQ ~                                                
    SES               0.300    0.055    5.501    0.000
  nAch ~                                              
    SES               0.398    0.055    7.285    0.000
    IQ                0.041    0.055    0.745    0.457
  GPA ~                                               
    SES               0.009    0.046    0.199    0.842
    IQ                0.501    0.043   11.763    0.000
    nAch              0.416    0.045    9.348    0.000

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
   .IQ                0.907    0.073   12.369    0.000
   .nAch              0.828    0.067   12.369    0.000
   .GPA               0.502    0.041   12.369    0.000

R-Square:
                   Estimate
    IQ                0.090
    nAch              0.170
    GPA               0.496

(modelEst1 <- lavInspect(SES_model_fit, "est"))

$lambda
     IQ nAch GPA SES
IQ    1    0   0   0
nAch  0    1   0   0
GPA   0    0   1   0
SES   0    0   0   1

$theta
     IQ nAch GPA SES
IQ   0              
nAch 0  0           
GPA  0  0    0      
SES  0  0    0   0  

$psi
     IQ    nAch  GPA   SES  
IQ   0.907                  
nAch 0.000 0.828            
GPA  0.000 0.000 0.502      
SES  0.000 0.000 0.000 0.997

$beta
        IQ  nAch GPA   SES
IQ   0.000 0.000   0 0.300
nAch 0.041 0.000   0 0.398
GPA  0.501 0.416   0 0.009
SES  0.000 0.000   0 0.000

Note that in lavaan the psi matrix corresponds to the following compound matrix in the notes

\[ \begin{bmatrix} \boldsymbol \Psi & \mathbf 0\\ \mathbf 0 & \boldsymbol \Phi \end{bmatrix} \]

and the beta matrix to

\[ \begin{bmatrix} \mathbf B & \boldsymbol \Gamma\\ \mathbf 0 & \mathbf 0 \end{bmatrix} \]

This is how to extract them

(Psi   <- modelEst1$psi[1:3, 1:3])

        IQ  nAch   GPA
IQ   0.907 0.000 0.000
nAch 0.000 0.828 0.000
GPA  0.000 0.000 0.502

(Phi   <- modelEst1$psi[4, 4])

[1] 0.997

(Beta  <- modelEst1$beta[1:3, 1:3])

         IQ  nAch GPA
IQ   0.0000 0.000   0
nAch 0.0407 0.000   0
GPA  0.5007 0.416   0

(Gamma <- modelEst1$beta[1:3, 4])

     IQ    nAch     GPA 
0.30000 0.39780 0.00919

EX 2

Assign the summarized data to a character vector

# Social status, actual and perceived income data
income_lower <- '
 1.0000 
 0.3670  1.0000 
 0.5103  0.4310  1.0000 
 0.3633  0.1840  0.2772  1.0000 
 0.1574  0.2819  0.2311  0.2921  1.0000' 

# Sample size
N4 <- 432

Convert to a full symmetric covariance/correlation matrix with names

income_cor <- getCov(income_lower,
                       names = c("subj_income",
                                 "subj_prestige",
                                 "subj_status",
                                 "actual_income",
                                 "actual_prestige"))

Specify model

income_model <- '
   # regressions
     subj_income ~ actual_income + subj_prestige
     subj_prestige ~ actual_prestige + subj_income
     subj_status ~ subj_income + subj_prestige
   # correlations
     actual_income ~~ actual_prestige
     subj_income ~~ subj_prestige 
     subj_status  ~~ subj_prestige
     subj_status  ~~ subj_income'

Note that the “actual_income ~~ actual_prestige” is not really needed. Do you know why?(hint: think regression). Now, we can fit the model and print a summary of the model estimates.

income_model_fit <- sem(income_model,
           sample.cov = income_cor,
           sample.nobs = N4)

Warning in lavaan::lavaan(model = income_model, sample.cov = income_cor, :
lavaan WARNING: syntax contains parameters involving exogenous covariates;
switching to fixed.x = FALSE

summary(income_model_fit,
        rsquare = TRUE)

lavaan (0.5-21) converged normally after  44 iterations

  Number of observations                           432

  Estimator                                         ML
  Minimum Function Test Statistic                0.000
  Degrees of freedom                                 0
  Minimum Function Value               0.0000000000000

Parameter Estimates:

  Information                                 Expected
  Standard Errors                             Standard

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)
  subj_income ~                                       
    actual_income     0.322    0.054    5.992    0.000
    subj_prestige     0.225    0.178    1.260    0.208
  subj_prestige ~                                     
    actual_prestig    0.231    0.048    4.818    0.000
    subj_income       0.320    0.131    2.449    0.014
  subj_status ~                                       
    subj_income       0.485    0.165    2.940    0.003
    subj_prestige     0.549    0.201    2.731    0.006

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)
  actual_income ~~                                    
    actual_prestig    0.291    0.050    5.828    0.000
 .subj_income ~~                                      
   .subj_prestige    -0.173    0.188   -0.918    0.359
 .subj_prestige ~~                                    
   .subj_status      -0.239    0.166   -1.438    0.150
 .subj_income ~~                                      
   .subj_status      -0.109    0.133   -0.822    0.411

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
   .subj_income       0.780    0.061   12.843    0.000
   .subj_prestige     0.812    0.055   14.680    0.000
   .subj_status       0.762    0.101    7.531    0.000
    actual_income     0.998    0.068   14.697    0.000
    actual_prestig    0.998    0.068   14.697    0.000

R-Square:
                   Estimate
    subj_income       0.218
    subj_prestige     0.186
    subj_status       0.236

We can also output a list of model matrices

(modelEst2 <- lavInspect(income_model_fit, "est"))

$lambda
                sbj_nc sbj_pr sbj_st actl_n actl_p
subj_income          1      0      0      0      0
subj_prestige        0      1      0      0      0
subj_status          0      0      1      0      0
actual_income        0      0      0      1      0
actual_prestige      0      0      0      0      1

$theta
                sbj_nc sbj_pr sbj_st actl_n actl_p
subj_income     0                                 
subj_prestige   0      0                          
subj_status     0      0      0                   
actual_income   0      0      0      0            
actual_prestige 0      0      0      0      0     

$psi
                sbj_nc sbj_pr sbj_st actl_n actl_p
subj_income      0.780                            
subj_prestige   -0.173  0.812                     
subj_status     -0.109 -0.239  0.762              
actual_income    0.000  0.000  0.000  0.998       
actual_prestige  0.000  0.000  0.000  0.291  0.998

$beta
                sbj_nc sbj_pr sbj_st actl_n actl_p
subj_income      0.000  0.225      0  0.322  0.000
subj_prestige    0.320  0.000      0  0.000  0.231
subj_status      0.485  0.549      0  0.000  0.000
actual_income    0.000  0.000      0  0.000  0.000
actual_prestige  0.000  0.000      0  0.000  0.000