This section shows how to perform some matrix operations in R.
In R, the basic data structure is a vector. A matrix is basically a vector with a dimension attribute, dim() . To create a matrix, we use the matrix() function. This function has the following arguments:
args(matrix)## function (data = NA, nrow = 1, ncol = 1, byrow = FALSE, dimnames = NULL) 
## NULL# fill matrix by column (default)Here we populate the matrix with a vector
v <- c(1, 2, 3, 4, 5, 6)
v## [1] 1 2 3 4 5 6M <-  matrix(v , nrow = 2, ncol = 3)
M##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6Mt <-  t(M)
Mt##      [,1] [,2]
## [1,]    1    2
## [2,]    3    4
## [3,]    5    6MMt <-  M %*% Mt
MMt##      [,1] [,2]
## [1,]   35   44
## [2,]   44   56dM <-  M + M
dM##      [,1] [,2] [,3]
## [1,]    2    6   10
## [2,]    4    8   12MMt_inverse <-  solve(MMt)
MMt_inverse##       [,1]  [,2]
## [1,]  2.33 -1.83
## [2,] -1.83  1.46Consider the simple regression model
\[y_i = \beta_0 + \beta_1 x_i +\epsilon_i \textrm{ for } i= 1, 2, \ldots ,n \] In matrix form, the regression model can be represented as
\[\begin{bmatrix} y_{1} \\ \vdots\\ y_{n} \end{bmatrix} = \begin{bmatrix} 1 & x_{1} \\ \vdots\\ 1 & x_{n} \end{bmatrix} \begin{bmatrix} \beta_{0} \\ \beta_{1} \end{bmatrix} + \begin{bmatrix} \epsilon_{1} \\ \vdots\\ \epsilon_{n} \end{bmatrix}\] or more concisely as
\[\mathbf{y} =\mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon}\]
It can be shown that the estimated parameters are
\[\hat{\boldsymbol \beta} = \left( \mathbf{X}^{\mathsf{T}} \mathbf{X} \right)^{-1} \mathbf{X}^{\mathsf{T}} \mathbf{y}\]
and the fitted values are
\[\hat{\mathbf{y}} = \mathbf{X} \hat{\boldsymbol \beta} = \mathbf{X} \left( \mathbf{X}^{\mathsf{T}} \mathbf{X} \right)^{-1} \mathbf{X}^{\mathsf{T}} \mathbf{y}\]
The next three question involves working with the following data:
x <- c(10, 8, 13, 9, 11, 14, 6 , 4, 12, 7 ,5 )
y <-  c(8.04, 6.95, 7.58, 8.81, 8.33, 9.96, 7.24, 4.26, 10.84, 4.82, 5.68 )Compute the vector of estimated parameters \(\hat{\boldsymbol \beta}\) for the simple regression problem given the data provided above. Hint: A fast way to create the matrix \(\mathbf{X}\) uses the column bind function, cbind(), to bind two vectors into a matrix. In R, if one element is a scalar, the cbind() function broadcasts (transforms) the scalar into a vector.
# answer hereObtain the fitted values, the vector \(\hat{\mathbf{y}}\), by using matrix multiplication
# answer hereObtain the vector of residuals \(\hat{\boldsymbol{\epsilon}}\). Note that the vector of residuals has a hat on it, whereas the vector of errors \(\boldsymbol{\epsilon}\) doesn’t. Now you know the difference!
# answer hereWhen you ever see a matrix, think transformation! In the equation for obtaining the fitted values above, we see that \(\mathbf{y}\) gets multiplied (transformed) by several matrices to give us \(\hat{\mathbf{y}}\). Let’s denote the result of these multiplications by
\[\mathbf{H}=\mathbf{X} \left( \mathbf{X}^{\mathsf{T}} \mathbf{X} \right)^{-1} \mathbf{X}^{\mathsf{T}}\]
Obtain \(\mathbf{H}\) for the data given above
# answer hereNow play with the interactive visualization here and try to figure out what kind of tranformation \(\hat{\mathbf{H}}\) accomplished. The answer is one word only. Think about the dimension of \(\mathbf{H}\) in relation with the dimension of the data vector \(\mathbf{y}\)
# answer here