In general, the broken stick model smoothes the observed growth trajectory. What happens of all observations are already aligned to the break ages? Does the model perfectly represent the data? Is the covariance matrix of the random effects (Ω) equal to the covariance between the measurements? Is σ2 equal to zero?
We adapt code from http://www.davekleinschmidt.com/sst-mixed-effects-simulation/simulations_slides.pdf to generate test data:
## ------------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## ------------------------------------------------------------------------------
##
## Attaching package: 'plyr'
## The following objects are masked from 'package:dplyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
library("mvtnorm")
make_data_generator <- function(resid_var = 1,
ranef_covar = diag(c(1, 1)), n = 100
) {
ni <- nrow(ranef_covar)
generate_data <- function() {
# sample data set under mixed effects model with random slope/intercepts
simulated_data <- rdply(n, {
b <- t(rmvnorm(n = 1, sigma = ranef_covar))
epsilon <- rnorm(n = length(b), mean = 0, sd = sqrt(resid_var))
b + epsilon
})
data.frame(
subject = rep(1:n, each = ni),
age = rep(1:ni, n),
simulated_data)
}
}
We choose between the perfect situation where σ2 = 0 and the noisy case σ2 = 1 and where the ages align perfectly.
resid_var <- 0
resid_var <- 1
set.seed(77711)
covar <- matrix(c(1, 0.7, 0.5, 0.3,
0.7, 1, 0.8, 0.5,
0.5, 0.8, 1, 0.6,
0.3, 0.5, 0.6, 1), nrow = 4)
gen_dat <- make_data_generator(n = 10000,
ranef_covar = covar,
resid_var = resid_var)
data <- gen_dat()
head(data)
## subject age .n X1
## 1 1 1 1 -0.9478
## 2 1 2 1 -2.0837
## 3 1 3 1 -2.6512
## 4 1 4 1 -2.5526
## 5 2 1 2 -0.0825
## 6 2 2 2 -1.2707
We wish to reproduce the correlation matrix among the y’s from the mixed model estimates. The target correlation matrix is:
library("tidyr")
library("dplyr")
d <- as_tibble(data[,-3])
broad <- t(spread(d, subject, X1))[-1,]
cor(broad)
## [,1] [,2] [,3] [,4]
## [1,] 1.000 0.350 0.255 0.161
## [2,] 0.350 1.000 0.406 0.246
## [3,] 0.255 0.406 1.000 0.313
## [4,] 0.161 0.246 0.313 1.000
Fit broken stick model, with knots specified at ages
1:4
.
library("brokenstick")
knots <- 1:3
boundary <- c(1, 4)
fit <- brokenstick(X1 ~ age | subject, data,
knots = knots, boundary = boundary,
method = "lmer")
## Warning: number of observations (=40000) <= number of random effects (=40000)
## for term (0 + age_1 + age_2 + age_3 + age_4 | subject); the random-effects
## parameters and the residual variance (or scale parameter) are probably
## unidentifiable
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## unable to evaluate scaled gradient
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge: degenerate Hessian with 1 negative eigenvalues
omega <- get_omega(fit, hide = "no")
beta <- coef(fit, hide = "no")
sigma2 <- fit$sigma2
round(beta, 2)
## age_1 age_2 age_3 age_4
## -0.02 -0.02 0.01 0.01
## [1] 0.818
## [,1] [,2] [,3] [,4]
## [1,] 1.0 0.7 0.5 0.3
## [2,] 0.7 1.0 0.8 0.5
## [3,] 0.5 0.8 1.0 0.6
## [4,] 0.3 0.5 0.6 1.0
## age_1 age_2 age_3 age_4
## age_1 1.23 0.71 0.52 0.33
## age_2 0.71 1.16 0.82 0.49
## age_3 0.52 0.82 1.22 0.63
## age_4 0.33 0.49 0.63 1.17
## age_1 age_2 age_3 age_4
## age_1 2.052 0.706 0.521 0.326
## age_2 0.706 1.982 0.816 0.489
## age_3 0.521 0.816 2.034 0.630
## age_4 0.326 0.489 0.630 1.992
## [,1] [,2] [,3] [,4]
## [1,] 2.052 0.706 0.521 0.326
## [2,] 0.706 1.982 0.816 0.489
## [3,] 0.521 0.816 2.034 0.630
## [4,] 0.326 0.489 0.630 1.992
## age_1 age_2 age_3 age_4
## age_1 1.000 0.350 0.255 0.161
## age_2 0.350 1.000 0.406 0.246
## age_3 0.255 0.406 1.000 0.313
## age_4 0.161 0.246 0.313 1.000
## [,1] [,2] [,3] [,4]
## [1,] 1.000 0.350 0.255 0.161
## [2,] 0.350 1.000 0.406 0.246
## [3,] 0.255 0.406 1.000 0.313
## [4,] 0.161 0.246 0.313 1.000
z <- predict(fit, x = "knots", include_data = FALSE, shape = "wide")[, -1]
# off-diagonal elements of covariance of broken stick estimates approach correlation
# not enough variance in the diagonal because of smoothing
cov(z)
## 1 2 3
## 1 0.795 0.595 0.475
## 2 0.595 0.791 0.681
## 3 0.475 0.681 0.821
## 1 2 3
## 1 1.000 0.750 0.587
## 2 0.750 1.000 0.845
## 3 0.587 0.845 1.000
cov2cor(hatC)
reproduces the sample time-to-time
correlation matrix.## age_1 age_2 age_3
## age_1 1.000 0.350 0.255
## age_2 0.350 1.000 0.406
## age_3 0.255 0.406 1.000