Content
Intro to DSEM
This is a short introduction to DSEM, with pictures!
DSEM is used to analyze time series data. Time series data has multiple observations for a given subject on one or multiple variables. For example, if we measure John’s urge to smoke every day for 50 days we will get the kind of data we need. DSEM is also useful for when we have multiple observations for multiple subjects. Thus, it is instructive to think of DSEM as being composed of two models: (1) a within-subject model, describing a subject’s mean and variation around that mean that happens over time and (2) a between-subject model, capturing how these subject-specific characteristics differ between subjects. The number of parameters in a DSEM can vary depending on the substantive question you would like to answer. In this workshop we will go over the most popular DSEM which consists of three parameters. In the figure below you can see a visual representation of the three-parameter DSEM. We will break down each of these parameters below using our running example of smoking urges.
1. Mean urge to smoke (\(\mu\))
The mean describes the average of a subject’s urge to smoke across all 50 days. This can be thought of as a stable point around which any variation happens. Subjects can differ in their mean value: as shown in the left panel of the figure below, the subject in blue has a higher mean urge to smoke. In the right panel, the model is described mathematically. We model a subject’s mean urge to smoke (\(\mu_{i}\)) as the sum of a population-level mean (\(\gamma_{1}\)) and a subject-specific deviation from that population-level mean (\(\upsilon_{1i}\)). In such a hierarchical framework, population-level effects are called fixed effects and the sds (or variances) of subject-specific effects are called random effects. Note that you estimate sds of subject-specific effects, because estimating all separate effects is computationally expensive (read: impossible).
2. Autoregression (AR1; \(\phi\))
How much the urge to smoke on a given day differs from the mean urge across all 50 days is called a deviation. How much these deviations linger over time, that is, for how long subsequent urges will stay above or below the subject’s mean urge, is captured by the autoregressive parameter. More frmally, it captures how deviations for the mean on time t-1 (i.e., \(Urge_{(t-1)i} - \mu_i\)) relate to deviations on time t (i.e., \(Urge_{ti} - \mu_i\) if you restructure the formula in the figure below (and ignore the noise \(\epsilon_{ti}\))). For example, when a subject with a high positive autoregressive parameter (blue in left panel of the figure below) reports an urge to smoke that is higher than their mean urge, you can expect them to stay at that elevated level for longer than someone with a lower autoregressive parameter (red in left panel of figure). In the right panel, the statistical model again shows how a subject’s autoregressive parameter (\(\phi_{i}\)) is composed of a population-level autoregressive parameter (\(\gamma_{2}\)) and a subject-specific deviation (\(\upsilon_{2i}\)).
3. Residual Variance (\(\psi\))
After taking into account the variation in urges that is explained by autoregression, our model still doesn’t fully capture all fluctuations. That is, we still have residual variation left. For example, two people who have the same level of urges on average and the same autoregression, may still differ in their day-to-day residual variation in their urges. In the left panel of the figure below, the two people shown are equal on all other parameters except their residual variance (blue has greater residual variance). As can be seen in the right panel of the figure below, the general structure of the residual variances is the same as for the mean and autoregression: a subject’s residual variance (\(\psi_{i}\)) is composed of a population-level parameter (\(\gamma_{3}\)) and a subject-specific deviation (\(\upsilon_{3i}\)). You can ignore the exponential term for now, we explain it in the Fitting a simple DSEM module.
To get started, go back to the top.
Get started
Before we start coding, let’s take care of some prerequisites. If you do not have R (R Core Team, 2024) and its user-friendly interface RStudio installed, please do so following the steps on this website.
Once installed, you also need to install the RStan package to enable R to use the computing platform.
Exercise 0: Install RStan using the R console.
if(!require(rstan)){install.packages("rstan")}
require(rstan)
The next step is accessing the example data on smoking urges through GitHub and getting a feel for them.
Exercise 1: Load the data. What variables are in the data set? How many subjects? And how many time points?
dat <- read.csv("https://raw.githubusercontent.com/JessicaSchaaf/dsemworkshop/main/Data/Two-Level%20Data.csv",header=T) #load McNeish data
There are 6 variables in the data set, for 100 subjects across 50 time points. These variables are subject’s smoking urges (‘urge’), their depressive symptoms (‘dep’), job stress (‘js’), home stress (‘hs’), subject ids (‘subject’), and time (‘time’).
We are interested in the following research questions
Q1. What are subject’s average smoking urges?
Q2. How do urges linger over time?
Q3. How do these effects differ across subjects? and
Q4. How do urges relate to concurrent depressive symptoms?
Let’s prepare the data to enable you to fit a DSEM to answer the research questions.
Exercise 2: First create an object Nsubj
that
contains the number of subjects and an object Nobs
that
contains the number of observations/time points.
Nsubj <- length(unique(dat$subject))
Nobs <- max(dat$time)
Exercise 3: Create two matrices of dimensions
Nsubj x Nobs
that contain the urges data (call this object
urges
) and the depression data (call this object
dep
).
urges <- matrix(dat$urge, Nsubj, Nobs, byrow=T)
dep <- matrix(dat$dep, Nsubj, Nobs, byrow=T)
Exercise 4: Plot the time series of smoking urges with time on the x-axis and urges, averaged across subjects, on the y-axis.
df = data.frame(time = 1:Nobs,
urges = tapply(dat$urge,dat$time,mean),
urgesSE = tapply(dat$urge,dat$time,sd)/sqrt(Nsubj))
if(!require(ggplot2)){install.packages("ggplot2")}
require(ggplot2)
ggplot(df, aes(x=time, y=urges)) +
geom_line(size=2) +
geom_ribbon(aes(ymin=df$urges - df$urgesSE, ymax=df$urges + df$urgesSE), alpha=0.3) +
labs(x="Time", y="Smoking Urges") +
theme_minimal() + ylim(-0.6,0.6)
To continue to the next module and learn about Stan models go back to the top.
Implementing a DSEM in Stan
Before getting into how to fit models in Stan, first some background on how to construct Stan models. You may or may not wish to alter or fully implement models yourself. Either way, understanding the basics may help understand what is going on behind the scenes. A Stan model is built up using several blocks. A simple model often requires a data block, a model block, and a parameters block.
The data block
In the data block, you specify which data to run the model
on. You provide these data from a different program, for example, R. A
standard DSEM requires three types of data to be transferred from R to
Stan: the number of subjects, the number of time points or observations,
and the time series data. In Stan, for each variable, you also have to
indicate the variable
type and its dimensions. The number of subjects (which we call
N
) is just a single integer, as is the number of
observations (which we call T
). The time series data is a
two-dimensional object of dimensions N x T
. In Stan, this
is implemented as follows:
data {
int<lower=1> N; // number of subjects
int<lower=1> T; // number of observations
array[N] vector[T] Y; // time series data
}
Important to note here is that Stan doesn’t require a certain set of variable names. In other words, you can call variables as you wish. Just be sure to be consistent; otherwise Stan will give you a semantic error (“Identifier ‘…’ not in scope.”).
The parameters block
In the parameters block, you specify which parameters to
sample in the model. In a standard DSEM the parameters consist of the
population-level mean, autoregression, and residual variance (which we
call gammas as is common notation in hierarchical modeling), and the
subject-specific deviations (u's
). They also consist of
parameters needed to construct the covariance matrix of subject-specific
deviations (which we call Omega
). Although this is
necessary to fit the model, we do not go into the details here to avoid
overly mathematical discussion. The population-level parameters
(gamma
) are implemented as a vector of size 3 and the
subject-specific deviations (u
) as a two-dimensional object
of size N x 3
:
parameters {
vector[3] gamma; // population-level effects
array[N] vector[3] u; // subject-specific deviations
}
The model block
In the model block, you specify the model to fit on the data and, as Stan uses Bayesian estimation, the prior distributions for all parameters.
The model
In our case the model is a standard DSEM, which mathematically can be described as follows:
\[\begin{equation} Y(t) = \mu + \delta(t)\label{eq1}\tag{1} \end{equation}\]
where Y
indicates the subject’s smoking urges across
trials t
, \(\mu\)
(mu
) is the mean urge, and \(\delta\) (delta
) is the
deviation of the observed urges from the mean urge on a specific
trial.
The lingering of deviations from the mean urge over time is captured
in an additional parameter \(\phi\)
(phi
), called the autoregressive effect. Mathematically,
this lingering is formulated as
\[\begin{equation} \delta(t) = \phi\delta(t-1) + \epsilon(t)\label{eq2}\tag{2} \end{equation}\]
where \(\epsilon\)
(epsilon
) is the residual after taking lingering of
deviations into account.
In Stan we implement Equation \(\ref{eq2}\) by specifying that the
deviation for subject i
on trial t
(delta[i][t]
) comes from a normal distribution with a mean
equal to the autoregressive effect for that subject
(phi[i] * delta[i][t-1]
). In Stan, we also need to specify
the standard deviation of this normal distribution. Do you remember from
the Intro to DSEM module that one of the parameters of a DSEM
is the residual variance? This is where this residual variance comes
into play. You have just learned from Equation \(\ref{eq2}\) that the residuals are called
\(\epsilon\). From this follows that
the residual variance is var(\(\epsilon_{it}\)), which we call \(\psi\) (psi
). However, as Stan
requires standard deviations instead of variances, we need to take the
squared root of the residual variance (sqrt(psi[i])
).
model {
for (i in 1:N) { // loop across subjects
for (t in 2:T) { // loop across observations
// likelihood of observing data given model parameters
target += normal_lpdf(delta[i][t] | phi[i] * delta[i][t-1], sqrt(psi[i]));
}
}
}
Likely, you have never encountered the target +=
notation in the code snippet above. It is just a way of telling Stan to
incrementally update the log likelihood (hence _lpdf
which
stands for log probability density function). It is mathematically
equivalent to
delta[i][t] ~ normal(phi[i] * delta[i][t-1], sqrt(psi[i]))
,
which may look more familiar.
Also in the model block, we specify the likelihood of the
subject-specific deviations. As you may remember from simple, or even
multilevel, regression, deviations are usually assumed normally
distributed with a mean of zero. Here we specify something similar.
Specifically, we tell Stan that subject-specific deviations
(u
) are multivariate normally distributed with means of
zero (rep_vector(0,3)
, which is a vector of three zeros)
and covariance matrix Omega
. This multivariate
implementation enables you to assess whether subject-specific effects
are correlated. For example, whether subjects with a higher mean have
lower residual variance or vice versa.
model {
for (i in 1:N) {
// likelihood of subject-specific deviations
target += multi_normal_lpdf(u[i] | rep_vector(0,3), Omega);
}
}
Prior distributions
In the model block, you also specify prior distributions for
the parameters in the parameters block. For the
population-level parameters (gamma
), we implement a normal
distribution with a mean of zero and a standard deviation of 3.
model {
// prior on population-level parameters
target += normal_lpdf(gamma | 0, 3);
}
The transformed parameters and generated quantities blocks
You may also add a transformed parameters block (in which you specify parameter transformations), and a generated quantities block (in which you compute additional outputs such as posterior predictives). Importantly, the model block runs locally. So if you want to use variables or parameters in the generated quantities block, you need to specify these in the transformed parameters block (instead of the model block). But that is for later.
Go back to the top for the real fun to start in the next module.
Fitting a simple DSEM
In the Get Started module, you have prepared all necessary
data to fit a DSEM to the example data on smoking urges. Stan requires
data to be formatted in a list. Besides, in the Stan model, we have used
the names N
for the number of subjects, T
for
the number of time points or observations, and Y
for the
time series data. To enable Stan to recognize which data are which, you
need to use these names (N
, T
, and
Y
) in the list. We start simple and only focus on the first
two RQs, that is, mean smoking urge and how urges linger over time.
Exercise 5: Create a list (called datUrges
) that
contains the number of subjects (N
), the number of
observations (T
), and the time series data (Y
)
which in this case is smoking urges.
datUrges <- list(N = Nsubj,
T = Nobs,
Y = urges)
We are now going to fit a DSEM with three parameters: mean, autoregression, and residual variance. As you may have noticed while plotting the data in the previous module, the urges data are centered around zero and have a fairly stable average over time. This indicates the data are detrended. As such, we do not have to worry about trends in our data and can omit a parameter capturing this trend in the DSEM. Beware that your own data may contain trends and that you should either deal with this before modeling (through detrending like in our example) or by including a trend in the DSEM. Code for a DSEM including trend is available here.
We have already programmed the three-parameter DSEM in Stan for you
below. In the Implementing a DSEM in Stan module, we explained
the core of what we implemented. For now, you have to believe that we
coded the model correctly. Remember that in the model gamma
indicates population-level effects. Specifically, gamma[1]
is the population-level mean, gamma[2]
the population-level
autoregression, and gamma[3]
the population-level residual
variance.
mymodel = "
data {
int<lower=1> N; // number of subjects
int<lower=1> T; // number of observations
array[N] vector[T] Y; // time series data
}
parameters {
vector[3] gamma; // population-level effects
array[N] vector[3] u; // subject-specific deviations
// for covariance matrix of subject-specific deviations
cholesky_factor_corr[3] L_Omega; // Cholesky factor
vector<lower=0>[3] tau_Omega; // vector of standard deviations
}
transformed parameters {
// construct covariance matrix from Cholesky factors and standard deviations
corr_matrix[3] R_Omega; // correlation matrix
R_Omega = multiply_lower_tri_self_transpose(L_Omega); // R = L* L'
// quad_form_diag: diag_matrix(tau) * R * diag_matrix(tau)
cov_matrix[3] Omega = quad_form_diag(R_Omega, tau_Omega);
// subject-specific parameters
vector[N] mu; // mean
vector[N] phi; // autoregression
vector[N] psi; // residual variance
// object for deviation from mean
array[N] vector[T] delta;
for (i in 1:N) {
// to obtain subject-specific effects,
// sum population-level effects and subjects-specific deviations
mu[i] = gamma[1] + u[i][1];
phi[i] = gamma[2] + u[i][2];
// note gamma[3] and u[i][3] are estimated on a log-scale
// to assume normal distribution which simplifies estimation
psi[i] = exp(gamma[3] + u[i][3]);
// calculate deviations by subtracting mean
delta[i] = Y[i] - mu[i];
}
}
model {
// prior distributions
// .. for the population-level parameters
target += normal_lpdf(gamma | 0, 3);
// .. for the Cholesky factor
target += lkj_corr_cholesky_lpdf(L_Omega | 1.0);
// .. for the vector of standard deviations
target += cauchy_lpdf(tau_Omega | 0, 2.5);
// likelihood
for (i in 1:N) {
// subject-specific deviations
target += multi_normal_lpdf(u[i] | rep_vector(0,3), Omega);
for (t in 2:T) {
// data given model parameters
target += normal_lpdf(delta[i][t] | phi[i] * delta[i][t-1], sqrt(psi[i]));
}
}
}
"
Exercise 6: Fit the three-parameter DSEM using the
stan()
function from the rstan package. The argument
model_code
should be set to mymodel, the object in
which we programmed the model code. The data
argument
should contain the data list you created in exercise 5. In the
pars
argument you may specify (using a character string)
which parameters to keep track of. For now, keep track of
population-level effects (gamma
) and standard deviations of
subject-specific effects (tau_Omega
). Set the
iter
argument to 1000 iterations. And for reproducibility,
set the seed
argument to 42.
options(mc.cores = parallel::detectCores())
fitUrges <- stan(model_code = mymodel,
data = datUrges,
pars = c("gamma","tau_Omega"),
iter = 1000,
seed = 42)
Exercise 7: Print the fit results for the population-level
effects using the print()
function.
print(fitUrges, pars="gamma", probs = c(0.025, 0.975), digits = 2)
## Inference for Stan model: anon_model.
## 4 chains, each with iter=1000; warmup=500; thin=1;
## post-warmup draws per chain=500, total post-warmup draws=2000.
##
## mean se_mean sd 2.5% 97.5% n_eff Rhat
## gamma[1] 0.00 0.01 0.09 -0.18 0.16 78 1.07
## gamma[2] 0.18 0.00 0.02 0.14 0.22 441 1.01
## gamma[3] 0.67 0.01 0.07 0.52 0.82 103 1.03
##
## Samples were drawn using NUTS(diag_e) at Thu May 8 10:24:03 2025.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
The output from exercise 7, gives you information on many things. As
Stan uses Bayesian estimation, the mean
indicates the mean
of the posterior distribution of each parameter. This value is often
reported as ‘the best estimate of the parameter’. The standard error of
the mean (se_mean
) indicates how stable the mean is
estimated across iterations, with large values indicating unstable
estimates. The standard deviation (sd
) indicates how
uncertain that mean is (the higher the more uncertain). The following
columns the print()
function returns indicate quantiles.
This is very helpful as you may use these to report credible intervals
(similar to confidence intervals in frequentist statistics). For
instance, the 2.5% and 97.5% quantiles give you the 95% credible
interval. The final two columns can be used to check how well your model
performs and how well parameters are estimated. n_eff
stand
for number of effective samples which indicates how effective sampling
was done. Without going into too much detail, the Rhat
statistic (Gelman
& Rubin, 1992) gives you an indication of model convergence.
Ideally Rhats are 1; values above 1.1 are typically interpreted as
convergence problems.
Exercise 8: Report results on the first two RQs, that is, what the average smoking urge is and how urges linger over time.
If you did everything correctly all effects have converged with R-hat values below 1.1 (the closer to 1 the better). You’ll also find that the average smoking urge is practically zero. This may seem odd at first, but remember that data were detrended (and thus demeaned) beforehand so it makes total sense. The autoregression is small but positive, indicating that experiencing more than average smoking urges yesterday relates to experiencing more than average smoking urges today (and vice versa for less than average urges).
Go back to the top to move on to investigating individual differences.
Investigating individual differences
Moving on to answering the third RQ, we’re now not interested in population-level effects, but in individual differences in these effects. Specifically, we test whether subjects differ in their mean smoking urge and how urges linger over time (both answered on the population level in exercise 8).
In exercise 6, you not only tracked population-level effects
(gamma
), but also standard deviations in subject-specific
effects (tau_Omega
) in the pars
argument.
These parameter estimates indicate to what extent subjects differ in
their mean, autoregression, and residual variance.
tau_Omega
contains three values: tau_Omega[1]
is the sd of subject-specific means, tau_Omega[2]
of
subject-specific autoregressions, and tau_Omega[3]
of
subject-specific residual variance.
Exercise 9: Use the print()
function to get fit
results for the standard deviations of subject-specific effects. Then
report on the third RQ. Tip: Compute the 95% credible interval to assess
in which range most of the means and autoregressions fall.
print(fitUrges, pars="tau_Omega", probs = c(0.025, 0.975), digits = 2)
## Inference for Stan model: anon_model.
## 4 chains, each with iter=1000; warmup=500; thin=1;
## post-warmup draws per chain=500, total post-warmup draws=2000.
##
## mean se_mean sd 2.5% 97.5% n_eff Rhat
## tau_Omega[1] 0.81 0 0.07 0.69 0.95 1448 1.00
## tau_Omega[2] 0.14 0 0.02 0.10 0.18 382 1.01
## tau_Omega[3] 0.71 0 0.06 0.61 0.83 2240 1.00
##
## Samples were drawn using NUTS(diag_e) at Thu May 8 10:24:03 2025.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
Looking at the estimates of tau_Omega
there are
substantial individual differences. To get a better feel for the scale
of individual differences, you may calculate the 95% credible interval,
and thus assess in which parameter range most of the individuals fall.
To do so, first take the mean of the estimated parameter as indicated by
gamma
in the summary of exercise 7. Then add/subtract 1.96
times the standard deviation (as indicated by tau_Omega
in
the summary of exercise 9) to/from the mean of the estimated parameter.
With respect to the mean, most individuals’ means range between -1.6 and
1.6, whereas most individuals’ autoregressions range between -.09 and
.45.
If you wish to investigate subject-specific effects (instead of their
standard deviation), you need to keep track of those effects while
fitting the model. The subject-specific means are called
mu
, the subject-specific autoregressions phi
,
and the residual variances psi
. Beware that there are
3 times N
subject-specific effects in the model so keeping
track of them makes model fitting take substantially longer. To speed up
estimation, you may run
options(mc.cores = parallel::detectCores())
in the R
console to run sampling chains in parallel.
Exercise 10: Refit the model but now keep track of the three
subject-specific effects. Use the same settings as in exercise 6, except
for the pars
argument.
options(mc.cores = parallel::detectCores())
fitUrges_ind <- stan(model_code = mymodel,
data = datUrges,
pars = c("mu","phi","psi"),
iter = 1000,
seed = 42)
You can extract the posterior samples for easy plotting with the
extract()
function. This returns an object that contains
the posterior samples for the tracked parameters. Beware that the
elements in the object have dimensions
number of samples x number of parameters
for
population-level effects, and
number of samples x number of participants
for
subject-specific effects.
Exercise 11: Extract the posterior samples and then plot the subject-specific estimates to visualize individual differences.
post = extract(fitUrges_ind)
df = data.frame(pp = 1:Nsubj,
param = rep(c("mu","phi","psi"),each=Nsubj),
est = c(apply(post$mu,2,mean),apply(post$phi,2,mean),apply(post$psi,2,mean)))
ggplot(df, aes(x=est, y=pp)) +
geom_point() +
facet_wrap(~ param, scales="free") +
labs(x="Estimate", y="Participant") +
ylim(c(1,Nsubj))
Exercise 12: Compare the plots to your own results in exercise 9. Are there substantial individual differences in mean urges and lingering of urges?
Looking good! Now let’s move on to adding covariates to answer the fourth RQ.
Adding covariates
The fourth RQ concerns a covariate, namely how concurrent depressive
symptoms relate to smoking urges. This covariate needs to be added to
the Stan model. We have already implemented such a covariate in a Stan
model. Yet, it may be helpful to understand what needs to be changed to
include an additional predictor. First, in the data block, we
declared that we have additional data (which we called X
)
to be modeled. Second, in the parameters block, we added a
population-level parameter for the covariate effect (an additional
gamma
). Third, also in the parameters block, we
added subject-specific effects (u
’s) for the covariate.
Beware that effects can only differ between subjects when a predictor
varies from trial-to-trial, not for subject-specific predictors. Fourth,
in the model block, we changed the prior distribution of the
subject-specific effects to indicate that we now have four (instead of
three) deviations per subject. And fifth, also in the model
block, we changed the formula for the actual model, stating that
the mean is now not only determined by the autoregressive effect
(Equation \(\ref{eq2}\) in the
Implementing a DSEM in Stan module), but also by the
covariate.
The complete model is implemented in the object
mymodelcov
below.
mymodelcov = "
data {
int<lower=1> N; // number of subjects
int<lower=1> T; // number of observations
array[N] vector[T] Y; // time series data
array[N] vector[T] X; // covariate
}
parameters {
vector[4] gamma; // population-level effects
array[N] vector[4] u; // subject-specific deviations
// for covariance matrix of subject-specific deviations
cholesky_factor_corr[4] L_Omega; // Cholesky factor
vector<lower=0>[4] tau_Omega; // vector of standard deviations
}
transformed parameters {
// construct covariance matrix from Cholesky factors and standard deviations
corr_matrix[4] R_Omega; // correlation matrix
R_Omega = multiply_lower_tri_self_transpose(L_Omega); // R = L* L'
// quad_form_diag: diag_matrix(tau) * R * diag_matrix(tau)
cov_matrix[4] Omega = quad_form_diag(R_Omega, tau_Omega);
// subject-specific parameters
vector[N] mu; // mean
vector[N] phi; // autoregression
vector[N] psi; // residual variance
vector[N] alpha; // effect of covariate
// object for deviation from mean
array[N] vector[T] delta;
for (i in 1:N) {
// to obtain subject-specific effects,
// sum population-level effects and subjects-specific deviations
mu[i] = gamma[1] + u[i][1];
phi[i] = gamma[2] + u[i][2];
// note gamma[3] and u[i][3] are estimated on a log-scale
// to assume normal distribution which simplifies estimation
psi[i] = exp(gamma[3] + u[i][3]);
alpha[i] = gamma[4] + u[i][4];
// calculate deviations by subtracting mean
delta[i] = Y[i] - mu[i];
}
}
model {
// prior distributions
// .. for the population-level parameters
target += normal_lpdf(gamma | 0, 3);
// .. for the Cholesky factor
target += lkj_corr_cholesky_lpdf(L_Omega | 1.0);
// .. for the vector of standard deviations
target += cauchy_lpdf(tau_Omega | 0, 2.5);
// likelihood
for (i in 1:N) {
// subject-specific deviations
target += multi_normal_lpdf(u[i] | rep_vector(0,4), Omega);
for (t in 2:T) {
// data given model parameters
target += normal_lpdf(delta[i][t] | phi[i] * delta[i][t-1] + alpha[i] * X[i][t], sqrt(psi[i]));
}
}
}
"
Exercise 13: Extend the data list datUrges
(created in exercise 5) to include object X
, containing the
depression data, and call this new data list
datUrges_dep
.
datUrges_dep <- list(N = Nsubj,
T = Nobs,
Y = urges,
X = dep)
Exercise 14: Fit a DSEM including the depression covariate.
Keep track of the population-level means and sds. Use
seed = 2094
. Report on the fourth RQ, that is, whether
depression relates to urges. Also compare estimates of the other
parameters to the ones you reported in exercise 8 and 9.
options(mc.cores = parallel::detectCores())
fitUrges_dep <- stan(model_code = mymodelcov,
data = datUrges_dep,
pars = c("gamma","tau_Omega"),
iter = 1000,
seed = 2094)
print(fitUrges_dep, pars=c("gamma","tau_Omega"), probs = c(0.025, 0.975), digits = 2)
## Inference for Stan model: anon_model.
## 4 chains, each with iter=1000; warmup=500; thin=1;
## post-warmup draws per chain=500, total post-warmup draws=2000.
##
## mean se_mean sd 2.5% 97.5% n_eff Rhat
## gamma[1] -0.01 0.01 0.08 -0.16 0.15 252 1.00
## gamma[2] 0.19 0.00 0.02 0.16 0.23 737 1.00
## gamma[3] 0.05 0.00 0.05 -0.04 0.14 453 1.00
## gamma[4] 0.79 0.01 0.09 0.61 0.95 138 1.03
## tau_Omega[1] 0.77 0.00 0.06 0.66 0.91 2681 1.00
## tau_Omega[2] 0.14 0.00 0.02 0.12 0.18 928 1.01
## tau_Omega[3] 0.42 0.00 0.04 0.35 0.50 1861 1.00
## tau_Omega[4] 0.89 0.00 0.07 0.77 1.03 2925 1.00
##
## Samples were drawn using NUTS(diag_e) at Thu May 8 10:27:15 2025.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
The answer is yes, more concurrent depressive symptoms relate to more
smoking urges, as indicated by a relatively large positive regression
coefficient under gamma[4]
. Also note that the estimates of
the mean and autoregression (gamma[1]
and
gamma[2]
) have barely changed. The residual variance
(gamma[3]
) dropped substantially, indicating residual
variance is explained by including concurrent depression as a
predictor.
Great! Let’s move on to more advanced DSEMs. Go back to the top to learn how to allow for asymmetric temporal dynamics and how to add predictors of model parameters.
Fitting a DYNASTI DSEM
We are also interested in an additional fifth RQ. That is, we wish to test whether temporal dynamics, in this case autoregressive effects, differ for above- and below-average levels of smoking urges. It may be that periods of high urges tend to persist over time whereas periods of low urges do not. Or even better, the opposite. To test these expectations, you may fit a DYnamics of ASymmetric TIme series (DYNASTI) version of the model in which you estimate separate autoregressions for above and below mean values.
Mathematically, the DYNASTI DSEM can be expressed as follows:
\[\begin{equation} \delta_{it} = \begin{cases} \phi_i^{\text{above}} \delta_{i,t-1} + \epsilon_{it} & \text{if } \delta_{i,t-1} > 0, \\ \phi_i^{\text{below}} \delta_{i,t-1} + \epsilon_{it} & \text{if } \delta_{i,t-1} \leq 0. \end{cases}\label{eq3}\tag{3} \end{equation}\]
The model thus determines whether the previous urge was above of below the mean (i.e., whether \(\delta_{i,t-1}\) is above or below 0) and subsequently uses \(\phi^{\text{above}}\) or \(\phi^{\text{below}}\) to explain the urge on the current trial.
In the model, there are several adjustments that need to be done to
allow for asymmetric temporal dynamics. We only changed the model, not
the data we will fit the model on, so the data block remained
the same. However, the DYNASTI DSEM consists of four (instead of three
parameters): the population-level mean, two population-level
autoregressive parameters (above and below the time series mean, see
Equation \(\ref{eq3}\)), and the
population-level residual variance. This additional population-level
parameter (gamma
) needs to be specified in the
parameters block. That is, we change the number of
gamma
s to 4. Similarly, we specify that we have
subject-specific deviations (u
’s) for an additional
parameter. Another change, which we implemented in the transformed
parameters block, is that we created an object
phi_temp
which we use to keep track of which autoregressive
parameter (phi_above
or phi_below
) to use. We
now also use this newly created object in the model specification in the
model block. We have implemented these changes in
mymodelasym
.
mymodelasym = "
data {
int<lower=1> N; // number of subjects
int<lower=1> T; // number of observations
array[N] vector[T] Y; // time series data
}
parameters {
vector[4] gamma; // population-level effects
array[N] vector[4] u; // subject-specific deviations
// for covariance matrix of subject-specific deviations
cholesky_factor_corr[4] L_Omega; // Cholesky factor
vector<lower=0>[4] tau_Omega; // vector of standard deviations
}
transformed parameters {
// construct covariance matrix from Cholesky factors and standard deviations
corr_matrix[4] R_Omega; // correlation matrix
R_Omega = multiply_lower_tri_self_transpose(L_Omega); // R = L* L'
// quad_form_diag: diag_matrix(tau) * R * diag_matrix(tau)
cov_matrix[4] Omega = quad_form_diag(R_Omega, tau_Omega);
// subject-specific parameters
vector[N] mu; // mean
vector[N] phi_above; // autoregression (above mean)
vector[N] phi_below; // autoregression (below mean)
vector[N] psi; // residual variance
vector[N] alpha; // effect of covariate
array[N] vector[T] delta; // object for deviation from mean
array[N] vector[T] phi_temp; // object to keep track of which autoregression to use
for (i in 1:N) {
// to obtain subject-specific effects,
// sum population-level effects and subjects-specific deviations
mu[i] = gamma[1] + u[i][1];
phi_above[i] = gamma[2] + u[i][2];
phi_below[i] = gamma[3] + u[i][3];
// note gamma[4] and u[i][4] are estimated on a log-scale
// to assume normal distribution which simplifies estimation
psi[i] = exp(gamma[4] + u[i][4]);
// calculate deviation by subtracting mean
delta[i] = Y[i] - mu[i];
// make object in which autoregression is based on deviation
phi_temp[i][1] = 0;
for (t in 2:T) {
if(delta[i][t-1] > 0) // above mean
phi_temp[i][t] = phi_above[i];
else // below mean
phi_temp[i][t] = phi_below[i];
}
}
}
model {
// prior distributions
// .. for the population-level parameters
target += normal_lpdf(gamma | 0, 3);
// .. for the Cholesky factor
target += lkj_corr_cholesky_lpdf(L_Omega | 1.0);
// .. for the vector of standard deviations
target += cauchy_lpdf(tau_Omega | 0, 2.5);
// likelihood
for (i in 1:N) {
// subject-specific deviations
target += multi_normal_lpdf(u[i] | rep_vector(0,4), Omega);
for (t in 2:T) {
// data given model parameters
target += normal_lpdf(delta[i][t] | phi_temp[i][t] * delta[i][t-1], sqrt(psi[i]));
}
}
}
"
You’re now ready to fit the DYNASTI DSEM model (as implemented in mymodelasym). You do this in the same way you fitted the simple DSEM model in the Fitting a simple DSEM module.
Exercise 15: Fit the four-parameter DYNASTI DSEM using the
stan()
function from the rstan package. The argument
model_code
should be set to mymodelasym, the
object with the implemented DYNASTI DSEM. The data
argument
should contain the data list you created in exercise 5. In the
pars
argument you may specify (using a character string)
which parameters to keep track of. For now, keep track of
population-level effects (gamma
). Run 1000 iterations and
use seed 23432.
options(mc.cores = parallel::detectCores())
fitUrgesasym <- stan(model_code = mymodelasym,
data = datUrges,
pars = "gamma",
iter = 1000,
seed = 23432)
Exercise 16: Print the fit results for the population-level
effects using the print()
function. Report results on the
fifth research question: whether temporal dynamics differ for above- and
below-average levels of smoking urges.
print(fitUrgesasym, pars="gamma", probs = c(0.025, 0.975), digits = 2)
## Inference for Stan model: anon_model.
## 4 chains, each with iter=1000; warmup=500; thin=1;
## post-warmup draws per chain=500, total post-warmup draws=2000.
##
## mean se_mean sd 2.5% 97.5% n_eff Rhat
## gamma[1] 0.02 0.01 0.10 -0.20 0.21 173 1.03
## gamma[2] 0.18 0.00 0.03 0.12 0.24 1625 1.00
## gamma[3] 0.17 0.00 0.04 0.09 0.23 632 1.00
## gamma[4] 0.66 0.01 0.07 0.52 0.81 213 1.01
##
## Samples were drawn using NUTS(diag_e) at Thu May 8 10:30:35 2025.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
In the Fitting a simple DSEM module, we discussed how to
interpret the output. If you are unsure, refer back to that section. All
effects should have converged with R-hat values below 1.1 (the closer to
1 the better). In this case we are specifically interested in the the
population-level autoregressive effects (gamma[2]
and
gamma[3]
). If you did everything correctly, you will have
found similar values for the above- and below-average autoregressive
effects. The small positive values mean that more-than-average urges and
less-than-average urges similarly tend to persist over time.
Assessing model fit
Some of you may want to assess model fit through fit indices or to
visualize how well the model describes the observed data. As a model fit
index, you may compute Bayes Factors. This can be done using the
bridgesampling
package. Example code is given below.
if(!require(bridgesampling)){install.packages("bridgesampling")}
require(bridgesampling)
# Compute marginal likelihoods
fitsym = bridge_sampler(fitUrges)
fitasym = bridge_sampler(fitUrgesasym)
# Compute Bayes Factor
bf(fitsym, fitasym)
There are two ways to get posterior predictives in order to visualize
model fit. The first is to extract the parameter estimates obtained from
fitting the DYNASTI DSEM to the behavioral data (as you have done before
using the print()
function) and subsequently use these
estimates to simulate data. To accomplish this, you have to write
simulation code in R yourself (or adjust our script on OSF!). The second is to let Stan
generate posterior predictives for you. To accomplish this, you may add
a generated quantities block to the Stan model code. For the
DYNASTI DSEM, this would be
generated quantities {
// obtain posterior predictives (to check model fit)
array[N] vector[T] delta_pred;
array[N] vector[T] Ypred;
for (i in 1:N) {
delta_pred[i][1] = normal_rng(0, sqrt(psi[i]));
for (t in 2:T) {
delta_pred[i][t] = normal_rng(phi_temp[i][t] * delta[i][t-1], sqrt(psi[i]));
Ypred[i][t] = mu[i] + delta_pred[i][t];
}
}
}
in which you generate random values (hence _rng
which
stands for random number generator), in our case deviations from the
mean, from a normal distribution based on the parameter estimates. By
adding the subject-specific mean (mu[i]
) to these
deviations you get posterior predictives, which we call
Ypred
. As this involves generating values, it may take a
while to run a model including this generated quantities block. Once
finished you may extract the posterior predictives from the fit object,
for example, by running post <- extract(fitUrgesasym)
and then post$Ypred
. Don’t forget to tell Stan to keep
track of these predictives by adding Ypred
to the
pars
argument.
Go back to the top for the final module on adding predictors of model parameters.
Predictors of DSEM parameters
A final extension discussed in this tutorial is how to include predictors of DSEM parameters. You may be interested in, for example, how depressed participants’ temporal dynamics differ from their healthy counterparts. Or, in case of a continuous predictor, whether autoregressive effects in- or decrease with age. You may first fit the model and then perform a t-test or regression analysis on the resulting parameter estimates. This two-step approach, however, is biased towards finding an effect (see Boehm et al., 2018). Therefore, it is a better idea to incorporate predictors within the model.
Let us investigate how job stress relates to mean smoking urges and to the autoregressive effect.
Exercise 17: Create a new data list called
datUrgesJS
that contains the number of subjects
N
, the number of observations T
, the urges
data Y
, and a vector, called JS
, that contains
the job stress per subject.
datUrgesJS <- list(N = Nsubj,
T = Nobs,
Y = urges,
JS = tapply(dat$js,dat$subject,unique))
Let us take the standard symmetric DSEM as starting point and extend
this model with job stress as predictor of mean smoking urges and
temporal dynamics. In the data block, we declared that we now
have a vector JS
that contains the job stress data. In the
parameters block, we changed the number of population-level
effects (gamma
s) as we now five of such effects: the mean,
the autoregression, the residual variable, and the role of job stress in
the mean and autoregression. The magic happens in the transformed
parameters block where we added effects (gamma[4]
and
gamma[5]
) of job stress (JS
).
mymodelpred = "
data {
int<lower=1> N; // number of subjects
int<lower=1> T; // number of observations
array[N] vector[T] Y; // time series data
array[N] real JS; // job stress data
}
parameters {
vector[5] gamma; // population-level effects
array[N] vector[3] u; // subject-specific deviations
// for covariance matrix of subject-specific deviations
cholesky_factor_corr[3] L_Omega; // Cholesky factor
vector<lower=0>[3] tau_Omega; // vector of standard deviations
}
transformed parameters {
// construct covariance matrix from Cholesky factors and standard deviations
corr_matrix[3] R_Omega; // correlation matrix
R_Omega = multiply_lower_tri_self_transpose(L_Omega); // R = L* L'
// quad_form_diag: diag_matrix(tau) * R * diag_matrix(tau)
cov_matrix[3] Omega = quad_form_diag(R_Omega, tau_Omega);
// subject-specific parameters
vector[N] mu; // mean
vector[N] phi; // autoregression
vector[N] psi; // residual variance
// object for deviation from mean
array[N] vector[T] delta;
for (i in 1:N) {
// to obtain subject-specific effects,
// sum population-level effects and subjects-specific deviations
mu[i] = gamma[1] + gamma[4]*JS[i] + u[i][1];
phi[i] = gamma[2] + gamma[5]*JS[i] + u[i][2];
// note gamma[3] and u[i][3] are estimated on a log-scale
// to assume normal distribution which simplifies estimation
psi[i] = exp(gamma[3] + u[i][3]);
// calculate deviations by subtracting mean
delta[i] = Y[i] - mu[i];
}
}
model {
// prior distributions
// .. for the population-level parameters
target += normal_lpdf(gamma | 0, 3);
// .. for the Cholesky factor
target += lkj_corr_cholesky_lpdf(L_Omega | 1.0);
// .. for the vector of standard deviations
target += cauchy_lpdf(tau_Omega | 0, 2.5);
// likelihood
for (i in 1:N) {
// subject-specific deviations
target += multi_normal_lpdf(u[i] | rep_vector(0,3), Omega);
for (t in 2:T) {
// data given model parameters
target += normal_lpdf(delta[i][t] | phi[i] * delta[i][t-1], sqrt(psi[i]));
}
}
}
"
Exercise 18: Report on the expected role of job stress in mean smoking urges and autoregressive effects.
Exercise 19: Fit the model including predictors using the
stan()
function. Keep track of population-level effects
(gamma
s), set the iter
argument to 1000
iterations, and set the seed
argument to 27. Report on the
results.
options(mc.cores = parallel::detectCores())
fitUrges_pred <- stan(model_code = mymodelpred,
data = datUrgesJS,
pars = "gamma",
iter = 1000,
seed = 27)
print(fitUrges_pred, pars="gamma", probs = c(0.025, 0.975), digits = 2)
## Inference for Stan model: anon_model.
## 4 chains, each with iter=1000; warmup=500; thin=1;
## post-warmup draws per chain=500, total post-warmup draws=2000.
##
## mean se_mean sd 2.5% 97.5% n_eff Rhat
## gamma[1] 0.03 0 0.08 -0.13 0.18 424 1.00
## gamma[2] 0.19 0 0.02 0.15 0.22 2207 1.00
## gamma[3] 0.66 0 0.07 0.51 0.80 315 1.02
## gamma[4] 0.40 0 0.10 0.21 0.59 417 1.00
## gamma[5] 0.10 0 0.02 0.05 0.14 1119 1.00
##
## Samples were drawn using NUTS(diag_e) at Thu May 8 10:32:29 2025.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
If everything went right, you’ll see that gamma[4]
is
positive. This means that subjects that experience more job stress, also
experience higher smoking urges. Similarly, gamma[5]
is
positive, suggesting that for subjects that experience more job stress,
smoking urges tend to linger more. Does this correspond to your
expectations as formulated in exercise 18?
This is the end of the tutorial. If you run into any problems or if you have questions, feel free to contact Jessica Schaaf on jessica.schaaf@radboudumc.nl. Suggestions on additional modules are also welcome. Hope you had a blast!