DSEM workshop

Welcome to this tutorial on Dynamic Structural Equation Models (DSEMs; Asparouhov et al., 2018; McNeish et al., 2020) and how to fit them in Stan! This tutorial requires basic R skills. Some experience with time series modeling and/or Stan is helpful but not required. After completing this tutorial, you will be able to understand the basics of DSEM and Stan, to fit a Standard DSEM and a DYNASTI DSEM with fixed and random effects in Stan, and to add covariates and predictors of parameters.

Below you see eight tabs. The first tab Intro to DSEM contains information on DSEMs and their parameters. The second tab Get started contains R code needed to use Stan and code to load and format example data. The third tab Implementing a DSEM in Stan contains some background information on Stan models. The rest of the tabs contain step-by-step exercises to get familiar with fitting DSEMs in Stan.

Content

Intro to DSEM

This is a short introduction to DSEM, with pictures!

DSEM is used to analyze time series data. Time series data has multiple observations for a given subject on one or multiple variables. For example, if we measure John’s urge to smoke every day for 50 days we will get the kind of data we need. DSEM is also useful for when we have multiple observations for multiple subjects. Thus, it is instructive to think of DSEM as being composed of two models: (1) a within-subject model, describing a subject’s mean and variation around that mean that happens over time and (2) a between-subject model, capturing how these subject-specific characteristics differ between subjects. The number of parameters in a DSEM can vary depending on the substantive question you would like to answer. In this workshop we will go over the most popular DSEM which consists of three parameters. In the figure below you can see a visual representation of the three-parameter DSEM. We will break down each of these parameters below using our running example of smoking urges.

1. Mean urge to smoke ($\mu$)

The mean describes the average of a subject’s urge to smoke across all 50 days. This can be thought of as a stable point around which any variation happens. Subjects can differ in their mean value: as shown in the left panel of the figure below, the subject in blue has a higher mean urge to smoke. In the right panel, the model is described mathematically. We model a subject’s mean urge to smoke ($\mu_{i}$) as the sum of a population-level mean ($\gamma_{1}$) and a subject-specific deviation from that population-level mean ($\upsilon_{1i}$). In such a hierarchical framework, population-level effects are called fixed effects and the sds (or variances) of subject-specific effects are called random effects. Note that you estimate sds of subject-specific effects, because estimating all separate effects is computationally expensive (read: impossible).

2. Autoregression (AR1; $\phi$)

How much the urge to smoke on a given day differs from the mean urge across all 50 days is called a deviation. How much these deviations linger over time, that is, for how long subsequent urges will stay above or below the subject’s mean urge, is captured by the autoregressive parameter. More frmally, it captures how deviations for the mean on time t-1 (i.e., $Urge_{(t-1)i} - \mu_i$) relate to deviations on time t (i.e., $Urge_{ti} - \mu_i$ if you restructure the formula in the figure below (and ignore the noise $\epsilon_{ti}$)). For example, when a subject with a high positive autoregressive parameter (blue in left panel of the figure below) reports an urge to smoke that is higher than their mean urge, you can expect them to stay at that elevated level for longer than someone with a lower autoregressive parameter (red in left panel of figure). In the right panel, the statistical model again shows how a subject’s autoregressive parameter ($\phi_{i}$) is composed of a population-level autoregressive parameter ($\gamma_{2}$) and a subject-specific deviation ($\upsilon_{2i}$).

3. Residual Variance ($\psi$)

After taking into account the variation in urges that is explained by autoregression, our model still doesn’t fully capture all fluctuations. That is, we still have residual variation left. For example, two people who have the same level of urges on average and the same autoregression, may still differ in their day-to-day residual variation in their urges. In the left panel of the figure below, the two people shown are equal on all other parameters except their residual variance (blue has greater residual variance). As can be seen in the right panel of the figure below, the general structure of the residual variances is the same as for the mean and autoregression: a subject’s residual variance ($\psi_{i}$) is composed of a population-level parameter ($\gamma_{3}$) and a subject-specific deviation ($\upsilon_{3i}$). You can ignore the exponential term for now, we explain it in the Fitting a simple DSEM module.

To get started, go back to the top.

Get started

Before we start coding, let’s take care of some prerequisites. If you do not have R (R Core Team, 2024) and its user-friendly interface RStudio installed, please do so following the steps on this website.

Once installed, you also need to install the RStan package to enable R to use the computing platform.

Exercise 0: Install RStan using the R console.

if(!require(rstan)){install.packages("rstan")}
require(rstan)

The next step is accessing the example data on smoking urges through GitHub and getting a feel for them.

Exercise 1: Load the data. What variables are in the data set? How many subjects? And how many time points?

dat <- read.csv("https://raw.githubusercontent.com/JessicaSchaaf/dsemworkshop/main/Data/Two-Level%20Data.csv",header=T) #load McNeish data

There are 6 variables in the data set, for 100 subjects across 50 time points. These variables are subject’s smoking urges (‘urge’), their depressive symptoms (‘dep’), job stress (‘js’), home stress (‘hs’), subject ids (‘subject’), and time (‘time’).

We are interested in the following research questions
Q1. What are subject’s average smoking urges?
Q2. How do urges linger over time?
Q3. How do these effects differ across subjects? and
Q4. How do urges relate to concurrent depressive symptoms?

Let’s prepare the data to enable you to fit a DSEM to answer the research questions.

Exercise 2: First create an object Nsubj that contains the number of subjects and an object Nobs that contains the number of observations/time points.

Nsubj <- length(unique(dat$subject))
Nobs <- max(dat$time)

Exercise 3: Create two matrices of dimensions Nsubj x Nobs that contain the urges data (call this object urges) and the depression data (call this object dep).

urges <- matrix(dat$urge, Nsubj, Nobs, byrow=T)
dep <- matrix(dat$dep, Nsubj, Nobs, byrow=T)

Exercise 4: Plot the time series of smoking urges with time on the x-axis and urges, averaged across subjects, on the y-axis.

df = data.frame(time    = 1:Nobs,
                urges   = tapply(dat$urge,dat$time,mean),
                urgesSE = tapply(dat$urge,dat$time,sd)/sqrt(Nsubj))

if(!require(ggplot2)){install.packages("ggplot2")}
require(ggplot2)

ggplot(df, aes(x=time, y=urges)) +
  geom_line(size=2) +
  geom_ribbon(aes(ymin=df$urges - df$urgesSE, ymax=df$urges + df$urgesSE), alpha=0.3) +
  labs(x="Time", y="Smoking Urges") +
  theme_minimal() + ylim(-0.6,0.6)

To continue to the next module and learn about Stan models go back to the top.

Implementing a DSEM in Stan

Before getting into how to fit models in Stan, first some background on how to construct Stan models. You may or may not wish to alter or fully implement models yourself. Either way, understanding the basics may help understand what is going on behind the scenes. A Stan model is built up using several blocks. A simple model often requires a data block, a model block, and a parameters block.

The data block

In the data block, you specify which data to run the model on. You provide these data from a different program, for example, R. A standard DSEM requires three types of data to be transferred from R to Stan: the number of subjects, the number of time points or observations, and the time series data. In Stan, for each variable, you also have to indicate the variable type and its dimensions. The number of subjects (which we call N) is just a single integer, as is the number of observations (which we call T). The time series data is a two-dimensional object of dimensions N x T. In Stan, this is implemented as follows:

data {
  int<lower=1> N;             // number of subjects
  int<lower=1> T;             // number of observations
  array[N] vector[T] Y;       // time series data
}

Important to note here is that Stan doesn’t require a certain set of variable names. In other words, you can call variables as you wish. Just be sure to be consistent; otherwise Stan will give you a semantic error (“Identifier ‘…’ not in scope.”).

The parameters block

In the parameters block, you specify which parameters to sample in the model. In a standard DSEM the parameters consist of the population-level mean, autoregression, and residual variance (which we call gammas as is common notation in hierarchical modeling), and the subject-specific deviations (u's). They also consist of parameters needed to construct the covariance matrix of subject-specific deviations (which we call Omega). Although this is necessary to fit the model, we do not go into the details here to avoid overly mathematical discussion. The population-level parameters (gamma) are implemented as a vector of size 3 and the subject-specific deviations (u) as a two-dimensional object of size N x 3:

parameters {
  vector[3] gamma;            // population-level effects
  array[N] vector[3] u;       // subject-specific deviations
}

The model block

In the model block, you specify the model to fit on the data and, as Stan uses Bayesian estimation, the prior distributions for all parameters.

The model

In our case the model is a standard DSEM, which mathematically can be described as follows:

\[\begin{equation} Y(t) = \mu + \delta(t)\label{eq1}\tag{1} \end{equation}\]

where Y indicates the subject’s smoking urges across trials t, $\mu$ (mu) is the mean urge, and $\delta$ (delta) is the deviation of the observed urges from the mean urge on a specific trial.

The lingering of deviations from the mean urge over time is captured in an additional parameter $\phi$ (phi), called the autoregressive effect. Mathematically, this lingering is formulated as

\[\begin{equation} \delta(t) = \phi\delta(t-1) + \epsilon(t)\label{eq2}\tag{2} \end{equation}\]

where $\epsilon$ (epsilon) is the residual after taking lingering of deviations into account.

In Stan we implement Equation $\ref{eq2}$ by specifying that the deviation for subject i on trial t (delta[i][t]) comes from a normal distribution with a mean equal to the autoregressive effect for that subject (phi[i] * delta[i][t-1]). In Stan, we also need to specify the standard deviation of this normal distribution. Do you remember from the Intro to DSEM module that one of the parameters of a DSEM is the residual variance? This is where this residual variance comes into play. You have just learned from Equation $\ref{eq2}$ that the residuals are called $\epsilon$. From this follows that the residual variance is var($\epsilon_{it}$), which we call $\psi$ (psi). However, as Stan requires standard deviations instead of variances, we need to take the squared root of the residual variance (sqrt(psi[i])).

model {
  for (i in 1:N) {            // loop across subjects
    for (t in 2:T) {          // loop across observations
      // likelihood of observing data given model parameters
      target += normal_lpdf(delta[i][t] | phi[i] * delta[i][t-1], sqrt(psi[i]));
    }
  }
}

Likely, you have never encountered the target += notation in the code snippet above. It is just a way of telling Stan to incrementally update the log likelihood (hence _lpdf which stands for log probability density function). It is mathematically equivalent to delta[i][t] ~ normal(phi[i] * delta[i][t-1], sqrt(psi[i])), which may look more familiar.

Also in the model block, we specify the likelihood of the subject-specific deviations. As you may remember from simple, or even multilevel, regression, deviations are usually assumed normally distributed with a mean of zero. Here we specify something similar. Specifically, we tell Stan that subject-specific deviations (u) are multivariate normally distributed with means of zero (rep_vector(0,3), which is a vector of three zeros) and covariance matrix Omega. This multivariate implementation enables you to assess whether subject-specific effects are correlated. For example, whether subjects with a higher mean have lower residual variance or vice versa.

model {
  for (i in 1:N) {
    // likelihood of subject-specific deviations
    target += multi_normal_lpdf(u[i] | rep_vector(0,3), Omega);
  }
}

Prior distributions

In the model block, you also specify prior distributions for the parameters in the parameters block. For the population-level parameters (gamma), we implement a normal distribution with a mean of zero and a standard deviation of 3.

model {
  // prior on population-level parameters
  target += normal_lpdf(gamma | 0, 3);
}

The transformed parameters and generated quantities blocks

You may also add a transformed parameters block (in which you specify parameter transformations), and a generated quantities block (in which you compute additional outputs such as posterior predictives). Importantly, the model block runs locally. So if you want to use variables or parameters in the generated quantities block, you need to specify these in the transformed parameters block (instead of the model block). But that is for later.

Go back to the top for the real fun to start in the next module.

Fitting a simple DSEM

In the Get Started module, you have prepared all necessary data to fit a DSEM to the example data on smoking urges. Stan requires data to be formatted in a list. Besides, in the Stan model, we have used the names N for the number of subjects, T for the number of time points or observations, and Y for the time series data. To enable Stan to recognize which data are which, you need to use these names (N, T, and Y) in the list. We start simple and only focus on the first two RQs, that is, mean smoking urge and how urges linger over time.

Exercise 5: Create a list (called datUrges) that contains the number of subjects (N), the number of observations (T), and the time series data (Y) which in this case is smoking urges.

datUrges <- list(N = Nsubj,
                 T = Nobs,
                 Y = urges)

We are now going to fit a DSEM with three parameters: mean, autoregression, and residual variance. As you may have noticed while plotting the data in the previous module, the urges data are centered around zero and have a fairly stable average over time. This indicates the data are detrended. As such, we do not have to worry about trends in our data and can omit a parameter capturing this trend in the DSEM. Beware that your own data may contain trends and that you should either deal with this before modeling (through detrending like in our example) or by including a trend in the DSEM. Code for a DSEM including trend is available here.

We have already programmed the three-parameter DSEM in Stan for you below. In the Implementing a DSEM in Stan module, we explained the core of what we implemented. For now, you have to believe that we coded the model correctly. Remember that in the model gamma indicates population-level effects. Specifically, gamma[1] is the population-level mean, gamma[2] the population-level autoregression, and gamma[3] the population-level residual variance.

mymodel = "
data {
  int<lower=1> N;           // number of subjects
  int<lower=1> T;           // number of observations
  array[N] vector[T] Y;         // time series data
}

parameters {
  vector[3] gamma;              // population-level effects
  array[N] vector[3] u;         // subject-specific deviations

  // for covariance matrix of subject-specific deviations
  cholesky_factor_corr[3] L_Omega;  // Cholesky factor
  vector<lower=0>[3] tau_Omega;     // vector of standard deviations
}

transformed parameters {
  // construct covariance matrix from Cholesky factors and standard deviations
  corr_matrix[3] R_Omega; // correlation matrix
  R_Omega = multiply_lower_tri_self_transpose(L_Omega); // R = L* L'
  // quad_form_diag: diag_matrix(tau) * R * diag_matrix(tau)
  cov_matrix[3] Omega = quad_form_diag(R_Omega, tau_Omega);
  
  // subject-specific parameters
  vector[N] mu;             // mean
  vector[N] phi;            // autoregression
  vector[N] psi;            // residual variance

  // object for deviation from mean
  array[N] vector[T] delta; 

  for (i in 1:N) {
    // to obtain subject-specific effects, 
    // sum population-level effects and subjects-specific deviations
    mu[i] = gamma[1] + u[i][1];
    phi[i] = gamma[2] + u[i][2];
    // note gamma[3] and u[i][3] are estimated on a log-scale
    // to assume normal distribution which simplifies estimation
    psi[i] = exp(gamma[3] + u[i][3]);
    // calculate deviations by subtracting mean
    delta[i] = Y[i] - mu[i];
  }
}

model {
  // prior distributions
  // .. for the population-level parameters
  target += normal_lpdf(gamma | 0, 3); 
  // .. for the Cholesky factor
  target += lkj_corr_cholesky_lpdf(L_Omega | 1.0);
  // .. for the vector of standard deviations
  target += cauchy_lpdf(tau_Omega | 0, 2.5);

  // likelihood
  for (i in 1:N) {
    // subject-specific deviations
    target += multi_normal_lpdf(u[i] | rep_vector(0,3), Omega);
    
    for (t in 2:T) {
      // data given model parameters
      target += normal_lpdf(delta[i][t] | phi[i] * delta[i][t-1], sqrt(psi[i]));
    }
  }
}
"

Exercise 6: Fit the three-parameter DSEM using the stan() function from the rstan package. The argument model_code should be set to mymodel, the object in which we programmed the model code. The data argument should contain the data list you created in exercise 5. In the pars argument you may specify (using a character string) which parameters to keep track of. For now, keep track of population-level effects (gamma) and standard deviations of subject-specific effects (tau_Omega). Set the iter argument to 1000 iterations. And for reproducibility, set the seed argument to 42.

options(mc.cores = parallel::detectCores())
fitUrges <- stan(model_code = mymodel,
                 data       = datUrges,
                 pars       = c("gamma","tau_Omega"),
                 iter       = 1000,
                 seed       = 42)

Exercise 7: Print the fit results for the population-level effects using the print() function.

print(fitUrges, pars="gamma", probs = c(0.025, 0.975), digits = 2)

## Inference for Stan model: anon_model.
## 4 chains, each with iter=1000; warmup=500; thin=1; 
## post-warmup draws per chain=500, total post-warmup draws=2000.
## 
##          mean se_mean   sd  2.5% 97.5% n_eff Rhat
## gamma[1] 0.00    0.01 0.09 -0.18  0.16    78 1.07
## gamma[2] 0.18    0.00 0.02  0.14  0.22   441 1.01
## gamma[3] 0.67    0.01 0.07  0.52  0.82   103 1.03
## 
## Samples were drawn using NUTS(diag_e) at Thu May  8 10:24:03 2025.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

The output from exercise 7, gives you information on many things. As Stan uses Bayesian estimation, the mean indicates the mean of the posterior distribution of each parameter. This value is often reported as ‘the best estimate of the parameter’. The standard error of the mean (se_mean) indicates how stable the mean is estimated across iterations, with large values indicating unstable estimates. The standard deviation (sd) indicates how uncertain that mean is (the higher the more uncertain). The following columns the print() function returns indicate quantiles. This is very helpful as you may use these to report credible intervals (similar to confidence intervals in frequentist statistics). For instance, the 2.5% and 97.5% quantiles give you the 95% credible interval. The final two columns can be used to check how well your model performs and how well parameters are estimated. n_eff stand for number of effective samples which indicates how effective sampling was done. Without going into too much detail, the Rhat statistic (Gelman & Rubin, 1992) gives you an indication of model convergence. Ideally Rhats are 1; values above 1.1 are typically interpreted as convergence problems.

Exercise 8: Report results on the first two RQs, that is, what the average smoking urge is and how urges linger over time.

If you did everything correctly all effects have converged with R-hat values below 1.1 (the closer to 1 the better). You’ll also find that the average smoking urge is practically zero. This may seem odd at first, but remember that data were detrended (and thus demeaned) beforehand so it makes total sense. The autoregression is small but positive, indicating that experiencing more than average smoking urges yesterday relates to experiencing more than average smoking urges today (and vice versa for less than average urges).

Go back to the top to move on to investigating individual differences.

Investigating individual differences

Moving on to answering the third RQ, we’re now not interested in population-level effects, but in individual differences in these effects. Specifically, we test whether subjects differ in their mean smoking urge and how urges linger over time (both answered on the population level in exercise 8).

In exercise 6, you not only tracked population-level effects (gamma), but also standard deviations in subject-specific effects (tau_Omega) in the pars argument. These parameter estimates indicate to what extent subjects differ in their mean, autoregression, and residual variance. tau_Omega contains three values: tau_Omega[1] is the sd of subject-specific means, tau_Omega[2] of subject-specific autoregressions, and tau_Omega[3] of subject-specific residual variance.

Exercise 9: Use the print() function to get fit results for the standard deviations of subject-specific effects. Then report on the third RQ. Tip: Compute the 95% credible interval to assess in which range most of the means and autoregressions fall.

print(fitUrges, pars="tau_Omega", probs = c(0.025, 0.975), digits = 2)

## Inference for Stan model: anon_model.
## 4 chains, each with iter=1000; warmup=500; thin=1; 
## post-warmup draws per chain=500, total post-warmup draws=2000.
## 
##              mean se_mean   sd 2.5% 97.5% n_eff Rhat
## tau_Omega[1] 0.81       0 0.07 0.69  0.95  1448 1.00
## tau_Omega[2] 0.14       0 0.02 0.10  0.18   382 1.01
## tau_Omega[3] 0.71       0 0.06 0.61  0.83  2240 1.00
## 
## Samples were drawn using NUTS(diag_e) at Thu May  8 10:24:03 2025.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

Looking at the estimates of tau_Omega there are substantial individual differences. To get a better feel for the scale of individual differences, you may calculate the 95% credible interval, and thus assess in which parameter range most of the individuals fall. To do so, first take the mean of the estimated parameter as indicated by gamma in the summary of exercise 7. Then add/subtract 1.96 times the standard deviation (as indicated by tau_Omega in the summary of exercise 9) to/from the mean of the estimated parameter. With respect to the mean, most individuals’ means range between -1.6 and 1.6, whereas most individuals’ autoregressions range between -.09 and .45.

If you wish to investigate subject-specific effects (instead of their standard deviation), you need to keep track of those effects while fitting the model. The subject-specific means are called mu, the subject-specific autoregressions phi, and the residual variances psi. Beware that there are 3 times N subject-specific effects in the model so keeping track of them makes model fitting take substantially longer. To speed up estimation, you may run options(mc.cores = parallel::detectCores()) in the R console to run sampling chains in parallel.

Exercise 10: Refit the model but now keep track of the three subject-specific effects. Use the same settings as in exercise 6, except for the pars argument.

options(mc.cores = parallel::detectCores())
fitUrges_ind <- stan(model_code = mymodel,
                     data       = datUrges,
                     pars       = c("mu","phi","psi"),
                     iter       = 1000,
                     seed       = 42)

You can extract the posterior samples for easy plotting with the extract() function. This returns an object that contains the posterior samples for the tracked parameters. Beware that the elements in the object have dimensions number of samples x number of parameters for population-level effects, and number of samples x number of participants for subject-specific effects.

Exercise 11: Extract the posterior samples and then plot the subject-specific estimates to visualize individual differences.

post = extract(fitUrges_ind)

df = data.frame(pp    = 1:Nsubj,
                param = rep(c("mu","phi","psi"),each=Nsubj),
                est   = c(apply(post$mu,2,mean),apply(post$phi,2,mean),apply(post$psi,2,mean)))

ggplot(df, aes(x=est, y=pp)) +
  geom_point() +
  facet_wrap(~ param, scales="free") +
  labs(x="Estimate", y="Participant") +
  ylim(c(1,Nsubj))

Exercise 12: Compare the plots to your own results in exercise 9. Are there substantial individual differences in mean urges and lingering of urges?

Looking good! Now let’s move on to adding covariates to answer the fourth RQ.

Adding covariates

The fourth RQ concerns a covariate, namely how concurrent depressive symptoms relate to smoking urges. This covariate needs to be added to the Stan model. We have already implemented such a covariate in a Stan model. Yet, it may be helpful to understand what needs to be changed to include an additional predictor. First, in the data block, we declared that we have additional data (which we called X) to be modeled. Second, in the parameters block, we added a population-level parameter for the covariate effect (an additional gamma). Third, also in the parameters block, we added subject-specific effects (u’s) for the covariate. Beware that effects can only differ between subjects when a predictor varies from trial-to-trial, not for subject-specific predictors. Fourth, in the model block, we changed the prior distribution of the subject-specific effects to indicate that we now have four (instead of three) deviations per subject. And fifth, also in the model block, we changed the formula for the actual model, stating that the mean is now not only determined by the autoregressive effect (Equation $\ref{eq2}$ in the Implementing a DSEM in Stan module), but also by the covariate.

The complete model is implemented in the object mymodelcov below.

mymodelcov = "
data {
  int<lower=1> N;               // number of subjects
  int<lower=1> T;               // number of observations
  array[N] vector[T] Y;         // time series data
  array[N] vector[T] X;         // covariate
}

parameters {
  vector[4] gamma;                  // population-level effects
  array[N] vector[4] u;             // subject-specific deviations

  // for covariance matrix of subject-specific deviations
  cholesky_factor_corr[4] L_Omega;  // Cholesky factor
  vector<lower=0>[4] tau_Omega;     // vector of standard deviations
}

transformed parameters {
  // construct covariance matrix from Cholesky factors and standard deviations
  corr_matrix[4] R_Omega; // correlation matrix
  R_Omega = multiply_lower_tri_self_transpose(L_Omega); // R = L* L'
  // quad_form_diag: diag_matrix(tau) * R * diag_matrix(tau)
  cov_matrix[4] Omega = quad_form_diag(R_Omega, tau_Omega);
  
  // subject-specific parameters
  vector[N] mu;             // mean
  vector[N] phi;            // autoregression
  vector[N] psi;            // residual variance
  vector[N] alpha;          // effect of covariate

  // object for deviation from mean
  array[N] vector[T] delta; 

  for (i in 1:N) {
    // to obtain subject-specific effects, 
    // sum population-level effects and subjects-specific deviations
    mu[i] = gamma[1] + u[i][1];
    phi[i] = gamma[2] + u[i][2];
    // note gamma[3] and u[i][3] are estimated on a log-scale
    // to assume normal distribution which simplifies estimation
    psi[i] = exp(gamma[3] + u[i][3]);
    alpha[i] = gamma[4] + u[i][4];
    // calculate deviations by subtracting mean
    delta[i] = Y[i] - mu[i];
  }
}

model {
  // prior distributions
  // .. for the population-level parameters
  target += normal_lpdf(gamma | 0, 3); 
  // .. for the Cholesky factor
  target += lkj_corr_cholesky_lpdf(L_Omega | 1.0);
  // .. for the vector of standard deviations
  target += cauchy_lpdf(tau_Omega | 0, 2.5);

  // likelihood
  for (i in 1:N) {
    // subject-specific deviations
    target += multi_normal_lpdf(u[i] | rep_vector(0,4), Omega);
    
    for (t in 2:T) {
      // data given model parameters
      target += normal_lpdf(delta[i][t] | phi[i] * delta[i][t-1] + alpha[i] * X[i][t], sqrt(psi[i]));
    }
  }
}
"

Exercise 13: Extend the data list datUrges (created in exercise 5) to include object X, containing the depression data, and call this new data list datUrges_dep.

datUrges_dep <- list(N = Nsubj,
                     T = Nobs,
                     Y = urges,
                     X = dep)

Exercise 14: Fit a DSEM including the depression covariate. Keep track of the population-level means and sds. Use seed = 2094. Report on the fourth RQ, that is, whether depression relates to urges. Also compare estimates of the other parameters to the ones you reported in exercise 8 and 9.

options(mc.cores = parallel::detectCores())
fitUrges_dep <- stan(model_code = mymodelcov,
                     data       = datUrges_dep,
                     pars       = c("gamma","tau_Omega"),
                     iter       = 1000,
                     seed       = 2094)

print(fitUrges_dep, pars=c("gamma","tau_Omega"), probs = c(0.025, 0.975), digits = 2)

## Inference for Stan model: anon_model.
## 4 chains, each with iter=1000; warmup=500; thin=1; 
## post-warmup draws per chain=500, total post-warmup draws=2000.
## 
##               mean se_mean   sd  2.5% 97.5% n_eff Rhat
## gamma[1]     -0.01    0.01 0.08 -0.16  0.15   252 1.00
## gamma[2]      0.19    0.00 0.02  0.16  0.23   737 1.00
## gamma[3]      0.05    0.00 0.05 -0.04  0.14   453 1.00
## gamma[4]      0.79    0.01 0.09  0.61  0.95   138 1.03
## tau_Omega[1]  0.77    0.00 0.06  0.66  0.91  2681 1.00
## tau_Omega[2]  0.14    0.00 0.02  0.12  0.18   928 1.01
## tau_Omega[3]  0.42    0.00 0.04  0.35  0.50  1861 1.00
## tau_Omega[4]  0.89    0.00 0.07  0.77  1.03  2925 1.00
## 
## Samples were drawn using NUTS(diag_e) at Thu May  8 10:27:15 2025.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

The answer is yes, more concurrent depressive symptoms relate to more smoking urges, as indicated by a relatively large positive regression coefficient under gamma[4]. Also note that the estimates of the mean and autoregression (gamma[1] and gamma[2]) have barely changed. The residual variance (gamma[3]) dropped substantially, indicating residual variance is explained by including concurrent depression as a predictor.

Great! Let’s move on to more advanced DSEMs. Go back to the top to learn how to allow for asymmetric temporal dynamics and how to add predictors of model parameters.

Fitting a DYNASTI DSEM

We are also interested in an additional fifth RQ. That is, we wish to test whether temporal dynamics, in this case autoregressive effects, differ for above- and below-average levels of smoking urges. It may be that periods of high urges tend to persist over time whereas periods of low urges do not. Or even better, the opposite. To test these expectations, you may fit a DYnamics of ASymmetric TIme series (DYNASTI) version of the model in which you estimate separate autoregressions for above and below mean values.

Mathematically, the DYNASTI DSEM can be expressed as follows:

\[\begin{equation} \delta_{it} = \begin{cases} \phi_i^{\text{above}} \delta_{i,t-1} + \epsilon_{it} & \text{if } \delta_{i,t-1} > 0, \\ \phi_i^{\text{below}} \delta_{i,t-1} + \epsilon_{it} & \text{if } \delta_{i,t-1} \leq 0. \end{cases}\label{eq3}\tag{3} \end{equation}\]

The model thus determines whether the previous urge was above of below the mean (i.e., whether $\delta_{i,t-1}$ is above or below 0) and subsequently uses $\phi^{\text{above}}$ or $\phi^{\text{below}}$ to explain the urge on the current trial.

In the model, there are several adjustments that need to be done to allow for asymmetric temporal dynamics. We only changed the model, not the data we will fit the model on, so the data block remained the same. However, the DYNASTI DSEM consists of four (instead of three parameters): the population-level mean, two population-level autoregressive parameters (above and below the time series mean, see Equation $\ref{eq3}$), and the population-level residual variance. This additional population-level parameter (gamma) needs to be specified in the parameters block. That is, we change the number of gammas to 4. Similarly, we specify that we have subject-specific deviations (u’s) for an additional parameter. Another change, which we implemented in the transformed parameters block, is that we created an object phi_temp which we use to keep track of which autoregressive parameter (phi_above or phi_below) to use. We now also use this newly created object in the model specification in the model block. We have implemented these changes in mymodelasym.

mymodelasym = "
data {
    int<lower=1> N;               // number of subjects
    int<lower=1> T;               // number of observations 
    array[N] vector[T] Y;         // time series data
}

parameters {
    vector[4] gamma;                // population-level effects
    array[N] vector[4] u;           // subject-specific deviations
    
    // for covariance matrix of subject-specific deviations
    cholesky_factor_corr[4] L_Omega;  // Cholesky factor
    vector<lower=0>[4] tau_Omega;     // vector of standard deviations
}

transformed parameters {
    // construct covariance matrix from Cholesky factors and standard deviations
    corr_matrix[4] R_Omega; // correlation matrix
    R_Omega = multiply_lower_tri_self_transpose(L_Omega); // R = L* L'
    // quad_form_diag: diag_matrix(tau) * R * diag_matrix(tau)
    cov_matrix[4] Omega = quad_form_diag(R_Omega, tau_Omega);
    
    // subject-specific parameters
    vector[N] mu;               // mean
    vector[N] phi_above;          // autoregression (above mean)
    vector[N] phi_below;          // autoregression (below mean)
    vector[N] psi;              // residual variance
    vector[N] alpha;            // effect of covariate
    
    array[N] vector[T] delta;     // object for deviation from mean
    array[N] vector[T] phi_temp;      // object to keep track of which autoregression to use
    
    for (i in 1:N) {
      // to obtain subject-specific effects,
      // sum population-level effects and subjects-specific deviations
      mu[i] = gamma[1] + u[i][1];
      phi_above[i] = gamma[2] + u[i][2];
      phi_below[i] = gamma[3] + u[i][3];
      // note gamma[4] and u[i][4] are estimated on a log-scale
      // to assume normal distribution which simplifies estimation
      psi[i] = exp(gamma[4] + u[i][4]);
      // calculate deviation by subtracting mean
      delta[i] = Y[i] - mu[i];   
    
      // make object in which autoregression is based on deviation
      phi_temp[i][1] = 0;
      for (t in 2:T) {
        if(delta[i][t-1] > 0) // above mean
          phi_temp[i][t] = phi_above[i];
        else // below mean
          phi_temp[i][t] = phi_below[i];
      }
    }
}

model {
    // prior distributions
    // .. for the population-level parameters
    target += normal_lpdf(gamma | 0, 3);
    // .. for the Cholesky factor
    target += lkj_corr_cholesky_lpdf(L_Omega | 1.0);
    // .. for the vector of standard deviations
    target += cauchy_lpdf(tau_Omega | 0, 2.5);
    
    // likelihood
    for (i in 1:N) {
      // subject-specific deviations
      target += multi_normal_lpdf(u[i] | rep_vector(0,4), Omega);
      
      for (t in 2:T) {
        // data given model parameters
        target += normal_lpdf(delta[i][t] | phi_temp[i][t] * delta[i][t-1], sqrt(psi[i]));
      }
    }
}
"

You’re now ready to fit the DYNASTI DSEM model (as implemented in mymodelasym). You do this in the same way you fitted the simple DSEM model in the Fitting a simple DSEM module.

Exercise 15: Fit the four-parameter DYNASTI DSEM using the stan() function from the rstan package. The argument model_code should be set to mymodelasym, the object with the implemented DYNASTI DSEM. The data argument should contain the data list you created in exercise 5. In the pars argument you may specify (using a character string) which parameters to keep track of. For now, keep track of population-level effects (gamma). Run 1000 iterations and use seed 23432.

options(mc.cores = parallel::detectCores())
fitUrgesasym <- stan(model_code = mymodelasym,
                     data       = datUrges,                  
                     pars       = "gamma",                  
                     iter       = 1000,
                     seed       = 23432)

Exercise 16: Print the fit results for the population-level effects using the print() function. Report results on the fifth research question: whether temporal dynamics differ for above- and below-average levels of smoking urges.

print(fitUrgesasym, pars="gamma", probs = c(0.025, 0.975), digits = 2)

## Inference for Stan model: anon_model.
## 4 chains, each with iter=1000; warmup=500; thin=1; 
## post-warmup draws per chain=500, total post-warmup draws=2000.
## 
##          mean se_mean   sd  2.5% 97.5% n_eff Rhat
## gamma[1] 0.02    0.01 0.10 -0.20  0.21   173 1.03
## gamma[2] 0.18    0.00 0.03  0.12  0.24  1625 1.00
## gamma[3] 0.17    0.00 0.04  0.09  0.23   632 1.00
## gamma[4] 0.66    0.01 0.07  0.52  0.81   213 1.01
## 
## Samples were drawn using NUTS(diag_e) at Thu May  8 10:30:35 2025.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

In the Fitting a simple DSEM module, we discussed how to interpret the output. If you are unsure, refer back to that section. All effects should have converged with R-hat values below 1.1 (the closer to 1 the better). In this case we are specifically interested in the the population-level autoregressive effects (gamma[2] and gamma[3]). If you did everything correctly, you will have found similar values for the above- and below-average autoregressive effects. The small positive values mean that more-than-average urges and less-than-average urges similarly tend to persist over time.

Assessing model fit

Some of you may want to assess model fit through fit indices or to visualize how well the model describes the observed data. As a model fit index, you may compute Bayes Factors. This can be done using the bridgesampling package. Example code is given below.

if(!require(bridgesampling)){install.packages("bridgesampling")}
require(bridgesampling)

# Compute marginal likelihoods
fitsym = bridge_sampler(fitUrges)
fitasym = bridge_sampler(fitUrgesasym)
# Compute Bayes Factor
bf(fitsym, fitasym)

There are two ways to get posterior predictives in order to visualize model fit. The first is to extract the parameter estimates obtained from fitting the DYNASTI DSEM to the behavioral data (as you have done before using the print() function) and subsequently use these estimates to simulate data. To accomplish this, you have to write simulation code in R yourself (or adjust our script on OSF!). The second is to let Stan generate posterior predictives for you. To accomplish this, you may add a generated quantities block to the Stan model code. For the DYNASTI DSEM, this would be

generated quantities {
  // obtain posterior predictives (to check model fit)
  array[N] vector[T] delta_pred;
  array[N] vector[T] Ypred;
  
  for (i in 1:N) {
    delta_pred[i][1] = normal_rng(0, sqrt(psi[i]));
    
    for (t in 2:T) {
      delta_pred[i][t] = normal_rng(phi_temp[i][t] * delta[i][t-1], sqrt(psi[i]));
      Ypred[i][t] = mu[i] + delta_pred[i][t];
    }
  }
}

in which you generate random values (hence _rng which stands for random number generator), in our case deviations from the mean, from a normal distribution based on the parameter estimates. By adding the subject-specific mean (mu[i]) to these deviations you get posterior predictives, which we call Ypred. As this involves generating values, it may take a while to run a model including this generated quantities block. Once finished you may extract the posterior predictives from the fit object, for example, by running post <- extract(fitUrgesasym) and then post$Ypred. Don’t forget to tell Stan to keep track of these predictives by adding Ypred to the pars argument.

Go back to the top for the final module on adding predictors of model parameters.

Predictors of DSEM parameters

A final extension discussed in this tutorial is how to include predictors of DSEM parameters. You may be interested in, for example, how depressed participants’ temporal dynamics differ from their healthy counterparts. Or, in case of a continuous predictor, whether autoregressive effects in- or decrease with age. You may first fit the model and then perform a t-test or regression analysis on the resulting parameter estimates. This two-step approach, however, is biased towards finding an effect (see Boehm et al., 2018). Therefore, it is a better idea to incorporate predictors within the model.

Let us investigate how job stress relates to mean smoking urges and to the autoregressive effect.

Exercise 17: Create a new data list called datUrgesJS that contains the number of subjects N, the number of observations T, the urges data Y, and a vector, called JS, that contains the job stress per subject.

datUrgesJS <- list(N  = Nsubj,
                   T  = Nobs,
                   Y  = urges,
                   JS = tapply(dat$js,dat$subject,unique))

Let us take the standard symmetric DSEM as starting point and extend this model with job stress as predictor of mean smoking urges and temporal dynamics. In the data block, we declared that we now have a vector JS that contains the job stress data. In the parameters block, we changed the number of population-level effects (gammas) as we now five of such effects: the mean, the autoregression, the residual variable, and the role of job stress in the mean and autoregression. The magic happens in the transformed parameters block where we added effects (gamma[4] and gamma[5]) of job stress (JS).

mymodelpred = "
data {
  int<lower=1> N;               // number of subjects
  int<lower=1> T;               // number of observations
  array[N] vector[T] Y;         // time series data
  array[N] real JS;             // job stress data
}

parameters {
  vector[5] gamma;                  // population-level effects
  array[N] vector[3] u;             // subject-specific deviations

  // for covariance matrix of subject-specific deviations
  cholesky_factor_corr[3] L_Omega;  // Cholesky factor
  vector<lower=0>[3] tau_Omega;     // vector of standard deviations
}

transformed parameters {
  // construct covariance matrix from Cholesky factors and standard deviations
  corr_matrix[3] R_Omega; // correlation matrix
  R_Omega = multiply_lower_tri_self_transpose(L_Omega); // R = L* L'
  // quad_form_diag: diag_matrix(tau) * R * diag_matrix(tau)
  cov_matrix[3] Omega = quad_form_diag(R_Omega, tau_Omega);
  
  // subject-specific parameters
  vector[N] mu;             // mean
  vector[N] phi;            // autoregression
  vector[N] psi;            // residual variance

  // object for deviation from mean
  array[N] vector[T] delta; 

  for (i in 1:N) {
    // to obtain subject-specific effects, 
    // sum population-level effects and subjects-specific deviations
    mu[i] = gamma[1] + gamma[4]*JS[i] + u[i][1];
    phi[i] = gamma[2] + gamma[5]*JS[i] + u[i][2];
    // note gamma[3] and u[i][3] are estimated on a log-scale
    // to assume normal distribution which simplifies estimation
    psi[i] = exp(gamma[3] + u[i][3]);
    // calculate deviations by subtracting mean
    delta[i] = Y[i] - mu[i];
  }
}

model {
  // prior distributions
  // .. for the population-level parameters
  target += normal_lpdf(gamma | 0, 3); 
  // .. for the Cholesky factor
  target += lkj_corr_cholesky_lpdf(L_Omega | 1.0);
  // .. for the vector of standard deviations
  target += cauchy_lpdf(tau_Omega | 0, 2.5);

  // likelihood
  for (i in 1:N) {
    // subject-specific deviations
    target += multi_normal_lpdf(u[i] | rep_vector(0,3), Omega);
    
    for (t in 2:T) {
      // data given model parameters
      target += normal_lpdf(delta[i][t] | phi[i] * delta[i][t-1], sqrt(psi[i]));
    }
  }
}
"

Exercise 18: Report on the expected role of job stress in mean smoking urges and autoregressive effects.

Exercise 19: Fit the model including predictors using the stan() function. Keep track of population-level effects (gammas), set the iter argument to 1000 iterations, and set the seed argument to 27. Report on the results.

options(mc.cores = parallel::detectCores())
fitUrges_pred <- stan(model_code = mymodelpred,
                      data       = datUrgesJS,
                      pars       = "gamma",
                      iter       = 1000,
                      seed       = 27)

print(fitUrges_pred, pars="gamma", probs = c(0.025, 0.975), digits = 2)

## Inference for Stan model: anon_model.
## 4 chains, each with iter=1000; warmup=500; thin=1; 
## post-warmup draws per chain=500, total post-warmup draws=2000.
## 
##          mean se_mean   sd  2.5% 97.5% n_eff Rhat
## gamma[1] 0.03       0 0.08 -0.13  0.18   424 1.00
## gamma[2] 0.19       0 0.02  0.15  0.22  2207 1.00
## gamma[3] 0.66       0 0.07  0.51  0.80   315 1.02
## gamma[4] 0.40       0 0.10  0.21  0.59   417 1.00
## gamma[5] 0.10       0 0.02  0.05  0.14  1119 1.00
## 
## Samples were drawn using NUTS(diag_e) at Thu May  8 10:32:29 2025.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at 
## convergence, Rhat=1).

If everything went right, you’ll see that gamma[4] is positive. This means that subjects that experience more job stress, also experience higher smoking urges. Similarly, gamma[5] is positive, suggesting that for subjects that experience more job stress, smoking urges tend to linger more. Does this correspond to your expectations as formulated in exercise 18?

This is the end of the tutorial. If you run into any problems or if you have questions, feel free to contact Jessica Schaaf on jessica.schaaf@radboudumc.nl. Suggestions on additional modules are also welcome. Hope you had a blast!

Go back to the top.