The regular HMM formulation needs a key assumption to be applicable, namely the data need to be observed at regular, equidistant time-points such that the transition probabilties can be interpreted meaningfully w.r.t. a specific time unit. If this is not the case, the model used should acocunt for this by building on a mathematical formulation in continuous time. The obvious choice here is to retain most of the HMM model formulation, but replace the unobserved discrete-time Markov chain with a continuous-time Markov chain. However, here it is important to note that the so-called snapshot property needs to be fulfilled, i.e. the observed process at time \(t\) can only depend on the state at that time instant and not on the interval since the previous observation. For more details see Glennie et al. (2023).
As transition probabilities can only be specified w.r.t. to some time unit, but we have irregular observation times \(t_1, \dots, t_T\), such a chain is characterized by a so-called (infinitesimal) generator matrix
\[ Q = \begin{pmatrix} q_{11} & q_{12} & \cdots & q_{1N} \\ q_{21} & q_{22} & \cdots & q_{2N} \\ \vdots & \vdots & \ddots & \vdots \\ q_{N1} & q_{N2} & \cdots & q_{NN} \\ \end{pmatrix}, \] where the diagonal entries are \(q_{ii} = - \sum_{j \neq i} q_{ij}\), \(q_{ij} \geq 0\) for \(i \neq j\). This matrix can be interpreted as the derivative of the transition probability matrix and completely describes the dynamics of the state process. For such a chain, the time-spent in a state \(i\) is exponentially distributed with rate \(-q_{ii}\) and conditional on leaving the state, the probability to transition to a state \(j \neq i\) is \(\omega_{ij} = q_{ij} / -q_{ii}\). For a more detailed introduction see Dobrow (2016) (pp. 265 f.). For observation times \(t_1\) and \(t_2\), we can then obtain the transition probability matrix between these points via the identity \[ \Gamma(t_1, t_2) = \exp(Q (t_2 - t_1)), \] where \(\exp()\) is the matrix expoential. This follows from the so-called Kolmogorov forward equations, but again, for more details see Dobrow (2016). We can easily see that such a model now accounts for irregular observation times.
# 2-state example
# generator matrix Q:
Q = matrix(c(-0.5,0.5,1,-1), nrow = 2, byrow = TRUE)
# state 1 has a smaller rate (dwell-time in state one ~ Exp(1)), i.e. it exhibits
# longer dwell times than state 3 with rate 3.
# parameters for the state-dependent (normal) distributions
mu = c(5, 20)
sigma = c(2, 5)
We simulate the continuous-time Markov chain by drawing the exponentially distributed state dwell-times. Within a stay, we can assume whatever structure we like for the observation times, as these are not explicitly modeled. Here we choose to generate them by a Poisson process with rate \(\lambda=1\), but this choice is arbitrary. For more details on Poisson point processes, see the MM(M)PP vignette.
set.seed(123)
k = 200 # number of state switches
trans_times = s = rep(NA, k) # time points where the chain transitions
s[1] = sample(1:2, 1) # initial distribuion c(0.5, 0.5)
# exponentially distributed waiting times
trans_times[1] = rexp(1, -Q[s[1],s[1]])
n_arrivals = rpois(1, trans_times[1])
obs_times = sort(runif(n_arrivals, 0, trans_times[1]))
x = rnorm(n_arrivals, mu[s[1]], sigma[s[1]])
for(t in 2:k){
s[t] = c(1,2)[-s[t-1]] # for 2-states, always a state swith when transitioning
# exponentially distributed waiting times
trans_times[t] = trans_times[t-1] + rexp(1, -Q[s[t], s[t]])
n_arrivals = rpois(1, trans_times[t]-trans_times[t-1])
obs_times = c(obs_times,
sort(runif(n_arrivals, trans_times[t-1], trans_times[t])))
x = c(x, rnorm(n_arrivals, mu[s[t]], sigma[s[t]]))
}
Let’s visualize the simulated continuous-time HMM:
color = c("orange", "deepskyblue")
n = length(obs_times)
plot(obs_times[1:50], x[1:50], pch = 16, bty = "n", xlab = "observation times",
ylab = "x", ylim = c(-5,25))
segments(x0 = c(0,trans_times[1:48]), x1 = trans_times[1:49],
y0 = rep(-5,50), y1 = rep(-5,50), col = color[s[1:49]], lwd = 4)
legend("topright", lwd = 2, col = color,
legend = c("state 1", "state 2"), box.lwd = 0)
The likelhood of a continuous-time HMM for observations \(x_{t_1}, \dots, x_{t_T}\) at irregular time points \(t_1, \dots, t_T\) has the exact same structure as the regular HMM likelihood:
\[
L(\theta) = \delta^{(1)} \Gamma(t_1, t_2) P(x_{t_2}) \Gamma(t_2, t_3)
P(x_{t_3}) \dots \Gamma(t_{T-1}, t_T) P(x_{t_T}) 1^t,
\] where \(\delta^{(1)}\), \(P\) and \(1^t\) are as usual. Thus we can fit such
models using the standard implementation of the general forward
algorithm forward_g()
with time-varying transition
probability matrices.
mllk = function(theta.star, timediff, x, N=2){
mu = theta.star[1:N]
sigma = exp(theta.star[N+1:N])
Q = diag(N) # generator matrix
Q[!Q] = exp(theta.star[2*N+1:(N*(N-1))])
diag(Q) = 0
diag(Q) = -rowSums(Q)
delta = solve(t(Q+1), rep(1,N), tol = 1e-20) # stationary distribution of the
# continuous-time Markov chain
Qube = LaMa::tpm_cont(Q, timediff) # this computes exp(Q*timediff)
allprobs = matrix(1, nrow = length(x), ncol = N)
ind = which(!is.na(x))
for(j in 1:N){
allprobs[ind,j] = dnorm(x[ind], mu[j], sigma[j])
}
-LaMa::forward_g(delta, Qube, allprobs)
}
theta.star = c(5, 15, log(3), log(5), # mu and sigma
log(1), log(0.5)) # off-diagonals of Q
timediff = diff(obs_times)
t1 = Sys.time()
mod = nlm(mllk, theta.star, timediff=timediff, x=x, stepmax = 10)
# we often need the stepmax, as the matrix exponential can be numerically unstable
Sys.time()-t1
#> Time difference of 0.05602217 secs
The simulation is very similar but we now also have to draw which state to transition to, as explained in the beginning.
set.seed(123)
k = 200 # number of state switches
trans_times = s = rep(NA, k) # time points where the chain transitions
s[1] = sample(1:3, 1) # uniform initial distribuion
# exponentially distributed waiting times
trans_times[1] = rexp(1, -Q[s[1],s[1]])
n_arrivals = rpois(1, trans_times[1])
obs_times = sort(runif(n_arrivals, 0, trans_times[1]))
x = rnorm(n_arrivals, mu[s[1]], sigma[s[1]])
for(t in 2:k){
# off-diagonal elements of the s[t-1] row of Q divided by the diagonal element
# give the probabilites of the next state
s[t] = sample(c(1:3)[-s[t-1]], 1, prob = Q[s[t-1],-s[t-1]]/-Q[s[t-1],s[t-1]])
# exponentially distributed waiting times
trans_times[t] = trans_times[t-1] + rexp(1, -Q[s[t], s[t]])
n_arrivals = rpois(1, trans_times[t]-trans_times[t-1])
obs_times = c(obs_times,
sort(runif(n_arrivals, trans_times[t-1], trans_times[t])))
x = c(x, rnorm(n_arrivals, mu[s[t]], sigma[s[t]]))
}
N = 3
# mu
round(mod2$estimate[1:N],2)
#> [1] 4.90 15.45 29.10
# sigma
round(exp(mod2$estimate[N+1:N]),2)
#> [1] 1.80 2.58 5.06
Q = diag(N) # generator matrix
Q[!Q] = exp(mod2$estimate[2*N+1:(N*(N-1))])
diag(Q) = 0
diag(Q) = -rowSums(Q)
round(Q, 3)
#> [,1] [,2] [,3]
#> [1,] -0.888 0.565 0.323
#> [2,] 2.821 -3.469 0.647
#> [3,] 0.000 0.770 -0.770