Estimating reaction kinetics with Stan and R
Introduction
In chemical kinetics, the rate of a solid-state reaction is generally modeled by a reaction rate law (or rate equation) of the form:
\[ \frac{d\alpha}{dt} \ = \ k \cdot f(\alpha)\ = \ Ae^{-E_a/RT} \cdot f(\alpha) \]
Here, \(\alpha(t) \in [0,1]\) is the reaction’s conversion fraction in time and \(f(\alpha)\) is the reaction model. The reaction rate constant \(k = Ae^{E_a / RT}\) is represented by an Arrhenius equation based on the absolute temperature \(T\), model parameters \(A\) (pre-exponential factor) and \(E_a\) (activation energy), and ideal gas constant \(R\). The reaction model \(f(\alpha)\) can take on many different forms based on mechanistic considerations or empirical observation. The following table provides a list of common reaction models for \(f(\alpha)\) taken from (Khawam and Flanagan 2006) describing various types of reaction kinetics:
Reaction model \(f(\alpha)\) | |
---|---|
Power-law (P2) | \(2 \alpha^{1/2}\) |
Power-law (P3) | \(3 \alpha^{2/3}\) |
Power-law (P4) | \(4 \alpha^{3/4}\) |
Avrami-Erofeyev (A2) | \(2(1 - \alpha)[-\ln(1 - \alpha)]^{1/2}\) |
Avrami-Erofeyev (A3) | \(3(1-\alpha)[-\ln(1-\alpha)]^{2/3}\) |
Avrami-Erofeyev (A4) | \(4(1-\alpha)[-\ln(1-\alpha)]^{3/4}\) |
Contracting area (R2) | \(2(1-\alpha)^{1/2}\) |
Contracting volume (R3) | \(3(1-\alpha)^{2/3}\) |
1-D Difussion (D1) | \(1/(2\alpha)\) |
2-D Diffusion (D2) | \(-[1/\ln(1-\alpha)]\) |
3-D Diffusion-Jander (D3) | \([3(1-\alpha)^{2/3}]/[2(1-(1-\alpha)^{1/3})]\) |
Ginstling-Brounshtein (D4) | \(3/[2((1-\alpha)^{-1/3}-1)]\) |
Zero-order (P1/R1/F0) | \(1\) |
First-order (F1) | \((1-\alpha)\) |
Second-order (F2) | \((1-\alpha)^2\) |
Third-order (F3) | \((1-\alpha)^3\) |
Exponential (E1) | \(\alpha\) |
Using deSolve::ode()
, we can integrate the kinetic trends for all listed reaction models \(f(\alpha)\) with a fixed rate constant \(k\):
library(deSolve)
alpha <- ode(
y = c(P2 = 0, P3 = 0, P4 = 0, A2 = 0, A3 = 0, A4 = 0, R2 = 0, R3 = 0, D1 = 0,
D2 = 0, D3 = 0, D4 = 0, F0 = 0, F1 = 0, F2 = 0, F3 = 0, E1 = 1) + 1E-6,
times = (1:100)/100,
func = function(t, y, parms) {
list(
dadt = parms["k"] * c(
P2 = 2 * sqrt(y[1]),
P3 = 3 * y[2]^(2/3),
P4 = 4 * y[3]^(3/4),
A2 = 2 * (1 - y[4]) * sqrt(-log(1 - y[4])),
A3 = 3 * (1 - y[5]) * (-log(1 - y[5]))^(2/3),
A4 = 4 * (1 - y[6]) * (-log(1 - y[6]))^(3/4),
R2 = 2 * sqrt(1 - y[7]),
R3 = 3 * (1 - y[8])^(2/3),
D1 = 1 / (2 * y[9]),
D2 = -1 / log(1 - y[10]),
D3 = 3 * (1 - y[11])^(2/3) / (2 * (1 - (1 - y[11])^(1/3))),
D4 = 3 / (2 * (1 - y[12])^(-1/3) - 1),
F0 = 1,
F1 = 1 - y[14],
F2 = (1 - y[15])^2,
F3 = (1 - y[16])^3,
E1 = y[17]
)
)
},
parms = c(k = 0.5)
)
head(alpha)
#> time P2 P3 P4 A2 A3
#> [1,] 0.01 0.000001 1.0000e-06 1.000000e-06 1.000000e-06 1.000000e-06
#> [2,] 0.02 0.000036 3.3750e-06 1.798892e-06 3.599936e-05 3.374995e-06
#> [3,] 0.03 0.000121 8.0000e-06 3.001402e-06 1.209927e-04 7.999970e-06
#> [4,] 0.04 0.000256 1.5625e-05 4.724899e-06 2.559672e-04 1.562488e-05
#> [5,] 0.05 0.000441 2.7000e-05 7.101751e-06 4.409028e-04 2.699964e-05
#> [6,] 0.06 0.000676 4.2875e-05 1.027933e-05 6.757716e-04 4.287409e-05
#> A4 R2 R3 D1 D2 D3
#> [1,] 1.000000e-06 0.000001000 0.00000100 0.00000100 0.00000100 0.0000010
#> [2,] 1.798891e-06 0.009975995 0.01492612 0.07071117 0.09831976 0.1974870
#> [3,] 3.001399e-06 0.019900990 0.02970198 0.10000072 0.13804818 0.2710020
#> [4,] 4.724890e-06 0.029775985 0.04432935 0.12247531 0.16813049 0.3242628
#> [5,] 7.101728e-06 0.039600980 0.05880896 0.14142228 0.19321710 0.3670950
#> [6,] 1.027928e-05 0.049375975 0.07314158 0.15811481 0.21510937 0.4032969
#> D4 F0 F1 F2 F3 E1
#> [1,] 0.00000100 0.000001 0.000001000 0.000001000 0.000001000 1.000001
#> [2,] 0.01492624 0.005001 0.004988516 0.004976114 0.004963795 1.005014
#> [3,] 0.02970296 0.010001 0.009951156 0.009901970 0.009853428 1.010051
#> [4,] 0.04433262 0.015001 0.014889046 0.014779296 0.014671678 1.015114
#> [5,] 0.05881665 0.020001 0.019802307 0.019608804 0.019420267 1.020202
#> [6,] 0.07315644 0.025001 0.024691063 0.024391196 0.024100856 1.025316
We plot all integrated kinetic trends in a single display by rescaling the trends \(\alpha(t)\) to the unit interval. This provides a quick overview of the different types of reaction kinetics represented by the reaction models \(f(\alpha)\).
To get a more detailed view of the available types of kinetic behavior, we integrate the same kinetic trends by also varying the rate constant \(k\). The reaction models are grouped by the same categories as described in (Khawam and Flanagan 2006, Table 1).
Note that most of the reaction models \(f(\alpha)\) shown here are different representations of the following general form:
\[ f(\alpha) \ \propto \ \alpha^m(1 - \alpha)^n(-\ln(1 - \alpha))^p \] By varying the constants \(m\), \(n\), and \(p\), many different types of reaction kinetics behavior can be generated and modeled.
Statistical modeling
The aim of this post is to demonstrate the use of Stan in the context of estimation and prediction of reaction kinetics based on observed data. In practice, the go-to approach to fit any of the above reaction models to observed data is likely least squares (or maximum likelihood) optimization, as this is the most convenient to implement in most mathematical programming languages. The least squares model fits allow us to quickly assess which reaction models provide reasonable fits to the data, but statistical inference (e.g. construction of confidence regions) for the estimated kinetic trends is not always straightforward due to the nonlinear nature (or missing closed-form solutions) of the integrated reaction models.
In the following demonstrating examples, we benefit from Stan’s ODE solvers to assess the complete posterior distributions of the reaction model parameters and integrated reaction trends. Bayesian inference directly provides us with valuable information on the uncertainty in the estimated kinetic trends. In addition, because of Stan’s flexible and efficient modeling interface, it is possible to build more complex statistical models involving numerically integrated reaction models and/or (highly) correlated model parameters.
Note: the Stan programs in this post are compiled with cmdstanr
based on CmdStan >= 2.24. Stan 2.24 introduces new ode_***
functions that avoid packing and unpacking schemes required by the now deprecated integrate_ode_***
functions in previous Stan versions1.
Example 1: Error propagation
Model scenario
Consider a fictional scenario where \(\alpha(t)\) is known to follow a first-order reaction model:
\[ \frac{d\alpha}{dt} \ = \ k \cdot f(\alpha) \ = \ Ae^{-E_a/RT} \cdot (1 - \alpha) \] Instead of observing \(\alpha(t)\) directly, we observe a set of normally distributed rate constants \(k_1, \ldots, k_S\) centered around \(k = A e^{-E_a/RT}\). That is,
\[ k_s | A, E_a, \sigma \overset{\text{iid}}{\sim} N(A e^{-E_a/RT}, \sigma^2) \quad \text{for} \ \ s = 1, \ldots, S \]
The aim of this exercise is to predict the integrated trend \(\alpha(t)\) based on the Arrhenius model fitted to the observations \(k_1, \ldots, k_S\).
Note: the integrated first-order reaction model has a simple closed-form solution,
\[ \alpha(t) \ = \ 1 - e^{-kt} \] For the sake of generality, we solve for \(\alpha(t)\) based on its differential form, as not all reaction models listed above have such straightforward closed-form solutions. Moreover, in terms of model sampling it should make no real difference for Stan2 whether \(\alpha(t)\) is represented by its integrated form or integrated from its differential form using one of Stan’s ODE solvers.
Simulated data
A set of observed i.i.d. rate constants \(k_1,\ldots, k_S\) is simulated with 8 observations at each of three absolute temperatures \(T = \{273.15, 293.15, 313.15\}\). The model parameters \(A\) and \(E_a\) in the Arrhenius equation are specified as \(A = 1 \times 10^{15}\ \text{min}^{-1}\) and \(E_a = 80\ \text{kJ/mol}\) identical to (Khawam and Flanagan 2006, Figure 2) and the noise variance is set to \(\sigma = 2\).
## Arrhenius model parameters
set.seed(1)
pars <- list(log_A = log(1E15), Ea = 80, sigma = 2)
temp_K <- rep(c(283.15, 293.15, 313.15), each = 8)
## simulated observations
k_obs <- with(
pars,
data.frame(
temp_K = temp_K,
k = rnorm(n = length(temp_K), mean = exp(log_A - Ea / (0.00831446 * temp_K)), sd = sigma)
)
)
Maximum likelihood
Firstly, given the observations \(k_1,\ldots,k_S\), we can compute maximum likelihood estimates \(\hat{A}\) and \(\hat{E}_a\) by fitting the Arrhenius model \(k_s \sim N(A e^{-E_a/RT}, \sigma^2)\) as a generalized linear model with Gaussian error distribution and a log-link function:
## 1) Arrhenius model fit
glmfit <- glm(k ~ 1 + I(-1 / (0.0083144 * temp_K)), family = gaussian(link = "log"), data = k_obs)
summary(glmfit)
#>
#> Call:
#> glm(formula = k ~ 1 + I(-1/(0.0083144 * temp_K)), family = gaussian(link = "log"),
#> data = k_obs)
#>
#> Deviance Residuals:
#> Min 1Q Median 3Q Max
#> -4.6222 -0.9366 0.5815 1.1630 3.1075
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 34.221 1.605 21.32 3.50e-16 ***
#> I(-1/(0.0083144 * temp_K)) 79.143 4.174 18.96 4.06e-15 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for gaussian family taken to be 3.881107)
#>
#> Null deviance: 9536.172 on 23 degrees of freedom
#> Residual deviance: 85.384 on 22 degrees of freedom
#> AIC: 104.57
#>
#> Number of Fisher Scoring iterations: 5
Next, by evaluating the fitted GLM at a specific temperature (e.g. \(T = 298.15 K\)), we obtain a point estimate \(\hat{k}(T)\), which can be plugged into the reaction model to predict the reaction kinetics at the specified temperature:
## 2) evaluate k-hat at room temperature
khat <- predict(glmfit, newdata = list(temp_K = 298.15), type = "response")
khat
#> 1
#> 9.927213
## 3) integrate alpha(t) by plugging in k-hat
alpha <- ode(
y = c(alpha = 0),
times = (0:100)/100,
func = function(t, y, parms) list(khat * (1 - y))
)
head(alpha)
#> time alpha
#> [1,] 0.00 0.00000000
#> [2,] 0.01 0.09450281
#> [3,] 0.02 0.18007554
#> [4,] 0.03 0.25756128
#> [5,] 0.04 0.32772478
#> [6,] 0.05 0.39125752
The following plot displays three such individual trajectories evaluated at three different temperatures:
Here the integrated trajectories are evaluated as deterministic curves conditional on the point estimates \(\hat{k}(T)\). Propagating the estimation error in \(\hat{k}(T)\) to the solution of the ODE is a not a trivial task and probably requires the use of computationally intensive Monte Carlo simulation or other (re)sampling techniques, such as bootstrapping.
MCMC sampling
A sensible solution to the previous issue is to simulate the posterior distribution of the integrated trajectories by means of Bayesian MCMC sampling. In Stan, we can combine both the initial generalized linear model fit and the integration of the reaction model making use of Stan’s ODE solvers. Since the numerical integration of the ODE is not part of the model itself, it is performed in the generated quantities
block of the Stan program. Using the new ode_rk45
function available in Stan >= 2.24, the Stan model can be encoded as follows:
// ex1.stan (Stan >= 2.24)
functions {
// derivative function
vector deriv(real time, vector y, real k) {
vector[1] dydt = k * (1 - y);
return dydt;
}
}
data {
// rate estimation
int<lower = 1> n;
vector[n] k_obs; // observed rate constants (1/min)
vector[n] temp_K; // observed temperatures (K)
// ode integration
int<lower = 1> n1;
vector[1] y0;
real times[n1]; // evaluated time points (min)
real temp_K_new; // evaluated temperature (K)
}
parameters {
real lnA; // pre-exponential factor
real<lower = 0> Ea; // activation energy
real<lower = 1E-6> sigma; // error standard deviation
}
model {
// (naive) priors
lnA ~ normal(25, 10);
Ea ~ normal(50, 10);
sigma ~ normal(0, 1);
// likelihood
k_obs ~ normal(exp(lnA - Ea ./ (0.0083144 * temp_K)), sigma);
}
generated quantities {
vector[1] y_new[n1];
// predicted rate constant
real k_hat = exp(lnA - Ea / (0.0083144 * temp_K_new));
// integrated reaction model
y_new = ode_rk45(deriv, y0, 0.0, times, k_hat);
}
The GLM likelihood is specified in the model
block and we assign (weakly informative) normal priors for the model parameters: \(\ln(A)\) (natural logarithm of the pre-exponential factor), \(E_a\) (activation energy) and \(\sigma\) (error standard deviation). The generated quantities
block contains the instructions to integrate a new reaction trend at each sampling iteration based on the the posterior predicted rate constant \(\hat{k}(T)\) and the derivative function defined in the functions
block.
The Stan model compilation and sampling is executed with CmdStan (>=2.24) through the cmdstanr
-package. First, we compile the model with cmdstan_model()
, second we call the sample()
method of the generated R6-object (ex1_model
) to sample a total of 4000 posterior predicted reaction trends:
library(cmdstanr)
## 1) compile model
ex1_model <- cmdstan_model("ex1.stan")
## 2) sample reaction trends
ex1_fit <- ex1_model$sample(
seed = 1,
data = list(
n = nrow(k_obs),
k_obs = k_obs$k, ## observed rate constants
temp_K = k_obs$temp_K, ## observed temperatures (K)
n1 = 100L,
y0 = array(0),
times = (1:100)/100, ## evaluated times (min)
temp_K_new = 298.15 ## evaluated temperature (K)
),
chains = 4,
iter_warmup = 1000,
iter_sampling = 1000
)
## model fit summary
ex1_fit$cmdstan_summary()
#> Inference for Stan model: ex1_model
#> 4 chains: each with iter=(1000,1000,1000,1000); warmup=(0,0,0,0); thin=(1,1,1,1); 4000 iterations saved.
#>
#> Warmup took (1.1, 1.0, 0.71, 0.70) seconds, 3.6 seconds total
#> Sampling took (1.3, 1.7, 0.93, 0.91) seconds, 4.8 seconds total
#>
#> Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat
#>
#> lp__ -30 4.1e-02 1.3e+00 -32 -29 -28 931 192 1.0
#> accept_stat__ 9.5e-01 1.4e-03 8.2e-02 7.8e-01 9.8e-01 1.0e+00 3.3e+03 6.7e+02 1.0e+00
#> stepsize__ 7.1e-03 3.3e-04 4.7e-04 6.4e-03 7.3e-03 7.7e-03 2.0e+00 4.1e-01 2.1e+13
#> treedepth__ 6.9e+00 4.2e-02 2.4e+00 2.0e+00 8.0e+00 9.0e+00 3.3e+03 6.7e+02 1.0e+00
#> n_leapfrog__ 3.0e+02 3.9e+00 2.2e+02 3.0e+00 2.6e+02 5.5e+02 3.3e+03 6.7e+02 1.0e+00
#> divergent__ 0.0e+00 nan 0.0e+00 0.0e+00 0.0e+00 0.0e+00 nan nan nan
#> energy__ 3.1e+01 5.9e-02 1.8e+00 2.9e+01 3.1e+01 3.4e+01 8.9e+02 1.8e+02 1.0e+00
#>
#> lnA 33 4.2e-02 1.3e+00 31 33 35 977 202 1.0
#> Ea 75 1.1e-01 3.4e+00 70 75 81 977 201 1.0
#> sigma 1.9 8.0e-03 2.7e-01 1.5 1.9 2.4 1120 231 1.0
#> y_new[1,1] 0.10 2.0e-04 6.3e-03 0.091 0.10 0.11 1013 209 1.0
#> y_new[2,1] 0.19 3.6e-04 1.1e-02 0.17 0.19 0.21 1012 209 1.0
#> y_new[3,1] 0.27 4.8e-04 1.5e-02 0.25 0.27 0.30 1012 209 1.0
#> y_new[4,1] 0.35 5.8e-04 1.8e-02 0.32 0.35 0.38 1012 209 1.0
....
Looking at CmdStan’s sampling diagnostics and summary results, no obvious sampling problems are observed3. Also, the posterior distributions of \(\log(A)\), \(E_a\) and \(\sigma\) all cover the target parameters reasonably well, using a total of \(n = 24\) observations to fit the Arrhenius regression model. Error propagation from the estimated rate constants to the reaction model integration is no longer an issue, as the reaction model ODE is solved for each sampled value of \(\hat{k}(T)\).
Given the sampling results, we can directly assess the prediction uncertainty present in the integrated reaction trends based on fitting the Arrhenius model to the observed rate constants. To illustrate, the following plot displays a (random) subset of 500 individual integrated reaction trends, as well as the posterior medians and 2.5% and 97.5% posterior quantiles pointwise in time across all reaction trends (resp. the black solid and dashed lines):
Example 2: Reaction model regression
Model scenario
In the previous example, the reaction model was not part of the regression model itself, i.e. we only evaluated the reaction model conditional on an estimated rate constant \(\hat{k}(T)\). In this second example, we consider fitting an integrated reaction model directly to observations of multiple replicated reaction trends.
More precisely, let \(\alpha_1(t), \ldots, \alpha_S(t)\) be a collection of replicated reaction processes, subject to the same experimental conditions, where each process is assumed to follow a reaction order model of the form:
\[ \frac{d\alpha_s}{dt} \ = \ k_s \cdot f(\alpha_s) \ = \ k_s \cdot (1 - \alpha_s)^m \]
The reaction order \(m\) is unknown, but is assumed to be the same for all processes \(\alpha_1(t), \ldots, \alpha_S(t)\). The rate constants \(k_1, \ldots, k_S\) are also unknown and are assumed to randomly vary across experiments. In particular, we impose a Gaussian (random effects) distribution for the replicated rate constants:
\[ k_1,\ldots,k_S\, |\, k, \sigma_u \ \overset{\text{iid}}{\sim} \ N(k, \sigma_u^2) \]
Note that modeling the rate constants according to an Arrhenius equation is not necessary in this example, as the temperature covariate is assumed to be fixed and identical for each replicated reaction process.
In addition to the between-replicate variation introduced by the rate constants \(k_1, \ldots, k_S\), we assume that there is some within-replicate (noise) variation present in the observed reaction process \(y_s(t)\). This noise variation is modeled according to a Beta distribution to ensure that the observed reaction process \(y_s(t)\) is always contained within the unit interval:
\[ y_s(t)\, | \, \alpha_s(t), \phi \ \sim \ \mathcal{Be}\left(\alpha_s(t) \cdot \phi, (1 - \alpha_s(t)) \cdot \phi\right), \quad \text{for}\ s = 1,\ldots, S \] The two shape parameters are parameterized in such a way that:
\[ \begin{aligned} \mathbb{E}[y_s(t)\, |\, \alpha_s(t)] & \ = \ \alpha_s(t) \\ \text{Var}(y_s(t)\, |\, \alpha_s(t)) & \ = \ \frac{\alpha_s(t) \cdot(1 - \alpha_s(t))}{\phi + 1} \end{aligned} \]
This means that, conditional on \(\alpha_s(t)\), \(y_s(t)\) is centered around \(\alpha_s(t)\), and the noise variance tends to zero whenever \(\alpha_s(t)\) is close to zero or one, with \(\phi\) serving as a scale parameter for the noise variance.
Combining the above expressions, for each replicate \(s = 1,\ldots, S\), the generative model for the observed reaction process \(y_s(t)\) with model parameters \(k\), \(m\), \(\sigma_u\) and \(\phi\) is given by:
\[ \begin{aligned} & y_s(t)\, |\, \alpha_s(t), \phi \ \sim \ \mathcal{Be}\left(\alpha_s(t) \cdot \phi, (1 - \alpha_s(t)) \cdot \phi\right), \\[3mm] & \begin{cases} k_s\, |\, k, \sigma_u \ \sim \ N(k, \sigma_u^2), \\[2mm] \frac{d \alpha_s}{dt} \, \Big|\, k_s,m \ =\ k_s \cdot (1 - \alpha_s(t))^m, \quad \text{with} \ \alpha_s(0) \ = \ 0 \\[3mm] \end{cases} \end{aligned} \]
Simulated data
Based on the generative model above, we generate \(S = 10\) replicated second-order reaction trends with model parameters \(k = 0.1\ \text{min}^{-1}\), \(m = 2\), \(\sigma_u = 0.05\ \text{min}^{-1}\) and \(\phi^{-1} = 0.0025\), where each replicate is evaluated at the same 10 time points.
## model parameters
set.seed(1)
S <- 10
times <- (0:10)*10
pars <- list(k = 0.1, m = 2, sigma_u = 0.05, phi_inv = 0.0025)
pars[["ks"]] <- with(pars, rnorm(S, mean = k, sd = sigma_u))
## helper function to integrate alpha_s(t)
alpha_s <- function(pars, times, s = 1) {
as.data.frame(
ode(
y = c(alpha = 0),
times = times,
func = function(t, y, ...) list(pars[["ks"]][s] * (1 - y)^pars[["m"]])
)
)
}
## generate y_s(t)
y_obs <- lapply(seq_len(S), function(s) {
y_s <- alpha_s(pars, times, s)
y_s[["y"]] <- with(pars, rbeta(nrow(y_s), shape1 = y_s[["alpha"]] / phi_inv, shape2 = (1 - y_s[["alpha"]]) / phi_inv))
y_s[["rep"]] <- sprintf("Replicate %02d", s)
y_s
})
head(y_obs[[1]])
#> time alpha y rep
#> 1 0 0.0000000 0.0000000 Replicate 01
#> 2 10 0.4071514 0.4540743 Replicate 01
#> 3 20 0.5786884 0.5677154 Replicate 01
#> 4 30 0.6732356 0.6897440 Replicate 01
#> 5 40 0.7331251 0.7023899 Replicate 01
#> 6 50 0.7744618 0.7755208 Replicate 01
Nonlinear least squares
For a single observed reaction process \(y_s(t)\), we can fit the integrated reaction model \(\alpha_s(t)\) by means of least squares optimization using the deSolve
and FME
-packages. First, given reaction model parameters \(k_s\), \(m\), we define a function that returns the vector of residuals \(y_s(t_i) - \alpha_s(t_i)\) for \(i = 1,\ldots,n\), where the numerical integration of \(\alpha_s(t)\) relies on deSolve::ode()
.
library(FME)
## single replicate residuals y_s(t) - alpha_s(t)
resid_s <- function(p, times, y_s) {
y_s[["y"]] - alpha_s(p, times)[["alpha"]]
}
resid_s(p = list(ks = pars[["ks"]][1], m = 2), times, y_obs[[1]])
#> [1] 0.0000000000 0.0469228977 -0.0109729722 0.0165083908 -0.0307352512
#> [6] 0.0010589923 0.0003623959 -0.0219780481 -0.0180789262 -0.0122449312
#> [11] -0.0191027469
Secondly, we pass the defined residual function to FME::modFit()
, which provides a convenient wrapper for various least squares optimization routines, defaulting to the Levenberg-Marquardt algorithm available through minpack.lm::nls.lm()
. Note that FME::modFit()
is not restricted to only least squares optimization for ODE models, but historically is aimed at this context.
## single replicate nls optimization
alpha_nls <- modFit(
f = resid_s, ## residual function
p = c(ks = 0, m = 1), ## parameter initializations
times = times, ## additional arguments to resid_s
y_s = y_obs[[1]]
)
summary(alpha_nls)
#>
#> Parameters:
#> Estimate Std. Error t value Pr(>|t|)
#> ks 0.085421 0.006826 12.51 5.38e-07 ***
#> m 2.299285 0.097091 23.68 2.04e-09 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.01563 on 9 degrees of freedom
#>
#> Parameter correlation:
#> ks m
#> ks 1.0000 0.9238
#> m 0.9238 1.0000
The following plot displays the least squares fits \(\hat{\alpha}_s(t)\) obtained with FME::modFit()
for each individual replicate \(s = 1,\ldots,S\):
MCMC sampling
Based on the individual least squares fits we can quickly assess whether the reaction-order models provide reasonable fits to the observed data, but obtaining least squares or maximum likelihood estimates based on the complete Beta regression model (including random effects) is less straightforward4. In Stan, on the other hand, the regression model is implemented as a standard random effects model by specifying the relevant contributions to the joint (log-)probability function as usual. The fact that the conditional expectations \(\mathbb{E}[ y_s(t)\, |\, \alpha_s(t)]\) are based on numerical (ODE) integration is not fundamentally different for the MCMC sampler from any other nonlinear (or linear) function specification for \(\alpha_s(t)\).
The following Stan program encodes the regression model using a non-centered parameterization for the random effects \(k_1,\ldots,k_S\). The reaction trends \(\alpha_s(t)\) are integrated using the ode_rk45
function analogous to the previous example. Both the random effects parameters and the smooth reaction trends are specified in the transformed parameters
block, but these could also have been specified in the model
block. The specification of the Beta likelihood for \(y_s(t)\, |\, \alpha_s(t), \phi\) in the model
block is similar to the model description above. Note that the somewhat contrived data format for y_s
and alpha_s
(an array of vectors) is used because this is the format returned by ode_rk45
.
// ex2.stan (Stan >= 2.24)
functions {
vector deriv(real time, vector y, vector ks, real m) {
int S = rows(ks);
vector[S] dydt = ks .* pow(1 - y, rep_vector(m, S));
return dydt;
}
}
data {
int<lower = 1> n; // # observations
int<lower = 1> S; // # replicates
real times[n]; // observed times
vector<lower = 0, upper = 1>[S] y_s[n]; // observed trends
}
transformed data {
vector[S] y0 = rep_vector(0.0, S);
}
parameters {
real<lower = 0> k; // mean rate constant
real<lower = 0, upper = 3> m; // reaction order
real<lower = 0> sigma_u; // replicate sd
real<lower = 1E-10> phi_inv; // residual sd scale
vector[S] u_s; // random effects
}
transformed parameters{
vector[S] ks;
vector[S] alpha_s[n];
ks = k + sigma_u * u_s;
alpha_s = ode_rk45(deriv, y0, 0, times, ks, m);
}
model {
// (naive) normal priors
k ~ normal(0, 1);
m ~ normal(1, 1);
sigma_u ~ normal(0, 1);
phi_inv ~ normal(0, 1);
// likelihood
u_s ~ std_normal();
for(s in 1:S)
y_s[][s] ~ beta(alpha_s[][s] / phi_inv, (1 - alpha_s[][s]) / phi_inv);
}
Stan model compilation and sampling is again executed with CmdStan (>=2.24) using the cmdstanr
-package. Note that we remove the observations at time zero in the input data
argument passed to Stan.
## 1) compile model
ex2_model <- cmdstan_model("ex2.stan")
## 2) sample posteriors
ex2_fit <- ex2_model$sample(
seed = 1,
data = list(
n = length(times) - 1,
S = S,
times = times[-1],
y_s = sapply(y_obs, function(y_s) y_s$y[-1])
),
chains = 4,
parallel_chains = 4,
iter_warmup = 1000,
iter_sampling = 1000
)
Assessing CmdStan’s summary diagnostics, all MCMC chains seem to have converged correctly. Also, the posterior distributions of the main model parameters are all well-behaved as seen from the pairs plot below.
## model fit summary
ex2_fit$summary()
#> # A tibble: 125 x 10
#> variable mean median sd mad q5 q95 rhat ess_bulk
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 lp__ 2.48e+2 2.49e+2 3.59e+0 3.43e+0 2.42e+2 2.53e+2 1.00 692.
#> 2 k 1.14e-1 1.14e-1 1.68e-2 1.58e-2 8.66e-2 1.42e-1 1.00 754.
#> 3 m 2.08e+0 2.08e+0 3.35e-2 3.34e-2 2.02e+0 2.13e+0 1.00 1798.
#> 4 sigma_u 4.88e-2 4.63e-2 1.41e-2 1.20e-2 3.15e-2 7.43e-2 1.00 701.
#> 5 phi_inv 2.22e-3 2.18e-3 3.58e-4 3.40e-4 1.71e-3 2.87e-3 1.00 2222.
#> 6 u_s[1] -9.36e-1 -9.22e-1 4.08e-1 4.00e-1 -1.63e+0 -2.73e-1 1.00 730.
#> 7 u_s[2] 3.75e-2 3.96e-2 3.35e-1 3.27e-1 -5.39e-1 5.80e-1 1.00 791.
....
library(bayesplot)
## pairs plot of posteriors
color_scheme_set("red")
mcmc_pairs(ex2_fit$draws(c("k", "m", "sigma_u", "phi_inv")), off_diag_fun = "hex")
As expected, the posterior reaction trend fits exhibit a smaller overall mean squared error (MSE) compared to the individual least squares fits, as seen in the plot below by comparing the target reaction trends \(\alpha_s(t)\) to the posterior predicted median (pointwise in time) trends5. This is attributed to the following facts: (i) the Stan model corresponds exactly to the generative model of the observed processes \(y_s(t)\) for \(s = 1,\ldots,S\), and (ii) the random effects model allows to share information across replicates, which is not possible in the individual least squares fits. Moreover, statistical (posterior) inference for any quantity of interest should be straightforward using the sampled posterior parameters and posterior predicted reaction trends. To illustrate, for each replicate \(s\), the plot also includes 0.5% and 99.5% posterior quantiles (pointwise in time) for the replicate-specific posterior predicted reaction trends.
Session Info
sessionInfo()
#> R version 4.0.2 (2020-06-22)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 18.04.5 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] bayesplot_1.8.0 FME_1.3.6.1 coda_0.19-4 rootSolve_1.8.2.1
#> [5] cmdstanr_0.3.0 deSolve_1.28 data.table_1.13.6 ggplot2_3.3.3
#> [9] kableExtra_1.3.1 knitr_1.31
#>
#> loaded via a namespace (and not attached):
#> [1] Rcpp_1.0.6 lattice_0.20-41 ps_1.5.0 utf8_1.1.4
#> [5] assertthat_0.2.1 digest_0.6.27 R6_2.5.0 plyr_1.8.6
#> [9] ggridges_0.5.3 backports_1.2.1 evaluate_0.14 httr_1.4.2
#> [13] highr_0.8 blogdown_1.2 pillar_1.4.7 rlang_0.4.10
#> [17] rstudioapi_0.13 minqa_1.2.4 hexbin_1.28.2 jquerylib_0.1.3
#> [21] checkmate_2.0.0 rmarkdown_2.6.6 labeling_0.4.2 webshot_0.5.2
#> [25] stringr_1.4.0 munsell_0.5.0 compiler_4.0.2 xfun_0.22
#> [29] pkgconfig_2.0.3 htmltools_0.5.1.1 tidyselect_1.1.0 gridExtra_2.3
#> [33] tibble_3.0.6 bookdown_0.21 fansi_0.4.2 viridisLite_0.3.0
#> [37] crayon_1.4.1 dplyr_1.0.4 withr_2.4.1 MASS_7.3-52
#> [41] grid_4.0.2 jsonlite_1.7.2 gtable_0.3.0 lifecycle_0.2.0
#> [45] DBI_1.1.1 magrittr_2.0.1 posterior_0.1.3 scales_1.1.1
#> [49] cli_2.3.0 stringi_1.5.3 reshape2_1.4.4 farver_2.0.3
#> [53] xml2_1.3.2 bslib_0.2.4 ellipsis_0.3.1 generics_0.1.0
#> [57] vctrs_0.3.6 tools_4.0.2 glue_1.4.2 purrr_0.3.4
#> [61] processx_3.4.5 abind_1.4-5 yaml_2.2.1 colorspace_2.0-0
#> [65] rvest_0.3.6 minpack.lm_1.2-1 sass_0.3.1
References
Khawam, A., and D. R. Flanagan. 2006. “Solid-State Kinetic Models: Basics and Mathematical Fundamentals.” Journal of Physical Chemistry B 110(45): 17315–28. https://doi.org/10.1021/jp062746a.
For additional details, see https://mc-stan.org/users/documentation/case-studies/convert_odes.html↩︎
Except possibly in terms of computational efficiency. For efficiently solving a system of linear ODE’s, see e.g. https://mc-stan.org/docs/2_26/stan-users-guide/solving-a-system-of-linear-odes-using-a-matrix-exponential.html↩︎
The Arrhenius parameters \(A\) and \(E_a\) are highly correlated in their posteriors, but this does not cause any major sampling issues for Stan.↩︎
The
FME::modMCMC()
function allows for MCMC sampling using a number of custom Metropolis algorithms, but is aimed primarily at fitting (ODE) signal plus i.i.d. noise models, see e.g. https://cran.r-project.org/web/packages/FME/vignettes/FMEmcmc.pdf↩︎For simplicity, in
ex2.stan
, the predicted reaction trends are only evaluated at the observed time points. To evaluate trend predictions at different time points, an additionalgenerated quantities
block can be included in the Stan program analogous toex1.stan
.↩︎